In this post we take the basic model developed in Marketing Mix Modeling Explained – with R and add the non-linear effect of advertising. I distinguish between advertising and advertising returns. I call the later sales for ease of readability.
Advertising non-linearity beyond the adstock/carry-over idea comes from two concepts: (1) diminishing returns and (2) saturation. Diminishing returns means that advertising exhibits non-constant and a decreasing marginal return to scale. For example, sales from $200 of advertising are less than twice the sales of $100 of advertising. A subsequent part of diminishing returns is the saturation effect where sales reach a limit after which more advertising has near zero incremental effect.
There is a number of functions used to model this non-linear advertising – sales relationship. I present few that are the most popular. I also specify the parameters ranges usually used with each function where applicable. But like they say, a picture is worth a 1,000 words.
- Power Function: y = α⋅xβ ; 0 < β ≤ 1
The power function has a the nice property that when β = 1 the function becomes linear. β = 1, of course, means that there is no diminishing returns within the observed data ranges. This function will never saturate as lim y = ∞ as x reaches ∞. This could be unreasonable for testing marketing outside of the observed ranges of modeled data. Aside from it’s use in advertising, the power function is also used to model price variable when β < 0. - Michaelis-Menten Function: y = (α⋅x) / (1 + β⋅x) ; β ≥ 0
The Michaelis-Menten function has a similar property to the power function as it becomes linear but when β = 0. This function has the added bonus of reaching a sales saturation of β/α. - Negative Exponential Function: y = α⋅(1 − e−β⋅x) ; β > 0
This function is called the negative exponential function due the −β portion. It is also referred to as the 2-parameter asymptotic exponential. Maximum sales attained by this modeling form, i.e. saturation, is α.
There are three important questions to ask:
- What happens at zero level of advertising?
- What happens at very high level of advertising?
- What happens between zero and high level of advertising?
Naturally, zero level of advertising should produce no sales effect and higher level of advertising effect should reach an upper limit mathematically called an asymptote. What happens in between is up to a great debate and is the subject of next post but it suffices to say an S curve function is sometimes desired.
The input variable, x, in all of the three functions above can be one of three things (1) advertising, (2) advertising adstock or (3) cumulative advertising for a certain period of time.
Last thing I want to add is that these functions are monotonically increasing, i.e. sales for higher level of advertising units is always greater than sales for lower level of advertising. Mathematically, f(x+ϵ) > f(x). This, of course, means advertising can do no harm, which is a whole different topic on it’s own.
This seems really interesting. can you help me in executing this in R
Use nls() function.
Looking forward to your next post on the shape of the function!
Thank you for the interesting post Gabriel! Can you give an example of how to apply the functions to the adstock model illustrated in your previous post? I tried to use nls() to optimize the parameters of negative exponential function as well as the adstock decay rate but got this error:
Error in nlsModel(formula, mf, start, wts) :
singular gradient matrix at initial parameter estimates
Hi Doro,
The nls function will fail quite a bit. You can use the nlsLM function from the minpack.lm package. Here is an example I put in the comments of Adstock Rate – Deriving with Analytical Methods.
Thank you Gabriel. This is super helpful. Looking forward to your new posts!
How does this show use of the the functions in the post with saturation etc?
Try using nls(y ~ a + b * adstock(x, r)^p, …)
Thanks a lot Gabriel, tried to used nlsLM function for the saturation effect but i am not getting exactly how to combine this with the original Sales equation. Can you please help me with its code in R.
Can you put some data along with your model?
HI, can you please let me know with an example how to solve exponential decay using nlslm()
There are many comments that explains that.
Hope you are doing well.
I am following your blog recently and its very helpful in my current work on promotional response analysis.
As I am new to this area of analysis, would like to learn more (Please bear with me, some of my questions may be basic )
1. Can you explain more detail on the statistical model used to calculate optimum ad-stock rate, do you also have a code in SAS, like in R? if so can you please share it?
2. I am able to follow, Optimization in Excel using solver to minimize errors. Any particular proc in SAS? which module in SAS support?
Also wile using nls in R, we have to transform the variables outside the model
I no longer use SAS. Moved to R & Python and never looked back. Your model should be able to transform the variable and calculate the coefficient in one step.
Hello Gabriel,
just wanted to ask about finding the parameters of these curves when we know, that these variables (spends/GRP etc.) are only a part of the model.
For example: Sales = A*constant + B*trend + C* Promotion(0,1) + D*f(TV) + E*f(Online).
We know the values of GRPs and Online spends. We also know that the f(TV) – based on GRPs – will probably be S-Shaped and f(Online) will be negative-exponential. How (and when – after defining the model or a t the very beginning) can we find the parameters of these functions?
Would really appreciate Your help
J.
The answers to these questions are defined in few blogs. If you know the functional form of the transformation then you’ll need to setup an optimization procedures. You can do it with with Excel like in this blog post. The trick is to write the function in the formula and let the optimization function choose the parameters. The
nls()
function or variations of it is usually what I use.Thank you very much! I’m working on my first MMM model and wanted to find some kind of algorithm. The purpose is to decompose the sales function and calculate ROI/treshold for media investment. The variables are: Sales, trend, promotion (0,1), seasonal vars (DoW without the weakest Monday) also described as (0,1), GRPs ans online Spends. Are these steps correct?
1. Finding optimal AdStock rates for TV and Online maximizing R^2 in:
a) Sales ~ GRP relation
b) Sales ~ Online relation
2. Replacing GRP and Online spends with AdStock variables
3. Running OLS regression so I can estimate the initial impact of my variables on Sales
4. NLSLM -> Sales ~ b0 + b1*trend + b2*Promotion(0,1) + b3….b8*Tuesday – Sunday(0,1) + b9*f(S-Shaped ~ AdStock GRP) + b10*g(Negative Exponential ~ AdStock Online)
So NLSLM has to find b0 – b10 parameters + parameters of both f and g functions.
…but something seems to be wrong here 😦 I feel like AdStock and parameters of the final function should be optimized at the same time, but I have no idea how to do it…
Or maybe I should build two models – one for decomposition, linear with optimized adstock and the second one for ROI and the diminishing functions…?
The steps are correct. You should build one model.
Hello, Gabriel. I was trying to create a fictional data set and fictional dataset: Sales = 1000 + 0,8&SQRT(TV) + 0,1*SQRT(RD) + random value , the data is below.
Sales TV RD
1222 100000 30000
1201 70000 20000
1149 35000 40000
1258 70000 20000
1250 90000 18000
1282 120000 31000
1148 50000 33000
1221 70000 20000
1227 88000 17000
1238 90000 19000
1344 150000 40000
1218 40000 24000
1133 21000 27000
1281 77000 42000
1312 120000 50000
1241 100000 32000
1257 90000 11000
1158 40000 19000
1199 66000 22000
1293 82000 31000
1302 100000 30000
1177 70000 20000
1193 35000 40000
1209 70000 20000
1293 90000 18000
1295 120000 31000
1155 50000 33000
1270 70000 20000
1227 88000 17000
1261 90000 19000
1327 150000 40000
1219 40000 24000
1115 21000 27000
1273 77000 42000
1317 120000 50000
1240 100000 32000
1226 90000 11000
1165 40000 19000
1182 66000 22000
1270 82000 31000
1246 100000 30000
1211 70000 20000
1155 35000 40000
1177 70000 20000
1260 90000 18000
1270 120000 31000
1163 50000 33000
1259 70000 20000
1273 88000 17000
1257 90000 19000
1376 150000 40000
1126 40000 24000
1106 21000 27000
1240 77000 42000
1326 120000 50000
1305 100000 32000
1279 90000 11000
1194 40000 19000
1188 66000 22000
1212 82000 31000
1285 100000 30000
1252 70000 20000
1124 35000 40000
1206 70000 20000
1229 90000 18000
1337 120000 31000
1149 50000 33000
1228 70000 20000
1237 88000 17000
1287 90000 19000
1367 150000 40000
1161 40000 24000
1104 21000 27000
1264 77000 42000
1289 120000 50000
1280 100000 32000
1279 90000 11000
1170 40000 19000
1178 66000 22000
1229 82000 31000
1262 100000 30000
1196 70000 20000
1152 35000 40000
1187 70000 20000
1243 90000 18000
1326 120000 31000
1152 50000 33000
1270 70000 20000
1284 88000 17000
1301 90000 19000
1307 150000 40000
1193 40000 24000
1082 21000 27000
1255 77000 42000
1273 120000 50000
1291 100000 32000
1264 90000 11000
1166 40000 19000
1178 66000 22000
1223 82000 31000
1221 100000 30000
1205 70000 20000
1178 35000 40000
1266 70000 20000
1233 90000 18000
1337 120000 31000
1231 50000 33000
1234 70000 20000
1277 88000 17000
1290 90000 19000
1326 150000 40000
1190 40000 24000
1140 21000 27000
1258 77000 42000
1250 120000 50000
1272 100000 32000
1244 90000 11000
1216 40000 19000
1202 66000 22000
1262 82000 31000
1250 100000 30000
1272 70000 20000
1121 35000 40000
1253 70000 20000
1258 90000 18000
1255 120000 31000
1225 50000 33000
1238 70000 20000
1284 88000 17000
1218 90000 19000
1295 150000 40000
1204 40000 24000
1133 21000 27000
1203 77000 42000
1317 120000 50000
1254 100000 32000
1253 90000 11000
1131 40000 19000
1267 66000 22000
1227 82000 31000
1220 100000 30000
1233 70000 20000
1128 35000 40000
1246 70000 20000
1249 90000 18000
1259 120000 31000
1155 50000 33000
1196 70000 20000
1270 88000 17000
1295 90000 19000
1296 150000 40000
1218 40000 24000
1109 21000 27000
1206 77000 42000
1275 120000 50000
1248 100000 32000
1241 90000 11000
1211 40000 19000
1267 66000 22000
1290 82000 31000
1251 100000 30000
1190 70000 20000
1168 35000 40000
1246 70000 20000
1258 90000 18000
1270 120000 31000
1239 50000 33000
1223 70000 20000
1213 88000 17000
1294 90000 19000
1345 150000 40000
1167 40000 24000
1102 21000 27000
1270 77000 42000
1348 120000 50000
1258 100000 32000
1218 90000 11000
1218 40000 19000
1248 66000 22000
1233 82000 31000
1273 100000 30000
1186 70000 20000
1217 35000 40000
1267 70000 20000
1214 90000 18000
1251 120000 31000
1229 50000 33000
1219 70000 20000
1203 88000 17000
1283 90000 19000
1379 150000 40000
1144 40000 24000
1171 21000 27000
1276 77000 42000
1314 120000 50000
1288 100000 32000
1290 90000 11000
1138 40000 19000
1177 66000 22000
1280 82000 31000
1302 100000 30000
1249 70000 20000
1133 35000 40000
1198 70000 20000
1218 90000 18000
1297 120000 31000
1168 50000 33000
1256 70000 20000
1269 88000 17000
My R code for non-linear regression is:
Example summary(nls_model)
Formula: Sales ~ b0 + a1 * TV^b1 + a2 * RD^b2
Parameters:
Estimate Std. Error t value Pr(>|t|)
b0 9.624e+02 1.115e+02 8.630 1.74e-15 ***
b1 4.326e-01 1.643e-01 2.633 0.0091 **
b2 7.650e-01 7.268e+00 0.105 0.9163
a1 2.078e+00 4.569e+00 0.455 0.6497
a2 2.256e-03 1.891e-01 0.012 0.9905
—
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 30.21 on 204 degrees of freedom
Number of iterations till stop: 50
Achieved convergence tolerance: 1.49e-08
Reason stopped: Number of iterations has reached `maxiter’ == 50.
What am I doing wrong…?
A.
Once again, as the prev comment was a bit chaotic. 🙂
Sales from the sample dataset is: 1000 + 0,8 * SQRT(TV) + 0,1 * SQRT (RD) + random number -50 – 50.
Hi Gabriel, I have been using the power transformation to make my models and I was looking for new ideas to incorporate saturation when I reached your blog. Interesting read! Can you suggest how should we go about selecting values for beta in transformations 2 and 3, because the possible ranges for beta are very wide, unlike for the power transformation where its between 0 to 1.
Hi Nikharn, It’s been a while since I’ve played with this but for Michaelis-Menten function a value between 0-1 would do. Note that a value of zero will be an identify function. This is the opposite of the Power function.
I’ve been exploring the concept of ad fatigue or ad half-life. The mathematical models explained here incorporate saturation but what happens after that? Seems like there should be a decay in the response at some point in time.
Take a look at advertising adstock: https://analyticsartist.wordpress.com/2013/11/02/calculating-adstock-effect/
Hi Gabriel, if we are already fitting a non-linear model using nls() or nlsLM() functions, why do we need to additionally transform the adstock to account for saturation? Referring to your earlier comment where you suggested using nls(y ~ a + b * adstock(x, r)^p, …). Thanks!
The non-linear fitting will give you the adstock. If you are superimposing a particular adstock prior to creating the model then you don’t need non-linear estimation. You can simply use your favorite linear function after having transformed the data the way you see fit.
But see, that is the problem. Namely, “transforming data the way you see fit” is a very biased way to create models. Some people even call this cheating because you continue transforming and modeling till you find a model that fits a preconceived conclusion.
Exponential decay ad stock function. if we use that decay pattern, Alpha * ( 1- e power (-Beta* X) )
X being ad spend or ad units. for any value of x greater than 10, the ad stock is becoming not more than 1unit. Is it appropriate ? because all ads are always > 100 units . I see that you mentioned that the max value of ad stock is Alpha. That means , alpha has to be large value. How do we arrive at Alpha ?
Is Alpha mentioned in Exponential decay function and Power function same? I believe alpha is the ‘decay’ or lag weight parameter and usually between 0 to 1. Then Apha cannot be large ! Little confused can you clarify Gabriel. Thank you
That is ok. The level of the transformed data doesn’t matter. Alpha and Beta can be obtained through non-linear optimization functions like nls() in R. Also, alpha is the saturation max for that variable.
Pingback: Improving Marketing Mix Modeling Using Machine Learning Approaches | by Slava Kisilevich | Jun, 2022 - Techno Blender