In this post we take the basic model developed in Marketing Mix Modeling Explained – with R and add the non-linear effect of advertising. I distinguish between advertising and advertising returns. I call the later sales for ease of readability.

Advertising non-linearity beyond the adstock/carry-over idea comes from two concepts: (1) diminishing returns and (2) saturation. Diminishing returns means that advertising exhibits non-constant and a decreasing marginal return to scale. For example, sales from $200 of advertising are less than twice the sales of $100 of advertising. A subsequent part of diminishing returns is the saturation effect where sales reach a limit after which more advertising has near zero incremental effect.

There is a number of functions used to model this non-linear advertising – sales relationship. I present few that are the most popular. I also specify the parameters ranges usually used with each function where applicable. But like they say, a picture is worth a 1,000 words.

**Power Function**:*y*=*α*⋅*x*^{β}; 0 <*β*≤ 1

The power function has a the nice property that when*β*= 1 the function becomes linear.*β*= 1, of course, means that there is no diminishing returns within the observed data ranges. This function will never saturate as lim*y*= ∞ as*x*reaches ∞. This could be unreasonable for testing marketing outside of the observed ranges of modeled data. Aside from it’s use in advertising, the power function is also used to model price variable when*β*< 0.**Michaelis-Menten Function**:*y*=*(**α*⋅*x) / (*1 +;*β⋅*x)≥ 0*β*

The Michaelis-Menten function has a similar property to the power function as it becomes linear but when*β*= 0. This function has the added bonus of reaching a sales saturation of*β*/*α*.**Negative Exponential Function**:*y*=*α*⋅(1 −*e*^{−β⋅x}) ;0*β >*

This function is called the negative exponential function due the −*β*portion. It is also referred to as the 2-parameter asymptotic exponential. Maximum sales attained by this modeling form, i.e. saturation, is*α*.

There are three important questions to ask:

- What happens at zero level of advertising?
- What happens at very high level of advertising?
- What happens between zero and high level of advertising?

Naturally, zero level of advertising should produce no sales effect and higher level of advertising effect should reach an upper limit mathematically called an asymptote. What happens in between is up to a great debate and is the subject of next post but it suffices to say an S curve function is sometimes desired.

The input variable, *x*, in all of the three functions above can be one of three things (1) advertising, (2) advertising adstock or (3) cumulative advertising for a certain period of time.

Last thing I want to add is that these functions are monotonically increasing, i.e. sales for higher level of advertising units is always greater than sales for lower level of advertising. Mathematically, *f*(*x*+*ϵ*) > *f*(*x*). This, of course, means advertising can do no harm, which is a whole different topic on it’s own.

Sray AgarwalThis seems really interesting. can you help me in executing this in R

AnalyticsArtistPost authorUse nls() function.

LilLooking forward to your next post on the shape of the function!

DoroThank you for the interesting post Gabriel! Can you give an example of how to apply the functions to the adstock model illustrated in your previous post? I tried to use nls() to optimize the parameters of negative exponential function as well as the adstock decay rate but got this error:

Error in nlsModel(formula, mf, start, wts) :

singular gradient matrix at initial parameter estimates

AnalyticsArtistPost authorHi Doro,

The nls function will fail quite a bit. You can use the nlsLM function from the minpack.lm package. Here is an example I put in the comments of Adstock Rate – Deriving with Analytical Methods.

DoroThank you Gabriel. This is super helpful. Looking forward to your new posts!

JimHow does this show use of the the functions in the post with saturation etc?

AnalyticsArtistPost authorTry using nls(y ~ a + b * adstock(x, r)^p, …)

PrateekThanks a lot Gabriel, tried to used nlsLM function for the saturation effect but i am not getting exactly how to combine this with the original Sales equation. Can you please help me with its code in R.

AnalyticsArtistPost authorCan you put some data along with your model?

ChandanHI, can you please let me know with an example how to solve exponential decay using nlslm()

AnalyticsArtistPost authorThere are many comments that explains that.

Srikanth KHope you are doing well.

I am following your blog recently and its very helpful in my current work on promotional response analysis.

As I am new to this area of analysis, would like to learn more (Please bear with me, some of my questions may be basic )

1. Can you explain more detail on the statistical model used to calculate optimum ad-stock rate, do you also have a code in SAS, like in R? if so can you please share it?

2. I am able to follow, Optimization in Excel using solver to minimize errors. Any particular proc in SAS? which module in SAS support?

Srikanth KAlso wile using nls in R, we have to transform the variables outside the model

AnalyticsArtistPost authorI no longer use SAS. Moved to R & Python and never looked back. Your model should be able to transform the variable and calculate the coefficient in one step.

JeanHello Gabriel,

just wanted to ask about finding the parameters of these curves when we know, that these variables (spends/GRP etc.) are only a part of the model.

For example: Sales = A*constant + B*trend + C* Promotion(0,1) + D*f(TV) + E*f(Online).

We know the values of GRPs and Online spends. We also know that the f(TV) – based on GRPs – will probably be S-Shaped and f(Online) will be negative-exponential. How (and when – after defining the model or a t the very beginning) can we find the parameters of these functions?

Would really appreciate Your help

J.

AnalyticsArtistPost authorThe answers to these questions are defined in few blogs. If you know the functional form of the transformation then you’ll need to setup an optimization procedures. You can do it with with Excel like in this blog post. The trick is to write the function in the formula and let the optimization function choose the parameters. The

`nls()`

function or variations of it is usually what I use.JeanThank you very much! I’m working on my first MMM model and wanted to find some kind of algorithm. The purpose is to decompose the sales function and calculate ROI/treshold for media investment. The variables are: Sales, trend, promotion (0,1), seasonal vars (DoW without the weakest Monday) also described as (0,1), GRPs ans online Spends. Are these steps correct?

1. Finding optimal AdStock rates for TV and Online maximizing R^2 in:

a) Sales ~ GRP relation

b) Sales ~ Online relation

2. Replacing GRP and Online spends with AdStock variables

3. Running OLS regression so I can estimate the initial impact of my variables on Sales

4. NLSLM -> Sales ~ b0 + b1*trend + b2*Promotion(0,1) + b3….b8*Tuesday – Sunday(0,1) + b9*f(S-Shaped ~ AdStock GRP) + b10*g(Negative Exponential ~ AdStock Online)

So NLSLM has to find b0 – b10 parameters + parameters of both f and g functions.

…but something seems to be wrong here 😦 I feel like AdStock and parameters of the final function should be optimized at the same time, but I have no idea how to do it…

Or maybe I should build two models – one for decomposition, linear with optimized adstock and the second one for ROI and the diminishing functions…?

AnalyticsArtistPost authorThe steps are correct. You should build one model.

Adam KupiecHello, Gabriel. I was trying to create a fictional data set and fictional dataset: Sales = 1000 + 0,8&SQRT(TV) + 0,1*SQRT(RD) + random value , the data is below.

Sales TV RD

1222 100000 30000

1201 70000 20000

1149 35000 40000

1258 70000 20000

1250 90000 18000

1282 120000 31000

1148 50000 33000

1221 70000 20000

1227 88000 17000

1238 90000 19000

1344 150000 40000

1218 40000 24000

1133 21000 27000

1281 77000 42000

1312 120000 50000

1241 100000 32000

1257 90000 11000

1158 40000 19000

1199 66000 22000

1293 82000 31000

1302 100000 30000

1177 70000 20000

1193 35000 40000

1209 70000 20000

1293 90000 18000

1295 120000 31000

1155 50000 33000

1270 70000 20000

1227 88000 17000

1261 90000 19000

1327 150000 40000

1219 40000 24000

1115 21000 27000

1273 77000 42000

1317 120000 50000

1240 100000 32000

1226 90000 11000

1165 40000 19000

1182 66000 22000

1270 82000 31000

1246 100000 30000

1211 70000 20000

1155 35000 40000

1177 70000 20000

1260 90000 18000

1270 120000 31000

1163 50000 33000

1259 70000 20000

1273 88000 17000

1257 90000 19000

1376 150000 40000

1126 40000 24000

1106 21000 27000

1240 77000 42000

1326 120000 50000

1305 100000 32000

1279 90000 11000

1194 40000 19000

1188 66000 22000

1212 82000 31000

1285 100000 30000

1252 70000 20000

1124 35000 40000

1206 70000 20000

1229 90000 18000

1337 120000 31000

1149 50000 33000

1228 70000 20000

1237 88000 17000

1287 90000 19000

1367 150000 40000

1161 40000 24000

1104 21000 27000

1264 77000 42000

1289 120000 50000

1280 100000 32000

1279 90000 11000

1170 40000 19000

1178 66000 22000

1229 82000 31000

1262 100000 30000

1196 70000 20000

1152 35000 40000

1187 70000 20000

1243 90000 18000

1326 120000 31000

1152 50000 33000

1270 70000 20000

1284 88000 17000

1301 90000 19000

1307 150000 40000

1193 40000 24000

1082 21000 27000

1255 77000 42000

1273 120000 50000

1291 100000 32000

1264 90000 11000

1166 40000 19000

1178 66000 22000

1223 82000 31000

1221 100000 30000

1205 70000 20000

1178 35000 40000

1266 70000 20000

1233 90000 18000

1337 120000 31000

1231 50000 33000

1234 70000 20000

1277 88000 17000

1290 90000 19000

1326 150000 40000

1190 40000 24000

1140 21000 27000

1258 77000 42000

1250 120000 50000

1272 100000 32000

1244 90000 11000

1216 40000 19000

1202 66000 22000

1262 82000 31000

1250 100000 30000

1272 70000 20000

1121 35000 40000

1253 70000 20000

1258 90000 18000

1255 120000 31000

1225 50000 33000

1238 70000 20000

1284 88000 17000

1218 90000 19000

1295 150000 40000

1204 40000 24000

1133 21000 27000

1203 77000 42000

1317 120000 50000

1254 100000 32000

1253 90000 11000

1131 40000 19000

1267 66000 22000

1227 82000 31000

1220 100000 30000

1233 70000 20000

1128 35000 40000

1246 70000 20000

1249 90000 18000

1259 120000 31000

1155 50000 33000

1196 70000 20000

1270 88000 17000

1295 90000 19000

1296 150000 40000

1218 40000 24000

1109 21000 27000

1206 77000 42000

1275 120000 50000

1248 100000 32000

1241 90000 11000

1211 40000 19000

1267 66000 22000

1290 82000 31000

1251 100000 30000

1190 70000 20000

1168 35000 40000

1246 70000 20000

1258 90000 18000

1270 120000 31000

1239 50000 33000

1223 70000 20000

1213 88000 17000

1294 90000 19000

1345 150000 40000

1167 40000 24000

1102 21000 27000

1270 77000 42000

1348 120000 50000

1258 100000 32000

1218 90000 11000

1218 40000 19000

1248 66000 22000

1233 82000 31000

1273 100000 30000

1186 70000 20000

1217 35000 40000

1267 70000 20000

1214 90000 18000

1251 120000 31000

1229 50000 33000

1219 70000 20000

1203 88000 17000

1283 90000 19000

1379 150000 40000

1144 40000 24000

1171 21000 27000

1276 77000 42000

1314 120000 50000

1288 100000 32000

1290 90000 11000

1138 40000 19000

1177 66000 22000

1280 82000 31000

1302 100000 30000

1249 70000 20000

1133 35000 40000

1198 70000 20000

1218 90000 18000

1297 120000 31000

1168 50000 33000

1256 70000 20000

1269 88000 17000

My R code for non-linear regression is:

Example summary(nls_model)

Formula: Sales ~ b0 + a1 * TV^b1 + a2 * RD^b2

Parameters:

Estimate Std. Error t value Pr(>|t|)

b0 9.624e+02 1.115e+02 8.630 1.74e-15 ***

b1 4.326e-01 1.643e-01 2.633 0.0091 **

b2 7.650e-01 7.268e+00 0.105 0.9163

a1 2.078e+00 4.569e+00 0.455 0.6497

a2 2.256e-03 1.891e-01 0.012 0.9905

—

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 30.21 on 204 degrees of freedom

Number of iterations till stop: 50

Achieved convergence tolerance: 1.49e-08

Reason stopped: Number of iterations has reached `maxiter’ == 50.

What am I doing wrong…?

A.

AdamOnce again, as the prev comment was a bit chaotic. 🙂

Sales from the sample dataset is: 1000 + 0,8 * SQRT(TV) + 0,1 * SQRT (RD) + random number -50 – 50.

nikharnHi Gabriel, I have been using the power transformation to make my models and I was looking for new ideas to incorporate saturation when I reached your blog. Interesting read! Can you suggest how should we go about selecting values for beta in transformations 2 and 3, because the possible ranges for beta are very wide, unlike for the power transformation where its between 0 to 1.

AnalyticsArtistPost authorHi Nikharn, It’s been a while since I’ve played with this but for Michaelis-Menten function a value between 0-1 would do. Note that a value of zero will be an identify function. This is the opposite of the Power function.

John ZickerI’ve been exploring the concept of ad fatigue or ad half-life. The mathematical models explained here incorporate saturation but what happens after that? Seems like there should be a decay in the response at some point in time.

AnalyticsArtistPost authorTake a look at advertising adstock: https://analyticsartist.wordpress.com/2013/11/02/calculating-adstock-effect/

PSHi Gabriel, if we are already fitting a non-linear model using nls() or nlsLM() functions, why do we need to additionally transform the adstock to account for saturation? Referring to your earlier comment where you suggested using nls(y ~ a + b * adstock(x, r)^p, …). Thanks!

AnalyticsArtistPost authorThe non-linear fitting will give you the adstock. If you are superimposing a particular adstock prior to creating the model then you don’t need non-linear estimation. You can simply use your favorite linear function after having transformed the data the way you see fit.

But see, that is the problem. Namely, “transforming data the way you see fit” is a very biased way to create models. Some people even call this cheating because you continue transforming and modeling till you find a model that fits a preconceived conclusion.

SatishExponential decay ad stock function. if we use that decay pattern, Alpha * ( 1- e power (-Beta* X) )

X being ad spend or ad units. for any value of x greater than 10, the ad stock is becoming not more than 1unit. Is it appropriate ? because all ads are always > 100 units . I see that you mentioned that the max value of ad stock is Alpha. That means , alpha has to be large value. How do we arrive at Alpha ?

Is Alpha mentioned in Exponential decay function and Power function same? I believe alpha is the ‘decay’ or lag weight parameter and usually between 0 to 1. Then Apha cannot be large ! Little confused can you clarify Gabriel. Thank you

AnalyticsArtistPost authorThat is ok. The level of the transformed data doesn’t matter. Alpha and Beta can be obtained through non-linear optimization functions like nls() in R. Also, alpha is the saturation max for that variable.

Pingback: Improving Marketing Mix Modeling Using Machine Learning Approaches | by Slava Kisilevich | Jun, 2022 - Techno Blender