Marketing mix modeling (MMM) is a process used to quantify the effects of different advertising mediums, i.e. media. It is also used to optimize spend budget over these different mediums. The popular method of choice is multiple regression analysis. The model also takes into account other variables such as pricing, distribution points and competitor tactics. This article will explain the mathematics behind MMM by starting with a simple model then adding complexities. I’ll also incorporate R code so you can immediately reproduce the results.
Start Simple:
Let’s assume there is only one advertising variable that affects sales. This simple model is usually defined as:
Sales = Base + b1·Advertising
There are two aspect to this model: (1) It is linear and (2) The Base is a constant. This is OK for now as we’ll add more complexity later. However, I can quickly tell you that Base can include other variables to make it non-constant. The non-linearity part will be introduced in a future blog post.
A sample R code can be:
sales <- c(37, 89, 82, 58, 110, 77, 103, 78, 95, 106, 98, 96, 68, 96, 157, 198, 145, 132, 96, 135)
ad <- c(6, 27, 0, 0, 20, 0, 20, 0, 0, 18, 9, 0, 0, 0, 13, 25, 0, 15, 0, 0)
modFit.0 <- lm(sales~ad)
summary(modFit.0)
This model has an R2 of 0.184 so there is much work to be done.
Complexity 1: The Adstock Case
The model above assumes that advertising in weekt will only affect sales that same week. This is wrong and will cause the advertising effect to be under-valued. Simply put, past ads can (and usually do) affect present and future sales. This multi effect aspect of advertising can be controlled for with adstock transformation, which I covered in a previous blog post.
Our model now becomes:
Sales = Base + b1·f(Advertising|α)
where f() is a the adstock transformation function for the Advertising variable given an adstock of α. Other functional forms besides adstock can be incorporated here as well. Also notice how the order of observations matter for adstocking to take place.
With an adstock rate of 50% the R code is:
sales <- c(37, 89, 82, 58, 110, 77, 103, 78, 95, 106, 98, 96, 68, 96, 157, 198, 145, 132, 96, 135)
ad <- c(6, 27, 0, 0, 20, 0, 20, 0, 0, 18, 9, 0, 0, 0, 13, 25, 0, 15, 0, 0)
ad.adstock <- as.numeric(filter(x=ad, filter=.50, method="recursive"))
modFit.1 <- lm(sales~ad.adstock)
summary(modFit.1)
Notice how we improved R2 from 0.184 to 0.252.
Complexity 2: More Advertising Variables
It should be clear by now that I have been using advertising mediums and advertising variables interchangeability. From modeling prospective Advertising can be a paid media channel like TV, radio or banner ads, a non-paid media variable like social impressions or word-of-mouth, or a marketing campaign. When adding more variables, however, their unit of measure need not be the same. Many measures can be used including TRPs, GRPs, impressions or spend. I listed them in order of preference when available. Regardless of unit of measure in a statistical model they are all called advertising variables and our model formulation becomes:
Sales = Base + ∑i=1 bi·f(Advertisingi|αi)
where f() is a the adstock transformation function for Advertisingi with an adstock of αi, i.e. each advertising variable has it’s own alpha rate.
The R code for two advertising variables with adstock rates of 30% is:
sales <- c(37, 89, 82, 58, 110, 77, 103, 78, 95, 106, 98, 96, 68, 96, 157, 198, 145, 132, 96, 135)
ad1 <- c(6, 27, 0, 0, 20, 0, 20, 0, 0, 18, 9, 0, 0, 0, 13, 25, 0, 15, 0, 0)
ad2 <- c(3, 0, 4, 0, 5, 0, 0, 0, 8, 0, 0, 5, 0, 11, 16, 11, 5, 0, 0, 15)
ad1.adstock <- as.numeric(filter(x=ad1, filter=.3, method="recursive"))
ad2.adstock <- as.numeric(filter(x=ad2, filter=.3, method="recursive"))
modFit2 <- lm(sales~ad1.adstock+ad2.adstock)
summary(modFit2)
Now, our model is even stronger with R2 of 0.769.
Complexity 3: Changing Base & Other Variables
So far we assumed the Base to be a constant, i.e. an intercept. I often get asked the question of how to make the Base non-constant. The simple answer is Base includes more than just the intercept. If you notice an increasing trend in Sales then part of modeling is to create a trend variable. This trend variable gets added to the base. Seasonal variables also sometimes get added to the Base. Finally, there is the idea of distribution points.
Distribution points accounts for the number of outlets (stores or online) that the product in question is being sold at. If a retailer, for example, doubles their stores then we would assume their sales would increase not due to marketing but simply to number of stores available. Marketing plays a role, of course, but I think you get the point.
Finally, pricing & promotions are of prime importance. They too are variables to add to the model. However, these variables aren’t part of the base. Due to their complexity I’ll leave their discussion to a future blog post.
Hence, our current model is now of the form:
Sales = a0 + a1·Trend + a2·Distribution + ∑i=1 bi·f(Advertisingi|αi)
sales <- c(37, 89, 82, 58, 110, 77, 103, 78, 95, 106, 98, 96, 68, 96, 157, 198, 145, 132, 96, 135)
ad1 <- c(6, 27, 0, 0, 20, 0, 20, 0, 0, 18, 9, 0, 0, 0, 13, 25, 0, 15, 0, 0)
ad2 <- c(3, 0, 4, 0, 5, 0, 0, 0, 8, 0, 0, 5, 0, 11, 16, 11, 5, 0, 0, 15)
trend <- 1:20
ad1.adstock <- as.numeric(filter(x=ad1, filter=.3, method="recursive"))
ad2.adstock <- as.numeric(filter(x=ad2, filter=.3, method="recursive"))
modFit.3 <- lm(sales~trend+ad1.adstock+ad2.adstock)
summary(modFit.3)
Our final model’s R2 is 0.940.
Business Implications & Contributions
Aside from the statistical fit of our model clients always ask about the business implication. This is usually referred to as sales lift or uplift due to marketing. a.k.a. the contribution. The contribution in our model is the product of adstocked advertising & the it’s coefficient.
Contributioni = bi·f(Advertisingi|αi)
Final Remarks & a Challenge:
You can see now that Marketing Mix Modeling is a business term for regression analysis on transformed variables. Any decent data scientist or statistition can do the job. However, it is important to note that the mix in Marketing Mix refers to the different mediums, media, campaigns or variables and their effects on sales. This is in contrast to mixed effects models, which measure the effect of one variable on many different levels, like DMA level modeling as an example. Mixed effect models can be used instead of multiple regression analysis when dealing with multiple geographies, like DMA’s, but the mixed terms refer to different things and I thought to call out.
The challenge that faces all statistical analyses is data as it is 80% of the work. While that can be taken care of by data personnel, there is still one challenge that escapes many. What adstock rate to give to each advertising variable? This is harder than it sounds and it goes beyond basic statistics. Modelers don’t only have to worry about a particular adstock being statistically valid, but they also have to choose among different adstock rates with different contributions, and all of which are statistically valid as well. One reason for this is that the ultimate consumer MMM results is a human. The model that makes the “most sense” – however that is defined – can trump the most accurate model. HBR has a good article about this problem. My recommendation for such scenarios is to track the model’s fit statistics at each decision points in the modeling process. The modeler or data scientist can then show the decision maker that choosing a higher contribution will make R2 drop from 90% to 70% and leave the final decision to the business users.