Data Science today can learn something from the macroeconomics of the 1970s
In 1976, Robert Lucas critiqued macroeconomic forecasting by arguing that models based on aggregate data needed to be replaced with ones estimated from individual-level choices. Data Science today needs to evolve in a similar fashion. Three practices would help:
- Move beyond the Business Intelligence style of correlation-based analysis
- Make decisions based on individual user behavior, rather than population averages
- Update analyses in real-time
Macroeconomics of the 70s was a field very much in flux – forecasting growth, unemployment, and the effects of monetary policy were pressing issues; aggregate data on economic conditions was getting more accurate and extensive; and better computers allowed more and more complex regression analyses to be performed. But the predictions of these statistical forecasts proved unreliable for policymakers. As a famous example, the Phillips curve found a negative historical correlation between unemployment and inflation – high inflation tended to coincide with periods of low unemployment, and vice versa. But when central banks reacted by pursuing inflationary policies, unemployment didn’t drop!
Future Nobel laureate Robert Lucas provided an explanation for this sort of forecasting failure in a seminal 1976 paper which came to be known as the “Lucas critique”. His argument was essentially that contemporary macroeconomic forecasting confused correlation with causation – high inflation today may be correlated with low unemployment tomorrow, but this doesn’t mean it causes it.
Lucas argued that forecasting models had to be revamped. Models estimating correlations between aggregate variables would be thrown out. In their place would be models capturing individual-level economic choices, with data used to estimate the “structural parameters” governing their behavior. These structural parameters would ideally be stable across time and environments, particularly following an intervention by a government or central bank. Only then could forecasting models really become useful for predicting the results of policy changes.
We think the data science world of today is facing many of the same challenges confronted by macroeconomists of the 1970s. The field is overdue for a Lucas critique of its own to point the way toward a new generation of models providing better predictions and more successful and cost-effective interventions.
Data Scientists need to get to the atomic level
The world of Data Science has spent the last two decades on a path mirroring macroeconomists a quarter century before them. As Business Intelligence tools have improved and hiring data scientists has become common practice, business analytics has grown into a sophisticated effort to find “actionable insights”. In other words, what factors drive user behavior and how can companies use them to proactively engage with customers or create policies that drive desired outcomes?
This analysis — though often sophisticated in execution — is conceptually simple: for a desired outcome, are there user traits, either demographic (who is this person?) or behavioral (how has this person interacted with my product?), that describe a user’s likelihood to engage or purchase? The standard approach uses broad characteristics of coarse demographic bins. A typical Business Intelligence result from Online Advertising looks like this: “Women aged 25-35 in California clicked on this ad at a rate 2x higher than Men aged 55+ in Wisconsin.”
Ubiquitous collection of demographic and behavioral data on users has made such insights much easier to obtain. However, it has also made possible much more powerful analyses. With data collection becoming ever more granular, we now have a substantial amount of “atomic-level” user data. This data enables us to learn about individuals, and not just population averages. Now we know not just that “Men 25-35 in California tended to buy at a rate of 0.1%”, but that our individual customer Erik Madsen “has bought a Plush Stuffed Animal every time he was shown an ad for one.”
This atomic-level data unlocks more tailored interventions. (An “intervention” is any action a company takes to affect customer behavior.) Now when we want to boost sales via promotion, instead of advertising a discount on our Plush Animals to “All Men aged 25-35 in California”, we can more precisely target “All Men aged 25-35 in California who have purchased our product at least 3 times in the last month”. This benefit of more granular analysis is clear and well-understood.
A less obvious opportunity afforded by atomic-level user data is the design of interventions that avoid “attenuation.” We use the term attenuation to describe a commonly observed discrepancy between predicted and actual effectiveness of an intervention. In particular, models based on aggregate correlations in customer data will typically over-predict the success of an intervention. This is because, as Bob Lucas pointed out, those correlations will usually weaken when the environment changes. By contrast, when we model individual level behavior, we are less likely to confuse correlation for causation and more likely to identify patterns in user behavior that hold up after an intervention. This strategy is more akin to modern structural macroeconomic modeling, in contrast to pre-Lucas critique forecasting based on aggregates. In the ideal, atomic-level modeling resolves the problem identified by the Lucas critique.
Why is atomic-level analysis the answer? Recall Lucas’s original prescription: anchor your analysis around behavioral patterns that will remain invariant across environments. It’s true that user invariants in a Business Intelligence context will be more situation-specific than the universal invariants of Economics (e.g. “Consumers save more when interest rates rise”). Even so, individual user traits are much more likely to be stable across time and environments than characteristics of an entire demographic segment. User traits look more like structural parameters and less like aggregate correlations the more we drill down toward the atomic level.
Atomic-level modeling in action
A concrete example will help illustrate the concepts we’ve been discussing. Consider a subscription media company, The Political Scientist, working to reduce churn. Their analytics team starts by looking for characteristics of subscribers vulnerable to churn, and discovers that failing to sign up for auto-renewal significantly increases subscription lapse in the next cycle. They propose that their marketing team offer discounts for signing up for auto renewal to increase customer retention.
This isn’t a crazy proposal by any means. Many companies offer such discounts, and their effectiveness lines up with results from behavioral science on the power of default options. What could go wrong with making decisions from this sort of aggregate analysis? An obvious shortcoming is that without any targeting, the scheme will be expensive – The Political Scientist will offer discounts to plenty of people who wouldn’t have churned even at full price. More granular analysis can reduce this cost by identifying subscribers with constellations of characteristics indicating high churn with more confidence.
But there’s an additional, more subtle attenuation effect that is just as important. Suppose subscribers in our example have two relevant individual characteristics – “auto-renewal status” and “number of articles viewed on <The Political Scientist website”. It’s natural that subscribers may self-sort into auto-renewing based on how often they read The Political Scientist. So consider the behavior of a subscriber who’s offered a discount to sign up for auto-renewal. Subscribers who only read sporadically will be happy to take advantage of this deal. But if they decide they’re not getting enough value out of the site, they may still go through the trouble of cancelling. The correlation between auto-renewal and churn would then be attenuated once the marketing team offers a discount. In other words, auto-renewal numbers may spike, without yielding the drop in churn predicted prior to the intervention. Discounting could turn out to be not only expensive but ineffective at influencing the desired metric. This is the Lucas critique in action in a Business Intelligence context.
Let’s see how atomic-level modeling improves the situation. Suppose The Political Scientist marketing team tailors the intervention to target only subscribers who are also consistent users of the website. It’s likely these users will get enough benefit from the site going forward that they won’t bother cancelling once they’re signed up for auto-renewal. Targeting discounts to this group will lead to a much less attenuated effect than would a mass intervention across the entire subscriber base. In the language of the Lucas critique, drilling down to the atomic level helps pick out so-called “structural parameters” of user behavior which remain stable after an intervention.
Keeping up with how quickly your data is changing is keeping up with your business
Analyzing user behavior at the atomic level greatly improves the accuracy and cost-effectiveness of your models. But it is not enough merely to cut data at the atomic level. User behavior can change rapidly as business conditions evolve. It is therefore necessary to track analytics “in real time” to keep up with these changes.
Here are two simple examples we’ve noticed:
- An individual’s propensity to buy tickets for a sports team’s games depends on the team’s recent win/loss record.
- A user’s willingness to pay for a ride-share company’s product depends on whether their competitor is having a promotion.
Marketing, sales, and client service professionals can rattle off myriad similar examples that litter their daily work.
Interestingly, though some of these shifts are observable, others are not. Any sports team knows its win/loss record. On the other hand, very few companies have data on their competitor’s marketing plan. Regardless, both observed and unobserved events impact the effectiveness of interventions. This illustrates the need to keep atomic-level models updated in “real time” to keep pace with changes in user behavior. Otherwise stale analytics introduce another source of attenuation and dilute the gains from modeling at the atomic level. If you’re not keeping up with how quickly your data is changing, you’re not keeping up with your business.