Hard Hat Stats: Some Common and Uncommon Sense (Part 1)
By Kevin Gray
I’m not a scholar – just a lunch pail guy – but I do have more than 30 years experience as a marketing researcher and statistician. I’d like to share some tips I’ve learned on my journey.
Be careful about making assumptions. Period.
Be focused on decisions, not technology. I’ve turned down work because I realized that all that was really needed were a few questions and a couple of cross tabs, not the fancy analytics I love. Technology can help or hinder – it’s not a one-way street.
Do your homework. Try to put yourself in your client’s shoes and understand their business, their point of view, how they make decisions and the information they use to make decisions.
Think seriously about how MR quality impacts decision quality. Below a certain quality threshold, the risk of bad decisions can skyrocket. Where that threshold is depends on the circumstances, but we seldom know in advance where it is. “70%” may be good enough…or a disaster. Doing things on the cheap can be very expensive.
Remember that marketing and advertising are much more than direct marketing and targeted ads. Also, don’t forget that how a product is marketed is usually at least as important as the product itself. Perhaps our best customers are our best customers in part because we haven’t been badgering them with targeted ads and emails!
Recognize that Statistical Science is not just math and programming, or point-and-click software. It’s part science, part art and multifaceted. Experience and subject matter expertise are very important. Moreover, the field is developing at an accelerating pace and nowadays statisticians must know how to use dozens of machine learners, such as neural nets and SVM, in addition to new statistical methods.
Avoid cookie cutters. For example, machine learners are often better at pattern recognition than the statistical methods most of us are familiar with but are usually very difficult to interpret. They don’t tell us much about the Why or How. Even more crucially, nearly any method has many options and requires a number of choices. Always going with the default settings is very bad practice.
Be wary of sweatshop statistics…cranking out zillions of models in mechanical fashion with only cursory checking and little effort to see if they make sense and will be of use to decision makers. Caveat Emptor.
Interpret data directionally, not absolutely. Even in extraordinary cases when data are near “perfect” – i.e., when sampling, coverage, non-response and measurement errors are trivial – data speak most clearly to us when placed in context. Look for patterns in data, not just for statistically significant differences between groups of respondents or customers. Meaning is in the mind, not in the math.
Understand that accurate predictive models rarely need to be built on all the data we have. Millions of records usually aren’t needed for model building and samples are typically sufficient. Statistical inference, after all, was developed primarily for small samples – teeny-tiny data by today’s standards. Moreover, some “big data” aren’t really very big, just wide, with the same variables repeated over and over again. Often transaction data, for example, can be collapsed into weekly or monthly periods for the analyses we need to run.
Don’t forget that marketing is also about changing behavior, not just predicting it, and don’t underestimate how difficult predicting the behavior of individual consumers actually is. Regression to the mean can ruin more than our day…I sometimes use a machine learner for making predictions and a more complex statistical procedure on a sub-sample of the same data to help me understand the Why that is driving the What and What’s Next.
Think multivariate. Most effects have multiple causes and just looking at a total column or cross tab may be very misleading. A lot of good data go to waste because they aren’t analyzed beyond simple cross tabs and graphics…descriptive information is not insights!
Understand that even complex statistical models are simplified representations of reality. This is one reason why automated modeling can be so risky – it’s not at all uncommon for two or more models to be equivalent statistically but suggest very different courses of action for decision-makers. A mere number-cruncher wouldn’t know which to recommend or whether it might be better to go back to the drawing board.
Avoid over-interpreting computer simulations. Even when based on highly sophisticated mathematical models, computer simulations are computer simulations, not reality. We also should remember that marketing research is not engineering and that most simulations we run are very rough compared to those conducted by engineers and scientists.
Don’t personify AI. The hearts and minds of Artificial Intelligence are computer programs. Robots, however charming, will never be able to think the way we do or feel emotions as we do. Being able to recognize human emotions and respond to them is not the same as actually feeling emotions. To my knowledge, AlphaGo did not revel in its victory over Lee Sedol. Your cat or dog is much more human than any machine will ever be.
Hope you find this helpful!