Hard Hat Stats: Some Common and Uncommon Sense (Part 3)
By Kevin Gray
I’m not a scholar – just a lunch pail guy – but I do have more than 30 years experience as a marketing researcher and statistician. I’d like to share some tips I’ve learned on my journey.
Embrace ambiguity, or at least learn to be tolerant of it! Marketing research draws heavily from the social and behavioral sciences, which lack quantifiable natural laws. It’s not Engineering. That said, I should note that Statistics is nowhere near as cut and dried as it might appear from an introductory course. At the more advanced levels there is often little consensus among statisticians as to what works best in similar situations.
Question legacy practice. Example: A still-popular way of conducting consumer segmentation is to perform a principal component factor analysis on a large number of values, attitudes and lifestyle ratings (“psychographics”) and then cluster the factor scores with K-means or hierarchical cluster analysis. The clusters are then cross tabbed against demographics, purchase behavior and other key marketing variables. This practice, which I call the “Cluster & Pray” method, was already under criticism when I began my MR career in the 1980s and I have seldom found it useful. The odds of obtaining a useful segmentation are much better if we use attitudinal statements known to be related to consumer behavior (though there are now more sophisticated approaches to segmentation we can use, too).
Watch out for the Devil in the details. Example: Though it isn’t the end of the world if you use statistical methods intended for cross-sectional data on time-series data, it isn’t good practice and can lead us astray. If you’re new to time-series analysis, here’s a primer.
Don’t focus on a small slice of consumers based on preconceptions – not evidence – that they are “the target.” This runs up costs and can lead to very bad decisions.
Remember that all variables in a principal components (“factor”) analysis will have at least some weight in the computation of factor scores. As an illustration, say only two variables out of 30 load heavily on a factor and we name that factor based on these two variables. In reality, they may only account for a small percentage of the variance of that factor’s score, thus the label we gave the factor is misleading and the conclusions we draw from the analysis may be incorrect. This is a very common mistake in MR.
Remember that brand mapping is not just correspondence analysis with our software’s default settings – essentially a clerical task. Using different options may have a substantial impact on the map and there are many good ways to do mapping besides correspondence analysis.
Don’t forget that adding a predictor to a regression will nearly always increase the model R square. The adjusted R square, which compensates for model complexity, is usually more meaningful. (There are many other indices as well.) Secondly, R square is a proportion, so if someone reports that the new model has an R square “25% higher”, does this mean an increase of .25 to .50, for example, or a less spectacular improvement of .25 to .31?
Ignore standardized regression coefficients for dummy variables. It’s hard to think of a case where interpreting a proportion in terms of standardized units is meaningful. More to the point, we aren’t truly standardizing since the variance of a proportion decreases as you move away from .5 towards 1 or 0.
Don’t mix apples and oranges, e.g. “Age group has more impact on purchase frequency than gender.” Does this mean that including age group in the model improved model fit more than adding gender did? If so, then we should say so. Also, categorical variables such as age group usually have more than two categories and require more than one column in the data file, so adding or deleting these variables will tend to have more impact on a model than adding or deleting variables occupying just one column.
Realize that “importance” is in the eye of the decision-maker. Unfortunately, decision-makers are often not clear what they mean by “important” and it can connote several things to statisticians. To decision-makers it often implies impact on the bottom line but, when asked about importance, statisticians should pin down specifically what the questioner has in mind. This can be a very hard question and is not something a computer can answer for us. For example, a profitable consumer segment may be distinguished from other consumers by just a few variables. This means, however, that the overall discriminatory power of these variables will be small because they don’t vary systematically among other consumers – in that sense, they are not important. Statisticians need to tread carefully when they use the word important themselves.
Remember that most “organic” data are not really organic. Finding the trees and plucking the fruit costs time and money. Moreover, the nutritional value of this fruit to the business should not be taken for granted…
Learn how to communicate with statisticians! Statisticians can come across as geeky purists preoccupied with mathematical minutia. For some this is a fair assessment. However, what might seem like technical trivia to non-statisticians may actually have a substantial impact on the bottom line. If you are ever in doubt, ask your statistician specifically how the details they’re fretting about might affect the decisions at hand. Be patient – these things are often hard to explain in words. You may be very happy you asked!
Learn how to communicate with decision-makers. This will be obvious to most marketing researchers, but the best research in the world will be the worst research in the world if its results and implications are badly communicated. Don’t try to dazzle clients with tech talk. Also, let’s not forget that we’re not professional entertainers. Don’t put your audience to sleep, but tell them what they need to know simply and clearly.
Avoid thinking of new methods and new data sources as complete replacements for “traditional” marketing research. They are supplements and complements to it and should be welcomed rather than feared or seen as panaceas. Progress is rarely either/or.
Don’t over-react to hype. Believing everything we hear is risky but so is rejecting everything we hear. True, a lot of claims about disruptive innovation are downright silly, but that doesn’t mean it’s OK to just to stick to what we know. I’m no soothsayer but my bet is that the world and MR will be very different in 15 years, and I want to be ready for it.
Hope you find this helpful!