Psychohistory & VAST Data: An Interview with 2011 Parlin Award Winner Steve Cohen
Earlier this month it was announced that Steven H. Cohen, Co-founder & Partner of In4mation Insights had been named the recipient of the American Marketing Association’s (AMA) 2011 Charles Coolidge Parlin Marketing Research Award.
This award recognizes Mr. Cohen’s substantial contributions and unwavering dedication to the ongoing advancement of marketing research practice. The Charles Coolidge Parlin Marketing Research Award ceremony took place on Tuesday, September 13, 2011, here at the AMA Research and Strategy Summit Conference in Orlando, FL.
I had the opportunity to chat with Steve about his award and the industry as a whole both before & during the conference and it’s been a profound honor to get to know one of the true giants of market research. We delve into a lot of interesting topics and I think you’ll enjoy this very much!
LM: Congratulations on being the recipient of the 2011 Parlin Award! For those who may not be familiar with it, can you tell us what the Parlin Award is and what you awarded it for?
SC: The award was established in 1945 by the Philadelphia Chapter of the AMA and The Wharton School in association with the Curtis Publishing Company to honor Charles Parlin, who is rightly the “father of marketing research.” The award is given to persons who have made outstanding contributions to the field.
It has been given for 66 years now and some of the prior recipients include practitioners such as A.C. Nielsen, George Gallup, Peter Drucker, and John Malec (founder of IRI), Rich Johnson (founder of Sawtooth Software), and Magid Abraham (founder of ComScore). Academics who have won the award include Paul Green, Glen Urban, John Hauser, Frank Bass, and Jordan Louviere.
I am humbled to get this award, let alone be mentioned in the same sentence as these illustrious and accomplished people.
The recipient must have demonstrated outstanding leadership and sustained impact on advancing the profession of marketing research over an extended period of time.
“Specifically, this transformational impact might be reflected in one or more of the following:
- New concepts, methods, and models for measurement and analysis that expand the capabilities of organizations to achieve a better understanding of markets, customers and consumers.
- Creative integration of existing methodologies and an understanding of information needs resulting in more widespread use and/or appreciation of marketing research.
- Demonstrated leadership resulting in stimulating the effective use and value of marketing research and market based knowledge.”
I have been fortunate to work with some fabulous colleagues whose insights and knowledge have helped me to grow and learn, and to make the contributions that I have over the years.
LM: Your career is one of the most distinguished in the industry and you’re widely considered a thought leader on applying advanced analytical models to enhance marketing effectiveness. What do you consider to be the most exciting recent development in that arena?
SC: Really two things, if I may.
The first is the adoption of Bayesian statistical modeling in the marketing profession. Bayesian models hold several advantages over classical analytic approaches.
First, they allow a more comprehensive and broad view of the drivers of behavior. They are good at integrating several different sources of data to provide a more in-depth understanding. For example, if we are doing a taste test of a new product, we could assume that:
- The chemicals that form the new product affect human sensory receptors;
- The consumer’s sensory experience translates into feelings and reactions to the product; and,.
- Such feelings and reactions have an influence on whether the consumer will but the product or not.
It is clear that this process proceeds in a hierarchical fashion, with a stimulus causing a chain of reactions in the consumer. This hierarchical process encompasses phenomena at both the product (chemical) and the person (consumer) level. Bayesian models can easily estimate the effects of each element in the chain, taking into account the different levels of data available to the analyst.
Bayesian models are also good at incorporating the beliefs of the manager into the model. A simple belief is that price should have a negative effect on demand. More complicated beliefs are also possible.
And finally, Bayesian models can estimate very complicated models of behavior by using Monte Carlo simulation to solve for the effects of interest, rather than solving a bunch of equations, which may not even be solvable with ordinary math!
The second exciting development that I want to mention is the extension of simple models of choice to a broader set of choice situations. Marketing researchers often use choice models to understand the drivers of picking, say, one particular bottle of ketchup off the shelf. Choice-based conjoint and other discrete choice models are used for this situation.
Yet, there are many choice situations that are not a simple “pick one.” For example, I might buy three cans of tomato soup at the grocery, or perhaps a hundred of the same Dell PC for office use. In this case, the person is choosing one item, yet ordering multiple units. Another situation looks like the choice off a restaurant menu. People are typically picking multiple items (beverage, entrée, and dessert) yet only one unit of each item. And finally, there is the yogurt or pet food purchase situation, where not only are multiple flavors or variants being chosen, but also multiple units of each. Statistical models of these behaviors have only been developed over the past few years and are not well-known to the marketing research community.
LM What do you think the opportunity for market researchers may be in applying advanced analytics such as Bayesian models or behavioral economics to social media data or to the broader concept of “Big Data”? Do our current analytical toolkits even have applications within this new era of massive converged data sets, or do we need to develop new approaches or skill sets?
SC: Good questions, Len.
My impression is that social media data are still not ready for big time data analytics, because of the frailties and foibles of language. So many words in English (and in other languages I might add) have multiple meanings and, even with humans reading and coding posts, it is still difficult to accurately interpret what posters mean. However, a recent study I know of finds that online posts can be good predictors of consumer sentiment as measured in brand tracking studies.
As for “Big Data,” I find this term to be ill-defined. I read the recent McKinsey study on Big Data, and they fail to adequately define it. Instead, I prefer the term VAST data – data consisting of Variables, Attributes, Subjects (people), and Time – where one or more of these are in the thousands, millions, or billions.
We are accustomed to reducing the scope of such databases by sampling subjects (consumers) or through data reduction on variables. Sampling subjects (consumers) permits simpler computation, but also misses the nuances and idiosyncrasies available in the full dataset. We are able to use data reduction because the data space is well-populated. Yet in a typical VAST dataset, as the number of variables increases beyond the usual several dozen, any local place in the “cloud” of data with several thousand variables is most likely empty. So in contrast to small datasets, where there are clouds of data with relatively few outliers, in VAST databases, the data space is mostly empty and almost all data points are outliers!
With VAST data, we will need both stronger computational power and statistical tools that are good with datasets that have lots of empty spaces. Luckily, computational power continues to grow apace and is available at lower prices than ever before. For example, I recently read about a computer at the University of Illinois that “predicts the future” and it runs at over 8 teraflops (a teraflop is one trillion floating point operations per second). For comparison, my company just bought a very affordable computer for use on large databases that runs at 1.3 teraflops, and can be expanded easily to run four times that speed. It’s remarkable that a company our size can afford such computational power.
To analyze VAST data, my company is banking heavily on Bayesian models as the way of the future. Such statistical models are complicated and the foundations of Bayesian models are not being taught on a regular basis in universities today. I suspect that it may take at least five years before many researchers have basic training in their use.
LM: there is a lot of discussion around emotional measurement as the “missing piece” in the consumer insight puzzle. First, what is your take on the idea? Second, how (if at all) can we combine the emotional measurement model with more traditional quantitative approaches?
SC: I have done a bit of research into emotional measurement and I do have a point of view. This is a really good idea that is in its early stages. Several means of such measurement are available, everything from measuring brain waves to measuring facial expressions. The brain wave measurements sound very cool, yet doing them on a grand scale has quite a ways to go.
On the other hand, facial emotion recognition systems were originally developed by Paul Eckman 40 years ago. This work required trained coders to record the movements of facial muscles, which were then translated into the associated emotions.
Recently, automated coding systems have been developed based on similar technology now used in defense and security applications. The systems that I am most familiar with code facial movements into six or seven broad categories of emotion. They do have difficulty with a common negative emotion like disgust – mostly because when a person is disgusted, they tend to move their head away and to the side – which of course wreaks havoc with the software that is trying to “read” the face.
A study by Michel Wedel at the University of Maryland used such a system to read facial expressions and translate them into emotions, by coding the muscle movements four times each second. There is the beginning of a VAST database! The cool thing is that he took these facial measurements while his research subjects were watching television commercials, and who were allowed to zap the commercials when they wanted. His work shows that emotional reactions like joy and surprise are highly related to the lack of zapping behavior, and he provides suggestions to advertisers on how to pace the emotional content of commercials so as to reduce the zaps. Naturally, and not surprising, is the fact that Wedel used Bayesian statistics to analyze these data.
I have also been experimenting with the idea of using such emotional measurement tools with other common quantitative surveys, like concept evaluations and sensory tests. So I am bullish on the use of such tools and techniques in broad quant studies in the future.
LM: I love your thinking on the future of data integration and the models that will be required in order to drive value from the massive data sets available to us today. When we chatted earlier we discovered that you and I have a shared love of classic science fiction, specifically the “Foundation” series by Isaac Asimov. In that series Asimov envisioned a future branch of science he called “Psychohistory” which is effectively a Predictive Model driven by “VAST” data sets and advanced analytics that was used to chart the macro developments of human culture for hundreds of years into the future. It seems to me that we’re actually not far off from being able to create a limited version of that idea. What do you think?
SC: Well as one geek to another …
Sometimes it just pays to read the news. Just look at today’s. The University of Illinois super computer that I mentioned earlier has been receiving news feeds from hundreds of millions of web sites. They claim that, through their analysis of the newsfeeds, they predicted the Arab spring and the decline of the popularity of Hosni Mubarak several months in advance.
I just read that the police department of Santa Cruz, CA is using a mathematical algorithm to predict where crime is likely to occur and has deployed police to those locations before the crime happens. They claim that the algorithm has been correct 40% of the time.
Asimov’s fiction teaches us that nothing is going to predict freaky one-time events, but it looks like beginnings of psychohistory are with us.
LM: Winning the Parlin Award is a big deal and you’ve earned bragging rights, so go ahead and brag: what did you do to win the award?
SC: I have always been interested in things academic and I have been lucky to work with some of the best in the business. This collaboration has facilitated my consulting work and has also made it easier for some of the best ideas of academics to be diffused into commercial research practice.
- First to use Choice-based Conjoint Analysis in commercial marketing research;
- Worked with Glen Urban of MIT to develop Information Acceleration;
- Worked with John Hauser of MIT and Abbie Griffin of the University of Utah to develop Voice of the Customer tools;
- Co-wrote the first academic paper on Latent Class Choice-based Conjoint;
- Co-wrote with Venkat Ramaswamy and was nominated for the prestigious Paul Green award for a paper on multiway Latent Class models and we also wrote a basic paper on LCMs that was published in Marketing Research Magazine;
- Co-wrote with Venkat and John Liechty another paper that was nominated for the Green award for a paper on Menu-based Conjoint Analysis;
- Spoke at over a dozen AMA and ESOMAR conferences over the years;
- Taught (and have been teaching) an Advanced Segmentation workshop for ESOMAR; and finally, the time I am best known for,
- Developed and wrote about Maximum Difference Scaling (MaxDiff) and won three awards for this work.
LM: Thanks for the great interview Steve! Congratulations on a distinguished career, winning the Parlin Award, and for being a true pioneer in our industry!