What’s Really Wrong With Polling
Editor’s Note: There is much to be learned from the recent election on many different levels, but perhaps most relevant to our readers are the implications on MR from the hit or miss predictive accuracy of polling and/or various analytical approaches. Nate Silver’s FiveThirtyEight began the post mortem with the post The Polls Missed Trump. We Asked Pollsters Why. and there have been hundreds of articles dissecting the data, casting blame, and suggesting changes. The same is happening in groups, on forums, on private list serves (yes, they do still exist!) and in individual discussions threads on social media. We think the topic is so important that today, GreenBook and the ARF are meeting to discuss collaborating on a webinar with a panel of key stakeholders on the topic, so look for more on that in the days ahead.
Here’s my take: some polling did better than others, as this post by Investor’s Business Daily rightfully shows: ” IBD/TIPP’s final numbers put Trump up by 1.6 points in a four-way race. As of 2 a.m. Wednesday morning, Trump was up by about 1 point in the popular vote. (The actual vote outcome will likely change as votes continue to be counted over the next several weeks.)
The Los Angeles Times, which had employed a panel of people who were queried about their choice (and which had been ridiculed throughout the election) showed Trump up in a two-way race by 3 points.”
So the outliers and innovators in polling did better than traditional methods. But so did other approaches using social media analytics, behavioral economics-based analysis, “big data”, meta analysis and data synthesis, and the focus of today’s post, text analytics. Tom Anderson posted on election day what text analytics was suggesting as an outcome, and in today’s follow-up (reposted from the OdinText blog) he takes a clear eyed view on how he did.
The key takeaway here is that some polling approaches work, but so do many other approaches and we’d do well to apply those lessons to political polling, public policy research, and commercial research.
Whatever your politics, I think you’ll agree that Tuesday’s election results were stunning. What is now being called an historic upset victory for Donald Trump apparently came as a complete shock to both of the campaigns, the media and, not least, the polling community.
The question everyone seems to be asking now is how could so many projections have been so far off the mark?
Some pretty savvy folks at Pew Research Center took a stab at some reasonable guesses on Wednesday—non-response bias, social desirability bias, etc.—all of which probably played a part, but I suspect there’s more to the story.
I believe the real problem lies with quantitative polling, itself. It just is not a good predictor of actual behavior.
Research Told Us Monday that Clinton Was In Trouble
On Monday I ran a blog post highlighting responses to what was inherently a question about the candidates’ respective positioning:
“Without looking, off the top of your mind, what issues does [insert candidate name] stand for?”
Interestingly, in either case, rather than naming a political issue or policy supported by the candidate, respondents frequently offered up a critical comment about his/her character instead (reflecting a deep-seated, negative emotional disposition toward that candidate). [See chart below]
Our analysis strongly suggested that Hillary Clinton was in more trouble than any of the other polling data to that point indicated.
- The #1 most popular response for Hillary Clinton involved the perception of dishonesty/corruption.
- The #1 and #2 most popular responses for Donald Trump related to platform (immigration, followed by pro-USA/America First), followed thirdly by perceived racism/hatemongering.
Bear in mind, again, that these were unaided, top-of-mind responses to an open-ended question.
So for those keeping score, the most popular response for Clinton was an emotionally-charged character dig; the two most popular responses for Trump were related to political platform.
This suggested that not only was Trump’s campaign messaging “Make America Great Again” resonating better, but that of the two candidates, the negative emotional disposition toward Hillary Clinton was higher than for Trump.
Did We Make a Mistake?
What I did not mention in that blog post was that initially my colleagues and I suspected we might have made a mistake.
Essentially, what these responses were telling us didn’t jibe with any of the projections available from pollsters, with the possible exception of the highly-respected Nate Silver, who was actually criticized for being too generous with Trump in weighting poll numbers up (about a 36% chance of winning or slightly better than expecting to flip tails twice with a coin).
How could this be? Had we asked the wrong question? Was it the sample*?
Nope. The data were right. I just couldn’t believe everyone else could be so wrong.
So out of fear that I might look incompetent and/or just plain nuts, I decided to downplay what this data clearly showed.
I simply wrote, “This may prove problematic for the Clinton camp.”
The Real Problem with Polls
Well, I can’t say I told you so, because what I wrote was a colossal understatement; however, this experience has reinforced my conviction that conventional quantitative Likert-scale survey questions—the sort used in every poll—are generally not terrific predictors of actual behavior.
If I ask you a series of questions with a set of answers or a ratings scale I’m not likely to get a response that tells me anything useful.
We know that consumers (and, yes, voters) are generally not rational decision-makers; people rely on emotions and heuristics to make most of our decisions.
If I really want to understand what will drive actual behavior, the surest way to find out is by allowing you to tell me unaided, in your own words, off the top of your head.
“How important is price to you on a scale of 1-10?” is no more likely to predict actual behavior than “How important is honesty to you in a president on a scale of 1-10?”
It applies to cans of tuna and to presidents.
[*Note: N=3,000 responses were collected via Google Surveys 11/5-11/7 2016. Google Surveys allow researchers to reach a validated (U.S. General Population Representative) sample by intercepting people attempting to access high-quality online content—such as news, entertainment and reference sites—or who have downloaded the Google Opinion Rewards mobile app. These users answer up to 10 questions in exchange for access to the content or Google Play credit. Google provides additional respondent information across a variety of variables including source/publisher category, gender, age, geography, urban density, income, parental status, response time as well as google calculated weighting. All 3,000 comments where then analyzed using OdinText to understand frequency of topics, emotions and key topic differences. Out of 65 topics total topics identified using OdinText 19 topics were mentioned significantly more often for Clinton, and 21 topics were significantly more often mentioned for Trump. Results are +/- 2.51% accurate at the 95% confidence interval. ]