1. Greenbook 2
  2. Greenbook-Mobile-6.29.16-

What’s Really Wrong With Polling

Posted by Tom H.C. Anderson Friday, November 11, 2016, 8:40 am
Posted in category General Information
What can researchers learn from yet another major polling fail?

Editor’s Note: There is much to be learned from the recent election on many different levels, but perhaps most relevant to our readers are the implications on MR from the hit or miss predictive accuracy of polling and/or various analytical approaches. Nate Silver’s FiveThirtyEight began the post mortem with the post The Polls Missed Trump. We Asked Pollsters Why. and there have been hundreds of articles dissecting the data, casting blame, and suggesting changes. The same is happening in groups, on forums, on private list serves (yes, they do still exist!) and in individual discussions threads on social media. We think the topic is so important that today, GreenBook and the ARF are meeting to discuss collaborating on a webinar with a panel of key stakeholders on the topic, so look for more on that in the days ahead.

Here’s my take: some polling did better than others, as this post by Investor’s Business Daily rightfully shows: ” IBD/TIPP’s final numbers put Trump up by 1.6 points in a four-way race. As of 2 a.m. Wednesday morning, Trump was up by about 1 point in the popular vote. (The actual vote outcome will likely change as votes continue to be counted over the next several weeks.)

 Not one other national poll had Trump winning in four-way polls. In fact, they all had Clinton winning by 3 or more points. For the entire run of the IBD/TIPP poll, we showed the race as being far tighter than other polls. This isn’t a fluke. This will be the fourth presidential election in a row in which IBD/TIPP got it right.

The Los Angeles Times, which had employed a panel of people who were queried about their choice (and which had been ridiculed throughout the election) showed Trump up in a two-way race by 3 points.”

So the outliers and innovators in polling did better than traditional methods. But so did other approaches using social media analytics, behavioral economics-based analysis, “big data”, meta analysis and data synthesis, and the focus of today’s post, text analytics.  Tom Anderson posted on election day  what text analytics was suggesting as an outcome, and in today’s follow-up (reposted from the OdinText blog) he takes a clear eyed view on how he did.

The key takeaway here is that some polling approaches work, but so do many other approaches and we’d do well to apply those lessons to political polling, public policy research, and commercial research.

 

By Tom H. C. Anderson

Whatever your politics, I think you’ll agree that Tuesday’s election results were stunning. What is now being called an historic upset victory for Donald Trump apparently came as a complete shock to both of the campaigns, the media and, not least, the polling community.

The question everyone seems to be asking now is how could so many projections have been so far off the mark?

Some pretty savvy folks at Pew Research Center took a stab at some reasonable guesses on Wednesday—non-response bias, social desirability bias, etc.—all of which probably played a part, but I suspect there’s more to the story.

I believe the real problem lies with quantitative polling, itself. It just is not a good predictor of actual behavior.

Research Told Us Monday that Clinton Was In Trouble

On Monday I ran a blog post highlighting responses to what was inherently a question about the candidates’ respective positioning:

“Without looking, off the top of your mind, what issues does [insert candidate name] stand for?”

Interestingly, in either case, rather than naming a political issue or policy supported by the candidate, respondents frequently offered up a critical comment about his/her character instead (reflecting a deep-seated, negative emotional disposition toward that candidate). [See chart below]

 

odintexttrumpclintonissues

 

Our analysis strongly suggested that Hillary Clinton was in more trouble than any of the other polling data to that point indicated.

Why?

  1. The #1 most popular response for Hillary Clinton involved the perception of dishonesty/corruption.
  1. The #1 and #2 most popular responses for Donald Trump related to platform (immigration, followed by pro-USA/America First), followed thirdly by perceived racism/hatemongering.

Bear in mind, again, that these were unaided, top-of-mind responses to an open-ended question.

So for those keeping score, the most popular response for Clinton was an emotionally-charged character dig; the two most popular responses for Trump were related to political platform.

This suggested that not only was Trump’s campaign messaging “Make America Great Again” resonating better, but that of the two candidates, the negative emotional disposition toward Hillary Clinton was higher than for Trump.

Did We Make a Mistake?

What I did not mention in that blog post was that initially my colleagues and I suspected we might have made a mistake.

Essentially, what these responses were telling us didn’t jibe with any of the projections available from pollsters, with the possible exception of the highly-respected Nate Silver, who was actually criticized for being too generous with Trump in weighting poll numbers up (about a 36% chance of winning or slightly better than expecting to flip tails twice with a coin).

How could this be? Had we asked the wrong question? Was it the sample*?

Nope. The data were right. I just couldn’t believe everyone else could be so wrong.

So out of fear that I might look incompetent and/or just plain nuts, I decided to downplay what this data clearly showed.

I simply wrote, “This may prove problematic for the Clinton camp.”

The Real Problem with Polls

Well, I can’t say I told you so, because what I wrote was a colossal understatement; however, this experience has reinforced my conviction that conventional quantitative Likert-scale survey questions—the sort used in every poll—are generally not terrific predictors of actual behavior.

If I ask you a series of questions with a set of answers or a ratings scale I’m not likely to get a response that tells me anything useful.

We know that consumers (and, yes, voters) are generally not rational decision-makers; people rely on emotions and heuristics to make most of our decisions.

If I really want to understand what will drive actual behavior, the surest way to find out is by allowing you to tell me unaided, in your own words, off the top of your head.

“How important is price to you on a scale of 1-10?” is no more likely to predict actual behavior than “How important is honesty to you in a president on a scale of 1-10?”

It applies to cans of tuna and to presidents.

@TomHCAnderson

 

[*Note: N=3,000 responses were collected via Google Surveys 11/5-11/7 2016. Google Surveys allow researchers to reach a validated (U.S. General Population Representative) sample by intercepting people attempting to access high-quality online content—such as news, entertainment and reference sites—or who have downloaded the Google Opinion Rewards mobile app. These users answer up to 10 questions in exchange for access to the content or Google Play credit. Google provides additional respondent information across a variety of variables including source/publisher category, gender, age, geography, urban density, income, parental status, response time as well as google calculated weighting. All 3,000 comments where then analyzed using OdinText to understand frequency of topics, emotions and key topic differences. Out of 65 topics total topics identified using OdinText 19 topics were mentioned significantly more often for Clinton, and 21 topics were significantly more often mentioned for Trump. Results are +/- 2.51% accurate at the 95% confidence interval. ]

Share
You can leave a response, or trackback from your own site.
Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

6 Responses to “What’s Really Wrong With Polling”

  1. Tom H. C. Anderson says:

    November 11th, 2016 at 10:36 am

    Thanks for repost Lenny. Curious to hear what Greenbook blog/IIEX readers think

  2. Alan Grabowsky says:

    November 11th, 2016 at 12:53 pm

    Great, Tom. That’s a big chunk of it. What’s missing in most electoral research is Vector and Valence, to indicate how relevant each attribute will be to a voter’s decision.
    Good research tries to discover the Derived Impact of each supposed driver of choice or satisfaction.
    There are so many issues in the sociopolitical environment that don’t appear in these surveys, but impact voter behavior. I counted 25 so far, such as Suppressed Fears, Repressed Resentments, and Masked Envy.
    The so-called Silent Majority are unwilling or unable to voice them, but wow, did they vote!

  3. Lana Novikova says:

    November 11th, 2016 at 6:19 pm

    Well said Tom “If I really want to understand what will drive actual behavior, the surest way to find out is by allowing you to tell me unaided, in your own words, off the top of your head. “How important is price to you on a scale of 1-10?” is no more likely to predict actual behavior than “How important is honesty to you in a president on a scale of 1-10?” I would only add that we don’t have to choose between closed-ends and open-ends. We just have to ask good questions, whether closed- or open-ended. Lisa Lewers and I will be presenting our model that predicted the outcome of elections in 4 “swing” states right on using BOTH closed- and open-ended emotion-focused questions (5 simple questions, n=500 per state, report: http://159.203.20.171/report/client/529-241866ea30958b0b2007ffde94ddbed9/). Hope to see you in Chicago next week!

  4. Brian Lunde says:

    November 11th, 2016 at 6:39 pm

    I hesitate to take on Tom as I greatly respect his thinking and experience, but I’m not ready to buy what he’s selling. First, U.S. presidential politics, in my view, is a high-involvement, emotionally charged “category” in which people are intensely bombarded for months with “stimulation” about the available “brands”–at a level far beyond what is typical of any other marketing context. This fact alone makes me suspicious of the broad conclusion that “It applies to cans of tuna and to presidents.” The second problem is one of logical argument: Tom uses the examples of “How important is price to you on a scale of 1-10?” and “How important is honesty to you in a president on a scale of 1-10?” and argues both are equally unlikely to predict behavior. I agree; but of course these are *terrible* quant questions that no experienced quant researcher would ever recommend. This is nothing but a tautology, i.e. “poor quantitative methods are hopelessly flawed.” I’ve been doing quant research for more than 25 years now and I know it can be done well and produce reliable guidance that accurately reflects real-world behavior. Maybe the struggle in presidential polls is the exception that proves the rule.

  5. Chris Robinson says:

    November 13th, 2016 at 11:07 pm

    Its a brave researcher who wants to claim the ability to predict an event kike the US elections – ask Nat Silver as he removes the mud from his face. Sentiment analysis looks to provide a good post-hoc confirmation of the results, and the claimants are coming out of eh woodwork, but I doubt it has mileage. The problems in being able to measure election results in the US are so overwhelming I doubt any system could guarantee accuracy. Just consider the sampling challenges. This is not a 3000 sample 95% accuracy problem thing, its first about the segmentation that is near impossible to duplicate. Just consider the uncontrollable’s – voting is not compulsory, its held on a workday, voting influences are of the moment, often down to last few hours and that segment control issue – which Hispanics, which black voters, which counties, which jobless, which blue collar segments, etc. Maybe we should just be honest and recognize this is one of those Black Swan events with long tails that can never safely be predicted.

  6. Paul Neto says:

    November 23rd, 2016 at 10:28 am

    It’s clear that a survey mechanism as a predictor is quite weak. This I believe is ever more evident today and especially around political issues. With the ever changing information that floats around, whether fake or not, has some influence and simply are not reflected in a survey response that takes hours or days to process. I believe in this election in particular there were a lot of individuals on the fence and a poll simply cannot detect at this granularity and could not detect where they would land. The media influence more than ever makes these that much more volatile. We also all know that people’s perceptions and minds are very unpredictable. Unfortunately we need to wait another four years to test whether any incremental changes o polling will be effective.

Leave a Reply

*

%d bloggers like this: