By Joel Rubinson
How should researchers think of big data?
When your data move beyond crosstabs and excel…when you are predicting rather than profiling…when your data do not fit neatly into rows and columns because they are unstructured in their natural state…when you are anonymously matching data at the user level that come from different sources and requires some analytic detective work to optimize the match…when you are extracting information in real time from a massive database on servers that does NOT fit on your laptop.
All of these are crosstabs hallmarks and there were numerous world class talks at the ARF Re:Think 2014 conference that fit this practical definition. And practically speaking, big data can take you to new territories of insights and actions where 20 minute online surveys cannot go.
There are three broad classes of big data applications that were brought to life:
- Matching different databases
- Using data science to create predictive meaning from massive data
- creating structure from unstructured data
Matching different data sets
Different presentations made it clear that the following matches are possible.
- Cable subscriber TV viewing matched with voter records and via predictive analytics, precisely targeting advertising to swing voters.
- Individually matching TV viewing, Facebook, digital clickstream, and radio listening to frequent shopper data. Nielsen has combined their audio panel with frequent shopper data via Catalina and their own Homescan panel. IRI and comScore have linked clickstream and purchase data. In every case, we now have a direct linkage between brand communication exposure and sales outcome that can be used for ad targeting and for determining return on marketing more precisely than macro-regression-based marketing mix modeling.
- Facebook emphasized the importance of matching behaviors across screens using a persistent log-in. Linking behaviors across screens is critical to properly allocate advertising funding in a world where 40% or more of online behaviors seamlessly and subconsciously go from one screen to another. Retailers should pick up on this, linking behaviors by frequent shopper number log-ins irrespective of screen and whether the purchase occurs online or in-store.
- IRI bringing together store scanner data, TV viewing data and digital behaviors for marketing mix modeling at a much more granular geo level.
- Conducting surveys among people whose clickstream behavior is known helped Ford to gain great insight into digital behaviors leading up to acquiring a vehicle.
- Matching attitudinal segmentation with third party data such as hobbies/interests, viewing and purchasing behaviors via data fusion.
Using data science
CivicScience (disclosure: I consult with them) presented a new way of collecting massive amounts of data that can be connected. Instead of a lengthy survey, they ask only three questions at a time but on a scale such that they have tens of millions of answers across nearly 30,000 questions in their database available for data mining. While the matrix is sparse (i.e. no one has answered all 30,000 questions), any question can be analyzed by any other question so, for example, you can find unexpected correlations of lifestyle and media factors with being persuadable regarding a media property or brand. This has great insights, media targeting, and prediction value. In particular, they presented something they call “expectation science” where they are able to cookie respondents with a good forecasting track record in a given domain and then ask expectation questions of them, such as, “What movie will win best picture Oscar?”, with very impressive results (8/9 winners corrected called).
Creating structure from unstructured data
Oculus 360 presented a way of mining social media conversations to understand what certain central concepts (like romantic or bohemian in the world of fashion) really mean to consumers and how you can tell if your brand is fully aligned.
Because of the massive number of possibilities to click on, clickstream behaviors can also be thought of as unstructured. Ford and Luth research conducted surveys among those whose online behaviors were metered, and analyzed clickstreams to understand digital behaviors along the path to purchase.
So what do all of these big data applications have in common?
- They allow marketers to target advertising based on behaviors and interests rather than simply based on demographics. I call this “precision marketing” and it will improve advertising effectiveness as well as change the ways we measure what is working.
- We are using digital data to measure things that respondents cannot accurately recall, like their clickstream behaviors or the effect that a fleeting (but potentially impactful) ad had on purchase outcomes.
- To handle massive amounts of data, often unstructured, and needed in close to real time, big data applications require technology solutions that go beyond current research tools like cross-tabs, and CSV files.
- They extract quant insights from naturally occurring data streams like digital, social, and customer data, rather than relying exclusively on surveys.
- They require new statistical analysis tools.
The ARF conference did not give us the full array of big data applications that marketers need to focus on (for example, there was little on understanding the power of first party data that come from brand websites and customer data) but enough to represent a call to action that research tool kits and skill sets must evolve beyond the “n=1000, 20 minute survey”.