It’s time for the GRIT Survey – please participate now!

Please participate in the newest GRIT Survey and help us understand what’s shaping our industry. Your participation in the GRIT Survey makes the GRIT Report possible.

The GreenBook Research Industry Trends (GRIT) study is the leading global survey of the market research profession and industry. Twice a year we ask insights professionals to help us all understand the trends impacting our work, our businesses, and our jobs via the GRIT Survey.

It’s now time for the for the Q4 2017 wave, and we’d like to ask you to take no more than 15 minutes to share your experience and opinions with us. Want to give back to our industry by taking the survey? Please head over to the survey.

As always, we’re diving deep into many of the most important areas and issues that impact insights professionals. In this wave we’re exploring topics such as the role of research in the organization, who owns the data, time spent on research functions, challenges & opportunities in the future, adoption of emerging methods & technologies, use of traditional methods & technologies, satisfaction levels with suppliers, the financial outlook for the industry, and your take on the buzz topics of the day.

Because many of the GRIT tracking questions included in this edition were developed in the pre-mobile era, it’s challenging to optimize them for a great mobile survey-taking experience while maintaining data consistency. As a result, we recommend completing this survey on a desktop, laptop, or tablet.

All who complete the survey will receive:

  • A PDF copy of the final GRIT Report before it’s released publicly.
  • Access to all survey data via a special GRIT data portal.
  • Priority free access to the GRIT webinar series.

Thank you in advance for sharing your perspective … and for sharing the survey with your colleagues!




Thanks to Our GRIT Partners

Research Partners

Ascribe, AYTM – Ask Your Target Market, Bakamo Social, Consensus Point, CX Network, 3 Translate, Gen2 Advisors, Lightspeed, Michigan State University, MROC Japan – Community Solutions Company, mTAB, Multivariate Solutions, NewMR, OfficeReports, Research NowResearchscape International, Stakeholder Advisory Services, Virtual Incentives

Sample Partners

A.C. Nielsen Center for Marketing Research at The Wisconsin School of Business, AIM, AMAI, American Marketing Association New YorkAsia Pacific Research Committee (APRC), ARIA, Australian Market & Social Research Society (AMSRS), BAQMaR, BVA, MRS, Next Gen Market Research (NGMR), OdinText Inc., Provokers, Qualitative Research Consultants Association, The Research Club, The UTA MSMR Alumni Association, University of Georgia | MRII, Women In Research

Mongolian Adventures

Find out the key market research takeaways from Asia Pacific Research Committee's annual conference.

By Dave McCaughan

Ghengis Khan, rich lamb stew, incredibly cheap cashmere sweaters. Go on, what else comes to mind of when you think of Mongolia? How about a truly professional and interesting market research conference. Well last week in Ulan Baatar that is what we got.

The APRC (Asia Pacific Research Committee) brings together 12 MR industry bodies from 11 Asia Pacific countries and last week near 350 marketers and researchers, including over 60 from all parts of Asia and beyond gathered in UB for it’s annual conference. In it’s brief history of a little over a decade one of the defining features of the APRC is it’s organizing top level international speakers at events held in important, but often over looked locations for conferences like this. Places like Xian, Auckland and now Mongolia’s capital.

Eleven presentations that covered some of the hot topics in marketing like Big Data, On-line shopping, behavioural economics, data science from the perspective of practioners in China, Japan, Thailand, Taiwan, Korea. And yes I did do a session on my pet subjects of narrative identification for story building and the use of Artificial Intelligence. Case studies abounded on how Samsung’s amazing growth was all about completely integrating research and insight at every stage of development, how online and off-line shopper decision making unfolds in Japan compared to Taiwan, and the inevitable story of how the 1985 generation is disrupting and changing behavior in China and beyond. And please do not refer to Chinese, and Asians in general, born after 1985 as millennials, a thoroughly discredited term anyway. As speakers like Victor Yuan Yue pointed out and was debated in a number of breaks the post 85’s of China and many other markets in the region ( Japan, Thailand, Taiwan, Korea … ) are likely the first and only generation where the great majority are single children and they are raised in a world of breathtaking, unprecedented change well beyond their Western cohorts experience.

The BIG learning? Something Victor said really clicked: “It’s not news anymore, it’s ooze”. The constant bane of marketers and the opportunity for researchers is finding something that sticks. As Andy Zhou, the incoming President of APRC pointed out we live in a time where in China’s biggest cities the millions of privately owned bicycles, still a major personal transport system, were 90 per cent replaced by shared service bikes in less than six months. Things are not evolving, there are revolutions everywhere. And again Victor explained those 85ers are really “non-experienced” and beyond open, they are expecting constant newness and change and because they have so little experience in nearly everything, and often have parents who can not advise them from their own experience they have absolutely no bias.

I was reminded of a piece of research I managed a few years ago as the 1985 generation were leaving college and joining the workforce. In one focus group about brand loyalty one mid twenty-something Thai explained, “I am perfectly loyal, I change my brands every month.” By which he meant that in his experience of constant newness and the need to be constantly up to date there was greater risk in staying with the same brands and more opportunity to develop his own brand by constant trial.

Maybe my one regret was that apart from passing reference there was not enough discussion about the real growth opportunity in Asia: the ageing populations. It would have been time well invested to hear more about innovation in understanding how ageing is changing so fast and how the 60-90 plus age groups (the fastest growing segment of every Asian population) is being researched, marketed to and adapting.

That would have solidified a theme across the conference for researchers from the people of Asia’s perspective: don’t hold me back, help me evolve quicker whether young or old.

Peter Harris, the outgoing APRC President, made the great point that we have to move beyond structures that are stilted and not fluid. Match the people of Asia’s experience of decades of change and dynamic choice with techniques to be more involved in lifetime experiences rather than “point in time” insertions. At it’s simplest in a “social” world stop re-asking for basic data and use tools to “know who people are. Andy Zhao built a theme around the conundrum that while there were many great new ideas, techniques and technologies being developed across the region too many clients really just wanted to use legacy processes and that one of the challenges for the industry in Asia was to do more to educate clients on new ideas and practices. Given that over half of the conference were clients it was a good start.

So the message from Mongolia: Innovation, technology, insights, revolution to breakthrough the ooze.

It all sounded and was a great intro for us all to gather in my home city, Bangkok, for IIeX Asia Pacific in December.

Oh, and well done to the Mongolian Marketing Research Society and key sponsor the Mongolian Marketing Consulting Group for a very professional and forward looking organization.


Machine Learning in Market Research

Machine learning is a buzzword in the industry. This post explores how its adoption can help market researchers reduce timing and costs, and some of its challenges.

By Brooke Patton

Obviously with all the talk about machine learning, it’s clear it has a significant impact on a variety of industries. One in particular is market research, but you probably already knew that. Our previous post in this series outlined the basics of machine learning: with that knowledge, we can now begin to understand how it can apply to market research. Recall that machine learning uses a series of algorithms in an iterative process to gather, process, and interpret data into learnings. Does that last part sound similar to anything else?

Applying Machine Learning to Market Research

Market researchers actively gather and interpret information specific to consumers, so it only goes to show that machine learning and market research are a match made in heaven. Below is a more detailed outline of the machine learning process and where market research can play a role:

Chart explaining how machine learning related to market research

The most influential aspects of combining market research with machine learning includes the impact on insights and the bottom line for research productivity. While machine learning is great for data collection, its primary purpose is to quickly learn from data and adjust itself— whereas market research wants to take those learnings and apply additional action outside of the data. The ability to collect more robust data sets that can quickly gather the who, what, when, where, and how allows market researchers to focus on more important aspects like why, and spend less time on additional research timing and costs. Additionally, more data means better data quality and less bias.


So while machine learning in market research certainly has its benefits, many wonder if market research is at risk for being replaced with machine learning. Machine learning is unable to take in external factors like politics, economics, or seasonality. So due to the fact that machine learning can gather, analyze, predict, and adjust itself all on its own doesn’t mean it can act on the findings or explain the business implications of them. For example, machine learning could tell you that someone who clicks on your ads at least three times is likely to purchase your product within the next month. However, what it can’t tell you is why they clicked on those ads and how the ads can be improved— that comes from the combination with market research.

There are also several challenges when it comes to machine learning: like data security, scalability, and cost of implementation. Data security is somewhat out of our control. With the influx of data and the difficulty in finding quality data, parameters will likely be put in place to control how much can be accessed and by whom. But it shouldn’t cause much of a slow down in the way of machine learning. Scalability on the other hand is dependent on how agile and prepared a business is before implementing machine learning. Some businesses, as we’ve seen with big data, are slower at adopting new tools than others. The cost of implementing machine learning practices is typically most expensive prior to implementing. And while the data will still come at a price, once the model and process are running smoothly, cost is less of a factor to the benefits received.

Automating My Job

Posted by Kevin Gray Wednesday, October 18, 2017, 6:55 am
Posted in category General Information
Marketing researchers, like just about everyone else, are concerned about losing their jobs through automation. Thinking ahead to this possibility, I decided to try to automate myself out of a job.

For background, I am a marketing science and analytics consultant working primarily, but not entirely, in marketing research. All of my work is tailored to specific client needs, and I do not use standardized proprietary methods I have developed or licensed. Customized analytics is by no means rare and, in that respect, I am not an oddball. Why Customize? explains why decision makers frequently need customized analytics. The methods page of my company website will give you some details about the sort of work I do and statistical and machine learning tools I use.

There are at least six stages in most of my projects, and I would like to take you through each of them to see which aspects of my work can be automated.

Defining the Problem: This is the most important and usually most challenging part of my job. Requests I receive, either directly from clients, or via other marketing research agencies or data science consultancies, are typically vague. Who will be using the results of my analytics, how they will be used and when they will be used is frequently unclear. At times, requests are at the other extreme and consist of a very narrow question such as “How much do you charge for conjoint?” (To an analytics person, this can be likened to walking into a restaurant and asking, “How much do you charge for dinner?”)

Defining the problem is essential for me to determine if there is even a role for me in the project. It often requires considerable patience and tact, and this is true when I work with my fellow Americans too, let alone when language and culture are barriers. How can this part of my work be automated? If it can, then clients will have also been automated.

Deciding on the Analytics Method: It is not at all unusual for me to use many statistical methods in sequence or in combination for one project. In fact, this is typical. Once I have a good grasp of why the research or analytics is being proposed, I can then think more specifically about the methods I will use. The method or methods may not be what the client originally had in mind, but through experience I have learned to focus on decisions, not technology. This part of my job cannot be automated either.

Deciding on the Data: The required data may already exist, but this is rare. More commonly, we need to explore the interrelationships among many variables and, because of this, there will be gaps in existing data. What we have may also be too old to be useful, or we may have no relevant data at all! Customized analytics usually involves collecting new data suited to specific business objectives or assembling data of various types from several sources. I am normally heavily involved in this step, including sample and questionnaire design when they are applicable. Automating this part of the process is only possible when we are repeating a project that has been successfully completed. It cannot be done for the benchmark. Machines 0 – Me 3.

Data Cleaning and Setup: Exploratory analysis is an opportunity to kill several birds with one stone. When we are setting up the data for analysis, we also are usually cleaning it, recoding it and exploring it. We are learning from the data. Once again, the first time around, this step is not possible to automate, nor would it be wise to attempt to do so.

Data Analysis: Statisticians normally have expended at least 70% of their time budget before reaching this point. Though obviously undesirable, in some cases we may have already exceeded 100%! Analysis is typically squeezed, as is interpretation and reporting, and part of the reason is unrealistic planning or misunderstanding of the project’s objectives. Here, once again, only when the analytics are being repeated with new data is automation feasible.

I have heard it claimed that methods such as cluster analysis lend themselves to automation, but nothing can be further from the truth. There are a gigantic number of clustering methods (for example) and different implementations of the same method. There is no single measure of “fit” that will tell me (or AI) which is the best method, implementation or options to choose, let alone which variables to consider using in the first place. (In an “ideal” cluster analysis, each clustering variable’s score would differ in just one cluster and thus each variable would have low overall “importance.”)

In multivariate analysis, it is more the rule than the exception for competing models to provide nearly equal fit to the data but suggest very different courses of action to decision makers. More fundamentally, any model provides a simplified representation of the process or processes that gave rise to the data – “Essentially, all models are wrong, but some are useful” in the immortal words of legendary statistician George Box.

Greater variety of data and an explosion in the number of analytics tools now available has actually made automation more difficult, not easier. In his excellent book Statistical Rethinking, Richard McElreath of the Max Planck Institute makes a very important observation: “…statisticians do not in general exactly agree on how to analyze anything but the simplest of problems. The fact that statistical inference uses mathematics does not imply that there is only one reasonable or useful way to conduct an analysis. Engineering uses math as well, but there are many ways to build a bridge.”

So, who programs these tools? How do they decide which procedures or options are best? The more choices there are, the more complex the programming task becomes. AI cannot decide how to program itself. There is also a heightened risk of bugs. Yes, AI does have bugs.

If the project is going to be repeated again and again, and once we have decided which data and analytic methods are sufficient, then it definitely makes sense to try to automate these stages as much as possible. But not the first time around, and human checks will still be needed at least periodically. Machines 0 – Me 5, and my future is starting to look bright.

Interpretation and Reporting: This is the final step and raison d’etre of most projects. There is also implementation and assessment post-implementation, but these are big subjects and worthy of their own article. (Hint: The human stuff gets very hairy…) Unless the project is very simple, with a limited number of variables and very basic analytics, this final step is very hard to automate.

Tracking studies in which software can pick out basic trends lend themselves to automated reporting, but the opinions of a human analyst can greatly enhance the final deliverable. The vast bulk of my work, however, is customized and also too complicated for this to be feasible. An AI would need to be developed and trained for each of my projects. Let’s not forget that the core of AI are computer programs, not magic.

So, at best, my job could only be partly automated. Let’s be charitable to the machines and make the final score Machines 1 – Me 5.

I’m safe, for now.

Brokers of Knowledge: The Currency of Market Research

How do research companies deliver actionable insights that contribute to client's business success? Be brokers of knowledge.

By Geoff Lowe

Brokers understand what their customers want and use their experience and expertise to get it for them. As market researchers, we are brokers of knowledge and insights. We are experts at exposing the heart of the matter. Our clients rely on us to guide them to results that impact their business. Or do they? According to a recent report by market research leader Greenbook, clients are still dissatisfied with the final products they receive from their research teams. Specifically, “Clients are mostly satisfied with how market research suppliers conduct research, but less so when it comes to consulting skills like understanding their business or business issues and reporting results and recommending actions.”

This is something that we should all pay very close attention to as researchers: tying our work directly back to the client’s business success. Rather than just handing findings over to clients, we must act as good brokers, guiding clients on the right way to invest this knowledge for positive results. In short, our outputs must be shared in a way that goes beyond the insights; beyond even making data-based decisions. Our outputs must substantially contribute to clients taking right data-based actions. But how do we bridge this gap?

If you’ve been in the market research space for any amount of time, you’ll know there’s no “one size fits all” solution here. Some projects require a deep dive for a full understanding of exactly what’s going on in the marketplace, while others must cater to the C-suite with topline results and high level business impacts. However, there are a few things that are universal. One is the need to spend time in the planning and strategy phase to really get our heads around the client’s business goals. This is essential to shape our deliverables and ensure subsequent actions clients take move them closer to their goals.

So, what do good brokers do? They know their strengths, they ask good questions and they leave their clients a step aheadsmarter and more informed.

  1. Know our strengths. Our clients hire us because we know our stuff. We know how to ask the right questions and how to find the right people to ask. Let’s not pretend we know everything, thoughwe don’t. Which is why we must…
  2. Ask questions. We can’t do the research our clients need unless we know about them, their goals and their expectations. When we understand these basics, then we can help recommend research-based actions that will lead to positive outcomes and to…
  3. Moving the client forward. Our clients should exit their interactions with us knowing more than they did when they went in. After all we’re broking knowledge. The knowledge we share must leave our clients more equipped, more confident, more in control and closer to meeting their goals.

With the right tools in place, we can broker knowledge that works. Beyond up-front planning and an understanding of the real underlying need, we should harness all the new technology and innovations that are flooding our industry with increasing speed. Our long-standing desire to show the very real impact of market research is now, more than ever before, within our reach. We have tools that integrate data from many sources and create outputs that can influence a company’s actions to reach its goals. We now have no excuse. We must make the currency we trade inknowledgereally count for our clients.

Data Visualization Best Practices with Tim Bock: 8 Types of Online Dashboards

Data Visualization Best Practices with Tim Bock is back! Learn how to create different types of online dashboards in the latest edition.

By Tim Bock

What type of online dashboard will work best for your data? This post reviews eight types of online dashboards to assist you in choosing the right approach for your next dashboard. Note that there may well be more than eight types of dashboards, I am sure I will miss a few. If so, please tell me in the comments section of this post.

KPI Online Dashboards

The classic dashboards are designed to report key performance indicators (KPIs). Think of the dashboard of a car or the cockpit of an airplane. The KPI dashboard is all about dials and numbers. Typically, these dashboards are live and show the latest numbers. In a business context, they typically show trend data as well.

A very simple example of a KPI Dashboard is below. Such dashboards can, of course, be huge. Huge dashboards have lots of pages crammed with numbers and charts, looking at all manner of operational and strategic data.

KPI Online Dashboards

Click the image for an interactive version


Geographic Online Dashboards

The most attractive dashboards are often geographic. The example below was created by Iaroslava Mizai in Tableau. Due to people being inspired by such dashboards, I imagine that a lot of money has been spent on Tableau licenses. While visually attractive, such dashboards tend to make up a tiny proportion of the dashboards in widespread use.

While visually attractive, such dashboards tend to make up a tiny proportion of the dashboards in widespread use. Outside of sales, geography, and demography, few people spend much time exploring geographic data.

Geographic online dashboard example

Click the image for an interactive version


Catalog Online Dashboards

catalog online dashboard is based around a menu. The viewer can select the results they are interested in from that menu. It is a much more general dashboard used for displaying data rather than geography. Here, you can also use any variable to cut the data. For example, the Catalog Dashboard below gives the viewer a choice of country to investigate.

Catalog online dashboards

Click the image for an interactive version

The dashboard below has the same basic idea, except the user navigates by clicking the control box on the right-side of the heading. In this example of a brand health dashboard, the control box is currently set to IGA (but you could click on it to change it to another supermarket).


Dashboard of Supermarket Brand Health (brand funnel)

Click the image for an interactive version


The PowerPoint Alternative Dashboard

story dashboard consists of a series of pages specifically ordered for the reader. This type of online dashboard is used as a powerful alternative to PowerPoint, with the additional benefits of being interactive, updatable and live. Typically, a user either navigates through such a dashboard using navigation buttons(i.e., forward and backward). Alternatively, they use the navigation bar on the left, as shown in the online dashboard example below.

PowerPoint Alternative Dashboard


Drill-Down Online Dashboards

drill-down is an online dashboard (or dashboard control) where the viewer can “drill” into the data to get more information. The whole dashboard is organized in in hierarchical fashion.

There are five common ways of facilitating drill-downs in dashboards: zoom, graphical filtering, control-based, filtering, and landing pages. The choice of which to use is partly technological and partly related to the structure of the data.

1. Zoom

Zooming is perhaps the most widely used technique for permitting users to drill-down. The user can typically achieve the zoom via mouse, touch, + buttons, and draggers. For example, the earlier Microsoft KPI dashboard permitted the viewer to change the time series window by dragging on the bottom of each chart.

While zooming is the most aesthetically pleasing way of drilling into data, it is also the least general. This approach to dashboarding only works when there is a strong and obvious ordering of the data. This is typically only the case with geographic and time series data, although sometimes data is forced into a hierarchy to make zooming possible. This is the case in the Zooming example below, which shows survival rates for the Titanic (double-click on it to zoom).

Unless writing everything from scratch, the ability to add zoom to a dashboard will depend on the components being used (i.e. whether the components support zoom).

Zoom Drill-down Online Dashboards

Click the image for an interactive version


2. Graphical filtering

Graphical filtering allows the user to explore data by clicking on graphical elements of the dashboard. For example, in this QLik dashboard, I clicked on the Ontario pie slice (on the right of the screen) and all the other elements on the page automatically updated to show data relating to Ontario.

Graphical filtering is cool. However, it both requires highly structured data and quite a bit of time figuring out how to design and implement the user interface. They are also the most challenging to build. The most amazing examples tend to be bespoke websites created by data journalists (e.g., The most straightforward way of creating such dashboards with graphical filtering tends to be using business intelligence tools, like Qlik and Tableau. Typically, there is a lot of effort required to structure the data up front. You then get the graphical filtering “for free”.  If you are more the DIY-type, wanting to build your own dashboards and pay nothing, RStudio’s Shiny is probably the most straightforward option.

Geographic dashboard example

Click the image for an interactive version


3. Control-based drill-downs

A quicker and easier way of implementing drill-downs is to give the user controls that they can use to select data. From a user interface perspective, the appearance is essentially the same as with the Supermarket Brand Health dashboard (example a few dashboards above). Here, a user chooses from the available options (or uses sliders, radio buttons, etc.).

4. Filtered drill-downs

When drilling-down involves restricting the data to a subset of the observations (e.g., to a subset of respondents in a survey), users can zoom in using filtering tools. For example, you can filter the Supermarket Brand Health dashboard by various demographic groups. While using filters to zoom is the least sexy of the ways of permitting users to drill into data, it is usually the most straightforward to implement. Furthermore, it is also a lot more general than any of the other styles of drill-downs considered so far. For example, the picture below illustrates drilling into the data of women aged 35 or more (using the Filters drop-down menu on the top right corner).

Supermarket brand funnel - with filter options on the right

Click the image for an interactive version


5. Hyperlink drill-downs

The most general approach for creating drill-downs is to link together multiple pages with hyperlinks. While all of the other approaches involve some aspect of filtering. On the other hand, hyperlinks enable the user to drill into qualitatively different data. Typically, there is a landing page that contains a summary of key data. So the user clicks on the data of interest to drill down and get more information. In the example of a hyperlinked dashboard below, the landing page shows the performance of different departments in a supermarket. The viewer clicks on the result for a department (e.g.: CHECK OUT) which takes them to a screen showing more detailed results.

NPS dashboard example

Click the image for an interactive version


NPS online dashboard example


Interactive Infographic Dashboard

Infographic dashboards present viewers with a series of closely related charts, text, and images. Here is an example of an interactive infographic on Gamers, where the user can change the country at the top and the dashboard automatically updates.

Interactive Infographic example

Click the image for an interactive version


Visual Confections Dashboard

visual confection is an online dashboard that layers multiple visual elements. On the other hand, a series of related visualizations is an infographic. The dashboard below overlays time series information, with exercise and diet information.

Visual confection dashboard example

Click the image for an interactive version


Simulator Dashboards

The final type of dashboard that I can think of is a simulator. The simulator dashboard example below is from a latent class logit choice model of the egg market. The user can select different properties for each of the competitors and the dashboard predicts market share.

simulator dashboard example

Click the image for an interactive version


 Create your own Online Dashboards

I have mentioned a few specific apps for creating online dashboards, including Tableau, QLik, and Shiny. All the other online dashboards in this post used R from within Displayr (you can even just use Displayr to see the underlying R code for each online dashboard). To explore or replicate the Displayr dashboards, just follow the links below for Edit mode for each respective dashboard, and then click on each of the visual elements.

Microsoft KPI

Overview: A one-page dashboard showing stock price and Google Trends data for Microsoft.
Interesting features: Automatically updated every 24 hours, pulling in data from Yahoo Finance and Google Trends.
Edit mode: Click here to see the underlying document.
View modeClick here to see the dashboard.


Europe and Immigration

Overview: Attitudes of Europeans to Immigration
Interesting features: Based on 213,308 survey responses collected over 13 years. Custom navigation via images and hyperlinks.
Edit mode: Click here to see the underlying document.
View mode: Click here to see the online dashboard.


Supermarket Brand Health

Overview: Usage and attitudes towards supermarkets
Interesting features: Uses a control (combo box) to update the calculations for the chosen supermarket brand.
Edit mode: Click here to see the underlying document.
View mode: Click here to see the online dashboard.


Supermarket Department NPS

Overview: Performance by department of supermarkets.
Interesting features: Color-coding of circles based on underlying data (they change when the data is filtered using the Filters menu in the top right). Custom navigation, whereby the user clicks on the circle for a department and gets more information about that department.
Edit mode: Click here to see the dashboard.
View mode: Click here to see the underlying document.


Blood Glucose Confection

Overview: Blood glucose measurements and food diary.
Interesting features: The fully automated underlying charts that integrate data from a wearable blood glucose implant and a food diary. See Layered Data Visualizations Using R, Plotly, and Displayr for more about this dashboard.
Edit mode: Click here to see the underlying document.
View mode: Click here to see the online dashboards.


Interactive infographic

Overview: An infographic that updates based on the viewer’s selection of country.
Interesting features: Based on an infographic created in Canva. The data is pasted in from a spreadsheet  (i.e., no hookup to a database).
Edit mode: Click here to see the dashboard.
View mode: Click here to see the underlying document.


Presidential MaxDiff

Overview: A story-style dashboard showing an analysis of what Americans desire in their Commander-in-Chief.
Interesting features: A revised data file can be used to automatically update the visualizations, text, and the underlying analysis (a MaxDiff model)(i.e., it is an automated report).
Edit mode: Click here to see the underlying document.
View mode: Click here to see the online dashboards.


Choice Simulator

Overview: A decision-support system
Interesting features: The simulator is hooked up directly to an underlying latent class model. See How to Create an Online Choice Simulator for more about this dashboard.
Edit mode: Click here to see the dashboard.
View mode: Click here to see the underlying document.


What the Research Now & SSI Merger Really Means

Research Now & SSI are merging, creating the newest Mega MR company. But the implications go far beyond a simple merger of sample companies and will have far reaching impact on almost all aspects of the insights industry.


The industry has been abuzz for months with rumors of a merger between sample giants Research Now and SSI, and last week this “open secret” finally became public fact.

There were many reactions to the announcement both publicly and privately, running the gamut from concerns about monopolization (silly based on the immense number of competitors in the industry) to the impact on human capital in mergers like this (a realistic concern since you can bet there will be layoffs due to resource arbitrage) and concerns about panel duplication and data quality (silly again since most studies use sample blending from multiple sources, including Research Now and SSI). However, some of the conversations I had with folks quickly moved to the more far reaching and strategic implications, which is what I want to focus on in this post.

Coming in at numbers 30 and 38 respectively in my recent ranking of the Top 150 companies in MR, the combined company is around $500M in annual revenue (assuming my size estimates are accurate!), moving them easily into the Top 20 and likely with a market cap putting them close to “Unicorn” status in value.  It’s important to recognize this fact. This company has larger revenues than not only most of the sector, but they are far larger than some of the most influential companies in our space: Qualtrics, SurveyMonkey, Google Surveys, Zappistore, VisionCritical, Medallia, etc… In fact, their revenues are larger than Qualtrics and SurveyMonkey combined.  Few would argue that those companies are not centers of gravity in our industry and bellwethers of where things are going. The new entity is large enough that they will become another center of gravity and the industry will begin to morph and adapt around them. In short, they are now a catalyst of accelerated change by sheer virtue of their size, reach, and influence.

Outside of the obvious financial aspects of the deal, in the press release issued by the two companies they were far more open about their intentions than announcements of deals usually are:


Because of accelerated adoption of automation and digital and social technology by both brands and consumers, considerable opportunities exist for delivering solutions for data-driven consumer engagement.  The combined assets of the two companies – in first-party data, technology platforms, and partnerships with major brands, publishers, and ad tech providers – will position the integrated organization to enhance its product offerings to include new services such as audience activation and engagement, paths to purchase, measurement and AI-based insights.

Together, the companies will be able to serve customers better and faster, while developing innovative marketing solutions based on technology and data. The companies’ complementary capabilities – in data integration methodology, automated and DIY research, cross-channel ad and audience solutions, panel recruitment/management technology, mobile market research and geo-location solutions, multi-mode data collection and operational platforms – will provide the foundation for leveraging first-party data beyond traditional market research as well as accelerate development of innovative client solutions based on data, technology, and real-time consumer sentiment. In addition, the new entity will have expanded global reach, allowing worldwide delivery of existing and new solutions.

Chris Fanning, president and CEO of SSI, says, “Combining our capabilities and talents creates accelerated synergies across products, services, and operations.  The combination will also enable accelerated investments in the development of new markets and data solutions that will ultimately help our customers grow their businesses more successfully.”

Gary S. Laben, CEO of Research Now, says, “Together, we can advance the state-of-the-art in automated research, delivery and solutions as well as in research-enriched data integration to give our customers increased competitive advantage. We will be better able to serve existing and new customers with a range of solutions for data-driven marketing and decision-making informed by research-based direct consumer engagement.”


Let’s parse this out a bit.

First both companies have been developing a suite of automated solutions, and they very rightfully point out that part of the motivation for the merger was driven by “accelerated adoption of automation”, which would apply to both the sample segment (Lucid, Cint, Pureprofile, Prodege, P2 Sample and other sample automation players) but also in data collection and reporting (Zappistore, Gutcheck, Wizer, Methodify, AYTM, QuestionPro, DiscussIO, Remesh, and many more). If the flood of VC funding into those companies that have been pioneering automation in insights was not signal enough, the merger of two large entities driven by the same trend should be the final sign that not only is automation the future, it will be one of the defining drivers of this industry for the next few years.

Second, they boldly announce that they are no longer simply a research company: “…considerable opportunities exist for delivering solutions for data-driven consumer engagement…leveraging first-party data beyond traditional market research….range of solutions for data-driven marketing and decision-making informed by research-based direct consumer engagement.”  Again, many of the newer entrants in the automated sample market I listed earlier are ahead of them here, but make no mistake, the merger of Martech, Adtech and Research is about to accelerate significantly.  Last year my good friend Peter Orban wrote on this topic several times, and his take then, based on the available evidence at the time, is worth repeating:


Whether Martech eats Adtech, they merge or continue to coexist one thing seems to be clear – we are looking at the new, data & software fueled marketing world order. We know software is eating the world and it is certainly eating every aspect of marketing.

But here is the scary thing: in this new world order there is no mention of Marketing Research. Pour through any of the landscapes and there is not a single utterance of the word “research”. Yes, there are research companies mentioned under various different headings, such as: “Customer Experience”, “Performance Attribution”, “Audience & Market Data” and “BI, CI & Data Science” but there is no ‘bucket’ called research.


It is clearly the intention of Research Now and SSI to move as quickly as possible to correct this and be a major player in staking their claim in the world of DMPs that drive the Martech/Adtech industries. The wall between research and marketing has been crumbling for many years and assuming they can pull it off, and assisted by the already significant strides their smaller competitors have already made, this should be the blow that knocks it down completely.

Third, if it was ever in doubt based on their movements over the past few years, they will join Toluna as a former “sample company” that unabashedly serves both research suppliers and direct clients with a whole host of research solutions.  ” The companies’ complementary capabilities – in data integration methodology, automated and DIY research, cross-channel ad and audience solutions, panel recruitment/management technology, mobile market research and geo-location solutions, multi-mode data collection and operational platforms…” clearly says they will double down on continuing to develop solutions that can capture end-user budgets directly. Their competitive set has been redefined to effectively include every sample, technology, and agency in the world. The good news is that as they integrate, scale and re-position the business they will very likely shy away from any aspect of service so as to not impact their drive towards a “tech company valuation”.  The upside for agencies is that this will also further accelerate the necessary shift to consulting-based business models vs. data collection and will further solidify the bifurcation of the industry into tech and service segments.

Lastly, I think this may finally be an opportunity to begin to fundamentally re-think our relationship with consumers. I wrote a few weeks ago that ” Personal data has been described as the “new oil” that will drive the economy of tomorrow, but it’s currently being treated as a commodity rather than a precious resource. We need to start developing models that both incentivize and reward individuals for contributing to the data economy — and here’s how.”  If the new company truly wants to be in the consumer engagement and activation business, then the single best way to do that is to transform their relationship with individual panelists from a purely transactional one to something that fits in the “personal data economy”. I am not as willing to bet on this outcome as I am on the previous three, but I sincerely hope the leaders of the organization can realize that a path to to this model could be the real keys to unlocking their growth and valuation.

My overall net is that in the short run there will be some upheaval in terms of their internal operations, staffing levels, competition and pricing but it will also push the industry into new markets and create more jobs eventually. They are pivoting to being a DMP for Martech and Adtech and will invest in technology to make that happen (AI and automation), and as a likely Unicorn in terms of valuation they will be able to influence the direction of the industry as a whole in a very significant way. This has to happen for us to thrive, and this deal is big enough that it will be a very meaningful catalyst. It will also help smaller players already going down similar paths as validation for their models and will add even more energy to investment dollars flowing in.

Just to double check my thinking, I reached out to a few folks to ask their opinions. I didn’t ask their permission to attribute their comments publicly so I won’t do that here, but they are some of the smartest people in the industry who have deep insight into macro trends.  Here are a few:


My initial guess is in the next 6-12 months we’ll see prices up a bit, quality up a bit, operating costs down, and then they need to pick the right extensions. Oh, and a bunch of clients are going to moan, and a number of people who hoped for sponsorship will be disappointed for a while!



Good to see this is now official news. Think it’s good for the industry, will create openings and opportunities for many, and will shift clients. Time will tell if it’s a good merger, but not sure either company had much choice or too many other moves they could make. Good luck to them though. Big company after the merger, with scale, and known brand, so they will be a serious player regardless.


Interested to see how well they execute. SSI’s recent acquisitions have been executed very well in my opinion, though this is a very different beast. If executed well I think this will add significantly more value than the sum of its parts — SSI has built competitive advantage in tech to integrate other panels + sample distribution, RNs self-service platform is decent and they are positioning themselves well in the programmatic space as a premium data provider — self-reported data matched to other 3rd party / offline data and connected to client 1st party data. One thing I think you’re missing is that service will likely drop in the short-medium term as staff retention will falter as the merger negatively impacts culture. Opportunity for other panels to pick up top talent, and bring clients with it.

Long term I see the battle between SSI/RN, Cint, Kantar’s data division (though they will struggle to win outside of WPP), with YouGov nipping at their heels, and a few dark horses like us that if can execute can quickly grow and mount a significant challenge.

My prediction is that in the short-medium term you’ll see the players that can connect, productize, and sell their data offerings will perform the strongest. Cint, RN, and to a lesser extent Kantar, YouGov, Lucid seem to be making the most waves at the moment on this front. I wouldn’t be surprised if you see survey programming (qualtrics/survey monkey), mar tech/programmatic tech platforms (Adobe/Oracle/etc, and data companies (Data Logix, LiveRamp, Neustar) with a war chest start buying panels for similar reasons.

Long term it will be the businesses with the best connections to the end-consumer that will win. This will be driven by having access to better and more scaleable data, if not driven by regulation. For this reason I think Cint and Kantar with current strategy is shaky long term. RN / SSI still need significant improvement in their consumer prop to reach the necessary scale to win in media (once we move past predictive modelling to scale trainer sets).


So there we go. I may be wrong, but if so I am in very good company. The consensus definitely seems to be that this merger may be one of the more significant events to occur in our industry in the past few years and will have far reaching implications for not just Research Now and SSI, but for the entire insights space.

One last note, and a personal request to Chris Fanning and Gary Laben, the CEOs of SSI and Research Now.

Earlier in this post I mentioned that many jumped to worries about the security of their jobs, and it’s a reasonable concern. Normally I would think that whatever talent you shed during consolidation would be snapped up by others, but I’ve heard much over the years about the Non-Competes you have in place for your teams.  Non-competes are absolutely appropriate in many employment transition situations, but in my opinion not in scenarios of involuntary downsizing or layoffs.

Since so many folks expressed concern about this issue, I am publicly asking that you reconsider your Non-Compete policy for any employees who are not sufficiently “packaged out” to allow them not to worry about working during the term of such an agreement. Your new company is a major force for good in this industry and will only bolster your reputation and brand by being known as an organization that puts people on the top of their list of concerns. Plus, as you move more into the realm of tech you will need to recruit good people to support that vision, and restrictive Non-Competes without commensurate compensation clauses tend to be counter-productive for attracting great talent.

To put skin in the game myself, I promise to use my resources to help anyone caught in the transition to make the needed connections to potential new employers; just email me at and connect with me on LinkedIn.

Can Research be Right if the Participants are Wrong?

Posted by Hugh Carling Friday, October 13, 2017, 7:00 am
Help benchmark the views of qualitative researchers and share your experiences on research participant recruitment.

By Hugh Carling

Share Your Research Recruitment Experiences

Effective recruitment and good sample quality are key challenges within the research industry, as the latest GRIT reports revealed. Clients and suppliers consider ‘being able to trust the results’ and ‘sample/panel quality’ as the most important factors in research study design (GRIT 2016 Q3-Q4). But the very latest GRIT survey found that 47% of insights buyers and 42% of research providers thought that sample quality was getting worse. Trust in sample quality within the industry is clearly low.

We want to benchmark the views of qualitative researchers on the current state of research recruitment to help us recognise and understand the challenges faced by researchers, to explore opportunities for improving recruitment methods within the industry and to improve our Behavioural Recruitment services.

We’d love qual researchers around the world to tell us their views on research participant recruitment in this 10-15 minute survey: Take the Survey

The report will be shared with all participants and one lucky participant will be randomly selected to receive $130.


The Fallacies of Facial Exploring Prophecy Feelings vs Facial Coding

Facial coding is a wildly used tool to measure emotions.This blog from Michael Sankey and Ph.D., Ken Roberts examines how effective it is in predicting purchase behavior for ad testing.

By Michael Sankey Ph.D. and Ken Roberts

Forethought welcomes all good science into the world of marketing. We are advocates for evidenced-based decision making in marketing and we welcome competition. But every now and again we see the need to examine alternate offers for our own education and that of our clients.

Recently, we found it interesting and worrisome to learn that more than a third of global Fortune 500 companies are using facial coding for ad pre-testing and some (e.g. Unilever and Mars) have made it a mandatory component for all copy-testing. Brands want to understand emotion in consumption. Great, no quarrel there. So, why does this news worry us? In a nutshell, facial coding for ad testing does not predict purchase behavior.

What are our credentials to comment? Since 2007, we have been in an ongoing conversation with world-leading academics from MIT, Cornell, Duke, ANU, Monash, UNSW and LBS to examine our methodology, Prophecy Feelings (which quantifies and models emotion in choice at a category, brand and communications level), and explore it against other emotions measurement methods such as facial coding. Our post-doctoral colleagues have made it their business to understand the differences and the valid applications of our own approach versus competitor methods. We have been invited to present and debate the validity of emotions measurement at international marketing science conferences including IIEX (USA & Aust), ARF Annual Conference and Audience Measurement (USA), MRMW (USA), AME (China), AMSRS (Aust) and WARC Researching Implicit (UK) conferences. Our work in this space has been rigorously tested in being granted patents in the USA, UK and Australia, during the peer-review process for the Marketing Science Practice Prize and for publishing in the journal of Marketing Science.

We wanted to offer you some of what we have learned:

  1. Facial coding is based on the work of Ekman, who identified six basic facial expressions of emotion that claimed to be universally recognized. From a marketing perspective, his theory was not grounded in understanding discrete emotions linked to consumption behavior. Conversely, the Forethought Prophecy Feelings scale measures implicit emotional responses to brands and communications on nine discrete emotions. These emotions have been empirically demonstrated to drive consumption behavior across different categories and regions (see Laros & Steenkamp, 2005).Moreover, and most critically, there is minimal, if any, scientific evidence establishing facial coding as a predictor of market performance. The Prophecy Feelings scale is scientifically validated and provides a causal model for the relative importance of nine discrete emotions in driving consumption behavior across categories with high degrees of predictive validity (i.e. correlations frequently above 0.7 in predicting future changes in market share with a one period lag and validated using third-party data). The technique has been published in top tier academic journals for its ability to augment explanatory power in models of choice and delivers far richer insight than commonly employed, stated emotional measures.
  2. Emotion is not constrained to facial expression – it must also consider body movement, especially for social emotions such as Pride or Shame. Pride and Shame have not been linked to distinct facial behaviors. The Prophecy Feelings scale animates both body movements and facial expressions, thus enabling a more holistic representation of the emotion. Moreover, an emotion such as Pride (which is not captured in facial coding) has frequently been proven in Forethought studies to be a strong driver of consumption behavior across many diverse categories.
  3. While appropriate for measuring dynamic (second-by-second) content such as video, facial coding is extremely limited in its ability to measure static content (e.g. print, brand logo). In contrast, Prophecy Feelings captures emotional intensity after the stimulus (static or dynamic) has been shown. This data may then be used to model the hierarchy of emotions driving purchase behavior.
  4. Forethought believes that to most effectively assess communications performance, marketers should be assessing campaigns on their ability to change brand performance (rational and emotional) on the scientifically derived drivers of market share. Facial coding does not measure change in the brand’s emotional performance due to the communication – it simply measures the emotional performance of the communication. In contrast, Prophecy Feelings determines creative efficacy by measuring the degree to which the creative improves/detracts from the brand’s performance on the key emotional consumption drivers; however, it can also be used to assess the emotional performance of the creative itself.From our experience working with clients across many diverse categories, we have seen widespread evidence whereby the creative has elicited strong levels or emotion (positive and/or negative) – however, the same creative did not shift the emotional performance of the brand (i.e. the emotion elicited by the ad was not linked to the brand). If the objective of the communication is to bring about a business outcome such as gaining market share, then the creative measurement should be assessing change in emotional performance of the brand.
  5. Facial coding is dependent on the experience of relatively strong emotions – muscle movements that occur below the threshold of visual observation cannot be analyzed. Facial coding has greater sensitivity for some emotions (e.g. Happiness) than others (e.g. Surprise and negative emotions such as Contempt). Indeed, only the feeling of Happiness with its expressive feature of the “smile,” is observably related to the underlying physiological and facial pattern of expression (Wolf, 2015). While both facial coding and Prophecy Feelings capture different degrees of emotional intensity, the Prophecy Feelings scale is arguably more sensitive and can detect more subtle emotional differences, as it is not dependent on a physical manifestation of emotion.
  6. Deploying facial coding outside of a controlled environment is challenging. Three important considerations – pose, illumination, and expression – need to be tightly controlled. For instance, on a mobile device the respondent may be moving around or shift the angle in which they’re holding the device thereby hindering the ability to detect facial expressions. Prophecy Feelings is not subject to the same limitations as the scale captures discrete emotion and contains contextual elements that the emotion pertains to.
  7. Facial expressions are highly complex and context-dependent. For example, gender, culture and age differences in expressiveness have all been observed. Studies have demonstrated that perceptions of emotions through facial expressions are not universal, but highly influenced by cultural contexts (Gendron, Roberson, Marietta van der Vyver, & Barrett, 2014).
  8. Facial coding has challenges in accounting for differences in facial morphology. For example, some people have curvature to the mouth that naturally (i.e., when not otherwise emoting) – looks like a smile or a frown.

From the evidence we’ve seen, facial coding does not inform which emotion should be elicited in communications (i.e. it doesn’t help to understand the hierarchy of emotional drivers of market share in a given category), rather it is a moment-by-moment measure of emotional elicitation in creative.

Marketers must first understand which emotion to communicate. This can be achieved by building a quantitative model determining the hierarchy of importance of discrete emotions driving category consumption (note to facial coding: it’s not always Happiness!), then undertake the process of working with research and agency partners to understand how that emotion is best triggered for your brand to help inform communications development. The creative should then be assessed to determine its performance in activating the target emotion, while at the same time understanding how the emotional content may be optimized in both current and future creative.

Conducting Latam Online Research? Read This First

Find out how to conduct successful research in Latam with this guide to navigating the region.

By Andreina De Abreu

We now live in a digital data driven society. And we know the Internet has made a lot of things (and our lives) easier. Even so, when you do research, you will always have to deal with many “unknowns.” In this world, becoming increasingly global, there are (still) many differences among countries and cultures. It’s like entering a territory with many lands unconquered.

Researchers are required to develop global research; and to collect data across different nations, with their own cultures, languages, internet penetrations, mobile adoption and so on. In such a unique continent like Latin America, being knowledgeable about the cultural differences (and particular qualities) is what will define the success of your research.

Before deciding which methodor combination of methodsyou will use to collect your data, first there are some basic tips to understand Latam markets. This will help you invest your efforts (and resources) more wisely in order to get insightful data from your online research.

Internet penetration

Internet penetration in Latam is lower in comparison to other markets. Truth is, many years ago, the gap used to be larger, but the sudden mobile internet adoption has helped close that gap. Internet adoption is unequal depending not only of the population, but also of the social economic levels and even if it is an urban or rural population.

Young population

The population in Latam is dominated by a large young segment. This is why marketers pay a lot of attention to this group. And this is great when it comes to online data collection because this particular segment keeps a major portion of the sample. Young people are staggeringly adopting the internet more than other age groups. So, that’s great news for your research project!

Go mobile!

The continent is migrating enthusiastically to Internet, but at a slower pace in comparison to other regions. Like we mentioned, the younger population is accelerating the process and it also creates a particular online ecosystem, different by essence, to the one from Europe or North America.

What is the main difference? Precisely the mobile effect. Most of Latam is jumping into the internet through their smartphones, attracted by interactivity offered by WhatsApp and Twitter. Which is not what happened all over the world. Germans, for instance, started using Internet through their desktops and walked through the online content revolution (from news websites to social media).

Now, how does this impact your research? Well, if you are planning on doing an online survey, your questionnaire should be adaptable to all devices. Otherwise, you might experience a drastic drop-out rate.

Internet infrastructure

Whether mobile or fixed, Internet connectivity is not as fast and nor as widespread as it is in Europe or North America. Is this important? Well, yes. The most important reason why is because surveys might take a longer time to be completed.

You might not believe it, but the same 20 minute questionnaire administered to participants in Europe and North America, may become a 25 minute one in Latam. Even though some differences might be caused by translation (for example, the same question in Spanish will probably have a longer length due to the number of words needed to express the same idea), the main argument is that internet speed is slower in Latam.

Now…Who is to blame for this? The panelist? The panel company? Answer is: None of them. Before unfairly punishing your participants, take their commitment into consideration. Reward them accordingly to the time they spend completing your survey.

Social class differences and their definition

Social classes have been (since the rise of traditional market research), a very important aspect in the industry because, among other reasons, it is a frequent quota variable to control sample representativeness.

In Latam, social class is more than important since it needs to be analyzed due to a demonstrable fact: differences among classes are huge! By instance, low social stratum is large compared to Europe and the United States. And in addition to this, people belonging to the classification are not living in the conditions they have in these other two regions. We are referring to people living in shacks, waking up in the morning without knowing what they are going to eat later that day.

Besides these differences, the way to define social class in Latam is very sophisticated and it varies from one country to the other. The method used to define social class might give a vague result. Normally, several questions about household composition and characteristics (including shocking ones like number of light bulbs at home or the floor’s type of material) are asked to the participant. As a result, the social class assigned to each person varies a lot depending to the answers to one or another question.

Very large urban population

The word diversity suits Latam well. The continent has great differences between regions within each country. Depending of the area, urban or rural, the characteristics (including population) are very dissimilar. Some of the most populated urban areas of the world are located in Latam: Mexico DF, São Paulo, Buenos Aires, Rio de Janeiro, Lima, Bogotá and Santiago de Chile.

Internet access is one particular difference to keep in mind. The access in rural areas is extremely low, which difficulties getting a representative sample from those areas.

The key to a successful research is to develop it according to each country and its diversity. If it is not possible, try to group the ones with similar characteristics. For example, regional accents can be one of the aspects. Colombia and Venezuela have a similar way of speaking Spanish. But it is not the same if you hear a Mexican or a Chilean speaking. And this is crucial when you are creating an online survey.

Last but not least, look for help. Find a panel company which is solid in Latam a strong presence in the continent. Usually, the most reliable companies to deliver an accurate representation of Latam are the ones who already have a presence in the continent. Viva Latin América!

Want to Become a Data Scientist? Read This First.

Posted by Kevin Gray Tuesday, October 10, 2017, 7:00 am
Interested in becoming a data scientist? Learn the basics about this trending field in Kevin Gray's interview with Jennifer Priestley.

By Kevin Gray and Jennifer Priestley

There’s been a lot of hype about Data Science…and probably just as much confusion about it. Marketing scientist Kevin Gray asks Jennifer Priestley, Associate Dean of The Graduate College and Professor of Statistics and Data Science at Kennesaw State University, what Data Science really is, what makes a good Data Scientist, and how to become one.

Can you define Data Science for us in simple, layperson’s terms?

I like the definition that was tweeted (appropriately) by Josh Wills, Director of Data Engineering at Slack – “(It’s the) Person who is better at statistics than any software engineer and better at software engineering than any statistician”.  I like to add what I call “The Priestley Corollary” – “(It’s the) Person who is better at explaining the business implications of analytical results than any scientist and better at the analytical science than any MBA”.  

What’s the difference between a statistician and a data scientist?

It’s a great question.  I am also frequently asked What’s the difference between a computer scientist and a data scientist?  The fact that both disciplines question if there is effectively anything new here is telling.  While both domains are contributing in important and meaningful ways to this nascent discipline, neither is independently sufficient.

Data is not only growing in size, but the definition of what we even consider to be data is expanding.  For example, text and image are increasingly common forms of data to be integrated into analytical methodologies like classification and risk modelling.  This expanding definition of data is pushing both statistics and computer science out of their traditional cores and into their respective fringes – and it’s at those fringes where the new thinking is taking place – and the fusion of the fringes is forming the basis of Data Science.  Much of the traditional core of statistics does not readily accommodate problems defined by billions of records and/or by unstructured data.  Similarly, while the core of computer science enables the efficient capture and storage of massive amounts of structured and unstructured data, the discipline is ill equipped to accommodate to the translation of that data into information through modelling, classification and then visualization.

I do agree that in Data Science circles, statisticians are more likely to get the short end of the stick.  I think this is unfortunate.  A few years ago, there was an article on the Simply Statistics blog, “Why Big Data Is in Trouble: They Forgot About Applied Statistics”. The article highlighted the issue of how a rush to the excitement of machine learning, text mining, and neural networks missed the importance of basic statistical concepts related to the behavior of data—including variation, confidence, and distributions. Which lead to bad decisions.  While Data Science in NOT statistics, statistics contributes in a foundational way to the discipline.  

Until a few years ago, few of us had ever heard of Data Science. Can you give us a snapshot of its history?

The term has been traced back to computer scientist Peter Naur in 1960, but “Data Science” also has evolutionary seeds in Statistics. In 1962 John W. Tukey (one of the best known and respected Statisticians of our time) wrote: “For a long time I thought I was a statistician, interested in inferences from the particular to the general. But as I have watched mathematical statistics evolve, I have … come to feel that my central interest is in data analysis… data analysis is intrinsically an empirical science.”

A reference to the term “Data Science” was made in the Proceedings of the Fifth Conference of the International Federation of Classification Societies in 1996.  The article was titled “Data Science, Classification, and Related Methods”.  In 1997, during his inaugural lecture as the H. C. Carver Chair in Statistics at the University of Michigan, Professor C. F. Jeff Wu (currently at the Georgia Institute of Technology), actually called for statistics to be renamed data science and statisticians to be renamed data scientists.  

A critical milestone for Data Science occurred in 2002 with the launch of the first academic, peer review journal dedicated to Data Science – Data Science Journal…followed the next year by The Journal of Data Science.  Since then several other journals have emerged to promote and disseminate academic research specifically in this space.  

The emergence of dedicated academic journals is particularly important to the academic community – these journals now provide emerging doctoral programs (like ours) and emerging academic departments to establish unique platforms for research, scholarship and publication.  Now Data Science faculty and doctoral students can engage in the production of knowledge and thought leadership within their own community – not Statistics, not Computer Science, not Mathematics, not the Business School.

A 2011 study by McKinsey that has been widely publicized predicted that by 2018 “…the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.” How accurate has this forecast proven to be? Are there now other projections we should use instead?

I get asked this question a lot – specifically, I have a lot of corporate executives who ask questions like “Is this Data Science thing a fad?”.  I think we need to reframe the discussion.

My opinion is that we don’t need “190,000 people” or “1.5 million managers” who have deep analytical skills.  I think everyone needs to have some level of analytical skills.  I think that basic data literacy should be as foundational in our education system as reading and math.  It’s encouraging to see basic coding skills increasingly being taught in elementary schools.  At the university level, my opinion is that Data Science should be part of the General Education curriculum (right now I can hear our Academic Affairs office gasping).  

So, while the current talent gap is very real, it’s a function of an education system that has been misaligned with the demands of the market.  Education at all levels is still pivoting and likely will continue to do so for the foreseeable future.  I would expect that within a generation, the demand for these skills will not diminish, but the supply will be more closely aligned.

Many people, including those contemplating a mid-career change, have set their career sights on Data Science. It may not be right for everyone, though. What sorts of aptitudes and skills do you need to work in Data Science? What are the best ways to become a Data Scientist?

It’s an excellent question.  We can talk about the generation coming up behind us…and what we need to do to get them ready…but the reality is that there are a lot of people in their late 20s, 30s, 40s who are looking for opportunities to pivot their career towards Data Science.

I see a lot of these people in my office.  I have had more than one conversation that goes like this “I just paid $10,000 to XX university to complete a certificate in Data Science…and I still can’t get a job”.  While some of these “certificates” are well developed and are good value, sadly, many are not.  

First, you cannot go from being a poet to a Data Scientist by going through a 5 day certificate program.  Or worse, an online certificate program.  

Second, I think people need to have realistic expectations about what it truly takes to accomplish their career goals. These skills are in high demand and are well paid because they are HARD – or at least take initiative to develop and refine.

Third, I think people need to take an inventory of where their skills are now and where they want to go.  The answer to that question of course will dictate how to get there.  Those that fall into the lure of the easy online certificate programs should be mindful of the Cheshire Cat from Alice in Wonderland – “if you don’t know where you are going, it really does not matter what path you take”.        

I tell people who ask my advice in this area this –

  1. If you are a poet and you are looking to get into Data Science – REALLY get into Data Science as a career in a deep and meaningful way – you need to put away your plumed pen and pull out your jeans and your backpack, and go back to school. Full time. Most graduate programs in Data Science are less than two years and most offer some form of graduate research assistantship. You should be looking for programs that include programming, statistics, modeling.  But also ample opportunity to work on REAL world projects with local companies, nonprofits, local governments…etc.  I can’t emphasize strongly enough how critical applied, hands on, real experience is to any Data Science program. This is why online/short term certificate programs don’t work for people who are starting from scratch in this area.  It’s through hands on experience that will help people understand the more latent aspects of data science – like the role of story telling, creativity (which is woefully underappreciated) and project management.
  2. If you are a computer scientist/programmer, look for a business school program with an analytics track or a strong applied statistics program.  Presumably your coding and math skills are where they need to be – you likely need the statistics/modeling/analytics – and the training (again) to tell the story and learn how to work in teams of people who think differently than you do.
  3. I would encourage everyone/anyone to consider learning basic visualization tools like Tableau.  I would also encourage anyone/everyone to consider taking online/asynchronous programming courses periodically.  These are typically inexpensive (even free) and enable you to continue to keep your skills sharp.

I get the argument that not everyone wants to become a computer programmer – I do not particularly enjoy programming.  I had to learn to program to get answers to the research questions that were posed to me.  If I could have found the answers using my trusty HP-12C and a mechanical pencil I would have.  You have to know basic math, you have to be able to read and write and, increasingly, you have to be competent in some basic programming in the 21st century.         

Data Scientists frequently comment that, in many organizations, management doesn’t really know how to use analytics for decision making. Decisions are still mostly made by gut and heavily influenced by organizational politics. Is that also your experience?

I frequently give talks at corporate events, where this issue is present in the room – even if it is not vocalized.  I frame the conversation like this – organizations can be roughly categorized as native and non-native to data.  

Examples of the “natives” are the companies that dominate the headlines – as well as the stock market – Amazon, Google, Facebook. These companies could not have existed 30 years ago.  Not only did the data that is so foundational to who they are and what they do did not exist, but even if it did, we did not have the computing power to capture it or to execute the deep analytical methodologies related to AI, machine learning, deep learning…that enable them to do what they do.  

However, another dimension to these companies that is often overlooked is that because they are native to data, this has HUGE cultural implications.  These are data-driven companies from the top to the bottom of the org chart.  They have data running through their DNA.  Most everyone who comes into these companies has a data centric orientation – and likely studied a computational discipline – increasingly Data Science.  The median age of an employee at Facebook is 29.  At Google, it’s also 29 and at Amazon its 30 (not including warehouse employees).  

Companies that are non-native to data are the companies that were successful long before we heard terms like “Data Science” and “Big Data.”  Examples might include Walmart and Arby’s.  These are very successful companies that did not initially have data running through their DNA.  And although these companies now lean heavily into data to inform their decision making and delivery of their products and services, there is a great deal of variation across the org chart in terms of computational literacy.  But their leadership has been very forward thinking in terms of making these companies leaders in their markets because of the cultural shifts in becoming fact-based, data-driven organizations.  Others in their respective markets (Sears, Macys…McDonald’s, Wendy’s) have not.  

Lastly, what impact do you think Artificial Intelligence and automation will have on Data Science in the next 10-15 years?

I’m not really an expert in this area, but I would say that any forecasting of the death of Statistics, Computer Science or Data Science because of automation is premature.  Calculators “automated” mathematics…but mathematics is broader and more complex now than it was before calculators.  I expect that same will be true in Data Science.

Thank you, Jen!