50 New Tools Democratizing Data Analysis & Visualization
Just as we’ve seen the shift to “DIY” data collection platforms, we’re also seeing the development of a whole new class of self-service data exploration and visualization tools. These are not necessarily replacements for SPSS, SAS, R, and other traditional analytical suites: those enterprise level systems are often still needed to do more complex and advanced statistical analysis. However, in many ways these newer entrants have a leg up over legacy providers: they are less expensive (in many cases free), more flexible, easier to use, and are built with the needs of a variety of users in mind. Today it’s possible to find many tools that can help the most inexperienced user quickly begin doing sophisticated analysis and produce great visualizations from many different types of data.
The democratization of data exploration is well underway.
Of course there have always been many software options available to analysts and researchers, so increased choice in and of itself isn’t revolutionary. For instance, when thinking only of the market researcher industry, for years survey-based data collection platforms have integrated fairly advanced data exploration tools into their offerings, but they have been mostly limited to the data collected in that system and don’t easily allow for the synthesis of external data sets. Even though they are often capable of much more, general usage has been confined to field management and some early stage client-access rather than truly integrated data reporting.
Since my beginnings in MR at the turn of the century, the standard data processing workflow was generally export data in a delimited format to SPSS, clean it up in that package, export again to WinCross or other tabulation package, produce voluminous crosstabs, and create Excel charts from the tables. Software like e-Tabs and MarketSight sought to streamline some of that process and of course some organizations did more fully leverage the capabilities of data collection systems or develop homegrown solutions using macros, but for much of the industry this was SOP.
To be fair, that was generally all that was expected of market researchers: to analyze discreet data sets and produce a variety of graphical depictions of the major findings. This expectation started to change with the advent of dashboarding platforms such as Xcelsius, Crystal Reports, Dundas Charts and others in the early 2000s as we began to see client demand for new ways to interact with data emerge, but often those platforms were still dependent on a phased approach requiring data collection from one system, export into a statistical package for cleaning and exploration, and then additional export into a database structure such as SQL or even spreadsheets like Excel for the dashboard tools to access. The process was cumbersome and prone to breakage. Few MR suppliers embraced dashboard delivery systems and were content with the status quo, allowing the IT and Business Intelligence sectors to lead the early stages of the data visualization revolution using these and similar tools.
Since the client demand hasn’t abated (and in fact has only grown in volume and clarity) technologists have done what they always do: created efficiencies and new models that disrupt existing paradigms. From data synthesis to data visualization and across every imaginable type of data, new tools have emerged that make it easier than ever to harness data for impactful insight generation. Most of these tools don’t require coding or even significant training; they are point-and-click and do much of the hard work behind the scenes.
I’ve been experimenting with a few of these platforms that most aligned with traditional research and in the rest of this post I’ll introduce you to a few of my favorites and my thoughts on their fit within the MR industry. This is by no means a comprehensive list or even a deep review, but rather a quick introduction into some tools that you might not be aware of yet so you can get familiar with them.
I should also mention that platforms like Infotools, Research Reporter and Tableau are not covered here because I personally have not used them, although I have heard nothing but praise for them from their client side users. Infotools particularly seems to be pushing the boundaries to make research-centric data analysis and visualization across the enterprise easy and impactful.
It’s companies like these and those profiled below that are helping to reshape the insights function by making data exploration more accessible to those of us (myself included) who are not trained statisticians or data scientists.
OfficeReports is a gem of a program. It solves a long time frustration for many who work within market research by simplifying how data can be migrated from a statistical package directly into that stalwart of reporting: Microsoft Office. It doesn’t reinvent the wheel, but it makes it oh so much better. We might bemoan our dependence on PowerPoint and Excel, but it is still the standard in business productivity software globally. The folks at Office Reports get that and have developed a software package that looks, feels, and functions like an add in to Office and does it really, really well.
OfficeReports does nothing less than turn Microsoft Office into a complete data analysis and reporting suite for surveys. OfficeReports adds the “OfficeReports menu” to PowerPoint and Word. The OfficeReports menu enables you to embed your dataset into a document or presentation, and easily work with the data, and create tables and charts. All from within Microsoft Office.
Check out this short video introduction:
A 14-day trial version of the complete software can be downloaded from www.officereports.com/download. Try it; I think you’ll fall in love with it as I have.
Second Prism is part of the Survey Analytics family of companies. It is a light weight Database to Mobile Adapter which helps publish dynamic reports easily and securely on to mobile devices.The graphics are sharp and intuitive, and the whole system is designed fro social sharing of information within teams or publicly. If accessing interactive charts and graphs on mobile devices is a priority, Second Prism is a solid platform to help deliver that capability in a user friendly way.
Databoard is the newest foray from Google into the world of easy-to-use data tools. It’s a simple, intuitive, and free application that exemplifies the new open-data paradigm. Gigaom did a review a few weeks ago and here is their take:
It’s not so much the statistics from its various mobile industry surveys that make Google’s new Databoard so cool, but the fact that they’re free, pretty and ready to use. It’s not going to put analyst or research firms out of business any time soon, but if I were in the business of charging hundreds to thousands of dollars for market research reports, I wouldn’t take Databoard too lightly either.
Here’s the service in a nutshell: Google has done a bunch of research into how people are using mobile devices, and now it has created a service where you can easily find the key data points from those studies and then share them or turn them into infographics. You can also just download the reports in their entirety. It’s remarkably simple and, in theory, remarkably useful.
I’m working to get some insight on next steps in the product plan for Databoard. If that includes allowing users to upload their own datasets or tying this to Google Consumer Surveys, then I might tend to agree with Gigaom that this is a shot across the bow for MR. For now, its a great resource to explore secondary data and produce some basic infographics, and clearly a signal on what the future may hold.
DataMarket is a hybrid solution that does several things at once:
- It’s an Open Data Portal that allows you to access and analyze thousands of publicly available datasets. What is truly special about it is that you can cherry pick variables from multiple discrete sets and combine them or cross tab them. Want to know the correlation between Unemployment rates in Ireland and the Dow Jones on any given day, and look at that by standard demographics? Data Market Allows you to do it.
- You can license the platform as a proprietary Data Hub. Upload your own data and allow anyone you want to access and analyze it. Again, that includes the ability to access multiple variables from different sets on the fly and do some fairly sophisticated analysis, all at the click of a mouse.
- Organize and search. Datamarket has great search functionality and allows you to organize data in a variety of ways to meet your needs. It’s a very flexible taxonomic system that builds your library as you go.
Here is their introductory video.
I’ve been a fan of Q for quite a few years. They have been our data processing and delivery partner for the GRIT studies through the last several phases because of how robust the platform is. Unlike the other solutions mentioned it is a full-fledged alternative to SPSS, SAS, etc.. Developed by Australian firm Numbers, Inc. Q offers a robust and easy to learn interface that can meet the needs of novice to advanced users including Choice Analysis, Multivariate Analysis, Predictive Modelling and interactive charting and dashboarding for visualization and reporting of it all. For a full list of the features click here.
If you haven’t checked out the dashboard for the GRIT studies yet, they are a great example of that capability in Q, although the package delivers far more than what’s evidenced here. One of the features I love is the ability to cherry pick and combine variables from any data set I’ve loaded for additional analysis, and the options to easily merge or recode data on the fly are also very cool. I’m amazed IBM or a competitor has not snapped up Q yet; it truly is a very well designed analytical package and I suspect someone will acquire it to take it to the next level very soon.
The video below will give you a good taste of this amazing tool:
Statwing was one of the companies that was invited to participate in the Insight Innovation Challenge at IIeX Philadelphia in June. Even before their presentation they seemed to be engaged with many participants, but afterwards that interest increased exponentially. It’s easy to understand why. In their own words the software lets you:
Explore statistical relationships without the jargon.
Visualize your data with one click.
Statwing makes working with data intuitive and beautiful.
And it delivers. It’s an incredibly simple UI that belies the sophistication of the models powering it. It chooses appropriate statistical tests automatically, even accounting for outliers or other data issues. It’s output is clear, visual, and interactive. The company has been getting lots of attention from the tech industry and BI community with good reason; it truly does empower the most novice of users to conduct complex analysis and produce solid visualizations.
Check out their presentation from IIeX below, or their video here.
Finally let’s look at Dapresy. Dapresy is in wide use by many MR firms for it’s dashboarding and reporting capabilities (and it is very good at that) but what sets it apart is the integration of an infographic engine along with a design studio to help users build a custom graphical library quickly and cost effectively. Here is a bit more from their website:
Dapresy Pro creates web-based portals for dynamic online presentation and report downloads, promptly delivering clear actionable results from survey data and other business data coming from markets, users and customers.
In addition to creating stunning InfoGraphics based dashboards, the system has fully integrated modules for data processing, statistics and analysis, cross tabs, charting and automated data updates.
You can see many examples of their outputs on their Ideabox, and as a user I can say that it’s relatively easy to produce similar high quality visualizations and interactive dashboards. I’ve been so impressed that I plan to start using some Dapresy outputs in GRIT starting with the next round.
Rudy Nadilo, President of North America spoke at IIeX as well. Here is his presentation: I think you’ll find it interesting.
Those are just a few of the newer tools that I have been personally exploring. There are many, many more available, especially when we branch out into the broader arena of Business Intelligence and Big Data. Tools like DataHero, Wolfram Alpha, BigML, LavaStorm, and ManyEyes are increasingly replacing traditional statistical packages used by researchers. Certainly some of these are not simple by any stretch of the imagination, but many are designed to take away the pain of analysis from many different types of data sources. The era of the statistician or analyst being a specialized role is rapidly coming to a close as these tools make it easy for anyone to produce similar deliverables. That said, the skills of data professionals will still be in demand, although more within the IT development arena than insights as more and more tools come into play to automate the process for everyone.
And just to make sure you find this post especially useful, Computerworld has a wonderful interactive chart tracking a a variety of absolutely free tools available in this post: Chart and image gallery: 30+ free tools for data visualization and analysis. I’ve reposted the basic chart below for convenience. minus the interactive elements.
|Data Wrangler||Data cleaning||No||No||Browser||2||External server||No|
|OpenRefine (formerly Google Refine)||Data cleaning||No||No||Browser||2||Local||No|
|R Project||Statistical analysis||Yes||With plugin||Linux, Mac OS X, Unix, Windows XP or later||4||Local||No|
|Google Fusion Tables||Visualization app/service||Yes||Yes||Browser||1||External server||Yes|
|Many Eyes||Visualization app/service||Yes||Limited||Browser||1||Public external server||Yes|
|Tableau Public||Visualization app/service||Yes||Yes||Windows||3||Public external server||Yes|
|VIDI||Visualization app/service||Yes||Yes||Browser||1||External server||Yes|
|Zoho Reports||Visualization app/service||Yes||No||Browser||2||External server||Yes|
|Choosel||Framework||Yes||Yes||Chrome, Firefox, Safari||4||Local or external server||Not yet|
|Exhibit||Library||Yes||Yes||Code editor and browser||4||Local or external server||Yes|
|Google Chart Tools||Library and Visualization app/service||Yes||Yes||Code editor and browser||2||Local or external server||Yes|
|D3||Library||Yes||Yes||Code editor and browser||4||Local or external server||Yes|
|Quantum GIS (QGIS)||GIS/mapping: Desktop||No||Yes||Linux, Unix, Mac OS X, Windows||4||Local||With plugin|
|OpenHeatMap||GIS/mapping: Web||No||Yes||Browser||1||External server||Yes|
|OpenLayers||GIS/mapping: Web, Library||No||Yes||Code editor and browser||4||local or external server||Yes|
|OpenStreetMap||GIS/mapping: Web||No||Yes||Browser or desktops running Java||3||Local or external server||Yes|
|TimeFlow||Temporal data analysis||No||No||Desktops running Java||1||Local||No|
|IBM Word-Cloud Generator||Word clouds||No||No||Desktops running Java||2||Local||As image|
|Gephi||Network analysis||No||No||Desktops running Java||4||Local||As image|
|NodeXL||Network analysis||No||No||Excel 2007 and 2010 on Windows||4||Local||As image|
|CSVKit||CSV file analysis||No||No||Linux, Mac OS X or Linux with Python installed||3||Local||No|
|DataTables||Create sortable, searchable tables||No||No||Code editor and browser||3||Local or external server||Yes|
|FreeDive||Create sortable, searchable tables||No||No||Browser||2||External server||Yes|
|Highcharts*||Library||Yes||No||Code editor and browser||3||Local or external server||Yes|
|Mr. Data Converter||Data reformatting||No||No||Browser||1||Local or external server||No|
|Panda Project||Create searchable tables||No||No||Browser with Amazon EC2 or Ubuntu Linux||2||Local or external server||No|
|PowerPivot**||Analysis and charting||Yes||No||Excel 2010 and some 2013 versions on Windows||3||Local||No|
|Weave||Visualization app/service||Yes||Yes||Flash-enabled browsers; Linux server on backend||4||Local or external server||Yes|
|Statwing||Visualization app/service||Yes||No||Browser||1||External server||Not yet|
|Infogr.am||Visualization app/service||Yes||Limited||Browser||1||External server||Yes|
|Datawrapper||Visualization app/service||Yes||No||Browser||1||Local or external server||Yes|
|Cascading Tree Sheets||Library||Yes||Yes||Browser||1||Local or external server||Yes|
|Dataset||Library||No||No||Browser||4||Local or external server||Yes|
|Leaflet||Library||No||Yes||Browser||4||Local or external server||Yes|
|Searchable Fusion Table Map Template||Library||No||Yes||Browser||3||Local or external server||Yes|
|Tabletop||Library||No||No||Browser||3||Local or external server||Yes|
|Data Explorer**||Data acquisition, data reformatting||No||No||Excel 2010 and 2013 on Windows||2||Local||No|
*Highcharts is free for non-commercial use and $80 for most single-site-wide licenses. **While add-ons are free, Excel (which is required to run them) is not.
I hope you’ll check out some of the solutions listed here. There are far more tools available to MR professionals than SPSS, WinCross, Excel and PowerPoint, and many are easier to use. Explore the options and find the solutions that can help your data analysis and visualization efforts become cheaper, easier, and more impactful.