Data Science Platforms: Myth v. Reality
The phrase “data science platform” has been bandied about a lot recently — at conferences, in market research, and in tech publications like this one. Forrester named data science platforms a top emerging technology last year, and companies using data science at an enterprise level are being wooed by offerings in a rapidly expanding marketplace of platform providers. But what is a data science platform, really? And is it more than just a buzzword?
First, a definition: Data science platforms are meant to encompass the whole of a data scientist’s work. That means they typically provide tools that help users integrate and explore data from varied sources, build and deploy models, and make the outputs of those models operational. Essentially, this suite of tools is meant to keep data science work transparent, reproducible, and scalable — and make it easy for a data scientist to push dynamic results (like the predicted outcomes of ad campaigns) to the people who make decisions based on those results, replacing or supplementing static (and quickly outdated) reports.
These platforms are no flash-in-the-pan-product, either. Data science as a profession has blown up — data scientists have had the best job in the United States for two years running according to recruiting site Glassdoor, and data science teams at Fortune 500 companies like Cisco number in the hundreds — and enterprise-grade technology is just beginning to catch up to demand. How do I know? We asked Forrester Consulting to hold a barometer to the industry to find out if — and why — businesses are using platforms*.
The Rise of the Platforms
The last major wave of big data tech investment was focused on enabling data science for organizations: building data lakes, centralizing data, and scaling support to continually integrate data through technologies like Hadoop. But now that companies have access to big data, Forrester has found that data science platform adoption is poised to more than double in the next two years — rising from 29% to 69% by the end of 2018. The reason, the firm concluded, is that more and more companies will soon realize the potential benefits. Among them, survey respondents suggest, are an improved customer experience, more informed business decisions, better business planning, and increased operational cost efficiency and customer retention.
Those aren’t the only benefits to performing data science work around a central software hub. Forrester’s survey also found that tool sprawl, where the volume of tools exceeds an organization’s ability to effectively utilize them, was the number one challenge data-driven businesses face, with an average of 6.7 tools being used to find value in data, from business intelligence tools and relational databases to predictive analytics, streaming analytics, and NoSQL databases. And almost half (46%) of the 208 companies Forrester spoke with lacked an integrated approach to their data science technology stack.
‘Insights leaders’ are the real MVPs
Companies already using data science platforms, on the other hand, are excelling. Forrester identified a group of businesses that regularly exceed profit and growth expectations, which it dubbed “insights leaders.” These leading companies were most likely to be small and agile (53% report having less than 5,000 total employees) and — most notably — 88% of them use a fully functional platform to do data science work. The majority (62%) also have a data science development plan and roadmap in place, as well as top-down support for data science initiatives starting in the C-suite.
Insights leaders currently make up only 22% of the market, and are far ahead of their less data-driven peers when it comes to investing in data science and retaining analytical talent. But nearly every company surveyed — whether insight leader or laggard — reported that data science is an important discipline to develop, and ranks among their most important corporate initiatives.
Clearly, there are a lot of components involved in running a business that does data science well. But as the buzz surrounding platforms becomes steadily louder, it’s my belief that these tools will become a vital ingredient in the recipe for overall business success. Having the ability to iterate on live data models, share code, and push results to other departments doesn’t just affect the reports that land on your CEO’s desk — it informs product development, helps optimize marketing decisions, and much, much more.
*(Full disclosure: We wanted to take stock of the market because we offer a Data Science platform.)
Like this article? Subscribe to our weekly newsletter to never miss out!