David Socha takes a short break from discussing Digital Twins and returns to his core business of integrated data & analytics. As this month’s title suggests, he has a surprising message.
In the never-ending quest to turn data into useful, actionable information, many companies still struggle to establish any semblance of a standard data architecture and toolset to deliver the consistent answers their business needs.
Meanwhile, analytics has become a $200 billion industry, fuelled by Digitalisation; Big Data; and a multitude of new and apparently “free” ways to analyse it all. This explosion of options has often only served to add complexity and uncertainty for companies hungry to invest in real answers to their most pressing questions, not in just more analytics.
We’ve been here before
The current analytic dilemma is analogous to what we could call The Original Data Problem1 in the early 80’s. Back then, people had a challenging time trying to store and analyse large volumes of data simply due to the limits of available technology.
Conventional wisdom was then to push data to functional silos – data marts – that would allow individuals to run their own analytics without having to worry about scaling to an enterprise level, or adversely affecting SLAs on operational systems. Circumvent the problem and build point solutions. Get the data from wherever; store and manage for your purpose; move on.
We now all understand the issues with that approach: duplicated data sets telling different versions of the “truth”; massive complexity in manual cross-functional reporting; slow and expensive change; high management and maintenance costs…the list goes on.
These problems were solved with the advent of the Data Warehouse. It was designed to effectively manage data at scale to enable enterprise analytics. It showed that by integrating data in a relational model we could allow any question at any time for any function. We solved the problem by understanding that silos do not and cannot provide a full view of the business. Now, hold that thought.
The rise of the analytics silo
So why aren’t all analytics run on the Data Warehouse? Well, while the best traditional Data Warehouses are still delivering fantastic performance, meeting SLAs and saving companies around the world millions every day, typically they simply don’t have the capability to support all the modern tools, languages, analytical engines available today.
A SQL engine is no longer enough. What about Machine Learning and Graph engines? And what of languages? Can I run R against the data in my Warehouse?2 Or Python? Or whatever new entrant to the market appears next week?
Also, it’s often no longer the case that all the data a business needs will be – or should be – in a traditional Data Warehouse at all. In certain circumstances, it will make much more sense to store certain data sets in a cheaper, less performant environment such as Hadoop, or Microsoft’s Azure Blob Storage.
And so, when companies want to extend the analytics community to include Data Scientists and Business Analysts, there to find innovative new opportunities or answers to the most pressing business problems, they don’t immediately turn to the traditional Data Warehouse. Understandably, they take the path of least resistance and create new, isolated analytic silos, often buying new tools for every new problem.
But remember when I said earlier that silos do not and cannot provide a full view of the business? Here we are again. And all the same issues apply.
The analytical platform
To break down those silos once again and invest strategically in answers, businesses need a new, comprehensive analytical platform capable of supporting today’s – and tomorrow’s – multiple analytical tools, languages and analytical engines. One that’s able to seamlessly access data in a variety of storage media, and scale to enterprise level.
Such a platform must have its foundation in secure, scalable and highly accessible data storage optimised for analytics. Data must be accessible by many types of analytical engine, from the familiar SQL to the more modern and potentially esoteric, such as Google AI’s Tensorflow.
Similarly, Data Scientists must be able to work with their favourite and most applicable languages and be able to apply new languages as they appear. Analysts familiar with SAS and similar applications must be able to continue working with them.
Beyond this, Business Intelligence users rightly entirely unaware of the architecture they are accessing must still be able to run their reports on tools such as Business Objects. Line-of-business applications from vendors such as Siemens or GE must also be able to access this single source of truth, enriching the capabilities of their tools with the same secure, trusted and accessible data as is used by the Data Science community and the Chief Executive. This is modern, Pervasive Data Intelligence3.
Let’s put it another way. Typically, I’m not one to use contrived analogies from the consumer world to highlight a point in our very different asset-centric, industrial environment. But here’s one that I’m happy to make an exception for: well into the 21st, Century, nobody wants separate devices for email, music, navigation, photos, web searches and the like. We all want a single platform that integrates all these capabilities – and just as importantly, is ready to support and deliver the new capabilities and applications we want, whenever they are developed.
Why don’t you already expect the same of your analytics environment? Stop buying analytics. Invest in answers.
1 I just made that up. But I’m pretty comfortable with a loose definition for The Original Data Problem being an inability to do what you want to do with available data sets due to the limitations of the existing technology. Apply it to the WW2 codebreakers at Bletchley Park for instance.
2 Personally, I most certainly can’t. But that’s more to do with my lack of Data Science skills.
3 I didn’t actually make that one up. Learn more about it here.
About the author
David Socha is Teradata’s Practice Partner for the Industrial Internet of Things (IoT). He began his career as a hands-on electrical distribution engineer, keeping the lights on in Central Scotland, before becoming a part of ScottishPower’s electricity retail deregulation programme in the late 1990s. After a period in IT management and consulting roles, David joined Teradata to found their International Utilities practice, later also taking on responsibilities in Smart Cities and the wider Industrial IoT sector.