Data is the new water
Such is its value and importance to driving the global economy, it has often been said that data is the new oil. But it is an imperfect comparison: oil is scarce and finite whereas data is an abundant and infinite resource. Moreover, the idea of comparing data with fossil fuel at a time when sustainability is more important than ever seems outdated, while the most effective data is clean data.
Data has also been likened to gold, both hyper-valuable commodities that have to be refined and analysed from a raw state for their ultimate potential to be realised. But this, too, is a flawed analogy, with data being easy to obtain and its supply increasing. Given its importance to the global economy, data is inarguably more valuable than gold.
While previous attempts to come up with a perfect physical parallel for data have come up short, market research firm International Data Corporation (IDC) and the business intelligence organisation Qlik believe they have come up with the perfect comparison: data, they say, is the new water.
It is an analogy the two companies recently ran with when they collaborated to deliver a new report: Data as the New Water: The Importance of Investing in Data and Analytics Pipelines, which involved speaking to 1,200 companies in an attempt to find out how closely data is tied to business success.
“We started thinking of water as a useful metaphor for data because you see terms like pipelines and data lakes being widely used and understood. The water idea has been part of the lexicon of data professionals for a little while,” says Dan Vesset, Group Vice President, Analytics and Information Management at IDC.
“Like water, data needs to be accessible, it needs to be clean and it is needed to survive. Oil had its use and a lot of economic impact but data is bigger, it has a greater economic effect. We believe data as the new water has a far better social context and helps to elevate the value of data in the executive suite.”
Unsurprisingly, the study found organisations that strategically invest in creating data-to-insights capabilities through modern data and analytics pipelines are seeing significant bottom line impact.
Companies in the survey with the highest demonstrable data-to-insights capabilities overwhelmingly reported that strong data pipelines are driving better decisions, resulting in significant bottom line impact. Stand-out results included 76% saying operational efficiency improved by an average of 21%, 75% saying revenue increased by an average of 21% and 74% saying profit increased by an average of 22%.
“We looked at four steps and we ended up calling that data-to-insights, which encompasses identifying data, gathering data, transforming data and analysing data,” Vesset reveals. “What was really interesting to me was that if you look at the correlation of each individual step to better decisions there was a positive correlation.
“One of our data statisticians then looked at creating a composite variable combining all four steps and that correlated with the decision making being even better. To me, that says that you need to really focus on all of those steps and that led to the pipeline, water and leakage metaphors.
“It doesn’t matter when they leak, the result is going to be the same. You’re not going to get enough fresh water out of the other end if there is a leak somewhere – the whole thing has to work together and you have to invest in everything.”
The challenges to deploying data pipelines that drive better decisions and business outcomes are stark, however. The results show that many companies find themselves inundated with data, struggling to maximise its value as it flows through “unintegrated and leaky pipelines”.
“Many companies don’t have the right processes, governance and technologies in place so are spending 80% of their time preparing, cleansing and gathering the data, and just 20% on analysis – the ideal situation is 80/20 the other way,” Vesset comments.
“Others are making decisions based on bad data, which has costs connected to it. That can be incorrect product recommendations, treating customers in a suboptimal way because you don’t have a lifetime view of them or personalising offers that you can’t actually fill because you don’t have intelligence about your inventory.”
Organisations, the study found, are dealing with complex and varied data types and sources, which can leak through data and analytics pipeline gaps. Of those surveyed, over 60% experienced significant challenges in assessing the value of data and identifying valuable data sources, often due to a lack of data cataloguing.
And over 42% surveyed identified assuring data correctness as a main challenge when processing or transforming data for analysis.
Vesset says that there is “an element of resignation” over data quality being the top challenge when it comes to gleaning valuable insights. Companies must, he believes, waste no time in seizing the initiative by meticulously collecting and tracking ever-increasing datasets to ward off leakages.
“It is an issue that has existed for decades, and a perennial challenge,” he says. “Over the last several years that data environment has become hugely more complex, there has been a move away from data warehouses to data lakes and various cloud environments.
“Enterprises have many more applications that can be in a number of locations and different clouds, and there is more real-time data as opposed to being collected at the end of the day. As that data environment complexity has increased, the ability to identify where all the data is and when it was updated, who updated it, and the lineage of the data becomes even more complex.
“It might seem trivial but the cataloguing and taking of inventory is a really important first step. Then it is about the gathering of data, and knowing which data sources can be combined to deliver real insights. For example, can I get weather data from a government source and combine it with my logistics data so I can optimise my transportation routes? Can I add traffic data to that?”
A lot has been said down the years about the importance of collaboration between IT and business, but a new dynamic is emerging. The importance of getting data strategy right has never been greater, and is resulting in more companies adding executives solely focussed on data in the form of Chief Data Officers or Chief Analytics Officers.
“These roles are being elevated, you can’t just say IT and business, you need that third person,” says Vesset. “The teams and the leadership in those teams needs to recognise that, and see that the analytics and data function is a stand-alone service, not something that is just distributed across a few analysts.”
Increasingly, those data and analytics experts are considering how various technologies can aid them in ensuring datasets are accurate and suitable to be mined and analysed for insights.
Any technologist worth their salt will want to consider how artificial intelligence (AI) could help them, but Vesset is cautious about its suitability for the specific problem of data leakage, and is steadfast that the best results come from a combination of man and machine, at least for the time being.
“There is so much noise and marketing around AI that some people think it is the solution to everything, but the reality is that it is still pretty rudimentary. When it comes to complex business decisions it is still really in its early stages, so there has to be this interplay between people and machines,” he says.
“It is important to start looking at technology that incorporates some form of automation and increasingly that means some level of machine learning. That doesn’t mean that the engineers and business analysts need to have PhDs in statistics, data science or AI, because the software itself can start looking at the patterns and see where there might be data quality issues.
“It can recommend fixes and options to take where there are problems, giving the analysts the option to decide. Eventually after time there will be enough learning that it might be possible with closed loop automation where it starts correcting some of those data quality errors itself, and that would also work with data visualisation as well. But it is still early days.”
With IT spending taking a significant hit because of the COVID-19 pandemic, some companies might be reluctant to invest significant sums into their data strategies and technologies. Research from IDC shows a fluid situation, with the most recent figures from the U.S. suggesting a third of organisations still say they will increase budgets for this type of technology but 52% will cut spending, up from 40% in March.
“We are entering a world where the advantage of having better enterprise intelligence is going to become more and more real, and we see that already with the likes of Netflix and Amazon that are heavily data driven, while companies like McDonald’s and IKEA are actively going out and acquiring companies to help them in this area,” Kesset concludes.
“There are some sectors like travel that are in crisis mode but outside of that maintaining some investment in data and analytics will position businesses better for the recovery, I have no doubt about that.”