The data gold mine: How to get the most from your data
By Filip Verloy, Field CTO EMEA, Rubrik
Think back before that first sip of coffee this morning. What did you do? Google the weather? Send a birthday message via Facebook (and delight in a cat video while you were there)? Whatever your morning ritual, it’s almost a given that you began creating data the moment you opened your eyes.
And you’re not alone. Whether it’s the 40,000 search queries performed every second on Google or the 300 hours of video uploaded every minute to YouTube, it’s easy to see how 90% of the world’s data has been generated in the last two years alone.
That’s all before you reached the office and the real data generation began.
Data, data, everywhere
There are 2.5 quintillion bytes of data created each day at our current pace. What’s more, this data is more dispersed than ever. It’s being generated across several locations, be it an on-premise data centre, public cloud or branch offices. It’s also increasingly difficult to move this data around, and recover the data you need, due to the sheer amount of it.
For some, it can be challenging enough to bridge the gap between their existing islands of data. Unfortunately, creating one unified system of record that acts as a single source of all your data is the crucial first step before you can start to do more interesting things with it.
Simply managing data is no longer enough. If businesses want to survive - to thrive - they need cloud data intelligence.
Filip Verloy, Field CTO EMEA, Rubrik
Won’t take my word for it? Last year the Economist dubbed data the world’s most valuable resource, overtaking last century’s oil by some degree. And just like those drilling for oil came up trumps in yesteryear, those drilling into their data, and not simply ‘storing’ it, are leading the charge today. Take Amazon, Apple, Google and Microsoft, who through recognising the value in their data, are harnessing it to gather a serious competitive advantage.
It goes without saying that using data should always be done responsibly and with people’s consent. As the Cambridge Analytica scandal has proved, people’s trust in brands can very easily be dissolved when foul play comes to light. With that being said, those who aren’t utilising the data under their noses are going to lose their edge, and customers, to the competition.
Getting meta about data
But even if you’re ahead of the curve, drilling into your data is a demanding task. You need to truly understand your data if you plan on using it effectively. You, my savvy friend, need data for your data.
Metadata is a set of data that describes and gives information about your actual data, such as where it is stored or which application created it. Adopting a cloud data management platform that creates metadata as standard allows businesses to manage and analyse their data at scale.
And that’s not all it enables you to do. Decisions can be automated with algorithms, machine learning allows a business to establish a baseline of behaviour and identify anomalies, and artificial intelligence means it becomes smarter over time. Crucially, a robust and secure cloud data management platform does all of the above in the background.
The result of this? The most immediate effect is the ability to radically improve the way data is governed, allowing for more efficient storage. Got some sensitive data, like credit card information? You can set policies that automate the system so that sensitive datasets are always kept on premises and cannot be moved to the public cloud.
Understanding which applications are creating each piece of data allows you to treat it accordingly. By grouping similar data types, e.g. databases versus file servers, you can more intelligently and cost effectively dedupe and store this data globally.
"Simply managing data is no longer enough. If businesses want to survive - to thrive - they need cloud data intelligence"
Historically, this process involved manually pooling data source channels, dumping them into a shared storage location and then combing through the data to identify duplicates or find similarities in the data. Metadata does this efficiently, intelligently, and independently. All the way helping you use less expensive storage space.
Moving data from the edge
Another growing data source comes in the form of IoT devices. On factory floors, for example, machine sensors generate significant amounts of data, too numerous to be easily managed. You need a way of either processing this data at the edge, or efficiently transferring it to somewhere more central to process it there. Metadata allows for this, and ensures the process is fast and efficient.
IoT devices aren’t the only ones creating data at the edge. Increasingly, the world’s data is created remotely and as a result it’s difficult for organisations to intelligently move that data around - especially with any haste.
And of course it wouldn’t be a conversation about data without mentioning regulation. With metadata, data is indexed, making it easier to honour the right to be forgotten. You can easily locate a customer’s data across your organisation and ensure it’s instantly removed. The uses for metadata in this capacity will be exciting to watch as regulation continues to move into the spotlight.
From a security aspect, metadata makes it easier to identify and recover from ransomware attacks. By monitoring the data being generated and establishing a baseline of behaviour, it allows you to identify any sudden change or anomaly. On seeing that change - e.g. certain files suddenly being deleted or tampered with - businesses are then able to identify exactly which data has been affected. From there it’s easy to revert back to the last known set of safe, untampered with data in a matter of clicks.
Previously this would have been a laborious task, trawling through each individual piece of data, identifying if it had been affected, and replacing it with old copies, not to mention leaving chance for human error. Thanks to metadata, this process is totally automated.
From governance to security, none of the above is possible without metadata, and it saves a serious amount of manual labour, time and money - resources that aren’t going spare in any of the businesses I’ve worked with!
Our industry is forever looking to the future, to the next disruptive technology, but is equally guilty of overlooking what it will take to get us there. Tim Berners-Lee theorised that “data is a precious thing and will last longer than the systems themselves”. I’d agree; not only will data outlast the systems of today and tomorrow, it’ll be the vital lifeblood powering them along the way. The sooner businesses acknowledge this, the sooner they can use their data to look inwards, bolstering every aspect of their organisation.
To paraphrase the old adage: take care of the data and the pounds will take care of themselves.