5 Stages of Data-Evolution to become a Data-driven organization

"stages of data evolution graph"
Data Driven

Author : Aina Raj

“Data and analytics are the key accelerants of an organization’s digitization and transformation efforts. Yet today, fewer than 50% of documented corporate strategies mention data and analytics as fundamental components for delivering enterprise value” – Gartner

Though data as a strategic priority is in its early adoption phase organizations are quickly realizing that in order to thrive in this digital economy, data & analytics competencies need to be integrated into corporate strategic planning. A data-driven organization is where data acts as a catalyst to faster and forward-looking decision making.

An organization goes through various stages of data journey before maturing into a data-driven organization like:

Stage 1: Data-Resistant: In this stage using data to find insights is not a mandate for the organization and hence it requires an entrepreneurial mindset to put to the efforts to make data as a priority.

Stage 2: Data-Aware: Organizations are aware of the data being generated and implicitly realize its value. Yet the potential value of data is not fully unlocked, the focus is mostly on collecting data. They may have ERP and CRM systems in place. They are curious about the insights that data can reveal and may consider investing in web and social media analytics also

Stage 3: Data Guided: Data guided companies focus on the analysis of data. What happened with the data? Who bought the data? What did they buy? How often? When? As companies start exploring their data, they would need guidance from vendors on tools and platforms that can help them manage data and make sense out of it.

ETL: Extract, Transform, Load

Data Storage and Data warehousing

Data analysis and Business Intelligence

On-Premise, Cloud, Hybrid, computing

These companies derive “tactical” value out of data. Once the company matures in this stage, they transition towards becoming a Data Savvy Organization

Stage 4: Data Savvy: These companies realize that the value of data is not merely “tactical” in nature but also strategic. They evolve beyond what, how and when to “why”. They start investing more resources into building data best practices internally. This is where the birth of “citizen data scientist” or “self-service analytics” happens.

Stage 5: Data-Driven: When organizations garner invaluable insights on what happened to the data and why they advance in their data journey to understand “what next”. Data becomes valuable at every level and data is made available to all. Every major decision is made completely backed by data evidence and data becomes the language of conversation within teams and between teams.

Data-driven organizations are future thinking, but they need to evaluate whether they are future-ready and future-proofed.

When data is used in every stage of decision making and made available to all, this comes with its own set of challenges handling the 4 Vs of data.

Volume: “Over 2.5 quintillion bytes of data are created every single day, and it’s only going to grow from there”. Today organizations store data in terabytes or petabytes. This volume of data generated needs to be collected, stored, managed, processed and analysed to derive valuable insights.

Velocity: Data is literally being generated at the speed of light! Twitter users generate on average ~500 million tweets/day. Data is processed in real-time, near-time or in batches. Different computing speeds need to be allocated to different levels of processing. 

Variety: The nature of data is structured, semi-structured or unstructured based on the multiple sources from which it is generated, online or offline, humans or machines. Analysis of streaming data is cumbersome and requires dedicated tools and specialized skills

Veracity: “Poor data quality costs the economy $3.1 Trillion a year” Veracity refers to the noises, abnormalities and biases in data. Mining data to ensure quality and sanity is a complex affair and one of the biggest challenges compared to the other three Vs.

As organizations grow, they transition from data-aware to data-driven. With the sheer volume, variety and complexity of data increasing exponentially, the real challenge lies in collecting and handling different types of data from multiple sources, maintaining data quality and making it available to all for real-time analysis.

A plausible solution to address this challenge permanently would be setting up a data warehouse.

A data warehouse can act as a central repository that integrates all the data available from multiple disparate sources. A wide variety of business analytics can be performed in a DW and different functional departments within the organization can use it to derive actionable insights.

Traditionally organizations have been using an on-premise data warehouse. But the more data-driven an organization becomes; the professionals tend to rely on advanced analytical tools to derive predictive and prescriptive insights. Unfortunately, these tools are so advanced and intuitive that a traditional DW fails to deliver the best insights and support new technologies.

A cloud-native DW can handle not only NoSQL but also semi-structured data and deliver the best insights without constraints in scalability and performance. A modern data warehouse like Snowflake allows separating storage resources from computing resources making DW a flexible, scalable, and cost-effective solution

An organization that is forward-looking and recognizes the value of data assets in strategic decision making should consider investing in a modern cloud data warehouse. The best approach would be to start small, realize the benefits, and expand and scale up slowly across the entire organization.