Data mesh vs. data monolith: which structure is for you?

As your company begins to adopt a data-driven culture, your organizational set-up matters.

Aug 3, 2022 - 4 min read

Try Whaly

Thousands of users rely on Whaly every day to monitor and improve their revenue. Join them now!

It’s the day before a board meeting, and your data team is suddenly bombarded with analytical questions from the CEO, CFO, Head of Product, and Head of Sales. Your team is working as quickly as they can fixing broken data pipelines and running queries, but the number of requests keep coming in, and the 5pm deck deadline is fast approaching. One team of data superheroes are supposed to somehow save the day and answer to all business functions.

While this scenario feels relatable, the occasional board meeting isn’t the only crunch time during which the data team is under pressure. For a business to grow based on informed, data-driven decisions, the constantly growing volume of data always needs to be analyzed and reported on, at any given time. That’s the only way a company can become successful and stay competitive. There needs to be an ongoing view into what’s going on in the organization for any improvements to be made.

This set-up is reflective of what we call a data monolith, and it’s traditionally how most organizations start out when they begin to see the importance of data. A central data team and a monolithic data management architecture are generally easy to set up and seem like a good idea at first. However, this organizational structure can easily become a bottleneck for getting data out to the various functions and business teams who need it. And as you can imagine, the chances of things getting overwhelming - fast - are high. As data volume and demand increases, that’s when you start to see cracks and flaws in the structure.

So what’s the alternative?

Say hello to data mesh, which you’ve probably heard of as a recent trending concept. It refers to a new architectural structure that re-imagines how your company’s data experts are organized.

Rather than the monolithic approach in which they sit together in the middle of the organization ready to be pounced on from all angles, a data mesh architecture is a decentralized approach that splits the data team into different domains and departments. Within each department, you have a designated data analyst(s) supporting that specific business function. In this way, each department can become proactive, self-sufficient, and self-serving, with requests distributed across data analysts. These analysts sit within and are responsible for their respective functions.

A data mesh structure leverages a domain-driven design, each with their own flexible, scalable solutions that correspond to their business domain. Each domain handles their own ETL pipelines, and are accountable for providing their data as products. There’s an underlying data infrastructure that’s responsible for providing each domain with the processing solutions. Then, each domain manages the ingestion, cleaning, and aggregation in order for the data to be used by business intelligence platforms. This is a far cry from the monolithic structures that handle consumption, storage, transformation, and output of data in a single, central data lake.

Data meshes have ushered in the era of self-service data and BI platforms, removing the technical complexity of analyzing data and focusing on individual use cases according to the business function. With self-service Business Intelligence platforms like Whaly, data analysts belonging to any function can sit between modeling, consumption, and exploration, and are empowered to enable the end business users to digest and understand data in an easy, user-friendly way.

You might wonder how governance is tackled in this case, and how to prevent the duplication of efforts if data teams are sprinkled across the organization. Domain-agnostic data is hosted in a central platform that handles the data pipeline engines, storage, and other infrastructure. In addition, there’s typically a governance body that includes representatives of all teams in the data mesh. They agree on global policies and rules, and define how the domain teams must build their data products. Each domain leverages these global policies, standards and components to run their custom ETL pipelines - giving them the necessary support, as well as the autonomy to really own their own process. With the clear definition of global policies and standards, cross-functional collaboration is facilitated.

In short…

Data monolith: your central data team sits in support / “ticket” mode, having to constantly build answers for the business teams. Since this is time-consuming and everyone’s relyingon a single team, it’s not uncommon for business teams to start building things on their own if they can’t get a response fast enough. How do they do this? Most often with spreadsheets. This does not lead to adoption and synergies between the data and business teams, and doesn’t lead to truly data-driven organization where all teams trust in data. It’s also harder for data teams to evangelize and show value from their data and their work.
Data mesh: more autonomy is given to business teams to build their own dashboards and reports, so that data teams can focus on designing and evangelizing the right metrics. The business teams can then begin to proactively “own” and better understand the data that they have the power to play with on their own, fostering trust and true adoption of data. Data teams can also be more proactive - as opposed to reactive - in how they analyze data, not just acting as a support team fielding requests all day.

If you’re an early stage startup, it might be easier to start with the monolithic approach where you set up a central data team at first, but you may quickly find that it’s not a sustainable way to promote data adoption across different business functions, and your central group of data experts will quickly be overwhelmed. If your data team has to handle a large amount of data sources from which they experiment with and transform the data, then you’re likely best-suited for a data mesh model.

In summary, data meshes allow for greater autonomy and flexibility for data owners in each domain, which facilitates greater experimentation and adoption. In this way, business users get value from the data faster, nailing that board meeting and unlocking business growth. In parallel, the burden on data teams is lessened as they don’t have to field the needs of every single data consumer, from any and all business units, through one single pipeline. It’s a win-win situation, and your future self will thank you as your company scales.

What’s your current structure and is it working for you? Get in touch, we’d be happy to hear from you!