The Startup Data Dilemma

Startups need data to grow, but in order to get data, they need to already be growing. How can startups overcome this to find a scalable way to make data-driven decisions?

The Startup Data Dilemma

Startups are comprised of small, early-stage teams full of risk-takers who want to change the world and are ready for anything. This results in a dynamic, determined, fast-growing organization that’s all about testing hypotheses and learning incredibly quickly.

However, Startups are not sustainable. “Startup” is a phase, through which newly born businesses must pass - but it’s transient. Startups need to quickly evolve and grow into larger, more structured businesses in order to reach profitability.

This evolution usually happens at the cost of their initial speed and growth rate, that will inevitably decrease as their life expectancy grows.

In a way, you could compare startups with childhood. Children grow quickly and are able to learn and absorb new things fast. They’re full of potential as they evolve at lightning speed, but they are very fragile and ignorant at the same time.

Over the course of their transition into “adulthood,” Startups and their employees face all sorts of dilemmas and crossroads, one of them being what I call the “Startup Data Dilemma.

The first part of this dilemma is that startups need data to grow into their next stage:

  1. Data is a key instrument for measuring growth: Startups need to track their performance over time to measure their growth. That's obvious, but many don’t go about this in an efficient or productive way. Let’s be real, a spreadsheet maintained by the co-founder can be pretty fragile, and can’t be the best way.
  2. Data is necessary for making the right decisions that will lead to growth: To reach their growth potential, Startups need to make the best, most informed decision that they possibly can. They obviously need data to make the ”data-driven decisions” that are backed by real figures. A startup can only be driven by its founder’s gut feeling, opinions, or luck for a limited time. Proper decisions should be backed by data - and that’s a fact.

The second part of the dilemma is that startups need to be growing already to get data:

  1. Data doesn’t grow on trees. Data is a byproduct of a startup’s growth. The more leads, customers, usage, etc. that the startup has over time, the most data it generates that can be collected and analyzed. So the startup really needs to have grown already - at least a little bit - to have available, relevant data on-hand for analysis.
  2. Data analysis is expensive. This takes resources from your team and is time-consuming to perform. Quite early on, you probably need to have a full-time data analyst on the job. Tooling is also expensive (we’re not talking about spreadsheets here, they won't scale). So we’re asking startups to give up a slice of its scarcest resources: time and money. And don't get me started on open-source being free, those tools require skills and time to deploy and manage them - time that startups think they have, but they don’t.
  3. Most startup founders don't know what they don't know. If you’ve never worked for a data-driven company like Uber, Airbnb, Google, or Amazon, chances are you don't even realize the role that data should play in your organization. You’ve probably watched TV shows and movies that have given you the false idea that companies are run based on their leaders’ intuition and charisma alone. Remember that movies aren’t reality - they tell you this story because it looks better on a screen, not because it’s the truth.

To summarize, startups need data to grow, BUT to get data they need to be growing already.

This sounds like a chicken-egg problem to me.

How to overcome this dilemma?

Startups should to focus on what they are good at, which is speed.

Their choice in data tooling and processes should meet the following criteria:

  • Is this a fast way to deploy a data culture?
  • Does this provides a short path from “question-to-answer”?
  • Can I evolve it quickly to meet future needs?

Let’s dive into each of these below.

Is this a fast way to deploy a data culture?

To deploy a data culture quickly, you need to show business impact quite fast with a small budget (time and money).

Initial results will help create the momentum to convert the startup into a data-driven organization later on.

This means that you should focus on selecting a solution that:

  • Can easily be set up (in a few days/weeks)
  • Can be rolled out to the end user with minimal training and cost
  • Is cheap to get started

Does this provide a short path from “question-to-answer”?

Internal users at a startup need to move quickly to keep up their competitive advantage, so you want to aim for “self-service” solution that anyone can use. You’ll want to optimize for adoption at first.

That means that you should find a solution on which people can easily:

  • Connect new data sources on their own
  • Transform the data with their own pre-existing skills (SQL for technical people, spreadsheet-inspired UI for business people)
  • Visualize and share their findings to the rest of the teams, via platforms on which they already work (in CRMs, Slack, Notion, etc.)

Can I evolve it quickly to meet future needs?

In the future, you’ll have more and more questions that require answers. And more and more people will join the company as it grows.

Hence, you should start setting certain “standards” right away to minimize the time to adoption for future teammates.

This means:

Get a Data Warehouse from the start - BigQuery, Snowflake or Redshift, not a “black box” solution on which you don’t own

→ Data Warehouse are the most common installation options of data tools (connectors, transformation tools, quality control, catalog, BI, ML, etc.), where all your data can be easily centralized in one place. By investing in a Data Warehouse from the early days, you’re doing yourself a favor for tomorrow, to serve your future needs.

Push for SQL usage whenever possible

→ SQL is the “lingua franca” of Data. It will be mastered by your new hires and by the tools you’ll implement in the future, and training for SQL is widely available. All logic written in SQL can be easily migrated from tool to tool, and maintained by future coworkers.

Get a solution that has “governance” features baked in

→ When people start working with a lot of data, different numbers will pop up, and some sensitive data will emerge (finance, for example). These problems are solved through strong data governance which can be achieved by centralizing the definition of metrics, giving the proper user access, etc. Your tools will need to adapt, and it’s best to opt for a tool that is strong in this field by design right away, to prepare for your future growth.

Solving the “Startup Data Dilemma” is hard, and it’s getting increasingly difficult as Data hires are becoming more and more expensive. Data competency is scarce, while the demand is higher than ever before.

However, as the success of recent Startups have shown us (AirBnb, Uber, Facebook) - if you solve this data dilemma early on and make it a priority, the future of your startup is bright!