In a previous post, we talked about how we formed and scaled Data at Monzo. One key thought was the desire to move fast but also create re-usable data assets that are owned by Monzonauts. The thoughts shared in this blog post also set in stone how important self-serve is to Monzo as a business. We need a data culture such that every engineer owns the data they're producing and further understands the importance of this data. Everyone is free to draw insights without a gatekeeper blocking their progress.
ETL is everyone's responsibility, we don't have a Business Intelligence team
While self-serve is still one of the core tenets of data at Monzo, it has evolved. The 2019 view of Monzo discussed the challenge of maintaining quality and the desire to avoid "The Tragedy Of The Commons". The ongoing battle to move fast but maintain quality and enable everyone to be autonomous led us to form a new team.
A lot has happened since that first article — we've launched Monzo Business, Monzo Plus, Monzo Premium and many other amazing features. To support these products and features, the data function has grown from 20 people to over 80 Monzonauts. In this post we will share more recent thoughts and discuss the challenges that we've faced. We'll move on to cover the tactics we've used to overcome the roadblocks along our Data journey.
🤕 Growing Pains
ETL is everyone's responsibility; but not everyone is an Analytics Engineer
Every member of our data team knows how to write high quality SQL, but not everyone is a professional data modeller. This is normal! Shaping data is not everyone's top priority, and shaping data well is a job in itself.
When we were a small team, it was easy to enforce standards across our models. But as we've grown, we have identified the need for a backbone of standards and principles upholding modelling. The aim of this is to keep our Warehouse healthy and suitable for self service.
Shared ownership is not so shared anymore
When you're part of a small team, you know how everything's connected and you understand the context behind design decisions. If there's a change, you know how it will impact downstream components. You have full control of how to model the data and you can be on top of everything.
But things change - we have more than 4000 models in our dbt project, and the number of data assets within our Warehouse is continually growing . More and more Monzonauts join the data discipline (while others continue their career outside Monzo) and most importantly, we have specialised squads creating domain-centric models. All of this means that things start to get tricky - it becomes difficult to have a holistic view of how things are connected. Ownership starts to be siloed across specialised teams and maintenance of shared models becomes a grey area.
The right balance between engineering and analytics
In a small team you have to practice what you preach when you talk about the concept of shared ownership. By this I mean a Data Analyst might need to wear the Engineer hat sometimes to try to unclog an ETL pipeline (The many hats of a Data Analyst). This still holds true at Monzo, everyone changes hats from time to time, but we're now reaching a degree of complexity which requires deeper specialisation and where changing hats is no longer an easy task.
Data Analysts and Data Scientists need time and space to understand the businesses problems and perform all kinds of exploratory tasks to deliver good work, ensuring that stakeholders are able to make the right decisions based on quality data. Their core competency is to drive business decisions. By removing any gatekeepers for ETL processes we are ensuring they can deliver high value analysis in an agile and autonomous way. However, our Warehouse is reaching a level of complexity where modelling the data correctly also requires time and scoping if we want to keep it under control. Requirements change and even the best model will need refactoring in the future! In this scenario, team members swapping hats is no longer efficient - Data Analysts can't afford to deviate from their specialism.
👶 A New Team Is Born
In 2021 we began to form an Analytics Engineering team. The idea behind this was to iterate on self-serve, not discard it. Like any concept in technology, if it can be improved then we'll make incremental progress towards greatness.
Our goal of forming this team is to:
Provide standards, best practice and ways of working that enable others across the business to build quality data assets in a maintainable way.
Embed within squads to provide the muscle required to deliver complex data projects.
Take ownership of data that is crucial and common to many projects, bringing more rigour where it is required.
This team doesn't exist to disrupt self-serve but to complement it.
The team is currently made up of five engineers. Each team member has their own unique set of skills but all members have an eye for detail and are rockstar data modellers. Another common pattern within the team is an engineering and coaching mindset - we aim to support and give guidance to our Analysts with the best software engineering best practices. We often act as an interface between engineering and data, shaping ideas such that they are reliable, testable and maintainable.
🗣️ Hub and Spoke
Monzo is organised into a number of Collectives (think Tribes) where the default model is to embed data people in a decentralised way (again, similar to the Tribes and Squads model).
We could quite easily continue this model and permanently embed Analytics Engineers within Collectives. However, this may make it difficult to curate standards and more centralised models. This would remove the possibility of oversight over the discipline; Engineers could develop deep domain knowledge and so would help the squad move fast, but we'd still fall into the same traps as we'd seen before. Centralising work is another possibility, however this would make prioritisation very difficult and the team may end up as a bottleneck.
With the positives and negatives of both approaches in mind, we've decided to adopt a hybrid approach - the Hub and Spoke model, a solution that is not new at Monzo as the Machine Learning team has also adopted.
Using this model we will maintain a small, central pool that will provide the centre of excellence function that we need, but also have the oversight to own key data models. From this central pool, some members of the team will temporarily join other collectives, implementing more specific models and providing data expertise, at the same time feeding back concepts to the hub. This will mean:
We ensure consistency of standards across our Data Warehouse
We can unify our support and maintenance mechanism
We can add more people to a project that requires them
We ensure we develop standards across everything we do in the Data Warehouse
We can teach people across the business to fly solo when embedded within a Collective.
🚀 Beyond Self-Serve Analytics
In this post we've discussed the continuation of our Data journey at Monzo. The challenge of making money work for everyone never ends. We're continually pushing that goal forward by providing data driven projects in a responsive way.
We've learn't a lot over the past few years and we're still learning. We've enhanced the technologies that have been adopted and reorganised ourselves into different structures and operating models to be more efficient.
We believe that by adopting a hybrid of a centralised and decentralised model for data modelling expertise that we'll be able to move fast but also provide high quality data assets. We're also realistic in that we'll continually monitor and adapt this approach as we go, ensuring we're best equipped to ensure delivery of first class data assets.
If you’re interested in being part of this amazing new team, we have a number of positions open for the following roles: