Machine Learning at Monzo in 2022

Over the last two years, I’ve written posts about Machine Learning at Monzo (in 2020, and in 2021). As the adoption and impact of machine learning inside of Monzo continues to grow, I thought it is time for a refresh – here is our 2022 update!

ML is now fully embedded inside of Monzo’s structure

In my 2021 post, I described the hub-and-spoke model that we were using for the machine learning folks in the company. We were moving away from being a single centralised team, and balancing between having a common place to work, building our platform, having reporting lines (in the ‘hub’) and being embedded inside of specific areas of Monzo (in the ‘spokes’).

Since then, each area of Monzo has grown so much that the entire Data discipline was restructured in mid-2022 to better fit with Monzo’s needs. We now have a Data Director who oversees all types of Data - from analytics engineering through to machine learning - in each area of Monzo (Operations, Financial Crime, Personal Banking, etc.) and everyone’s reporting line in that area leads towards that Director.

Machine Learning is no different: everyone who has the words 'Machine Learning' in their job title reports now into a Head of Machine Learning for that area (who, in turn, reports into the local Data Director). This org change coincided nicely with me taking a sabbatical that folks with four-year tenures are given; I decided to pivot towards a more hands-on role on my return so I returned as a Staff Machine Learning Engineer in Operations.

More importantly, the use cases and interest in machine learning at Monzo has now gone beyond a single discipline. We have folks across Engineering, Data, Decision Science, and Machine Learning who participate in training models and building systems that use or enable us to use ML more effectively. Because of that, I created a cross-disciplinary Machine Learning Steering Group that is currently setting the agenda for our near-future platform requirements and sharing common pain points from different parts of the business – what it means to do machine learning at Monzo now varies depending on what part of Monzo you’re talking about.

We also have many new faces in the team. As in previous years, here’s a snapshot of everyone who is in the machine learning discipline / area / steering group in the company – I think this might be the last year that I can fit faces onto one picture!

Some of the people involved in machine learning at Monzo in 2022

Team growth also means that we're less frequently all in the same room together, and so sharing our projects and ideas across the discipline becomes harder. We have also set up mini knowledge sharing sessions (following the PechaKucha format) where anyone in the team can share insights, methods, or just a cool thing they learned about, in a short and snappy format.

ML problems – new and existing

As Monzo continues to grow, we now have areas of the company where machine learning is new and being bootstrapped, as well as areas where machine learning models already exist and need to be improved and maintained. Both ends of this spectrum exist inside of the company, and this means we're doing a wider range of work.

On the former, for example, our new experimental customer demand forecasting system is currently running using BigQuery ML, we’ve been exploring the applicability of speech-to-text models, and we’ve shipped early experiments for predicting outcomes related to customer experience and auto-complete search queries.

The latter is an exercise in retraining and re-evaluating new models – using historical data, experimenting with features from new data sources, running shadow deployments, tuning thresholds, and ensuring we have the monitoring that we need so that systems can continue to perform as they were originally designed to do.

Outside of Monzo, interest in artificial intelligence is increasing as new, wonderful models like stable diffusion and chatGPT are released. This excitement has made its way into the company, and we now think deeply about both whether we should pragmatically apply ML to a given area and what new opportunities recent developments in ML are bringing to the table.

Our ML platform continues to evolve

Earlier this year, I wrote a blog post about Monzo’s machine learning stack. Now that our structure has changed, the owners and contributors of different components of our platform are also shifting. For example, we’ve been collaborating with the Security team while they have been improving how we manage Python dependencies in our machine learning micro services, and the Data Platform team now co-owns some of our platform services, like our feature store.

We have also continued to add things into our systems to make them safer and more observable. These range from quality-of-life improvements (for example, people who run jobs in our AI Platform get Slack notifications when their job finishes), through to more automation (we have a code generation tool to add fields into our production feature store) and more control (we can schedule the same cron job to run multiple times with different arguments).

One of the major topics that we are currently exploring in this space is that our approach to ML, which builds on our data and production stacks, has a fairly steep learning curve. It requires folks to onboard into the tools used across data and engineering, and jump between the tools that exist in both of those worlds based on what we are trying to achieve. On the one hand, this makes sense: getting ML models from idea to production requires data and engineering tooling and we built our workflows at the intersection of what existed already. On the other hand, this may hinder further adoption of ML. Our new ML Steering Group is currently looking into how we can make the ML developer experience faster and safer, without compromising on our foundations and how tightly we integrate with other platform teams at Monzo.

What are the key takeaways?

As always, it’s been tricky to write a post that I feel does justice to all of the work that is happening behind the scenes.

We’re no longer a machine learning team in a company – machine learning is an established tool that is used by many disciplines to help teams reach their goals. Being established means that there is more work to do: maintenance, re-training, re-usability, and governance, and more people who are interested in that work: ranging from executives through to product managers. But, at the same time, there are areas of the business that have no ML presence at all. In short, part of maturing ML at the company means that it looks quite differently based on your vantage point.

The next phase of our growth is not just about hiring and growing: our new inflection point is about how we can maintain our speed and autonomy, while safely increasing our impact and building on each other’s work as we have multiple ML folks spread across the entire company.