Proton

Last year at Proton, we migrated from a polyrepo architecture to a monorepo architecture to make it easier to manage the packages that are part of our front-end web application stack. We’d been facing problems for some time, and after considering our options, we decided that a monorepo would be the most suitable solution. This article explains the problems we faced with our polyrepo setup, explores the benefits of a monorepo setup, and describes our journey from polyrepo to monorepo.

Before going further, when I say “polyrepo” and “monorepo”, this is what I mean:

  • Polyrepo: A system of source code modules that have dependencies between each other but are separate version control repository instances.
  • Monorepo: A system of source code modules that have dependencies between each other but all live under a single version control repository instance.

I’m going to say “Git repositories” or just “repositories” instead of “version control repositories” from here on out. And, to be clear, Git is not a prerequisite of monorepo architecture.

The beginning

Proton started with an email client, Proton Mail(new window), as its sole application but has since evolved into a privacy provider offering a broad range of products, including web applications for Proton Mail, Proton Calendar(new window), Proton Drive(new window), and the Proton Account that links them all. Adding new applications to our stack has caused the number of Git repositories we maintain to grow proportionally, with one repository per application. However, we created repositories beyond the ones required for our applications. As you might imagine, our apps have to share the same functionality, look, and feel, even if they are different products. It follows that we used repositories for code that was shared between products.

As an example, we used to have a separate repository for shared React components. This was the result of a natural evolution of our existing systems. However, sharing code across codebases became increasingly complex as we added more applications and products, making it hard to manage packages under this multi-repository structure. There are several reasons this system didn’t scale well.

Our main issue with our polyrepo

During and after our transition to a monorepo, we started seeing how we could benefit from its architecture. However, one issue in particular — the unnecessary and wasteful replication of administrative tasks — drove us to look into this monorepo option in the first place. Whenever a feature implementation required changes across multiple projects to complete (e.g., adding a React component for a new feature inside the Proton Mail application), administrative tasks in Git were highly impractical to execute. To prepare a single feature, we had to mirror Git operations — branching, committing, opening merge requests, reviewing, rebasing, etc. — across many repositories.

We then came across the idea of “atomic changes”, which resonated with us, even if it represented a shift in our philosophy. The main idea behind atomic changes is that instead of having changes scoped to a technical concern of your project(s), you scope changes to their semantic group as chunks of modification to your product’s functionality. There’s no reason to split up changes that intrinsically affect our shared UI components and (for example) the Proton Mail application if they all address the same concern. Such semantically connected changes should be:

  • Grouped under the same change, diff, and commit
  • Reviewable simultaneously (not in two separate merge requests)
  • Revertible as one unit.

A monorepo allows us to achieve this as it naturally supports atomic changes in the form of Git commits.

In the polyrepo, testing code before accepting and merging it to the main branch also proved challenging, especially from an automation CI/CD point of view. Builds had to include versions of dependencies not on the main branch of their respective repository. Nonetheless, with some CI/CD hacking and trickery, we could get the job done, and it was possible to send features through the development lifecycle successfully.

We also weren’t using semver and registry hosting to version our packages (and still aren’t), which would have been one way to address some of these issues. However, semver would have been far from a silver bullet for our needs, and it comes with its own baggage, such as complexity around managing hosted packages, publishing them, and versioning them on consumption.

Polyrepo repository architecture has many other minor, inconvenient quirks given our needs. I’ll go into more of the problems we faced while discussing the advantages of our monorepo. For more context, our polyrepo architecture presented problems besides developer experience, including inherent technical issues. One tangible example was that we couldn’t perform rollbacks to previous versions on a cross-repository basis. If a new feature that affected multiple repositories was merged and then turned out to have an issue, it was challenging to perform rollbacks automatically as no single operation could perform a rollback on separate Git histories simultaneously.

These issues were slowly piling up, and it became apparent that we needed a solution. After some consideration, that solution turned out to be migrating to a monorepo architecture.

Weighing our options

With the decision to migrate locked in, we had to devise a plan.

At that time, we had about 15 developers on the Front-end team working on our web application stack. Additionally, many people from other teams, such as Crypto or Back-end, would also frequently contribute to our repositories. Having many people actively working on these repositories meant that the physical migration would need to happen fast, and the implementation would have to be robust once we were on the other side. Otherwise, we risked blocking our colleagues’ work for an extended period of time.

To ensure a robust implementation, we spent quite some time researching different tools and experimenting with proofs of concept. We would check how one option felt or if we could get it to behave as we wanted it to. We explored different package managers (specifically, npm, yarn, pnpm), semantic versioning with a hosted registry, different types of dependency installations, lockfile management, and more.

In the end, we decided to go very bare bones. We chose Yarn (Berry) and Yarn Workspaces, a single lockfile in the root of the monorepo, no semantic versioning, and no zero-installs. We arrived at these decisions because we wanted as little overhead as possible, mature tools, and for our team to already be familiar with said tools.

All the potential benefits of a monorepo

A key moment during our research on monorepos was realizing that, while this architecture would certainly deal with the problems we were facing, these systems offered so much more. Monorepos provided many benefits we hadn’t necessarily considered, most revolving around developer collaboration.

We argued that monorepo architecture would incentivize people to collaborate on projects they don’t necessarily own by making all of the code visible, thus empowering developers to implement simple fixes. Instead of being forced to look for help because you’re looking at a black box, you might be able to implement a necessary change yourself since all of the code would be easily accessible.

Monorepos would also likely make large-scale refactoring a possibility, as we would be able to change huge parts of different projects with unified commits. Since all of the interdependent source code would now be hosted in the same Git repository, the availability and file system location of any piece of code would be predictable. That would make it possible to provide utilities for performing any action necessary to work with the monorepo locally or in continuous integration (CI), e.g., environment configuration, dev-servers, builds, checks, automated sym-linking, lockfile management, and more. We were pretty hyped about it, to say the least.

After arriving at a monorepo blueprint that we were happy with, we put together a presentation for the rest of the team, presented our findings and proof-of-concept, collected feedback, and iterated upon it. We wanted to make sure that we wouldn’t create a setup that someone would be unable or unhappy to work with. It was well received, and we decided to move forward.

The physical migration

As we prepared to migrate, our main objective was to avoid disrupting ongoing work. We wrote a script that would take all the existing repositories from our polyrepo setup, merge their Git histories into a single history, and fill in the gaps necessary to realize the full monorepo. This script could generate our entire monorepo at the execution of a command, which meant that we could create the monorepo at any instant, no matter what state the polyrepo was currently in. This was much better than having to shut down development while we manually built the monorepo from the polyrepo.

The full implementation also saw a complete rewrite of our CI for all the app and package checks and deployments, which was quite a big part of the transition. Exploring how to adjust and write CI for a monorepo will be covered in its own article at a later date.

Once everything was ready and set up, we set a date for the migration: a Saturday. We chose a weekend day so people could go home, leave their work behind on a Friday, then come back the following Monday and find what they had been working on now inside the monorepo.

At this point, we considered the polyrepo deprecated because we didn’t want to maintain multiple conflicting Git histories continuously. To ensure that no work got lost, we compiled a list of all the active branches people wanted salvaged and ported over (we added support for this in our monorepo creation script).

On the other side

As unrealistically ambitious as the plan sounds on paper, it worked out for us quite smoothly! During the first week after the migration a few pipelines failed, and some incomplete bits of code were left behind in the polyrepo setup and had to be ported over manually post-transition. Apart from these and a few other minor hiccups, everything went well. Nobody was seriously blocked from continuing their work, and now that the migration is complete, nobody has looked back.

We’ve discovered the monorepo offers even more benefits than anticipated since the migration. It’s much easier to onboard people to our codebase now, thanks to the one-click type setup on a local development environment. A small internal community has developed around it, and it’s not just members from the Proton Front-end team. It includes anyone interested in monorepo architecture and anyone who works with ours. In this community, we talk about:

  • Monorepos in general (and our WebClients monorepo(new window) in particular)
  • Dealing with issues around monorepo when people need help
  • Proposing improvements to our monorepo’s workflow.

Most importantly, we’re now all speaking the same language when it comes to Git workflow and administration. Since it’s all one Git repo now, we’ve also normalized guidelines for Git across different front-end feature teams and universally configured the rules of our Git hosting tool that spans the entire monorepo (e.g., merge rules).

Conclusion

In retrospect, this monorepo implementation has exceeded our expectations. It’s a good solution given our needs, and we’re happy we went with it! The improvement in developer experience led to a notable boost in productivity. It’s still not a silver bullet, and there are many challenges that come with it, but for us, these challenges are heavily outweighed by the benefits it has delivered. We hope this baseline package architecture will hold up and allow us to scale and add any other required packages with ease for the foreseeable future.

The Git repository discussed in this article is open source and can be found at https://github.com/ProtonMail/WebClients(new window).

Protect your privacy with Proton
Create a free account

Related articles

How to delete all photos from Google Photos
Using Google Photos to store and share your pictures means allowing the company to see, analyze, and process them. Many people concerned about their privacy have taken steps to move away from the Google ecosystem, despite the company’s efforts to hid
Proton Wallet
  • Product updates
  • Proton news
  • Proton Wallet
WHAT IS PROTON WALLET? Our long-term vision is for Proton Wallet to be a digital wallet that gives you full control of your digital assets. While the type of assets that you can hold in Proton Wallet may evolve over time as we add more capabilities
  • Privacy guides
Bitcoin is an innovative payment network that leverages peer-to-peer transactions to remove the need for a central bank. Bitcoin has revolutionized the core principles of value exchange by showing that a network of fully independent nodes can operate
Proton Wallet is a digital asset wallet that currently supports self-custody on-chain Bitcoin. In this article, we review the key features and security architecture that make Proton Wallet a private and secure wallet that is as easy to use as email.
proton scribe
Most of us send emails every day. Finding the right words and tone, however, can take up a lot of time. Today we’re introducing Proton Scribe, a smart, privacy-first writing assistant built right into Proton Mail that helps you compose and improve yo
People and companies are generally subject to the laws of the country and city where they are located, and those laws can change when they move to a new place. However, the situation becomes more complicated when considering data, which can be subjec