At Proton, we strive to plan for all contingencies that might affect our service. Our efforts paid off on Tuesday, September 29 (around 4 AM Wednesday morning, Australia time), allowing us to turn a very serious incident into a minor inconvenience for some of our users.
As far as we currently understand the situation, this was done on accident and not maliciously, although we have still not been provided with any additional details. For several hours, however, around 30% of the global internet looking for us got pointed to Telstra instead. Some 1,680 other networks were also affected, making this perhaps the most serious BGP hijacking incident ever.
As we will explain in this article, although there are ways to mitigate the fallout from BGP hijacking, due to the way the internet is designed there is simply no way to prevent this kind of incident from happening. The episode highlights, in fact, how broken some key aspects of the internet are.
Users were minimally affected
Fortunately, we mitigated the effects of this incident relatively quickly thanks to measures such as Route Origin Authorization and monitoring that we already have in place for situations like this.
We were able to divert all mail and web traffic along unimpacted internet routes, so the only effect was some delay in sending and receiving emails. Some non-essential systems, however, like our payments system, were not functioning. This meant that customers were not able to pay us for some number of hours, and we incurred meaningful financial losses as a result.
In this incident, no user data was lost or breached.
What is BGP hijacking?
Border Gateway Protocol (BGP) is a key component of how the internet works. It is the routing protocol that determines how data packets travel from one IP address to another IP address.
IP addresses are the computer-friendly numbers that ISPs assign to every device that connects to the internet. If we imagine the internet as a postal network, IP addresses are the ZIP codes for each destination, and data packets are letters that each contain a portion of the information being sent. BGP is the road map used to deliver the letters to the correct addresses.
BGP looks for all available paths and chooses the fastest and most efficient route. Usually this means transferring data packets between autonomous systems (AS) until they reach their destination.
Autonomous systems are the smaller networks that together make up the internet. All IP addresses belong to an autonomous system, and most autonomous systems belong to ISPs like Telstra.
Using our earlier analogy, autonomous systems are like regional post offices. Letters are delivered to a regional post office, which then delivers them to individual addresses in the post office’s catchment area.
BGP relies on routing information published by autonomous systems, but there are no real safeguards in place to prevent a malicious AS from announcing a route to IP prefixes that it does not control.
To return to our analogy, BGP hijacking is like painting over road signs with false directions so that letters intended for one post office are delivered to a different one.
BGP always chooses the fastest and most efficient route, however, so for a BGP hijack to succeed, the fraudulent announcement must appear to offer a more efficient route than any existing route maps stored on BGP routers across the internet.
How we stopped this incident so quickly
We continually monitor our networks for signs of BGP hijacking and have measures in place to mitigate the problem when an attack is detected. Upon detecting that some portion of our internet traffic was being routed to Australia, we moved essential services over to backup subnets and IPs which were not being hijacked.
Thus, even though our primary network was still being hijacked, the bulk of our traffic was no longer flowing through that network. To follow the analogy above, even though the main highway now had incorrect road signs, we sent users down a different highway that still had the correct signs.
How can BGP hijacking be prevented?
The bad news is that there is no practical way to prevent BGP hijacking. There are, however, some proposals to fix this. The main approach is Resource Public Key Infrastructure (RPKI)(new window), also known as Resource Certification.
This is a system designed to certify that BGP route announcements can only be made by an organization with the authority to make that announcement.
We believe this is the most sensible solution, and Proton fully implements RPKI on all our networks. RPKI is opt-in and only works if other internet service providers agree to abide by it. Unfortunately, over 80%(new window) of the internet does not use RPKI, and until it does, incidents of BGP hijacking can cause significant damage (in our case, 30% of global traffic).
As BGP hijacking cannot be prevented without the collaboration of the main internet players, the only solution is to have infrastructure in place to quickly switch to alternative networks if and when it happens. Proton had previously invested in such infrastructure as part of our work for alternative routing(new window).
For the internet to become a safer place, more ISPs and cloud service providers must adopt RPKI, and for that to happen, internet users will need to lobby their ISPs. If you would like to get involved in this effort, please see Is BGP safe yet?(new window), an initiative by Cloudflare to pressure ISPs into adopting the protocol.
Encouraging your ISP to support RPKI is a good first step toward preventing future incidents of this sort anywhere on the internet. If you are a local internet registry (LIR), sign your prefix now. It is a simple, five-minutes process.
The Proton Team