Effective Spam Filtering with Encrypted Email

Share this page

As many of you have noticed, in recent weeks, spam filtering performance at Proton Mail has dramatically improved. This is because we have recently deployed a series of updates to improve spam filtering performance. In this three part series of posts, we will discuss many of the spam challenges

Proton Mail faces and discuss in detail how to fight spam in the end-to-end encrypted world. Proton Mail’s spam challenges can be summarized broadly into three categories.

  1. Incoming Spam
  2. Outgoing Spam
  3. Internal Spam

Incoming spam is spam sent to Proton Mail from third party email providers, for example Hotmail. Outgoing spam is spammers using Proton Mail to send spam to third party email providers. Finally, Internal spam is Proton Mail accounts being used to spam other Proton Mail accounts. Each of these pose very different risks and challenges. As an encrypted email service provider with over 1 million users, spam is a continuous battle and one of the toughest challenges to overcome.

In this blog post, we will be discussing incoming spam filtering. In future blog posts, we will cover the challenges of preventing Proton Mail from being used by spammers and the challenges of doing spam filtering with end-to-end encrypted emails which we cannot read.

Incoming Spam

Incoming spam is not dangerous, but it can be a major inconvenience for users. Incoming spam doesn’t just clog up user inboxes, but it can also cause a performance problem if not handled efficiently due to the sheer volume of incoming spam emails which sometimes arrives at the Proton Mail servers.

Emails that come from third party email providers obviously cannot be delivered with end-to-end encryption, but upon reaching our mail servers, we will encrypt them with the recipient’s public key before saving the messages(new window). All this is done in memory so that by the time anything is permanently stored to disk, the email is already un-readable to us. This gives us a very limited window to perform spam filtering on incoming messages.

When an incoming message is received, it goes through the following filtering steps. The goal is to use less computationally extensive methods first to reject as much spam as possible before more expensive methods are used.

1. First, the IP address of the incoming SMTP server is checked against spam blacklists which contain IP addresses of servers we have previously received spam from. If we receive a hit, the message is rejected.

2. Secondly, the message is passed through our customized Bayesian filters which marks suspicious messages as spam.

3. Next, we generate checksums of incoming messages and check them against a database of known spam messages. If there is a match, we mark the message as spam. The checksums are done in such a way that it is also effective against mutating spam emails.

4. Afterwards, we also apply a few other anti-spam techniques which we cannot detail here for security reasons (see below)

5. Since email headers can be easily spoofed and abused, we also verify the authenticity (SPF, DKIM, and DMARC) of incoming emails to protect users. An email that fails DMARC(new window) is likely spoofed so it will be sent to the spam folder with a warning for our users.

6. Finally, user specific spam rules are applied(new window). This will apply user specified whitelists and blacklists to avoid false positives, or catch more spam messages.

Over the past few months, we have optimized and improved many of the above components to achieve a 500% improvement in spam detection. In doing this, we learned a few lessons:

Security through Obscurity

Generally speaking, security through obscurity is not recommended. This is why Proton Mail is open source(new window), and we have all of our front end code open to inspection from the community(new window). Security through obscurity is the anti-thesis of open source, and relies on the notion that security of a system can be improved if attackers do not know how the system works. Generally, this is a bad approach. It is better to have a system so secure that even if attackers know how it works, they cannot bypass it. This is certainly the case with the PGP email encryption that Proton Mail utilizes.

However, one case where security through obscurity DOES work is fighting spammers. This is particularly the case when it comes to fighting outgoing spam. Fighting spam is like trying to hit a moving target, it requires constant adjustment and tuning, especially since the distinction between spam and non-spam messages can be unclear at times.

There simply isn’t any foolproof method for defeating spam. Thus, if spammers don’t know how we are blocking their messages, it makes it much more difficult for them to find a workaround. This is why we cannot publish detailed specs of how our spam filters work. It also means we cannot open source our backend server configs which contain our spam filter settings.

In terms of privacy and trust, there is little advantage in open sourcing the server configs because even if the configs were released, there is no way to guarantee that is the config running on the server side. On the other hand, releasing the backend configs would let spammers know exactly what they need to do to bypass our spam filters, which would put the entire Proton Mail community at risk.

Personalized Spam Filtering

Before designing our spam filtering system, we looked through months of spam reports from the community. What we quickly learned is that every user has a different definition of spam. What you consider to be spam won’t be the same as what your neighbor considers to be spam. Thus, it is impossible to define a single ruleset that works for everybody. This pushed us in the direction of personalized rulesets.

Today, every single Proton Mail account comes with its own spam filter settings which are unique to that account. When you mark messages as spam or not-spam, the filter will dynamically adjust to take into account your personal preferences. You can also view and modify your personal spam filter settings(new window). Proton Mail also accounts for whether an email came from one of your contacts or not. If it comes from a contact, it is allowed through the spam filter.

The Best Spam Filter is You

Proton Mail has a comprehensive multi-tiered protection system to prevent spam from entering your inbox, but actually you are the best protection against spam. The vast majority of spam can be avoided by simply not giving your Proton Mail email address to unscrupulous websites which then resell that information to spammers. To learn more about how to avoid receiving spam in the first place, read our guide to avoiding spam.

Secure your emails, protect your privacy
Get Proton Mail free

Share this page

Proton Team

We are scientists, engineers, and specialists from around the world drawn together by a shared vision of protecting freedom and privacy online. Proton was born out of a desire to build an internet that puts people before profits, and we're working to create a world where everyone is in control of their digital lives.

Related articles

If you’ve ever uploaded a file or a video online to share with someone, chances are you’ve used a cloud storage service.  Unlike traditional forms of data storage (such as hard drives), cloud storage uses servers in off-site locations to store data,
Phishing scams try to trick you into revealing sensitive data or downloading malware, often leading to identity theft, credit card fraud, or other cybercrime. Learn all about phishing and how to prevent it. With billions of phishing emails sent dail
Around 50% of all emails contain trackers that spy on your email activity — over 160 billion messages sent every day. Here’s how they work and how to block them. Working silently in the background, email trackers not only monitor how you respond to
With over 33 million registered users and more than 100,000 business customers, LastPass is one of the world’s most popular password managers. After an escalating series of highly-damaging disclosures over the last few months, LastPass has now admitt
Email headers are the hidden part of emails containing vital information to identify and authenticate messages. Learn how to read them to spot spam and stay secure. Have you received an unexpected email from a strange address? Is it actually from so
The United States is notoriously weak on privacy laws. With its secret surveillance courts and all-powerful spy agencies, the US has many tools to collect data on people within its jurisdiction and beyond. Recently, that power has been used to prose