blog.robur.coop/articles/2024-10-29-ptt.md

10 KiB
Raw Blame History

date title description tags author breaks
2024-10-29 Postes, télégraphes et téléphones, next steps An update of our email stack
SMTP
emails
mailing-lists
name email link
Romain Calascibetta romain.calascibetta@gmail.com https://blog.osau.re/
false

As you know from our article on Robur's finances, we've just received funding for our email project. This project started when I was doing my internship in Cambridge and it's great to see that it's been able to evolve over time and remain functional. This article will introduce you to the latest changes to our PTT project and how far we've got towards providing an OCaml mailing list service.

A Git repository or a simple block device as a database?

One issue that came up quickly in our latest experiments with our SMTP stack was the database of users with an email address. Since we had decided to break down the various stages of an email submission to offer simple unikernels, we ended up having to deploy 4 unikernels to have a service that worked.

  • a unikernel for authentication
  • a unikernel DKIM-signing the incoming email
  • one unikernel as primary DNS server
  • one unikernel sending the signed email to its real destination

And we're only talking here about the submission of an email, the reception concerns another pipe.

The problem with such an architecture is that some unikernels need to have the same data: the users. In this case, the first unikernel needs to know the user's password in order to verify authentication. The final unikernel needs to know the real destinations of the users.

Let's take the example of two users: foo@robur.coop and bar@robur.coop. The first points to hannes@foo.org and the second to reynir@example.com.

If Hannes wants to send a message to bar@robur.coop under the identity of foo@robur.coop, he will need to authenticate himself to our first unikernel. This first unikernel must therefore:

  1. check that the user foo exists
  2. the hashed password used by Hannes is the same as the one in the database

Next, the email will be signed by our second unikernel. It will then forward the email to the last unikernel, which will do the actual translation of the recipients and DNS resolution. In other words:

  1. it will see that one (the only) recipient is bar@robur.coop
  2. check that bar@robur.coop exists and obtain its real address
  3. it will obtain reynir@example.com and perform DNS resolution on example.com to find out the email server for this domain
  4. finally send the email signed by foo@robur.coop to reynir@example.com!

So the first and last unikernels need to have the same information about our users. One for the passwords, the second for the real email addresses.

But as you know, we're talking about unikernels that exist independently of each other. What's more, they can't share files and the possibility of them sharing block-devices remains an open question (and a complex one where parallel access may be involved). In short, the only way to synchronise these unikernels in relation to common data is with a Git repository.

Git has the advantage of being widely used for our unikernels (primary-git, pasteur, unipi and contruno). The advantage is that you can track changes, modify files and notify the unikernel to update itself (using nsupdate, a simple ping or an http request to the unikernel).

The problem is that this requires certain skills. Even if it's simple to set up a Git server and then deploy our unikernels, we can restructure our architecture and simplify the deployment of an SMTP stack!

Elit and OneFFS

We have therefore decided to merge the email exchange service and email submission into a unikernel so that this is the only user information requester.

So we decided to use OneFFS as the file system for our database, which will be a plain JSON file. This is perhaps one of the advantages of MirageOS, which is that you can decide exactly what you need to implement specific objectives.

In this case, those with experience of Postfix, LDAP or MariaDB could confirm that configuring an email service should be simpler than implementing a multitude of pipes between different applications and authentication methods.

The JSON file is therefore very simple and so is the creation of an OneFFS image:

$ cat >database.json<<EOF
> [ { "name": "din"
>   , "password": "xxxxxx"
>   , "mailboxes": [ "romain.calascibetta@gmail.com" ] } ]
> EOF
$ opam install oneffs
$ oneffs create -i database.json -o database.img

All you have to do is register this image as a block with albatross and launch our Elit unikernel with this block-device.

$ albatross-client create-block --data=database.img database 1024
$ albatross-client create --net=service:br0 --block=database:database \
    elit elit.hvt \
    --arg=...

At this stage, and if we add our unikernel signing incoming emails, we have more or less the same thing as what I've described in my previous articles on deploying an email service.

Multiplex receiving & sending emails

The PTT project is a toolkit for implementing SMTP servers. It gives developers the choice of implementing their logic as they see fit:

  • sign an email
  • resolve destinations according to a database
  • check SPF information
  • annotate the email as spam or not
  • etc.

Previously, PTT was split into 2 parts:

  1. management of incoming clients/emails
  2. the logic to be applied to incoming emails and their delivery

The second point was becoming increasingly complex, however, and errors in sending emails are legion (DMARC non-alignment, the email is too big for the destination, the destination doesn't exist, etc.). All the more so since, up to now, PTT could only report these errors via the logs...

Hannes immediately mentioned the possibility of separating the logic of the unikernel from the delivery. This will allow us to deal with temporary failures (greylisting) as well. So a fundamental change was made:

  • improve the sendmail and sendmail-lwt packages (as well as proposing sendmail-miou!) when sending or submitting an email
  • improve PTT so that there are now 3 distinct jobs: receiving, what to do with incoming emails and sending emails

SMTP

This finally allows us to describe a clearer error management policy that is independent of what we want to do with incoming emails. At this stage, we can look for the Return-Path in emails that we haven't managed to send and notify the senders!

All this is still in the experimental stage and practical cases are needed to observe how we should handle errors and how others do.

Insights & Next goals

We're already starting to have a bit of fun with email and we can start sending and receiving emails right away.

We're also already seeing hacking attempts on our unikernel:

  • people trying to authenticate themselves without STARTTLS (or with it, depending on how clever the bot is)
  • people trying to send emails as non-existent users in our database
  • we're also seeing content that has nothing to do with SMTP

Above all, this shows that, very early on, bots try to usurp the identity linked to your server (in our case, osau.re) in order to send spam, authenticate themselves or simply send stuff and observe what happens. In this case, for all the cases mentioned, Elit (and PTT) reacts well: in other words, it simply cuts off the connection.

We were also able to observe how services such as gmail work. In addition, for the purposes of a mailing list, email forwarding distorts DMARC verification (specifically, SPF verification). The case is very simple:

foo@gmail.com tries to reply to robur@osau.re. robur@osau.re is a mailing list to several addresses (one of them is bar@gmail.com). The unikernel will receive the email and send it to bar@gmail.com. The problem is the alignment between the From field (which corresponds to foo@gmail.com) and our osau.re server. From gmail.com's point of view, there is a misalignment between these two pieces of information and it therefore refuses to receive the email.

This is where our next objectives come in:

  • finish our DMARC implementation
  • implement ARC so that our server notifies us that, on our side, the DMARC check went well and that gmail.com should trust us on this.

There is another way of solving the problem, perhaps a little more problematic, modify the incoming email and in particular the From field. Although this could be done quite simply with mrmime, it's better to concentrate on DMARC and ARC so that we can send our emails as they are and never alter them (especially as this will invalidate previous DKIM signatures!).

Conclusion

It's always satisfying to see your projects working more or less correctly. This article will surely be the start of a series on the intricacies of email and the difficulty of deploying such a service at home.

We hope that this NLnet-funded work will enable us to replace our current email system with unikernels. We're already past the stage where we can, more or less (without DMARC checking), send emails to each other, which is a big step!

So follow our work on our blog and if you like what we're producing (which involves a whole bunch of protocols and formats - much more than just SMTP), you can make a donation here!