diff --git a/articles/2024-10-23-ptt.md b/articles/2024-10-23-ptt.md new file mode 100644 index 0000000..7c2bfbe --- /dev/null +++ b/articles/2024-10-23-ptt.md @@ -0,0 +1,221 @@ +--- +date: 2024-10-23 +title: Postes, télégraphes et téléphones, next steps +description: An update of our email stack +tags: + - SMTP + - emails + - mailing-lists +author: + name: Romain Calascibetta + email: romain.calascibetta@gmail.com + link: https://blog.osau.re/ +breaks: false +--- + +As you know from [our article on Robur's +finances](https://blog.robur.coop/articles/finances.html), we've just received +[funding for our email project](https://nlnet.nl/project/PTT). This project +started when I was doing my internship in Cambridge and it's great to see that +it's been able to evolve over time and remain functional. This article will +introduce you to the latest changes to [our PTT +project](https://github.com/mirage/ptt) and how far we've got towards providing +an OCaml mailing list service. + +## A Git repository or a simple block device as a database? + +One issue that came up quickly in our latest experiments with our SMTP stack was +the database of users with an email address. Since we had decided to ‘break +down’ the various stages of an email submission to offer simple unikernels, we +ended up having to deploy 4 unikernels to have a service that worked. +- a unikernel for authentication +- a unikernel DKIM-signing the incoming email +- one unikernel as primary DNS server +- one unikernel sending the signed email to its real destination + +And we're only talking here about the submission of an email, the reception +concerns another ‘pipe’. + +The problem with such an architecture is that some unikernels need to have the +same data: the users. In this case, the first unikernel needs to know the user's +password in order to verify authentication. The final unikernel needs to know +the real destinations of the users. + +Let's take the example of two users: foo@robur.coop and bar@robur.coop. The +first points to hannes@foo.org and the second to reynir@example.com. + +If Hannes wants to send a message to bar@robur.coop under the identity of +foo@robur.coop, he will need to authenticate himself to our first unikernel. +This first unikernel must therefore: +1) check that the user `foo` exists +2) the hashed password used by Hannes is the same as the one in the database + +Next, the email will be signed by our second unikernel. It will then forward the +email to the last unikernel, which will do the actual translation of the +recipients and DNS resolution. In other words: +1) it will see that one (the only) recipient is bar@robur.coop +2) check that bar@robur.coop exists and obtain its real address +3) it will obtain reynir@example.com and perform DNS resolution on + `example.com` to find out the email server for this domain +4) finally send the email signed by foo@robur.coop to reynir@example.com! + +So the first and last unikernels need to have the same information about our +users. One for the passwords, the second for the real email addresses. + +But as you know, we're talking about unikernels that exist independently of each +other. What's more, they can't share files and the possibility of them sharing +block-devices remains an open question (and a complex one where parallel access +may be involved). In short, the only way to ‘synchronise’ these unikernels in +relation to common data is with a Git repository. + +[Git][git-kv] has the advantage of being widely used for our unikernels +([primary-git][primary-git], [pasteur][pasteur], [unipi][unipi] and +[contruno][contruno]). The advantage is that you can track changes, modify +files and notify the unikernel to update itself (using nsupdate, a simple ping +or an http request to the unikernel). + +The problem is that this requires certain skills. Even if it's ‘simple’ to set +up a Git server and then deploy our unikernels, we can restructure our +architecture and simplify the deployment of an SMTP stack! + +## Elit and OneFFS + +We have therefore decided to merge the email exchange service and email +submission into a unikernel so that this is the only user information requester. + +So we decided to use [OneFFS][oneffs] as the file system for our database, +which will be a plain JSON file. This is perhaps one of the advantages of +MirageOS, which is that you can decide exactly what you need to implement +specific objectives. + +In this case, those with experience of Postfix, LDAP or MariaDB could confirm +that configuring an email service should be ‘simpler’ than implementing a +multitude of pipes between different applications and authentication methods. + +The JSON file is therefore very simple and so is the creation of an OneFFS +image: +```sh +$ cat >database.json< [ { "name": "din" +> , "password": "xxxxxx" +> , "mailboxes": [ "romain.calascibetta@gmail.com" ] } ] +> EOF +$ opam install oneffs +$ oneffs create -i database.json -o database.img +``` + +All you have to do is register this image as a block with [albatross][albatross] and launch +our Elit unikernel with this block-device. +```sh +$ albatross-client create-block --data=database.img database 1024 +$ albatross-client create --net=service:br0 --block=database:database \ + elit elit.hvt \ + --arg=... +``` + +At this stage, and if we add our unikernel signing incoming emails, we have more +or less the same thing as what I've described in [my previous articles][smtp_1] on +[deploying][smtp_2] an [email service][smtp_3]. + +## Multiplex receiving & sending emails + +The PTT project is a toolkit for implementing SMTP servers. It gives developers +the choice of implementing their logic as they see fit: +* sign an email +* resolve destinations according to a database +* check SPF information +* annotate the email as spam or not +* etc. + +Previously, PTT was split into 2 parts: +1) management of incoming clients/emails +2) the logic to be applied to incoming emails and their delivery + +The second point was becoming increasingly complex, however, and errors in +sending emails are legion (DMARC non-alignment, the email is too big for the +destination, the destination doesn't exist, etc.). All the more so since, up to +now, PTT could only report these errors via the logs... + +Hannes immediately mentioned the possibility of separating the logic of the +unikernel from the delivery. This will allow us to deal with temporary failures +(greylisting) as well. So a fundamental change was made: +- improve the [sendmail][sendmail] and `sendmail-lwt` packages (as well as proposing +`sendmail-miou`!) when sending or submitting an email +- improve PTT so that there are now 3 distinct jobs: receiving, what to do with +incoming emails and sending emails + +![SMTP](../images/smtp.jpg) + +This finally allows us to describe a clearer error management policy that is +independent of what we want to do with incoming emails. At this stage, we can +look for the `Return-Path` in emails that we haven't managed to send and notify +the senders! + +All this is still in the experimental stage and practical cases are needed to +observe how we should handle errors and how others do. + +## Insights & Next goals + +We're already starting to have a bit of fun with email and we can start sending +and receiving emails right away. + +We're also already seeing hacking attempts on our unikernel: +- people trying to authenticate themselves without `STARTTLS` (or with it, +depending on how clever the bot is) +- people trying to send emails as non-existent users in our database +- we're also seeing content that has nothing to do with SMTP + +Above all, this shows that, very early on, bots try to usurp the identity linked +to your server (in our case, osau.re) in order to send spam, authenticate +themselves or simply send ‘stuff’ and observe what happens. In this case, for +all the cases mentioned, Elit (and PTT) reacts well: in other words, it simply +cuts off the connection. + +We were also able to observe how services such as gmail work. In addition, for +the purposes of a mailing list, email forwarding distorts DMARC verification +(specifically, SPF verification). The case is very simple: + +foo@gmail.com tries to reply to robur@osau.re. robur@osau.re is a mailing list +to several addresses (one of them is bar@gmail.com). The unikernel will receive +the email and send it to bar@gmail.com. The problem is the alignment between +the `From` field (which corresponds to foo@gmail.com) and our osau.re server. +From gmail.com's point of view, there is a misalignment between these two +pieces of information and it therefore refuses to receive the email. + +This is where our next objectives come in: +- finish our DMARC implementation +- implement ARC so that our server notifies us that, on our side, the DMARC + check went well and that gmail.com should trust us on this. + +There is another way of solving the problem, perhaps a little more problematic, +modify the incoming email and in particular the `From` field. Although this +could be done quite simply with [mrmime][mrmime], it's better to concentrate on +DMARC and ARC so that we can send our emails as they are and never alter them +(especially as this will invalidate previous DKIM signatures!). + +## Conclusion + +It's always satisfying to see your projects working ‘more or less’ correctly. +This article will surely be the start of a series on the intricacies of email +and the difficulty of deploying such a service at home. + +We hope that this NLnet-funded work will enable us to replace our current email +system with unikernels. We're already past the stage where we can, more or less +(without DMARC checking), send emails to each other, which is a big step! + +So follow our work on our blog and if you like what we're producing (which +involves a whole bunch of protocols and formats - much more than just SMTP), you +can make [a donation here](https://robur.coop/Donate)! + +[mrmime]: https://github.com/mirage/mrmime +[smtp_1]: https://blog.osau.re/articles/smtp_1.html +[smtp_2]: https://blog.osau.re/articles/smtp_2.html +[smtp_3]: https://blog.osau.re/articles/smtp_3.html +[oneffs]: https://github.com/robur-coop/oneffs +[albatross]: https://github.com/robur-coop/albatross +[git-kv]: https://github.com/robur-coop/git-kv +[primary-git]: https://github.com/robur-coop/dns-primary-git/ +[contruno]: https://github.com/dinosaure/contruno +[pasteur]: https://github.com/dinosaure/pasteur +[unipi]: https://github.com/robur-coop/unipi +[sendmail]: https://github.com/mirage/colombe diff --git a/images/smtp.jpg b/images/smtp.jpg new file mode 100644 index 0000000..c974dcc Binary files /dev/null and b/images/smtp.jpg differ