blog.robur.coop/articles/2024-10-23-ptt.md

222 lines
10 KiB
Markdown
Raw Normal View History

2024-10-26 09:12:58 +00:00
---
date: 2024-10-23
title: Postes, télégraphes et téléphones, next steps
description: An update of our email stack
tags:
- SMTP
- emails
- mailing-lists
author:
name: Romain Calascibetta
email: romain.calascibetta@gmail.com
link: https://blog.osau.re/
breaks: false
---
As you know from [our article on Robur's
finances](https://blog.robur.coop/articles/finances.html), we've just received
[funding for our email project](https://nlnet.nl/project/PTT). This project
started when I was doing my internship in Cambridge and it's great to see that
it's been able to evolve over time and remain functional. This article will
introduce you to the latest changes to [our PTT
project](https://github.com/mirage/ptt) and how far we've got towards providing
an OCaml mailing list service.
## A Git repository or a simple block device as a database?
One issue that came up quickly in our latest experiments with our SMTP stack was
the database of users with an email address. Since we had decided to break
down the various stages of an email submission to offer simple unikernels, we
ended up having to deploy 4 unikernels to have a service that worked.
- a unikernel for authentication
- a unikernel DKIM-signing the incoming email
- one unikernel as primary DNS server
- one unikernel sending the signed email to its real destination
And we're only talking here about the submission of an email, the reception
concerns another pipe.
The problem with such an architecture is that some unikernels need to have the
same data: the users. In this case, the first unikernel needs to know the user's
password in order to verify authentication. The final unikernel needs to know
the real destinations of the users.
Let's take the example of two users: foo@robur.coop and bar@robur.coop. The
first points to hannes@foo.org and the second to reynir@example.com.
If Hannes wants to send a message to bar@robur.coop under the identity of
foo@robur.coop, he will need to authenticate himself to our first unikernel.
This first unikernel must therefore:
1) check that the user `foo` exists
2) the hashed password used by Hannes is the same as the one in the database
Next, the email will be signed by our second unikernel. It will then forward the
email to the last unikernel, which will do the actual translation of the
recipients and DNS resolution. In other words:
1) it will see that one (the only) recipient is bar@robur.coop
2) check that bar@robur.coop exists and obtain its real address
3) it will obtain reynir@example.com and perform DNS resolution on
`example.com` to find out the email server for this domain
4) finally send the email signed by foo@robur.coop to reynir@example.com!
So the first and last unikernels need to have the same information about our
users. One for the passwords, the second for the real email addresses.
But as you know, we're talking about unikernels that exist independently of each
other. What's more, they can't share files and the possibility of them sharing
block-devices remains an open question (and a complex one where parallel access
may be involved). In short, the only way to synchronise these unikernels in
relation to common data is with a Git repository.
[Git][git-kv] has the advantage of being widely used for our unikernels
([primary-git][primary-git], [pasteur][pasteur], [unipi][unipi] and
[contruno][contruno]). The advantage is that you can track changes, modify
files and notify the unikernel to update itself (using nsupdate, a simple ping
or an http request to the unikernel).
The problem is that this requires certain skills. Even if it's simple to set
up a Git server and then deploy our unikernels, we can restructure our
architecture and simplify the deployment of an SMTP stack!
## Elit and OneFFS
We have therefore decided to merge the email exchange service and email
submission into a unikernel so that this is the only user information requester.
So we decided to use [OneFFS][oneffs] as the file system for our database,
which will be a plain JSON file. This is perhaps one of the advantages of
MirageOS, which is that you can decide exactly what you need to implement
specific objectives.
In this case, those with experience of Postfix, LDAP or MariaDB could confirm
that configuring an email service should be simpler than implementing a
multitude of pipes between different applications and authentication methods.
The JSON file is therefore very simple and so is the creation of an OneFFS
image:
```sh
$ cat >database.json<<EOF
> [ { "name": "din"
> , "password": "xxxxxx"
> , "mailboxes": [ "romain.calascibetta@gmail.com" ] } ]
> EOF
$ opam install oneffs
$ oneffs create -i database.json -o database.img
```
All you have to do is register this image as a block with [albatross][albatross] and launch
our Elit unikernel with this block-device.
```sh
$ albatross-client create-block --data=database.img database 1024
$ albatross-client create --net=service:br0 --block=database:database \
elit elit.hvt \
--arg=...
```
At this stage, and if we add our unikernel signing incoming emails, we have more
or less the same thing as what I've described in [my previous articles][smtp_1] on
[deploying][smtp_2] an [email service][smtp_3].
## Multiplex receiving & sending emails
The PTT project is a toolkit for implementing SMTP servers. It gives developers
the choice of implementing their logic as they see fit:
* sign an email
* resolve destinations according to a database
* check SPF information
* annotate the email as spam or not
* etc.
Previously, PTT was split into 2 parts:
1) management of incoming clients/emails
2) the logic to be applied to incoming emails and their delivery
The second point was becoming increasingly complex, however, and errors in
sending emails are legion (DMARC non-alignment, the email is too big for the
destination, the destination doesn't exist, etc.). All the more so since, up to
now, PTT could only report these errors via the logs...
Hannes immediately mentioned the possibility of separating the logic of the
unikernel from the delivery. This will allow us to deal with temporary failures
(greylisting) as well. So a fundamental change was made:
- improve the [sendmail][sendmail] and `sendmail-lwt` packages (as well as proposing
`sendmail-miou`!) when sending or submitting an email
- improve PTT so that there are now 3 distinct jobs: receiving, what to do with
incoming emails and sending emails
![SMTP](../images/smtp.jpg)
This finally allows us to describe a clearer error management policy that is
independent of what we want to do with incoming emails. At this stage, we can
look for the `Return-Path` in emails that we haven't managed to send and notify
the senders!
All this is still in the experimental stage and practical cases are needed to
observe how we should handle errors and how others do.
## Insights & Next goals
We're already starting to have a bit of fun with email and we can start sending
and receiving emails right away.
We're also already seeing hacking attempts on our unikernel:
- people trying to authenticate themselves without `STARTTLS` (or with it,
depending on how clever the bot is)
- people trying to send emails as non-existent users in our database
- we're also seeing content that has nothing to do with SMTP
Above all, this shows that, very early on, bots try to usurp the identity linked
to your server (in our case, osau.re) in order to send spam, authenticate
themselves or simply send stuff and observe what happens. In this case, for
all the cases mentioned, Elit (and PTT) reacts well: in other words, it simply
cuts off the connection.
We were also able to observe how services such as gmail work. In addition, for
the purposes of a mailing list, email forwarding distorts DMARC verification
(specifically, SPF verification). The case is very simple:
foo@gmail.com tries to reply to robur@osau.re. robur@osau.re is a mailing list
to several addresses (one of them is bar@gmail.com). The unikernel will receive
the email and send it to bar@gmail.com. The problem is the alignment between
the `From` field (which corresponds to foo@gmail.com) and our osau.re server.
From gmail.com's point of view, there is a misalignment between these two
pieces of information and it therefore refuses to receive the email.
This is where our next objectives come in:
- finish our DMARC implementation
- implement ARC so that our server notifies us that, on our side, the DMARC
check went well and that gmail.com should trust us on this.
There is another way of solving the problem, perhaps a little more problematic,
modify the incoming email and in particular the `From` field. Although this
could be done quite simply with [mrmime][mrmime], it's better to concentrate on
DMARC and ARC so that we can send our emails as they are and never alter them
(especially as this will invalidate previous DKIM signatures!).
## Conclusion
It's always satisfying to see your projects working more or less correctly.
This article will surely be the start of a series on the intricacies of email
and the difficulty of deploying such a service at home.
We hope that this NLnet-funded work will enable us to replace our current email
system with unikernels. We're already past the stage where we can, more or less
(without DMARC checking), send emails to each other, which is a big step!
So follow our work on our blog and if you like what we're producing (which
involves a whole bunch of protocols and formats - much more than just SMTP), you
can make [a donation here](https://robur.coop/Donate)!
[mrmime]: https://github.com/mirage/mrmime
[smtp_1]: https://blog.osau.re/articles/smtp_1.html
[smtp_2]: https://blog.osau.re/articles/smtp_2.html
[smtp_3]: https://blog.osau.re/articles/smtp_3.html
[oneffs]: https://github.com/robur-coop/oneffs
[albatross]: https://github.com/robur-coop/albatross
[git-kv]: https://github.com/robur-coop/git-kv
[primary-git]: https://github.com/robur-coop/dns-primary-git/
[contruno]: https://github.com/dinosaure/contruno
[pasteur]: https://github.com/dinosaure/pasteur
[unipi]: https://github.com/robur-coop/unipi
[sendmail]: https://github.com/mirage/colombe