forked from robur/blog.robur.coop
Compare commits
68 commits
Author | SHA1 | Date | |
---|---|---|---|
d74e429ba7 | |||
8ccb063981 | |||
dd0fc6e95e | |||
575cc7f095 | |||
04e71b2485 | |||
f9b9b712ec | |||
2fda58ca98 | |||
3ebe667432 | |||
086485e904 | |||
ec0dec16ef | |||
e4762faff5 | |||
a8629616de | |||
26880677b0 | |||
f3be2beb34 | |||
e993307d83 | |||
bc0bbbc706 | |||
f37e1a5c1c | |||
c2aeb492ec | |||
0aadded352 | |||
ceb856999c | |||
45917109f0 | |||
340a53acd4 | |||
cf9e1df880 | |||
f35e0763db | |||
bc02087612 | |||
62c7a926f0 | |||
2228432cab | |||
4a4c22e244 | |||
1918cb3821 | |||
0153d20b89 | |||
ca5efb67e8 | |||
dcbb5e0e05 | |||
caec8c419c | |||
58f1bf634b | |||
f10a30f29c | |||
80192cb685 | |||
9991309d38 | |||
f65105c31a | |||
6f77487c47 | |||
c85b9ad712 | |||
4643347770 | |||
6ee1282392 | |||
52d6ce6b67 | |||
06b0b673c6 | |||
7093e64796 | |||
db5e8fd9cb | |||
9388e31171 | |||
0b0c57de5c | |||
f1c0e05e81 | |||
8ced832149 | |||
e1618b8c2c | |||
a1dbffeb7f | |||
21abbd9f94 | |||
73dd1a52c7 | |||
4076bdff9e | |||
86dbd5df8c | |||
62c58c5c30 | |||
843b7a887f | |||
3deddd702f | |||
33ef4926e2 | |||
90fc53c8d2 | |||
6c79caeb19 | |||
3b1d9f0805 | |||
e54e40c6aa | |||
74ad2da16d | |||
34539125c4 | |||
4ff887acee | |||
6b343af91c |
49 changed files with 2894 additions and 1003 deletions
3
.gitignore
vendored
Normal file
3
.gitignore
vendored
Normal file
|
@ -0,0 +1,3 @@
|
|||
_build/
|
||||
_site/
|
||||
_cache
|
15
README.md
15
README.md
|
@ -10,25 +10,26 @@ $ git clone git@git.robur.coop:robur/blog.robur.coop
|
|||
$ cd blog.robur.coop/
|
||||
$ opam pin add -yn .
|
||||
$ opam install --deps-only blogger
|
||||
$ dune exec src/blogger.exe -- watch
|
||||
$ dune exec bin/watch.exe --
|
||||
```
|
||||
|
||||
A little server run on `http://localhost:8888`.
|
||||
A little server run on `http://localhost:8000`.
|
||||
|
||||
The user can add an article into the `articles/` directory. The format is easy.
|
||||
A simple header which starts with `---` and finish with `---`. Inside, you have
|
||||
a YAML description of the article where some fields are required:
|
||||
- `date`
|
||||
- `article.title`
|
||||
- `article.description`
|
||||
- `title`
|
||||
- `description`
|
||||
- `tags`
|
||||
|
||||
You can specify an `author` (with its `name`, `email` and `link`) or not. By
|
||||
default, we use `team@robur.coop`. If everything looks good, you can generate
|
||||
via the `blogger.exe` tool the generated website via:
|
||||
```shell-session
|
||||
$ dune exec src/blogger.exe -- push \
|
||||
-r git@git.robur.coop:robur/blog.robur.coop.git#gh-pages
|
||||
$ dune exec bin/push.exe -- push \
|
||||
-r git@git.robur.coop:robur/blog.robur.coop.git#gh-pages \
|
||||
--host https://blog.robur.coop
|
||||
[--name "The Robur team"] \
|
||||
[--email team@robur.coop]
|
||||
```
|
||||
|
@ -37,7 +38,7 @@ An SSH communication will starts. If you already registered your private key
|
|||
with `ssh-agent` and your `.ssh/config` is configured to take this one if you
|
||||
communicate with with `git@git.robur.coop`, everything will be smooth! Et voilà!
|
||||
At the end, an HTTP request will be send to `https://blog.robur.coop` (via
|
||||
Gitea) to update the unikernel with the last version of the blog.
|
||||
Forgejo) to update the unikernel with the last version of the blog.
|
||||
|
||||
You can also use the `update.sh` script to update the blog with the builder user
|
||||
on the server machine.
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2024-02-03
|
||||
article.title: Python's `str.__repr__()`
|
||||
article.description: Reimplementing Python string escaping in OCaml
|
||||
title: Python's `str.__repr__()`
|
||||
description: Reimplementing Python string escaping in OCaml
|
||||
tags:
|
||||
- OCaml
|
||||
- Python
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2024-08-21
|
||||
article.title: MirageVPN and OpenVPN
|
||||
article.description: Discoveries made implementing MirageVPN, a OpenVPN-compatible VPN library
|
||||
title: MirageVPN and OpenVPN
|
||||
description: Discoveries made implementing MirageVPN, a OpenVPN-compatible VPN library
|
||||
tags:
|
||||
- MirageVPN
|
||||
- OpenVPN
|
||||
|
@ -236,7 +236,7 @@ This work was funded by [the EU NGI Assure Fund through NLnet](https://nlnet.nl/
|
|||
In my opinion, this shows that funding one open source project can have a positive impact on other open source projects, too.
|
||||
|
||||
[robur]: https://robur.coop/
|
||||
[miragevpn-server]: https://blog.robur.coop/articles/miragevpn-server.html
|
||||
[miragevpn-server]: miragevpn-server.html
|
||||
[contact]: https://reyn.ir/contact.html
|
||||
|
||||
[^openvpn-tls]: This is not always the case. It is possible to use static shared secret keys, but it is mostly considered deprecated.
|
221
articles/2024-10-29-ptt.md
Normal file
221
articles/2024-10-29-ptt.md
Normal file
|
@ -0,0 +1,221 @@
|
|||
---
|
||||
date: 2024-10-29
|
||||
title: Postes, télégraphes et téléphones, next steps
|
||||
description: An update of our email stack
|
||||
tags:
|
||||
- SMTP
|
||||
- emails
|
||||
- mailing-lists
|
||||
author:
|
||||
name: Romain Calascibetta
|
||||
email: romain.calascibetta@gmail.com
|
||||
link: https://blog.osau.re/
|
||||
breaks: false
|
||||
---
|
||||
|
||||
As you know from [our article on Robur's
|
||||
finances](https://blog.robur.coop/articles/finances.html), we've just received
|
||||
[funding for our email project](https://nlnet.nl/project/PTT). This project
|
||||
started when I was doing my internship in Cambridge and it's great to see that
|
||||
it's been able to evolve over time and remain functional. This article will
|
||||
introduce you to the latest changes to [our PTT
|
||||
project](https://github.com/mirage/ptt) and how far we've got towards providing
|
||||
an OCaml mailing list service.
|
||||
|
||||
## A Git repository or a simple block device as a database?
|
||||
|
||||
One issue that came up quickly in our latest experiments with our SMTP stack was
|
||||
the database of users with an email address. Since we had decided to ‘break
|
||||
down’ the various stages of an email submission to offer simple unikernels, we
|
||||
ended up having to deploy 4 unikernels to have a service that worked.
|
||||
- a unikernel for authentication
|
||||
- a unikernel DKIM-signing the incoming email
|
||||
- one unikernel as primary DNS server
|
||||
- one unikernel sending the signed email to its real destination
|
||||
|
||||
And we're only talking here about the submission of an email, the reception
|
||||
concerns another ‘pipe’.
|
||||
|
||||
The problem with such an architecture is that some unikernels need to have the
|
||||
same data: the users. In this case, the first unikernel needs to know the user's
|
||||
password in order to verify authentication. The final unikernel needs to know
|
||||
the real destinations of the users.
|
||||
|
||||
Let's take the example of two users: foo@robur.coop and bar@robur.coop. The
|
||||
first points to hannes@foo.org and the second to reynir@example.com.
|
||||
|
||||
If Hannes wants to send a message to bar@robur.coop under the identity of
|
||||
foo@robur.coop, he will need to authenticate himself to our first unikernel.
|
||||
This first unikernel must therefore:
|
||||
1) check that the user `foo` exists
|
||||
2) the hashed password used by Hannes is the same as the one in the database
|
||||
|
||||
Next, the email will be signed by our second unikernel. It will then forward the
|
||||
email to the last unikernel, which will do the actual translation of the
|
||||
recipients and DNS resolution. In other words:
|
||||
1) it will see that one (the only) recipient is bar@robur.coop
|
||||
2) check that bar@robur.coop exists and obtain its real address
|
||||
3) it will obtain reynir@example.com and perform DNS resolution on
|
||||
`example.com` to find out the email server for this domain
|
||||
4) finally send the email signed by foo@robur.coop to reynir@example.com!
|
||||
|
||||
So the first and last unikernels need to have the same information about our
|
||||
users. One for the passwords, the second for the real email addresses.
|
||||
|
||||
But as you know, we're talking about unikernels that exist independently of each
|
||||
other. What's more, they can't share files and the possibility of them sharing
|
||||
block-devices remains an open question (and a complex one where parallel access
|
||||
may be involved). In short, the only way to ‘synchronise’ these unikernels in
|
||||
relation to common data is with a Git repository.
|
||||
|
||||
[Git][git-kv] has the advantage of being widely used for our unikernels
|
||||
([primary-git][primary-git], [pasteur][pasteur], [unipi][unipi] and
|
||||
[contruno][contruno]). The advantage is that you can track changes, modify
|
||||
files and notify the unikernel to update itself (using nsupdate, a simple ping
|
||||
or an http request to the unikernel).
|
||||
|
||||
The problem is that this requires certain skills. Even if it's ‘simple’ to set
|
||||
up a Git server and then deploy our unikernels, we can restructure our
|
||||
architecture and simplify the deployment of an SMTP stack!
|
||||
|
||||
## Elit and OneFFS
|
||||
|
||||
We have therefore decided to merge the email exchange service and email
|
||||
submission into a unikernel so that this is the only user information requester.
|
||||
|
||||
So we decided to use [OneFFS][oneffs] as the file system for our database,
|
||||
which will be a plain JSON file. This is perhaps one of the advantages of
|
||||
MirageOS, which is that you can decide exactly what you need to implement
|
||||
specific objectives.
|
||||
|
||||
In this case, those with experience of Postfix, LDAP or MariaDB could confirm
|
||||
that configuring an email service should be ‘simpler’ than implementing a
|
||||
multitude of pipes between different applications and authentication methods.
|
||||
|
||||
The JSON file is therefore very simple and so is the creation of an OneFFS
|
||||
image:
|
||||
```sh
|
||||
$ cat >database.json<<EOF
|
||||
> [ { "name": "din"
|
||||
> , "password": "xxxxxx"
|
||||
> , "mailboxes": [ "romain.calascibetta@gmail.com" ] } ]
|
||||
> EOF
|
||||
$ opam install oneffs
|
||||
$ oneffs create -i database.json -o database.img
|
||||
```
|
||||
|
||||
All you have to do is register this image as a block with [albatross][albatross] and launch
|
||||
our Elit unikernel with this block-device.
|
||||
```sh
|
||||
$ albatross-client create-block --data=database.img database 1024
|
||||
$ albatross-client create --net=service:br0 --block=database:database \
|
||||
elit elit.hvt \
|
||||
--arg=...
|
||||
```
|
||||
|
||||
At this stage, and if we add our unikernel signing incoming emails, we have more
|
||||
or less the same thing as what I've described in [my previous articles][smtp_1] on
|
||||
[deploying][smtp_2] an [email service][smtp_3].
|
||||
|
||||
## Multiplex receiving & sending emails
|
||||
|
||||
The PTT project is a toolkit for implementing SMTP servers. It gives developers
|
||||
the choice of implementing their logic as they see fit:
|
||||
* sign an email
|
||||
* resolve destinations according to a database
|
||||
* check SPF information
|
||||
* annotate the email as spam or not
|
||||
* etc.
|
||||
|
||||
Previously, PTT was split into 2 parts:
|
||||
1) management of incoming clients/emails
|
||||
2) the logic to be applied to incoming emails and their delivery
|
||||
|
||||
The second point was becoming increasingly complex, however, and errors in
|
||||
sending emails are legion (DMARC non-alignment, the email is too big for the
|
||||
destination, the destination doesn't exist, etc.). All the more so since, up to
|
||||
now, PTT could only report these errors via the logs...
|
||||
|
||||
Hannes immediately mentioned the possibility of separating the logic of the
|
||||
unikernel from the delivery. This will allow us to deal with temporary failures
|
||||
(greylisting) as well. So a fundamental change was made:
|
||||
- improve the [sendmail][sendmail] and `sendmail-lwt` packages (as well as proposing
|
||||
`sendmail-miou`!) when sending or submitting an email
|
||||
- improve PTT so that there are now 3 distinct jobs: receiving, what to do with
|
||||
incoming emails and sending emails
|
||||
|
||||
![SMTP](../images/smtp.jpg)
|
||||
|
||||
This finally allows us to describe a clearer error management policy that is
|
||||
independent of what we want to do with incoming emails. At this stage, we can
|
||||
look for the `Return-Path` in emails that we haven't managed to send and notify
|
||||
the senders!
|
||||
|
||||
All this is still in the experimental stage and practical cases are needed to
|
||||
observe how we should handle errors and how others do.
|
||||
|
||||
## Insights & Next goals
|
||||
|
||||
We're already starting to have a bit of fun with email and we can start sending
|
||||
and receiving emails right away.
|
||||
|
||||
We're also already seeing hacking attempts on our unikernel:
|
||||
- people trying to authenticate themselves without `STARTTLS` (or with it,
|
||||
depending on how clever the bot is)
|
||||
- people trying to send emails as non-existent users in our database
|
||||
- we're also seeing content that has nothing to do with SMTP
|
||||
|
||||
Above all, this shows that, very early on, bots try to usurp the identity linked
|
||||
to your server (in our case, osau.re) in order to send spam, authenticate
|
||||
themselves or simply send ‘stuff’ and observe what happens. In this case, for
|
||||
all the cases mentioned, Elit (and PTT) reacts well: in other words, it simply
|
||||
cuts off the connection.
|
||||
|
||||
We were also able to observe how services such as gmail work. In addition, for
|
||||
the purposes of a mailing list, email forwarding distorts DMARC verification
|
||||
(specifically, SPF verification). The case is very simple:
|
||||
|
||||
foo@gmail.com tries to reply to robur@osau.re. robur@osau.re is a mailing list
|
||||
to several addresses (one of them is bar@gmail.com). The unikernel will receive
|
||||
the email and send it to bar@gmail.com. The problem is the alignment between
|
||||
the `From` field (which corresponds to foo@gmail.com) and our osau.re server.
|
||||
From gmail.com's point of view, there is a misalignment between these two
|
||||
pieces of information and it therefore refuses to receive the email.
|
||||
|
||||
This is where our next objectives come in:
|
||||
- finish our DMARC implementation
|
||||
- implement ARC so that our server notifies us that, on our side, the DMARC
|
||||
check went well and that gmail.com should trust us on this.
|
||||
|
||||
There is another way of solving the problem, perhaps a little more problematic,
|
||||
modify the incoming email and in particular the `From` field. Although this
|
||||
could be done quite simply with [mrmime][mrmime], it's better to concentrate on
|
||||
DMARC and ARC so that we can send our emails as they are and never alter them
|
||||
(especially as this will invalidate previous DKIM signatures!).
|
||||
|
||||
## Conclusion
|
||||
|
||||
It's always satisfying to see your projects working ‘more or less’ correctly.
|
||||
This article will surely be the start of a series on the intricacies of email
|
||||
and the difficulty of deploying such a service at home.
|
||||
|
||||
We hope that this NLnet-funded work will enable us to replace our current email
|
||||
system with unikernels. We're already past the stage where we can, more or less
|
||||
(without DMARC checking), send emails to each other, which is a big step!
|
||||
|
||||
So follow our work on our blog and if you like what we're producing (which
|
||||
involves a whole bunch of protocols and formats - much more than just SMTP), you
|
||||
can make [a donation here](https://robur.coop/Donate)!
|
||||
|
||||
[mrmime]: https://github.com/mirage/mrmime
|
||||
[smtp_1]: https://blog.osau.re/articles/smtp_1.html
|
||||
[smtp_2]: https://blog.osau.re/articles/smtp_2.html
|
||||
[smtp_3]: https://blog.osau.re/articles/smtp_3.html
|
||||
[oneffs]: https://github.com/robur-coop/oneffs
|
||||
[albatross]: https://github.com/robur-coop/albatross
|
||||
[git-kv]: https://github.com/robur-coop/git-kv
|
||||
[primary-git]: https://github.com/robur-coop/dns-primary-git/
|
||||
[contruno]: https://github.com/dinosaure/contruno
|
||||
[pasteur]: https://github.com/dinosaure/pasteur
|
||||
[unipi]: https://github.com/robur-coop/unipi
|
||||
[sendmail]: https://github.com/mirage/colombe
|
45
articles/2024-12-04-github-sponsor.md
Normal file
45
articles/2024-12-04-github-sponsor.md
Normal file
|
@ -0,0 +1,45 @@
|
|||
---
|
||||
date: 2024-12-04
|
||||
title: Sponsor us via GitHub
|
||||
description: A new way to sponsor our cooperative
|
||||
tags:
|
||||
- cooperative
|
||||
- github
|
||||
author:
|
||||
name: Romain Calascibetta
|
||||
email: romain.calascibetta@gmail.com
|
||||
link: https://blog.osau.re/
|
||||
breaks: false
|
||||
---
|
||||
|
||||
We're delighted to announce the possibility of helping our cooperative through
|
||||
the GitHub Sponsors platform. The link is available here:
|
||||
|
||||
[https://github.com/sponsors/robur-coop](https://github.com/sponsors/robur-coop)
|
||||
|
||||
We would also like to reiterate the possibility of making a donation[^preferable] to our
|
||||
cooperative via the IBAN of [Änderwerk][anderwerk] available here (if you need
|
||||
a tax-deductible donation receipt, please use [this form][donate]).
|
||||
|
||||
Account holder: Änderwerk gGmbH
|
||||
Subject: robur
|
||||
IBAN: DE46 4306 0967 1289 8604 00
|
||||
BIC: GENODEM1GLS
|
||||
Bank: GLS Gemeinschaftsbank, Christstrasse 9, 44789 Bochum, Germany
|
||||
|
||||
More generally, you can refer to our [article][financial] which explains our
|
||||
funding since the creation of Robur and we would like to point out that,
|
||||
despite our funding, part of our work remains unfunded: in particular with
|
||||
regard to the maintenance of certain software as well as certain services made
|
||||
available to our users.
|
||||
|
||||
We would therefore be delighted if users of our software and services could
|
||||
finance our work according to their means. GitHub in particular offers an
|
||||
easy-to-use platform for funding us (even if, in all transparency, it takes a
|
||||
certain amount from each transaction).
|
||||
|
||||
[financial]: https://blog.robur.coop/articles/finances.html
|
||||
[anderwerk]: https://aenderwerk.de/
|
||||
[donate]: https://aenderwerk.de/donate
|
||||
|
||||
[^preferable]: In fact, this method is preferable to us as this means it will go directly to us instead of through GitHub and Stripe who will take a small cut of the donation in fees.
|
604
articles/2025-01-07-carton-and-cachet.md
Normal file
604
articles/2025-01-07-carton-and-cachet.md
Normal file
|
@ -0,0 +1,604 @@
|
|||
---
|
||||
date: 2025-01-07
|
||||
title: Git, Carton and emails
|
||||
description: A way to store and archive your emails
|
||||
tags:
|
||||
- emails
|
||||
- storage
|
||||
- Git
|
||||
author:
|
||||
name: Romain Calascibetta
|
||||
email: romain.calascibetta@gmail.com
|
||||
link: https://blog.osau.re/
|
||||
breaks: false
|
||||
---
|
||||
|
||||
We are pleased to announce the release of Carton 1.0.0 and Cachet. You can have
|
||||
an overview of these libraries in our announcement on the OCaml forum. This
|
||||
article goes into more detail about the PACK format and its use for archiving
|
||||
your emails.
|
||||
|
||||
## Back to Git and patches
|
||||
|
||||
In our Carton annoucement, we talk about 2 levels of compression for Git
|
||||
objects, which are zlib compression and compression between objects using a
|
||||
patch.
|
||||
|
||||
Furthermore, if we have 2 blobs (2 versions of a file), one of which contains
|
||||
'A' and the other contains 'A+B', the second blob will probably be saved in the
|
||||
form of a patch requiring the contents of the first blob and adding '+B'. At a
|
||||
higher level and according to our use of Git, we understand that this second
|
||||
level of compression is very interesting: we generally just add/remove few lines
|
||||
(like introduce a new function) or delete some (removing code) in our files of
|
||||
our project.
|
||||
|
||||
However, there is a bias in what Git does and what we perceive. We often think
|
||||
that when it comes to patching in the case of Git, we think of the
|
||||
[patience diff][patience-diff] or the [Eugene Myers diff][eugene-myers-diff].
|
||||
While the latter offer the advantage of readability in terms of knowing what has
|
||||
been added or deleted between two files, they are not necessarily optimal for
|
||||
producing a _small_ patch.
|
||||
|
||||
In reality, what interests us in the case of the storage and transmission of
|
||||
these patches over the network is not a form of readability in these patches but
|
||||
an optimality in what can be considered as common between two files and what is
|
||||
not. It is at this stage that the use of [duff][duff] is introduced.
|
||||
|
||||
This is a small library which can generate a patch between two files according
|
||||
to the series of bytes common to both files. We're talking about 'series of
|
||||
bytes' here because these elements common to our two files are not necessary
|
||||
human readable. To find these series of common bytes, we use [Rabin's
|
||||
fingerprint][rabin] algorithm: [a rolling hash][rolling-hash] used since time
|
||||
immemorial.
|
||||
|
||||
### Patches and emails
|
||||
|
||||
So, as far as emails are concerned, it's fairly obvious that there are many
|
||||
common "words" to all your emails. The simple word `From:` should exist in all
|
||||
your emails.
|
||||
|
||||
From this simple idea, we can understand the impact, the headers of your emails
|
||||
are more or less similar and have more or less the same content. The idea of
|
||||
`duff`, applied to your emails, is to consider these other emails as a
|
||||
"slightly" different version of your first email.
|
||||
1) we can store a single raw email
|
||||
2) and we build patches of all your other emails from this first one
|
||||
|
||||
A fairly concrete example of this compression through patches and emails is 2
|
||||
notification emails from GitHub: these are quite similar, particularly in the
|
||||
header. Even the content is just as similar: the HTML remains the same, only
|
||||
the commentary differs.
|
||||
|
||||
```shell
|
||||
$ carton diff github.01.eml github.02.eml -o patch.diff
|
||||
$ du -sb github.01.eml github.02.eml patch.diff
|
||||
9239 github.01.eml
|
||||
9288 github.02.eml
|
||||
5136 patch.diff
|
||||
```
|
||||
|
||||
This example shows that our patch for rebuilding `github.02.eml` from
|
||||
`github.01.eml` is almost 2 times smaller in size. In this case, with the PACK
|
||||
format, this patch will also be compressed with zlib (and we can reach ~2900
|
||||
bytes, so 3 times smaller).
|
||||
|
||||
#### Compress and compress!
|
||||
|
||||
To put this into perspective, a compression algorithm like zlib can also have
|
||||
such a ratio (3 times smaller). But the latter also needs to serialise the
|
||||
Huffman tree required for compression (in the general case). What can be
|
||||
observed is that concatenating separately compressed emails makes it difficult
|
||||
to maintain such a ratio. Worse, concatenating all the emails before compression
|
||||
and compressing them all gives us a better ratio!
|
||||
|
||||
That's what the PACK file is all about: the aim is to be able to concatenate
|
||||
these compressed emails and keep an interesting overall compression ratio. This
|
||||
is the reason for the patch, to reduce the objects even further so that the
|
||||
impact of the zlib _header_ on all our objects is minimal and, above
|
||||
all, so that we can access to objects **without** having to decompress the
|
||||
previous ones (as we would have to do for a `*.tar.gz` archive, for example).
|
||||
|
||||
The initial intuition about the emails was right, they do indeed share quite a
|
||||
few elements together and in the end we were able to save ~4000 bytes in our
|
||||
GitHub notification example.
|
||||
|
||||
## Isomorphism, DKIM and ARC
|
||||
|
||||
One attribute that we wanted to pay close attention to throughout our
|
||||
experimentation was "isomorphism". This property is very simple: imagine a
|
||||
function that takes an email as input and transforms it into another value
|
||||
using a method (such as compression). Isomorphism ensures that we can 'undo'
|
||||
this method and obtain exactly the same result again:
|
||||
|
||||
```
|
||||
decode(encode(x)) == x
|
||||
```
|
||||
|
||||
This property is very important for emails because signatures exist in your
|
||||
email and these signatures result from the content of your email. If the email
|
||||
changes, these signatures change too.
|
||||
|
||||
For instance, the DKIM signature allows you to sign an email and check its
|
||||
integrity on receipt. ARC (which will be our next objective) also signs your
|
||||
emails, but goes one step further: all the relays that receive your email and
|
||||
send it back to the real destination must add a new ARC signature, just like
|
||||
adding a new block to the Bitcoin blockchain.
|
||||
|
||||
So you need to make sure that the way you serialise your email (in a PACK file)
|
||||
doesn't alter the content in order to keep these signatures valid! It just so
|
||||
happens that here too we have a lot of experience with Git. Git has the same
|
||||
constraint with [Merkle-Trees][merkle-tree] and as far as we're concerned, we've
|
||||
developed a library that allows you to generate an encoder and a decoder from a
|
||||
description and that respects the isomorphism property _by construction_: the
|
||||
[encore][encore] library.
|
||||
|
||||
Then we could store our emails as they are in the PACK file. However, the
|
||||
advantage of `duff` really comes into play when several objects are similar. In
|
||||
the case of Git, tree objects are similar but they are not similar with
|
||||
commits, for example. For emails, there is also such a distinction: the email
|
||||
headers are similar but they are not similar to the email content.
|
||||
|
||||
You can therefore try to "split" emails into 2 parts, the header on one side and
|
||||
the content on the other. We would then have a third value which would tell us
|
||||
how to reconstruct our complete email (i.e. identify where the header is and
|
||||
identify where the content is).
|
||||
|
||||
However, after years of reading email RFCs, things are much more complex. Above
|
||||
all, this experience has enabled me to synthesise a skeleton that all emails
|
||||
have:
|
||||
|
||||
```ocaml
|
||||
(* multipart-body :=
|
||||
[preamble CRLF]
|
||||
--boundary transport-padding CRLF
|
||||
part
|
||||
( CRLF --boundary transport-padding CRLF part )*
|
||||
CRLF
|
||||
--boundary-- transport-padding
|
||||
[CRLF epilogue]
|
||||
|
||||
part := headers ( CRLF body )?
|
||||
*)
|
||||
|
||||
type 'octet body =
|
||||
| Multipart of 'octet multipart
|
||||
| Single of 'octet option
|
||||
| Message of 'octet t
|
||||
|
||||
and 'octet part = { headers : 'octet; body : 'octet body }
|
||||
|
||||
and 'octet multipart =
|
||||
{ preamble : string
|
||||
; epilogue : string * transport_padding;
|
||||
; boundary : string
|
||||
; parts : (transport_padding * 'octet part) list }
|
||||
|
||||
and 'octet t = 'octet part
|
||||
```
|
||||
|
||||
As you can see, the distinction is not only between the header and the content
|
||||
but also between the parts of an email as soon as it has an attachment. You can
|
||||
also have an email inside an email (and I'm always surprised to see that this
|
||||
particular case is _frequent_). Finally, there's the annoying _preamble_ and
|
||||
_epilogue_ of an email with several parts, which is often empty but necessary:
|
||||
you always have to ensure isomorphism — even for "useless" bytes, they count for
|
||||
signatures.
|
||||
|
||||
We'll therefore need to serialise this structure and all we have to do is
|
||||
transform a `string t` and `SHA1.t t` so that our structure no longer contains
|
||||
the actual content of our emails but a unique identifier referring to this
|
||||
content and which will be available in our PACK file.
|
||||
|
||||
```ocaml
|
||||
module Format : sig
|
||||
val t : SHA1.t Encore.t
|
||||
end
|
||||
|
||||
let decode =
|
||||
let parser = Encore.to_angstrom Format.t in
|
||||
Angstrom.parse_string ~consume:All parser str
|
||||
|
||||
let encode =
|
||||
let emitter = Encore.to_lavoisier Format.t in
|
||||
Encore.Lavoisier.emit_string ~chunk:0x7ff t emitter
|
||||
```
|
||||
|
||||
However, we need to check that the isomorphism is respected. You should be
|
||||
aware that work on [Mr. MIME][mrmime] has already been done on this subject with
|
||||
the [afl][afl] fuzzer: check our assertion `x == decode(encode(x))`. This
|
||||
ability to check isomorphism using afl has enabled us to use the latter to
|
||||
generate valid random emails. This allows me to reintroduce you to the
|
||||
[hamlet][hamlet] project, perhaps the biggest database of valid — but
|
||||
incomprehensible — emails. So we've checked that our encoder/decoder for
|
||||
“splitting” our emails respects isomophism on this million emails.
|
||||
|
||||
## Carton, POP3 & mbox, some metrics
|
||||
|
||||
We can therefore split an email into several parts and calculate an optimal
|
||||
patch between two similar pieces of content. So now you can start packaging!
|
||||
This is where I'm going to reintroduce a tool that hasn't been released yet,
|
||||
but which allows me to go even further with emails: [blaze][blaze].
|
||||
|
||||
This little tool is my _Swiss army knife_ for emails! And it's in this tool
|
||||
that we're going to have fun deriving Carton so that it can manipulate emails
|
||||
rather than Git objects. So we've implemented the very basic [POP3][pop3]
|
||||
protocol (and thanks to [ocaml-tls][tls] for offering a free encrypted
|
||||
connection) as well as the [mbox][mbox] format.
|
||||
|
||||
Both are **not** recommended. The first is an old protocol and interacting with
|
||||
Gmail, for example, is very slow. The second is an old, non-standardised format
|
||||
for storing your emails — and unfortunately this may be the format used by your
|
||||
email client. After resolving a few bugs such as the unspecified behaviour of
|
||||
pop.gmail.com and the mix of CRLF and LF in the mbox format... You'll end up
|
||||
with lots of emails that you'll have fun packaging!
|
||||
|
||||
```shell
|
||||
$ mkdir mailbox
|
||||
$ blaze.fetch pop3://pop.gmail.com -p $(cat password.txt) \
|
||||
-u recent:romain.calascibetta@gmail.com -f 'mailbox/%s.eml' > mails.lst
|
||||
$ blaze.pack make -o mailbox.pack mails.lst
|
||||
$ tar czf mailbox.tar.gz mailbox
|
||||
$ du -sh mailbox mailbox.pack mailbox.tar.gz
|
||||
97M mailbox
|
||||
28M mailbox.pack
|
||||
23M mailbox.tar.gz
|
||||
```
|
||||
|
||||
In this example, we download the latest emails from the last 30 days via POP3
|
||||
and store them in the `mailbox/` folder. This folder weighs 97M and if we
|
||||
compress it with gzip, we end up with 23M. The problem is that we need to
|
||||
decompress the `mailbox.tar.gz` document to extract the emails.
|
||||
|
||||
This is where the PACK file comes in handy: it only weighs 28M (so we're very
|
||||
close to what `tar` and `gzip` can do) but we can rebuild our emails without
|
||||
unpacking everything:
|
||||
|
||||
```shell
|
||||
$ blaze.pack index mailbox.pack
|
||||
$ blaze.pack list mailbox.pack | head -n1
|
||||
0000000c 4e9795e268313245f493d9cef1b5ccf30cc92c33
|
||||
$ blaze.pack get mailbox.idx 4e9795e268313245f493d9cef1b5ccf30cc92c33
|
||||
Delivered-To: romain.calascibetta@gmail.com
|
||||
...
|
||||
```
|
||||
|
||||
Like Git, we now associate a hash with our emails and can retrieve them using
|
||||
this hash. Like Git, we also calculate the `*.idx` file to associate the hash
|
||||
with the position of the email in our PACK file. Just like Git (with `git show`
|
||||
or `git cat-file`), we can now access our emails very quickly. So we now have a
|
||||
database system (read-only) for our emails: we can now archive our emails!
|
||||
|
||||
Let's have a closer look at this PACK file. We've developed a tool more or less
|
||||
similar to `git verify-pack` which lists all the objects in our PACK file and,
|
||||
above all, gives us information such as the number of bytes needed to store
|
||||
these objects:
|
||||
|
||||
```shell
|
||||
$ blaze.pack verify mailbox.pack
|
||||
4e9795e268313245f493d9cef1b5ccf30cc92c33 a 12 186 6257b7d4
|
||||
...
|
||||
517ccbc063d27dbd87122380c9cdaaadc9c4a1d8 b 666027 223 10 e8e534a6 cedfaf6dc22f3875ae9d4046ea2a51b9d5c6597a
|
||||
```
|
||||
|
||||
It shows the hash of our object, its type (A for the structure of our email, B
|
||||
for the content), its position in the PACK file, the number of bytes used to
|
||||
store the object (!) and finally the depth of the patch, the checksum, and the
|
||||
source of the patch needed to rebuild the object.
|
||||
|
||||
Here, our first object is not patched, but the next object is. Note that it
|
||||
only needs 223 bytes in the PACK file. But what is the real size of this
|
||||
object?
|
||||
|
||||
```shell
|
||||
$ carton get mailbox.idx 517ccbc063d27dbd87122380c9cdaaadc9c4a1d8 \
|
||||
--raw --without-metadata | wc -c
|
||||
2014
|
||||
```
|
||||
|
||||
So we've gone from 2014 bytes to 223 bytes! That's almost a compression ratio of
|
||||
10! In this case, the object is the content of an email. Guess which one? A
|
||||
GitHub notification! If we go back to our very first example, we saw that we
|
||||
could compress with a ratio close to 2. We could go further with zlib: we
|
||||
compress the patch too. This example allows us to introduce one last feature of
|
||||
PACK files: the depth.
|
||||
|
||||
```shell
|
||||
$ carton get mailbox.idx 517ccbc063d27dbd87122380c9cdaaadc9c4a1d8
|
||||
kind: b
|
||||
length: 2014 byte(s)
|
||||
depth: 10
|
||||
cache misses: 586
|
||||
cache hits: 0
|
||||
tree: 000026ab
|
||||
Δ 00007f78
|
||||
...
|
||||
Δ 0009ef74
|
||||
Δ 000a29ab
|
||||
...
|
||||
```
|
||||
|
||||
In our example, our object requires a source which, in turn, is a patch
|
||||
requiring another source, and so on (you can see this chain in the `tree`).
|
||||
The length of this patch chain corresponds to the depth of our object. There is
|
||||
therefore a succession of patches between objects. What Carton tries to do is
|
||||
to find the best patch from a window of possibilities and keep the best
|
||||
candidates for reuse. If we unroll this chain of patches, we find a "base"
|
||||
object (at `0x000026ab`) that is just compressed with zlib and the latter is
|
||||
also the content of another GitHub notification email. This shows that Carton
|
||||
is well on its way to finding the best candidate for the patch, which should be
|
||||
similar content, moreover, another GitHub notification.
|
||||
|
||||
The idea is to sacrifice a little computing time (in the reconstruction of
|
||||
objects via their patches) to gain in compression ratio. It's fair to say that
|
||||
a very long patch chain can degrade performance. However, there is a limit in
|
||||
Git and Carton: a chain can't be longer than 50. Another point is the search for
|
||||
the candidate source for the patch, which is often physically close to the patch
|
||||
(within a few bytes): reading the PACK file by page (thanks to [Cachet][cachet])
|
||||
sometimes gives access to 3 or 4 objects, which have a certain chance of being
|
||||
patched together.
|
||||
|
||||
Let's take the example of Carton and a Git object:
|
||||
|
||||
```shell
|
||||
$ carton get pack-*.idx eaafd737886011ebc28e6208e03767860c22e77d
|
||||
...
|
||||
cache misses: 62
|
||||
cache hits: 758
|
||||
tree: 160720bb
|
||||
Δ 160ae4bc
|
||||
Δ 160ae506
|
||||
Δ 160ae575
|
||||
Δ 160ae5be
|
||||
Δ 160ae5fc
|
||||
Δ 160ae62f
|
||||
Δ 160ae667
|
||||
Δ 160ae6a5
|
||||
Δ 160ae6db
|
||||
Δ 160ae72a
|
||||
Δ 160ae766
|
||||
Δ 160ae799
|
||||
Δ 160ae81e
|
||||
Δ 160ae858
|
||||
Δ 16289943
|
||||
```
|
||||
|
||||
We can see here that we had to load 62 pages, but that we also reused the pages
|
||||
we'd already read 758 times. We can also see that the offset of the patches
|
||||
(which can be seen in Tree) is always close (the objects often follow each
|
||||
other).
|
||||
|
||||
### Mbox and real emails
|
||||
|
||||
In a way, the concrete cases we use here are my emails. There may be a fairly
|
||||
simple bias, which is that all these emails have the same destination:
|
||||
romain.calascibetta@gmail.com. This is a common point which can also have a
|
||||
significant impact on compression with `duff`. We will therefore try another
|
||||
corpus, the archives of certain mailing lists relating to OCaml:
|
||||
[lists.ocaml.org-archive](https://github.com/ocaml/lists.ocaml.org-archive)
|
||||
|
||||
```shell
|
||||
$ blaze.mbox lists.ocaml.org-archive/pipermail/opam-devel.mbox/opam-devel.mbox \
|
||||
-o opam-devel.pack
|
||||
$ gzip -c lists.ocaml.org-archive/pipermail/opam-devel.mbox/opam-devel.mbox \
|
||||
> opam-devel.mbox.gzip
|
||||
$ du -sh opam-devel.pack opam-devel.mbox.gzip \
|
||||
lists.ocaml.org-archive/pipermail/opam-devel.mbox/opam-devel.mbox
|
||||
3.9M opam-devel.pack
|
||||
2.0M opam-devel.mbox.gzip
|
||||
10M lists.ocaml.org-archive/pipermail/opam-devel.mbox/opam-devel.mbox
|
||||
```
|
||||
|
||||
The compression ratio is a bit worse than before, but we're still on to
|
||||
something interesting. Here again we can take an object from our PACK file and
|
||||
see how the compression between objects reacts:
|
||||
|
||||
```shell
|
||||
$ blaze.pack index opam-devel.pack
|
||||
...
|
||||
09bbd28303c8aafafd996b56f9c071a3add7bd92 b 362504 271 10 60793428 412b1fbeb6ee4a05fe8587033c1a1d8ca2ef5b35
|
||||
$ carton get opam-devel.idx 09bbd28303c8aafafd996b56f9c071a3add7bd92 \
|
||||
--without-metadata --raw | wc -c
|
||||
2098
|
||||
```
|
||||
|
||||
Once again, we see a ratio of 10! Here the object corresponds to the header of
|
||||
an email. This is compressed with other email headers. This is the situation
|
||||
where the fields are common to all your emails (`From`, `Subject`, etc.).
|
||||
Carton successfully patches headers together and email content together.
|
||||
|
||||
## Next things
|
||||
|
||||
All the work done on email archiving is aimed at producing a unikernel (`void`)
|
||||
that could archive all incoming emails. Finally, this unikernel could send the
|
||||
archive back (via an email!) to those who want it. This is one of our goals for
|
||||
implementing a mailing list in OCaml with unikernels.
|
||||
|
||||
Another objective is to create a database system for emails in order to offer
|
||||
two features to the user that we consider important:
|
||||
- quick and easy access to emails
|
||||
- save disk space through compression
|
||||
|
||||
With this system, we can extend the method of indexing emails with other
|
||||
information such as the keywords found in the emails. This will enable us,
|
||||
among other things, to create an email search engine!
|
||||
|
||||
## Conclusion
|
||||
|
||||
This milestone in our PTT project was quite long, as we were very interested in
|
||||
metrics such as compression ratio and software execution speed.
|
||||
|
||||
The experience we'd gained with emails (in particular with Mr. MIME) enabled us
|
||||
to move a little faster, especially in terms of serializing our emails. Our
|
||||
experience with ocaml-git also enabled us to identify the benefits of the PACK
|
||||
file for emails.
|
||||
|
||||
But the development of [Miou][miou] was particularly helpful in satisfying us in
|
||||
terms of program execution time, thanks to the ability to parallelize certain
|
||||
calculations quite easily.
|
||||
|
||||
The format is still a little rough and not quite ready for the development of a
|
||||
keyword-based e-mail indexing system, but it provides a good basis for the rest
|
||||
of our project.
|
||||
|
||||
So, if you like what we're doing and want to help, you can make a donation via
|
||||
[GitHub][donate-github] or using our [IBAN][donate-iban].
|
||||
|
||||
<hr />
|
||||
|
||||
## Post
|
||||
|
||||
This little note is an extension of the feedback we got on the Carton release.
|
||||
[nojb][nojb], in this case, pointed to the [public-inbox][public-inbox]
|
||||
software as the archiver of the various Linux kernel mailing lists. The latter
|
||||
is based on the same intuition we had, namely to use the PACK format to archive
|
||||
emails.
|
||||
|
||||
The question then arises: are we starting to remake the wheel?
|
||||
|
||||
In truth, the devil is in the detail. As it happens, you can download LKML
|
||||
mailing list archives with Git in this way:
|
||||
```shell
|
||||
$ git clone --mirror http://lore.kernel.org/lkml/15 lkml/git/15.git
|
||||
$ cd lkml/git/15.git
|
||||
$ du -sh objects/pack/pack-*.pack
|
||||
981M objects/pack/pack-*.pack
|
||||
$ cd objects/pack/
|
||||
$ mkdir loose
|
||||
$ carton explode 'loose/%s/%s' pack-*.pack
|
||||
$ du -sh loose/c/
|
||||
2.7G loose/c
|
||||
```
|
||||
`public-inbox` is based not only on the PACK format for email archiving, but
|
||||
also on Git concepts. In this case, such a Git repository actually only contains
|
||||
an `m` file corresponding to the last email received on the mailing list. The
|
||||
other e-mails are "old versions of this e-mail". In this case, `public-inbox`
|
||||
considers a certain form of _versioning_ between emails. Each commit is a new
|
||||
email and will "replace" the previous one.
|
||||
|
||||
### Heuristics to patch
|
||||
|
||||
`public-inbox` then relies on the heuristics implemented by Git to find the best
|
||||
candidate for patching emails. These heuristics are explained
|
||||
[here][git-heuristics]. The idea is to consider a base object (which will be the
|
||||
source of several patches) as the **last** version of your file (in the case of
|
||||
`public-inbox`, the last email received) and build patches of previous versions
|
||||
with this base object. The heuristic comes from the spontaneous idea that, when
|
||||
it comes to software files, these grow entropically. The latest version is
|
||||
therefore most likely to contain all the similarities with previous versions.
|
||||
|
||||
Once again, when it comes to code, we tend to add code. So we should be able to
|
||||
use all the occurrences available in the latest version of a file to produce
|
||||
patches for earlier versions.
|
||||
|
||||
### Comparison
|
||||
|
||||
Let's have some fun comparing `public-inbox` and the `blaze` tool:
|
||||
```markdown
|
||||
+-------+--------------+------+
|
||||
| blaze | public-inbox | raw |
|
||||
+-----------+-------+--------------+------+
|
||||
| caml-list | 160M | 154M | 425M |
|
||||
+-----------+-------+--------------+------+
|
||||
| lkml.15 | 1.1G | 981M | 2.7G |
|
||||
+-----------+-------+--------------+------+
|
||||
| kvm.0 | 1.2G | 1.1G | 3.1G |
|
||||
+-----------+-------+--------------+------+
|
||||
```
|
||||
|
||||
The first thing you'll notice is that `blaze` produces PACK files that are a
|
||||
little larger than those produced by Git. The problem is that `blaze` doesn't
|
||||
store exactly the same thing! The emails it stores are emails with lines ending
|
||||
in `\r\n`, whereas `public-inbox` stores emails with `\n`. It may just be a
|
||||
small character, but multiplied by the number of lines in an email and the
|
||||
number of emails in the archive, it's got its weight.
|
||||
|
||||
It's also true that [decompress][decompress], the OCaml implementation of zlib,
|
||||
is not as efficient as its C competitor in terms of ratio. So this is
|
||||
disadvantage we have, which is not linked to the way we generate the PACK file
|
||||
(we could replace `decompress` with zlib!).
|
||||
|
||||
However, there's another interesting metric between what we produce and what
|
||||
`public-inbox` does. It's important to understand that we maintain "some
|
||||
compatibility" with the Git PACK file. The objects aren't the same and don't
|
||||
have the same meaning, but it's still a PACK file. As such, we can use `git
|
||||
verify-pack` on our archive as on the `public-inbox` archive:
|
||||
|
||||
```markdown
|
||||
+-----------------+------------------------+
|
||||
| PACK from blaze | PACK from public-inbox |
|
||||
+-----------+-----------------+------------------------+
|
||||
| caml-list | ~2.5s | ~4.1s |
|
||||
+-----------+-----------------+------------------------+
|
||||
| lkml.15 | ~14.7s | ~16.3s |
|
||||
+-----------+-----------------+------------------------+
|
||||
| kvm.0 | ~18s | ~21s |
|
||||
+-----------+-----------------+------------------------+
|
||||
```
|
||||
|
||||
The analysis of our PACK file is faster than the one of `public-inbox`. This is
|
||||
where we need to understand what we're trying to store and how we're doing it.
|
||||
When it comes to finding a candidate for a patch, `blaze` relies solely on the
|
||||
similarities between the two objects/emails they have, whereas `public-inbox`,
|
||||
via Git heuristics, will still prioritize a patch between emails that follow
|
||||
each other in temporality via "versioning".
|
||||
|
||||
The implication is that the last 2 emails may have no similarity at all, but
|
||||
Git/`public-inbox` will still try to patch them together, as one is the
|
||||
_previous version_ (in terms of time) of the other.
|
||||
|
||||
Another aspect is that Git sometimes breaks _the patch chain_ so that, when it
|
||||
comes to extracting an object, if it's a patch, its source shouldn't be very far
|
||||
away in the PACK file. Git prefers to patch an object with a source that may be
|
||||
less good but close to it, rather than keeping the best candidate as the source
|
||||
for all patches. Here too, `blaze` reacts differently: we try to keep and reuse
|
||||
the best candidate as much as possible.
|
||||
|
||||
A final difference, which may also be important, is the way in which emails are
|
||||
stored. We often refer to e-mails as "split", whereas `public-inbox` only stores
|
||||
them as they are. One implication of this can be the extraction of an
|
||||
attachment. As far as `blaze` is concerned, we would just have to extract the
|
||||
_skeleton_ of the email, search in the various headers for the desired
|
||||
attachment and extract the attachment as is, which is a full-fledged object in
|
||||
our PACK file.
|
||||
|
||||
As for `public-inbox`, we'd have to extract the email, **parse** the email, then
|
||||
search for the part containing the attachment according to the header and
|
||||
finally extract the attachment.
|
||||
|
||||
### Conclusion
|
||||
|
||||
If we had to draw a "meta" conclusion from the differences between `blaze` and
|
||||
`public-inbox`, it's that our tool focuses on the content of your emails,
|
||||
whereas `public-inbox` focuses on the historicity of your emails. As such, and
|
||||
in the hope of making an OCaml-based email client, we believe our approach
|
||||
remains interesting.
|
||||
|
||||
But these experiments have shown us 2 important things:
|
||||
- we're capable of handling millions of emails, parsing and storing them. It's
|
||||
pretty impressive to see our tool handle almost a million emails (`kvm.0`)
|
||||
without any bugs!
|
||||
- the second thing is that our initial intuition remains valid. Even if the path
|
||||
seems subtly different from what `public-inbox` can do, our approach is
|
||||
clearly the right one and keeps us going.
|
||||
|
||||
[patience-diff]: https://opensource.janestreet.com/patdiff/
|
||||
[eugene-myers-diff]: https://www.nathaniel.ai/myers-diff/
|
||||
[duff]: https://github.com/mirage/duff
|
||||
[rabin]: https://en.wikipedia.org/wiki/Rabin_fingerprint
|
||||
[rolling-hash]: https://en.wikipedia.org/wiki/Rolling_hash
|
||||
[merkle-tree]: https://en.wikipedia.org/wiki/Merkle_tree
|
||||
[encore]: https://github.com/mirage/encore
|
||||
[mrmime]: https://github.com/mirage/mrmime
|
||||
[afl]: https://afl-1.readthedocs.io/en/latest/fuzzing.html
|
||||
[hamlet]: https://github.com/mirage/hamlet
|
||||
[blaze]: https://github.com/dinosaure/blaze
|
||||
[pop3]: https://en.wikipedia.org/wiki/Post_Office_Protocol
|
||||
[tls]: https://github.com/mirleft/ocaml-tls
|
||||
[mbox]: https://en.wikipedia.org/wiki/Mbox
|
||||
[donate-github]: https://github.com/sponsors/robur-coop
|
||||
[donate-iban]: https://robur.coop/Donate
|
||||
[miou]: https://github.com/robur-coop/miou
|
||||
[nojb]: https://discuss.ocaml.org/t/ann-release-of-carton-1-0-0-and-cachet/15953/2?u=dinosaure
|
||||
[public-inbox]: https://public-inbox.org/README.html
|
||||
[decompress]: https://github.com/mirage/decompress
|
||||
[git-heuristics]: https://github.com/git/git/blob/master/Documentation/technical/pack-heuristics.txt
|
538
articles/arguments.md
Normal file
538
articles/arguments.md
Normal file
|
@ -0,0 +1,538 @@
|
|||
---
|
||||
date: 2024-10-22
|
||||
title: Runtime arguments in MirageOS
|
||||
description:
|
||||
The history of runtime arguments to a MirageOS unikernel
|
||||
tags:
|
||||
- OCaml
|
||||
- MirageOS
|
||||
author:
|
||||
name: Hannes Mehnert
|
||||
email: hannes@mehnert.org
|
||||
link: https://hannes.robur.coop
|
||||
---
|
||||
|
||||
TL;DR: Passing runtime arguments around is tricky, and prone to change every other month.
|
||||
|
||||
## Motivation
|
||||
|
||||
Sometimes, as an unikernel developer and also as operator, it's nice to have
|
||||
some runtime arguments passed to an unikernel. Now, if you're into OCaml,
|
||||
command-line parsing - together with error messages, man page generation, ... -
|
||||
can be done by the amazing [cmdliner](https://erratique.ch/software/cmdliner)
|
||||
package from Daniel Bünzli.
|
||||
|
||||
MirageOS uses cmdliner for command line argument passing. This also enabled
|
||||
us from the early days to have nice man pages for unikernels (see
|
||||
`my-unikernel-binary --help`). There are two kinds
|
||||
of arguments: those at configuration time (`mirage configure`), such as the
|
||||
target to compile for, and those at runtime - when the unikernel is executed.
|
||||
|
||||
In Mirage 4.8.1 and 4.8.0 (released October 2024) there have been some changes
|
||||
to command-line arguments, which were motivated by 4.5.0 (released April 2024)
|
||||
and user feedback.
|
||||
|
||||
First of all, our current way to pass a custom runtime argument to a unikernel
|
||||
(`unikernel.ml`):
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
open Cmdliner
|
||||
|
||||
let hello =
|
||||
let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
|
||||
let term = Arg.(value & opt string "Hello World!" doc) in
|
||||
Mirage_runtime.register_arg term
|
||||
|
||||
module Hello (Time : Mirage_time.S) = struct
|
||||
let start _time =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
Logs.info (fun f -> f "%s" (hello ()));
|
||||
Time.sleep_ns (Duration.of_sec 1) >>= fun () -> loop (n - 1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
We define the [Cmdliner.Term.t](https://erratique.ch/software/cmdliner/doc/Cmdliner/Term/index.html#type-t)
|
||||
in line 6 (`let term = ..`) - which provides documentation ("How to say hello."), the option to
|
||||
use (`["hello"]` - which is then translated to `--hello=`), that it is optional,
|
||||
of type `string` (cmdliner allows you to convert the incoming strings to more
|
||||
complex (or more narrow) data types, with decent error handling).
|
||||
|
||||
The defined argument is directly passed to [`Mirage_runtime.register_arg`](https://ocaml.org/p/mirage-runtime/4.8.1/doc/Mirage_runtime/index.html#val-register_arg),
|
||||
(in line 7) so our binding `hello` is of type `unit -> string`.
|
||||
In line 14, the value of the runtime argument is used (`hello ()`) for printing
|
||||
a log message.
|
||||
|
||||
The nice property is that it is all local in `unikernel.ml`, there are no other
|
||||
parts involved. It is just a bunch of API calls. The downside is that `hello ()`
|
||||
should only be evaluated after the function `start` was called - since the
|
||||
`Mirage_runtime` needs to parse and fill in the command line arguments. If you
|
||||
call `hello ()` earlier, you'll get an exception "Called too early. Please delay
|
||||
this call to after the start function of the unikernel.". Also, since
|
||||
Mirage_runtime needs to collect and evaluate the command line arguments, the
|
||||
`Mirage_runtime.register_arg` may only be called at top-level, otherwise you'll
|
||||
get another exception "The function register_arg was called to late. Please call
|
||||
register_arg before the start function is executed (e.g. in a top-level binding).".
|
||||
|
||||
Another advantage is, having it all in unikernel.ml means adding and removing
|
||||
arguments doesn't need another execution of `mirage configure`. Also, any
|
||||
type can be used that the unikernel depends on - the config.ml is compiled only
|
||||
with a small set of dependencies (mirage itself) - and we don't want to impose a
|
||||
large dependency cone for mirage just because someone may like to use
|
||||
X509.Key_type.t as argument type.
|
||||
|
||||
Earlier, before mirage 4.5.0, we had runtime and configure arguments mixed
|
||||
together. And code was generated when `mirage configure` was executed to
|
||||
deal with these arguments. The downsides included: we needed serialization for
|
||||
all command-line arguments (at configure time you could fill the argument, which
|
||||
was then serialized, and deserialized at runtime and used unless the argument
|
||||
was provided explicitly), they had to appear in `config.ml` (which also means
|
||||
changing any would need an execution of `mirage configure`), since they generated code
|
||||
potential errors were in code that the developer didn't write (though we had
|
||||
some `__POS__` arguments to provide error locations in the developer code).
|
||||
|
||||
Related recent changes are:
|
||||
- in mirage 4.8.1, the runtime arguments to configure the OCaml runtime system
|
||||
(such as GC settings, randomization of hashtables, recording of backtraces)
|
||||
are now provided using the [cmdliner-stdlib](https://ocaml.org/p/cmdliner-stdlib)
|
||||
package.
|
||||
- in mirage 4.8.0, for git, dns-client, and happy-eyeballs devices the optional
|
||||
arguments are generated by default - so they are always available and don't
|
||||
need to be manually done by the unikernel developer.
|
||||
|
||||
Let's dive a bit deeper into the history.
|
||||
|
||||
## History
|
||||
|
||||
In MirageOS, since the early stages (I'll go back to 2.7.0 (February 2016) where
|
||||
functoria was introduced) used an embedded fork of `cmdliner` to handle command
|
||||
line arguments.
|
||||
|
||||
[![Animated changes to the hello world unikernel](https://asciinema.org/a/ruHoadi2oZGOzgzMKk5ZYoFgf.svg)](https://asciinema.org/a/ruHoadi2oZGOzgzMKk5ZYoFgf)
|
||||
|
||||
### February 2016 (Mirage 2.7.0)
|
||||
|
||||
When looking into the MirageOS 2.x series, here's the code for our hello world
|
||||
unikernel:
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let hello =
|
||||
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
|
||||
Key.(create "hello" Arg.(opt string "Hello World!" doc))
|
||||
|
||||
let main =
|
||||
foreign
|
||||
~keys:[Key.abstract hello]
|
||||
"Unikernel.Hello" (console @-> job)
|
||||
|
||||
let () = register "hello-key" [main $ default_console]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
|
||||
module Hello (C: V1_LWT.CONSOLE) = struct
|
||||
let start c =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
C.log c (Key_gen.hello ());
|
||||
OS.Time.sleep 1.0 >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
As you can see, the cmdliner term was provided in `config.ml`, and in
|
||||
`unikernel.ml` the expression `Key_gen.hello ()` was used - `Key_gen` was
|
||||
a module generated by the `mirage configure` invocation.
|
||||
|
||||
You can as well see that the term was wrapped in `Key.create "hello"` - where
|
||||
this string was used as the identifier for the code generation.
|
||||
|
||||
As mentioned above, a change needed to be done in `config.ml` and a
|
||||
`mirage configure` to take effect.
|
||||
|
||||
### July 2016 (Mirage 2.9.1)
|
||||
|
||||
The `OS.Time` was functorized with a `Time` functor:
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let hello =
|
||||
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
|
||||
Key.(create "hello" Arg.(opt string "Hello World!" doc))
|
||||
|
||||
let main =
|
||||
foreign
|
||||
~keys:[Key.abstract hello]
|
||||
"Unikernel.Hello" (console @-> time @-> job)
|
||||
|
||||
let () = register "hello-key" [main $ default_console $ default_time]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
|
||||
module Hello (C: V1_LWT.CONSOLE) (Time : V1_LWT.TIME) = struct
|
||||
let start c _time =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
C.log c (Key_gen.hello ());
|
||||
Time.sleep 1.0 >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
### February 2017 (Mirage pre3)
|
||||
|
||||
The `Time` signature changed, now the `sleep_ns` function sleeps in nanoseconds.
|
||||
This avoids floating point numbers at the core of MirageOS. The helper package
|
||||
`duration` is used to avoid manual conversions.
|
||||
|
||||
Also, the console signature changed - and `log` is now inside the Lwt monad.
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let hello =
|
||||
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
|
||||
Key.(create "hello" Arg.(opt string "Hello World!" doc))
|
||||
|
||||
let main =
|
||||
foreign
|
||||
~keys:[Key.abstract hello]
|
||||
~packages:[package "duration"]
|
||||
"Unikernel.Hello" (console @-> time @-> job)
|
||||
|
||||
let () = register "hello-key" [main $ default_console $ default_time]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
|
||||
module Hello (C: V1_LWT.CONSOLE) (Time : V1_LWT.TIME) = struct
|
||||
let start c _time =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
C.log c (Key_gen.hello ()) >>= fun () ->
|
||||
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
### February 2017 (Mirage 3)
|
||||
|
||||
Another big change is that now console is not used anymore, but
|
||||
[logs](https://erratique.ch/software/logs).
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let hello =
|
||||
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
|
||||
Key.(create "hello" Arg.(opt string "Hello World!" doc))
|
||||
|
||||
let main =
|
||||
foreign
|
||||
~keys:[Key.abstract hello]
|
||||
~packages:[package "duration"]
|
||||
"Unikernel.Hello" (time @-> job)
|
||||
|
||||
let () = register "hello-key" [main $ default_time]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
|
||||
module Hello (Time : Mirage_time_lwt.S) = struct
|
||||
let start _time =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
Logs.info (fun f -> f "%s" (Key_gen.hello ()));
|
||||
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
### January 2020 (Mirage 3.7.0)
|
||||
|
||||
The `_lwt` is dropped from the interfaces (we used to have Mirage_time and
|
||||
Mirage_time_lwt - where the latter was instantiating the former with concrete
|
||||
types: `type 'a io = Lwt.t` and `type buffer = Cstruct.t` -- in a cleanup
|
||||
session we dropped the `_lwt` interfaces and opam packages. The reasoning was
|
||||
that when we'll get around to move to another IO system, we'll move everything
|
||||
at once anyways. No need to have `lwt` and something else (`async`, or nowadays
|
||||
`miou` or `eio`) in a single unikernel.
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let hello =
|
||||
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
|
||||
Key.(create "hello" Arg.(opt string "Hello World!" doc))
|
||||
|
||||
let main =
|
||||
foreign
|
||||
~keys:[Key.abstract hello]
|
||||
~packages:[package "duration"]
|
||||
"Unikernel.Hello" (time @-> job)
|
||||
|
||||
let () = register "hello-key" [main $ default_time]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
|
||||
module Hello (Time : Mirage_time.S) = struct
|
||||
let start _time =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
Logs.info (fun f -> f "%s" (Key_gen.hello ()));
|
||||
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
### October 2021 (Mirage 3.10)
|
||||
|
||||
Some renamings to fix warnings. Only `config.ml` changed.
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let hello =
|
||||
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
|
||||
Key.(create "hello" Arg.(opt string "Hello World!" doc))
|
||||
|
||||
let main =
|
||||
main
|
||||
~keys:[key hello]
|
||||
~packages:[package "duration"]
|
||||
"Unikernel.Hello" (time @-> job)
|
||||
|
||||
let () = register "hello-key" [main $ default_time]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
|
||||
module Hello (Time : Mirage_time.S) = struct
|
||||
let start _time =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
Logs.info (fun f -> f "%s" (Key_gen.hello ()));
|
||||
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
### June 2023 (Mirage 4.4)
|
||||
|
||||
The argument was moved to runtime.
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let hello =
|
||||
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
|
||||
Key.(create "hello" Arg.(opt ~stage:`Run string "Hello World!" doc))
|
||||
|
||||
let main =
|
||||
main
|
||||
~keys:[key hello]
|
||||
~packages:[package "duration"]
|
||||
"Unikernel.Hello" (time @-> job)
|
||||
|
||||
let () = register "hello-key" [main $ default_time]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
|
||||
module Hello (Time : Mirage_time.S) = struct
|
||||
let start _time =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
Logs.info (fun f -> f "%s" (Key_gen.hello ());
|
||||
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
### March 2024 (Mirage 4.5)
|
||||
|
||||
The runtime argument is in `config.ml` refering to the argument as string
|
||||
("Unikernel.hello"), and being passed to the `start` function as argument.
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let runtime_args = [ runtime_arg ~pos:__POS__ "Unikernel.hello" ]
|
||||
|
||||
let main =
|
||||
main
|
||||
~runtime_args
|
||||
~packages:[package "duration"]
|
||||
"Unikernel.Hello" (time @-> job)
|
||||
|
||||
let () = register "hello-key" [main $ default_time]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
open Cmdliner
|
||||
|
||||
let hello =
|
||||
let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
|
||||
Arg.(value & opt string "Hello World!" doc)
|
||||
|
||||
module Hello (Time : Mirage_time.S) = struct
|
||||
let start _time hello =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
Logs.info (fun f -> f "%s" hello);
|
||||
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
### October 2024 (Mirage 4.8)
|
||||
|
||||
Again, moved out of `config.ml`.
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let main =
|
||||
main
|
||||
~packages:[package "duration"]
|
||||
"Unikernel.Hello" (time @-> job)
|
||||
|
||||
let () = register "hello-key" [main $ default_time]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
open Cmdliner
|
||||
|
||||
let hello =
|
||||
let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
|
||||
Mirage_runtime.register_arg Arg.(value & opt string "Hello World!" doc)
|
||||
|
||||
module Hello (Time : Mirage_time.S) = struct
|
||||
let start _time =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
Logs.info (fun f -> f "%s" (hello ()));
|
||||
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
end
|
||||
```
|
||||
|
||||
### 2024 (Not yet released)
|
||||
|
||||
This is the future with time defunctorized. Read more in the [discussion](https://github.com/mirage/mirage/issues/1513).
|
||||
To delay the start function, a `dep` of `noop` is introduced.
|
||||
|
||||
`config.ml`
|
||||
```OCaml
|
||||
open Mirage
|
||||
|
||||
let main =
|
||||
main
|
||||
~packages:[package "duration"]
|
||||
~dep:[dep noop]
|
||||
"Unikernel" job
|
||||
|
||||
let () = register "hello-key" [main]
|
||||
```
|
||||
|
||||
and `unikernel.ml`
|
||||
```OCaml
|
||||
open Lwt.Infix
|
||||
open Cmdliner
|
||||
|
||||
let hello =
|
||||
let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
|
||||
Mirage_runtime.register_arg Arg.(value & opt string "Hello World!" doc)
|
||||
|
||||
let start () =
|
||||
let rec loop = function
|
||||
| 0 -> Lwt.return_unit
|
||||
| n ->
|
||||
Logs.info (fun f -> f "%s" (hello ()));
|
||||
Mirage_timer.sleep_ns (Duration.of_sec 1) >>= fun () ->
|
||||
loop (n-1)
|
||||
in
|
||||
loop 4
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
The history of hello world shows that over time we slowly improve the developer
|
||||
experience, and removing the boilerplate needed to get MirageOS unikernels up
|
||||
and running. This is work over a decade including lots of other (here invisible)
|
||||
improvements to the mirage utility.
|
||||
|
||||
Our current goal is to minimize the code generated by mirage, since code
|
||||
generation has lots of issues (e.g. error locations, naming, binary size). It
|
||||
is a long journey. At the same time, we are working on improving the performance
|
||||
of MirageOS unikernels, developing unikernels that are useful in the real
|
||||
world ([VPN endpoint](https://github.com/robur-coop/miragevpn), [DNSmasq replacement](https://github.com/robur-coop/dnsvizor), ...), and also [simplifying the
|
||||
deployment of MirageOS unikernels](https://github.com/robur-coop/mollymawk).
|
||||
|
||||
If you're interested in MirageOS and using it in your domain, don't hesitate
|
||||
to reach out to us (via eMail: team@robur.coop) - we're keen to deploy MirageOS
|
||||
and find more domains where it is useful. If you can spare a dime, we're a
|
||||
registered non-profit in Germany - and can provide tax-deductable receipts for
|
||||
donations ([more information](https://robur.coop/Donate)).
|
107
articles/dnsvizor01.md
Normal file
107
articles/dnsvizor01.md
Normal file
|
@ -0,0 +1,107 @@
|
|||
---
|
||||
date: 2024-10-25
|
||||
title: "Meet DNSvizor: run your own DHCP and DNS MirageOS unikernel"
|
||||
description:
|
||||
The NGI-funded DNSvizor provides core network services on your network; DNS resolution and DHCP.
|
||||
tags:
|
||||
- OCaml
|
||||
- MirageOS
|
||||
- DNSvizor
|
||||
author:
|
||||
name: Hannes Mehnert
|
||||
email: hannes@mehnert.org
|
||||
link: https://hannes.robur.coop
|
||||
---
|
||||
|
||||
TL;DR: We got [NGI0 Entrust (via NLnet)](https://nlnet.nl/entrust/) funding for developing
|
||||
[DNSvizor](https://nlnet.nl/project/DNSvizor/) - a DNS resolver and
|
||||
DHCP server. Please help us by [sharing with us your dnsmasq
|
||||
configuration](https://github.com/robur-coop/dnsvizor/issues/new), so we can
|
||||
prioritize the configuration options to support.
|
||||
|
||||
## Introduction
|
||||
|
||||
The [dynamic host configuration protocol (DHCP)](https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol)
|
||||
is fundamental in today's Internet and local networks. It usually runs on your
|
||||
router (or as a dedicated independent service) and automatically configures
|
||||
computers that join your network (for example wireless laptops, smartphones)
|
||||
with an IP address, routing information, a DNS resolver, etc. No manual
|
||||
configuration is needed once your friends' smartphone got the password of your
|
||||
wireless network \o/
|
||||
|
||||
The [domain name system (DNS)](https://en.wikipedia.org/wiki/Domain_Name_System)
|
||||
is responsible for translating domain names (such as "robur.coop", "nlnet.nl")
|
||||
to IP addresses (such as 193.30.40.138 or 2a0f:7cc7:7cc7:7c40::138) - used by
|
||||
computers to talk to each other. Humans can remember domain names instead of
|
||||
memorizing IP addresses. Computers then use DNS to translate these domain names
|
||||
to IP addresses to communicate with. DNS is a hierarchic, distributed,
|
||||
faul-tolerant service.
|
||||
|
||||
These two protocols are fundamental to today's Internet: without them it would
|
||||
be much harder for humans to use it.
|
||||
|
||||
## DNSvizor
|
||||
|
||||
We at [robur](https://robur.coop) got funding (from
|
||||
[NGI0 Entrust via NLnet](https://nlnet.nl/project/DNSvizor/)) to continue our work on
|
||||
[DNSvizor](https://github.com/robur-coop/dnsvizor) - a
|
||||
[MirageOS unikernel](https://mirageos.org) that provides DNS resolution and
|
||||
DHCP service for a network. This is fully implemented in
|
||||
[OCaml](https://ocaml.org).
|
||||
|
||||
Already at our [MirageOS retreats](https://retreat.mirageos.org) we deployed
|
||||
such unikernel, to test our [DHCP implementation](https://github.com/mirage/charrua)
|
||||
and our [DNS resolver](https://github.com/mirage/ocaml-dns) - and found and
|
||||
fixed issues on-site. At the retreats we have a very limited Internet uplink,
|
||||
thus caching DNS queries and answers is great for reducing the load on the
|
||||
uplink.
|
||||
|
||||
Thanks to the funding we received, we'll be able to work on improving the
|
||||
performance, but also to finish our DNSSec implementation, provide DNS-over-TLS
|
||||
and DNS-over-HTTPS services, and also a web interface. DNSvizor will use the
|
||||
existing [dnsmasq](https://thekelleys.org.uk/dnsmasq/doc.html) configuration
|
||||
syntax, and provide lots of features from dnsmasq, and also provide features
|
||||
such as block lists from [pi-hole](https://pi-hole.net/).
|
||||
|
||||
We are at a point where the [basic unikernel (our MVP)](https://github.com/robur-coop/dnsvizor)
|
||||
- providing DNS and DHCP services - is ready, and we provide
|
||||
[reproducible binary builds](https://builds.robur.coop/job/dnsvizor). Phew. This
|
||||
means that the first step is done. The `--dhcp-range` from dnsmasq is already
|
||||
being parsed.
|
||||
|
||||
We are now curious on concrete usages of dnsmasq and the configurations you use.
|
||||
If you're interested in dnsvizor, please [open an issue at our repository](https://github.com/robur-coop/dnsvizor/issues/new)
|
||||
with your dnsmasq configuration. This will help us to guide which parts of the configuration to prioritize.
|
||||
|
||||
## Usages of DNSvizor
|
||||
|
||||
We have several use cases for DNSvizor:
|
||||
- at your home router to provide DNS resolution and DHCP service, filtering ads,
|
||||
- in the datacenter auto-configuring your machine park,
|
||||
- when running your unikernel swarm to auto-configure them.
|
||||
|
||||
The first one is where pi-hole as well fits into, and where dnsmasq is used quite
|
||||
a lot. The second one is also a domain where dnsmasq is used. The third one is
|
||||
from our experience that lots of people struggle with deploying MirageOS
|
||||
unikernels since they have to manually do IP configuration etc. We ourselves
|
||||
also pass additional information to the unikernels, such as syslog host,
|
||||
monitoring sink, X.509 certificates or host names, do some DNS provisioning, ...
|
||||
|
||||
With DNSvizor we will leverage the common configuration options of all
|
||||
unikernels (reducing the need for boot arguments), and also go a bit further
|
||||
and make deployment seamless (including adding hostnames to DNS, forwarding
|
||||
from our reverse TLS proxy, etc.).
|
||||
|
||||
## Conclusion
|
||||
|
||||
[DNSvizor](https://github.com/robur-coop/dnsvizor) provides DNS resolution and
|
||||
DHCP service for your network, and [already exists](https://builds.robur.coop/job/dnsvizor) :).
|
||||
Please [report issues](https://github.com/robur-coop/dnsvizor/issues/) you
|
||||
encounter and questions you may have. Also, if you use dnsmasq, please
|
||||
[show us your configuration](https://github.com/robur-coop/dnsvizor/issues/new).
|
||||
|
||||
If you're interested in MirageOS and using it in your domain, don't hesitate
|
||||
to reach out to us (via eMail: team@robur.coop) - we're keen to deploy MirageOS
|
||||
and find more domains where it is useful. If you can
|
||||
[spare a dime](https://robur.coop/Donate), we're a registered non-profit in
|
||||
Germany - and can provide tax-deductable receipts in Europe.
|
302
articles/finances.md
Normal file
302
articles/finances.md
Normal file
|
@ -0,0 +1,302 @@
|
|||
---
|
||||
date: 2024-10-21
|
||||
title: How has robur financially been doing since 2018?
|
||||
description: How we organise as a collective, and why we're doing that.
|
||||
tags:
|
||||
- finances
|
||||
- cooperative
|
||||
author:
|
||||
name: Hannes Mehnert
|
||||
email: hannes@mehnert.org
|
||||
link: https://hannes.robur.coop
|
||||
---
|
||||
|
||||
Since the beginning, robur has been working on MirageOS unikernels and getting
|
||||
them deployed. Due to our experience in hierarchical companies, we wanted to
|
||||
create something different - a workplace without bosses and management. Instead,
|
||||
we are a collective where everybody has a say on what we do, and who gets how
|
||||
much money at the end of the month. This means nobody has to write report and
|
||||
meet any goals - there's no KPI involved. We strive to be a bunch of people
|
||||
working together nicely and projects that we own and want to bring forward. If
|
||||
we discover lack of funding, we reach out to (potential) customers to fill our
|
||||
cash register. Or reach out to people to donate money.
|
||||
|
||||
Since our mission is fulfilling and already complex - organising ourselves in a
|
||||
hierarchy-free environment, including the payment, and work on software in a
|
||||
niche market - we decided from the early days that bookeeping and invoicing
|
||||
should not be part of our collective. Especially since we want to be free in
|
||||
what kind of funding we accept - donations, commercial contracts, public
|
||||
funding. In the books, robur is part of the non-profit company
|
||||
[Änderwerk](https://aenderwerk.de) in Germany - and friends of ours run that
|
||||
company. They get a cut on each income we generate.
|
||||
|
||||
To be inclusive and enable everyone to participate in decisions, we are 100%
|
||||
transparent in our books - every collective member has access to the financial
|
||||
spreadsheets, contracts, etc. We use a needs-based payment model, so we talk
|
||||
about the needs everyone has on a regular basis and adjust the salary, everyone
|
||||
agreeing to all the numbers.
|
||||
|
||||
## 2018
|
||||
|
||||
We started operations in 2018. In late 2017, we got donations (in the form of
|
||||
bitcoins) by friends who were convinced of our mission. This was 54,194.91 €.
|
||||
So, in 2018 we started with that money, and tried to find a mission, and
|
||||
generate income to sustain our salaries.
|
||||
|
||||
Also, already in 2017, we applied for funding from
|
||||
[Prototypefund](https://prototypefund.de) on a [CalDAV server](https://prototypefund.de/project/robur-io/),
|
||||
and we received the grant in early 2018. This was another 48,500 €, paid to
|
||||
individuals (due to reasons, Prototype fund can't cash out to the non-profit -
|
||||
this put us into some struggle, since we needed some double bookkeeping and
|
||||
individuals had to dig into health care etc.).
|
||||
|
||||
We also did in the second half of 2018 a security audit for
|
||||
[Least Authority](https://leastauthority.com/blog/audits/five-security-audits-for-the-tezos-foundation/)
|
||||
(invoicing 19,600 €).
|
||||
|
||||
And later in 2018 we started on what is now called NetHSM with an initial
|
||||
design workshop (5,000 €).
|
||||
|
||||
And lastly, we started to work on a grant to implement [TLS 1.3](https://datatracker.ietf.org/doc/html/rfc8446),
|
||||
funded by Jane Street (via OCaml Labs Consulting). In 2018, we received 12,741.71 €
|
||||
|
||||
We applied at NLNet for improving the QubesOS firewall developed in MirageOS
|
||||
(without success), tried to get the IT security prize in Germany (without
|
||||
success), and to DIAL OSC (without success).
|
||||
|
||||
| Project | Amount |
|
||||
|-----------------|----------:|
|
||||
| Donation | 54,194.91 |
|
||||
| Prototypefund | 48,500.00 |
|
||||
| Least Authority | 19,600.00 |
|
||||
| TLS 1.3 | 12,741.71 |
|
||||
| Nitrokey | 5,000.00 |
|
||||
| __Total__ | __140,036.62__ |
|
||||
|
||||
|
||||
## 2019
|
||||
|
||||
We were keen to finish the CalDAV implementation (and start a CardDAV
|
||||
implementation), and received some financial support from Tarides for it
|
||||
(15,000 €).
|
||||
|
||||
The TLS 1.3 work continued, we got in total 68,887.53 €.
|
||||
|
||||
We also applied to (and got funding from) Prototypefund, once with an [OpenVPN-compatible
|
||||
MirageOS unikernel](https://prototypefund.de/en/project/robust-openvpn-client-with-low-use-of-resources/),
|
||||
and once with [improving the QubesOS firewall developed as MirageOS unikernel](https://prototypefund.de/project/portable-firewall-fuer-qubesos/).
|
||||
This means again twice 48,500 €.
|
||||
|
||||
We also started the implementation work of NetHSM - which still included a lot
|
||||
of design work - in total the contract was over 82,500 €. In 2019, we invoiced
|
||||
Nitrokey in 2019 in total 40,500 €.
|
||||
|
||||
We also received a total of 516.48 € as donations from source unknown to us.
|
||||
|
||||
We also applied to NLnet with [DNSvizor](https://nlnet.nl/project/Robur/), and
|
||||
got a grant, but due to buerocratic reasons they couldn't transfer the money to
|
||||
our non-profit (which was involved with NLnet in some EU grants), and we didn't
|
||||
get any money in the end.
|
||||
|
||||
| Project | Amount |
|
||||
|----------|----------:|
|
||||
| CardDAV | 15,000.00 |
|
||||
| TLS 1.3 | 68,887.53 |
|
||||
| OpenVPN | 48,500.00 |
|
||||
| QubesOS | 48,500.00 |
|
||||
| Donation | 516.48 |
|
||||
| Nitrokey | 40,500.00 |
|
||||
| __Total__ | __221,904.01__ |
|
||||
|
||||
## 2020
|
||||
|
||||
In 2020, we agreed with OCaml Labs Consulting to work on maintenance of OCaml
|
||||
packages in the MirageOS ecosystem. This was a contract where at the end of the
|
||||
month, we reported on which PRs and issues we spent how much time. For us, this
|
||||
was great to have the freedom to work on which OCaml packages we were keen to
|
||||
get up to speed. In 2020, we received 45,000 € for this maintenance.
|
||||
|
||||
We finished the TLS 1.3 work (18,659.01 €)
|
||||
|
||||
We continued to work on the NetHSM project, and invoiced 55,500 €.
|
||||
|
||||
We received a total of 255 € in donations from sources unknown to us.
|
||||
|
||||
We applied at reset.tech again with DNSvizor, unfortunately without success.
|
||||
|
||||
We also applied at [NGI pointer](https://pointer.ngi.eu) to work on reproducible
|
||||
builds for MirageOS, and a web frontend. Here we got the grant of 200,000 €,
|
||||
which we worked on in 2021 and 2022.
|
||||
|
||||
| Project | Amount |
|
||||
|-----------|----------:|
|
||||
| OCLC | 45,000.00 |
|
||||
| TLS 1.3 | 18,659.01 |
|
||||
| Nitrokey | 55,500.00 |
|
||||
| Donations | 255.00 |
|
||||
| __Total__ | __119,414.01__ |
|
||||
|
||||
## 2021
|
||||
|
||||
As outlined, we worked on reproducible builds of unikernels - rethinking the way
|
||||
how a unikernel is configured: no more compiled-in secrets, but instead using
|
||||
boot parameters. We setup the infrastructure for doing daily reproducible
|
||||
builds, serving system packages via a package repository, and a
|
||||
[web frontend](https://builds.robur.coop) hosting the reproducible builds.
|
||||
We received in total 120,000 € from NGI Pointer in 2021.
|
||||
|
||||
Our work on NetHSM continued, including the introduction of elliptic curves
|
||||
in mirage-crypto (using [fiat](https://github.com/mit-plv/fiat-crypto/)). The
|
||||
invoices to Nitrokey summed up to 26,000 € in 2021.
|
||||
|
||||
We developed in a short timeframe two packages, [u2f](https://github.com/robur-coop/u2f)
|
||||
and later [webauthn](https://git.robur.coop/robur/webauthn) for Skolem Labs based
|
||||
on [gift economy](https://en.wikipedia.org/wiki/Gift_economy). This resulted in
|
||||
donations of 18,976 €.
|
||||
|
||||
We agreed with [OCSF](https://ocaml-sf.org/) to work on
|
||||
[conex](https://github.com/hannesm/conex), which we have not delivered yet
|
||||
(lots of other things had to be cleared first: we did a security review of opam
|
||||
(leading to [a security advisory](https://opam.ocaml.org/blog/opam-2-1-5-local-cache/)),
|
||||
we got rid of [`extra-files`](https://discuss.ocaml.org/t/ann-opam-repository-policy-change-checksums-no-md5-and-no-extra-files)
|
||||
in the opam-repository, and we [removed the weak hash md5](https://discuss.ocaml.org/t/ann-opam-repository-policy-change-checksums-no-md5-and-no-extra-files)
|
||||
from the opam-repository.
|
||||
|
||||
| Customer | Amount |
|
||||
|-------------|----------:|
|
||||
| NGI Pointer | 120,000.00 |
|
||||
| Nitrokey | 26,000.00 |
|
||||
| Skolem | 18,976.00 |
|
||||
| __Total__ | __164,976.00__ |
|
||||
|
||||
## 2022
|
||||
|
||||
We finished our NGI pointer project, and received another 80,000 €.
|
||||
|
||||
We also did some minor maintenance for Nitrokey, and invoiced 4,500 €.
|
||||
|
||||
For Tarides, we started another maintaining MirageOS packages (and continuing
|
||||
[our TCP/IP stack](https://github.com/robur-coop/utcp)), and invoiced in
|
||||
total 22,500 €.
|
||||
|
||||
A grant application for [bob](https://github.com/dinosaure/bob/) was rejected,
|
||||
but a grant application for [MirageVPN](https://github.com/robur-coop/miragevpn)
|
||||
got accepted. Both at NLnet within the EU NGI project.
|
||||
|
||||
| Project | Amount |
|
||||
|-------------|---------:|
|
||||
| NGI Pointer | 80,000.00 |
|
||||
| Nitrokey | 4,500.00 |
|
||||
| Tarides | 22,500.00 |
|
||||
| __Total__ | __107,000.00__ |
|
||||
|
||||
## 2023
|
||||
|
||||
We finished the NetHSM project, and had a final invoice over 2,500 €.
|
||||
|
||||
We started a collaboration for [semgrep](https://semgrep.dev), porting some of
|
||||
their Python code to OCaml. We received in total 37,500 €.
|
||||
|
||||
We continued the MirageOS opam package maintenance and invoiced in total
|
||||
89,250 € to Tarides.
|
||||
|
||||
A grant application on [MirageVPN](https://nlnet.nl/project/MirageVPN/) got
|
||||
accepted (NGI Assure), and we received in total 12,000 € for our work on it.
|
||||
This is a continuation of our 2019 work funded by Prototypefund.
|
||||
|
||||
We also wrote various funding applications, including one for
|
||||
[DNSvizor](https://github.com/robur-coop/dnsvizor) that was
|
||||
[accepted](https://nlnet.nl/project/DNSvizor/) (NGI0 Entrust).
|
||||
|
||||
| Customer | Amount |
|
||||
|-----------|---------:|
|
||||
| Nitrokey | 2,500.00 |
|
||||
| semgrep | 37,500.00 |
|
||||
| Tarides | 89,250.00 |
|
||||
| MirageVPN | 12,000.00 |
|
||||
| __Total__ | __141,250.00__ |
|
||||
|
||||
## 2024
|
||||
|
||||
We're still in the middle of it, but so far we continued the Tarides maintenance
|
||||
contract (54,937.50 €).
|
||||
|
||||
We also finished the MirageVPN work, and received another 45,000 €.
|
||||
|
||||
We had a contract with Semgrep again on porting Python code to OCaml and received 18,559.40 €.
|
||||
|
||||
We again worked on several successful funding applications, one on
|
||||
[PTT](https://nlnet.nl/project/PTT/) (NGI Zero Core), a continuation of the
|
||||
[NGI DAPSI](https://www.ngi.eu/funded_solution/ngi-dapsiproject-24/) project -
|
||||
now realizing mailing lists with our SMTP stack.
|
||||
|
||||
We also got [MTE](https://nlnet.nl/project/MTE/) (NGI Taler) accepted.
|
||||
|
||||
The below table is until end of September 2024.
|
||||
|
||||
| Project | Amount |
|
||||
|-----------|----------:|
|
||||
| Semgrep | 18,559.40 |
|
||||
| Tarides | 62,812.50 |
|
||||
| MirageVPN | 45,000.00 |
|
||||
| __Total__ | __126,371.90__ |
|
||||
|
||||
## Total
|
||||
|
||||
In a single table, here's our income since robur started.
|
||||
|
||||
| Year | Amount |
|
||||
|-------|-----------:|
|
||||
| 2018 | 140,036.62 |
|
||||
| 2019 | 221,904.01 |
|
||||
| 2020 | 119,414.01 |
|
||||
| 2021 | 164,976.00 |
|
||||
| 2022 | 107,000.00 |
|
||||
| 2023 | 141,250.00 |
|
||||
| 2024 | 126,371.90 |
|
||||
| __Total__ | __1,020,952.54__ |
|
||||
|
||||
![Plot of above income table](../images/finances.png)
|
||||
|
||||
As you can spot, it varies quite a bit. In some years we have fewer money
|
||||
available than in other years.
|
||||
|
||||
## Expenses
|
||||
|
||||
As mentioned, the non-profit company [Änderwerk](https://aenderwerk.de) running
|
||||
the bookkeeping and legal stuff (invoices, tax statements, contracts, etc.) gets
|
||||
a cut on each income we produce. They are doing amazing work and are very
|
||||
quick responding to our queries.
|
||||
|
||||
We spend most of our income on salary. Some money we spend on travel. We also
|
||||
pay monthly for our server (plus some extra for hardware, and in June 2024 a
|
||||
huge amount for trying to recover data from failed SSDs).
|
||||
|
||||
## Conclusion
|
||||
|
||||
We have provided an overview of our income, we were three to five people working
|
||||
at robur over the entire time. As written at the beginning, we use needs-based
|
||||
payment. Our experience with this is great! It provides a lot of trust into each
|
||||
other.
|
||||
|
||||
Our funding is diverse from multiple sources - donations, commercial work,
|
||||
public funding. This was our initial goal, and we're very happy that it works
|
||||
fine over the last five years.
|
||||
|
||||
Taking the numbers into account, we are not paying ourselves "industry standard"
|
||||
rates - but we really love what we do - and sometimes we just take some time off.
|
||||
We do work on various projects that we really really enjoy - but where (at the
|
||||
moment) no funding is available for.
|
||||
|
||||
We are always happy to discuss how our collective operates. If you're
|
||||
interested, please drop us a message.
|
||||
|
||||
Of course, if we receive donations, we use them wisely - mainly for working on
|
||||
the currently not funded projects (bob, albatross, miou, mollymawk - to name a few). If you
|
||||
can spare a dime or two, don't hesitate to [donate](https://robur.coop/Donate).
|
||||
Donations are tax-deductable in Germany (and should be in Europe) since we're a
|
||||
registered non-profit.
|
||||
|
||||
If you're interested in MirageOS and using it in your domain, don't hesitate
|
||||
to reach out to us (via eMail: team@robur.coop) so we can start to chat - we're keen to deploy MirageOS
|
||||
and find more domains where it is useful.
|
110
articles/gptar-update.md
Normal file
110
articles/gptar-update.md
Normal file
|
@ -0,0 +1,110 @@
|
|||
---
|
||||
title: GPTar (update)
|
||||
date: 2024-10-28
|
||||
description: libarchive vs hybrid GUID partition table and GNU tar volume header
|
||||
tags:
|
||||
- OCaml
|
||||
- gpt
|
||||
- tar
|
||||
- mbr
|
||||
- persistent storage
|
||||
author:
|
||||
name: Reynir Björnsson
|
||||
email: reynir@reynir.dk
|
||||
link: https://reyn.ir/
|
||||
---
|
||||
|
||||
In a [previous post][gptar-post] I describe how I craft a hybrid GUID partition table (GPT) and tar archive by exploiting that there are disjoint areas of a 512 byte *block* that are important to tar headers and *protective* master boot records used in GPT respectively.
|
||||
I recommend reading it first if you haven't already for context.
|
||||
|
||||
After writing the above post I read an excellent and fun *and totally normal* article by Emily on how [she created **executable** tar archives][tar-executable].
|
||||
Therein I learned a clever hack:
|
||||
GNU tar has a tar extension for *volume headers*.
|
||||
These are essentially labels for your tape archives when you're forced to split an archive across multiple tapes.
|
||||
They can (seemingly) hold any text as label including shell scripts.
|
||||
What's more is GNU tar and bsdtar **does not** extract these as files!
|
||||
This is excellent, because I don't actually want to extract or list the GPT header when using GNU tar or bsdtar.
|
||||
This prompted me to [use a different link indicator](https://github.com/reynir/gptar/pull/1).
|
||||
|
||||
This worked pretty great.
|
||||
Listing the archive using GNU tar I still get `GPTAR`, but with verbose listing it's displayed as a `--Volume Header--`:
|
||||
|
||||
```shell
|
||||
$ tar -tvf disk.img
|
||||
Vr-------- 0/0 16896 1970-01-01 01:00 GPTAR--Volume Header--
|
||||
-rw-r--r-- 0/0 14 1970-01-01 01:00 test.txt
|
||||
```
|
||||
|
||||
And more importantly the `GPTAR` entry is ignored when extracting:
|
||||
|
||||
```shell
|
||||
$ mkdir tmp
|
||||
$ cd tmp/
|
||||
$ tar -xf ../disk.img
|
||||
$ ls
|
||||
test.txt
|
||||
```
|
||||
|
||||
## BSD tar / libarchive
|
||||
|
||||
Unfortunately, this broke bsdtar!
|
||||
|
||||
```shell
|
||||
$ bsdtar -tf disk.img
|
||||
bsdtar: Damaged tar archive
|
||||
bsdtar: Error exit delayed from previous errors.
|
||||
```
|
||||
|
||||
This is annoying because we run FreeBSD on the host for [opam.robur.coop](https://opam.robur.coop), our instance of [opam-mirror][opam-mirror].
|
||||
This Autumn we updated [opam-mirror][opam-mirror] to use the hybrid GPT+tar GPTar *tartition table*[^tartition] instead of hard coded or boot parameter specified disk offsets for the different partitions - which was extremely brittle!
|
||||
So we were no longer able to inspect the contents of the tar partition from the host!
|
||||
Unacceptable!
|
||||
So I started to dig into libarchive where bsdtar comes from.
|
||||
To my surprise, after building bsdtar from the git clone of the source code it ran perfectly fine!
|
||||
|
||||
```shell
|
||||
$ ./bsdtar -tf ../gptar/disk.img
|
||||
test.txt
|
||||
```
|
||||
|
||||
I eventually figure out [this change][libarchive-pr] fixed it for me.
|
||||
I got in touch with Emily to let her know that bsdtar recently fixed this (ab)use of GNU volume headers.
|
||||
Her reply was basically "as of when I wrote the article, I was pretty sure bsdtar ignored it."
|
||||
And indeed it did.
|
||||
Examining the diff further revealed that it ignored the GNU volume header - just not "correctly" when the GNU volume header was abused to carry file content as I did:
|
||||
|
||||
```diff
|
||||
/*
|
||||
* Interpret 'V' GNU tar volume header.
|
||||
*/
|
||||
static int
|
||||
header_volume(struct archive_read *a, struct tar *tar,
|
||||
struct archive_entry *entry, const void *h, size_t *unconsumed)
|
||||
{
|
||||
- (void)h;
|
||||
+ const struct archive_entry_header_ustar *header;
|
||||
+ int64_t size, to_consume;
|
||||
+
|
||||
+ (void)a; /* UNUSED */
|
||||
+ (void)tar; /* UNUSED */
|
||||
+ (void)entry; /* UNUSED */
|
||||
|
||||
- /* Just skip this and read the next header. */
|
||||
- return (tar_read_header(a, tar, entry, unconsumed));
|
||||
+ header = (const struct archive_entry_header_ustar *)h;
|
||||
+ size = tar_atol(header->size, sizeof(header->size));
|
||||
+ to_consume = ((size + 511) & ~511);
|
||||
+ *unconsumed += to_consume;
|
||||
+ return (ARCHIVE_OK);
|
||||
}
|
||||
```
|
||||
|
||||
So thanks to the above change we can expect a release of libarchive supporting further flavors of abuse of GNU volume headers!
|
||||
🥳
|
||||
|
||||
[gptar-post]: gptar.html
|
||||
[tar-executable]: https://uni.horse/executable-tarballs.html
|
||||
[opam-mirror]: https://git.robur.coop/robur/opam-mirror/
|
||||
[libarchive-pr]: https://github.com/libarchive/libarchive/pull/2127
|
||||
|
||||
[^tartition]: Emily came up with the much better term "tartition table" than what I had come up with - "GPTar".
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2024-02-21
|
||||
article.title: GPTar
|
||||
article.description: Hybrid GUID partition table and tar archive
|
||||
title: GPTar
|
||||
description: Hybrid GUID partition table and tar archive
|
||||
tags:
|
||||
- OCaml
|
||||
- gpt
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2024-02-11
|
||||
article.title: Cooperation and Lwt.pause
|
||||
article.description:
|
||||
title: Cooperation and Lwt.pause
|
||||
description:
|
||||
A disgression about Lwt and Miou
|
||||
tags:
|
||||
- OCaml
|
||||
|
@ -9,6 +9,10 @@ tags:
|
|||
- Community
|
||||
- Unikernel
|
||||
- Git
|
||||
author:
|
||||
name: Romain Calascibetta
|
||||
email: romain.calascibetta@gmail.com
|
||||
link: https://blog.osau.re/
|
||||
breaks: false
|
||||
---
|
||||
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2023-11-20
|
||||
article.title: MirageVPN updated (AEAD, NCP)
|
||||
article.description:
|
||||
title: MirageVPN updated (AEAD, NCP)
|
||||
description:
|
||||
How we resurrected MirageVPN from its bitrot state
|
||||
tags:
|
||||
- OCaml
|
||||
|
@ -16,7 +16,7 @@ author:
|
|||
|
||||
## Updating MirageVPN
|
||||
|
||||
As announced [earlier this month](https://blog.robur.coop/articles/miragevpn.html), we've been working hard over the last months on MirageVPN (initially developed in 2019, targeting OpenVPN™ 2.4.7, now 2.6.6). We managed to receive funding from [NGI Assure](https://www.assure.ngi.eu/) call (via [NLnet](https://nlnet.nl)). We've made over 250 commits with more than 10k lines added, and 18k lines removed. We closed nearly all old issues, and opened 100 fresh ones, of which we already closed more than half of them. :D
|
||||
As announced [earlier this month](miragevpn.html), we've been working hard over the last months on MirageVPN (initially developed in 2019, targeting OpenVPN™ 2.4.7, now 2.6.6). We managed to receive funding from [NGI Assure](https://www.assure.ngi.eu/) call (via [NLnet](https://nlnet.nl)). We've made over 250 commits with more than 10k lines added, and 18k lines removed. We closed nearly all old issues, and opened 100 fresh ones, of which we already closed more than half of them. :D
|
||||
|
||||
### Actual bugs fixed (that were leading to non-working MirageVPN applications)
|
||||
|
||||
|
@ -29,7 +29,7 @@ To avoid any future breakage while revising the code (cleaning it up, extending
|
|||
|
||||
### New features: AEAD ciphers, supporting more configuration primitives
|
||||
|
||||
We added various configuration primitives, amongst them configuratble tls ciphersuites, minimal and maximal tls version to use, [tls-crypt-v2](https://blog.robur.coop/articles/miragevpn.html), verify-x509-name, cipher, remote-random, ...
|
||||
We added various configuration primitives, amongst them configuratble tls ciphersuites, minimal and maximal tls version to use, [tls-crypt-v2](miragevpn.html), verify-x509-name, cipher, remote-random, ...
|
||||
|
||||
From a cryptographic point of view, we are now supporting more [authentication hashes](https://github.com/robur-coop/miragevpn/pull/108) via the configuration directive `auth`, namely the SHA2 family - previously, only SHA1 was supported, [AEAD ciphers](https://github.com/robur-coop/miragevpn/pull/125) (AES-128-GCM, AES-256-GCM, CHACHA20-POLY1305) - previously only AES-256-CBC was supported.
|
||||
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2024-04-16
|
||||
article.title: Speeding up MirageVPN and use it in the wild
|
||||
article.description:
|
||||
title: Speeding up MirageVPN and use it in the wild
|
||||
description:
|
||||
Performance engineering of MirageVPN, speeding it up by a factor of 25.
|
||||
tags:
|
||||
- OCaml
|
||||
|
@ -19,10 +19,9 @@ coauthors:
|
|||
name: Reynir Björnsson
|
||||
email: reynir@reynir.dk
|
||||
link: https://reyn.ir/
|
||||
contribution: What is this field used for?
|
||||
---
|
||||
|
||||
As we were busy continuing to work on [MirageVPN](https://github.com/robur-coop/miragevpn), we got in touch with [eduVPN](https://eduvpn.org), who are interested about deploying MirageVPN. We got example configuration from their side, and [fixed](https://github.com/robur-coop/miragevpn/pull/201) [some](https://github.com/robur-coop/miragevpn/pull/168) [issues](https://github.com/robur-coop/miragevpn/pull/202), and also implemented [tls-crypt](https://github.com/robur-coop/miragevpn/pull/169) - which was straightforward since we earlier spend time to implement [tls-crypt-v2](https://blog.robur.coop/articles/miragevpn.html).
|
||||
As we were busy continuing to work on [MirageVPN](https://github.com/robur-coop/miragevpn), we got in touch with [eduVPN](https://eduvpn.org), who are interested about deploying MirageVPN. We got example configuration from their side, and [fixed](https://github.com/robur-coop/miragevpn/pull/201) [some](https://github.com/robur-coop/miragevpn/pull/168) [issues](https://github.com/robur-coop/miragevpn/pull/202), and also implemented [tls-crypt](https://github.com/robur-coop/miragevpn/pull/169) - which was straightforward since we earlier spend time to implement [tls-crypt-v2](miragevpn.html).
|
||||
|
||||
In January, they gave MirageVPN another try, and [measured the performance](https://github.com/robur-coop/miragevpn/issues/206) -- which was very poor -- MirageVPN (run as a Unix binary) provided a bandwith of 9.3Mb/s, while OpenVPN provided a bandwidth of 360Mb/s (using a VPN tunnel over TCP).
|
||||
|
||||
|
@ -46,7 +45,7 @@ The learnings of our performance engineering are in three areas:
|
|||
|
||||
## Conclusion
|
||||
|
||||
To conclude: we already achieved a factor of 25 in performance by adapting the code in various ways. We have ideas to improve the performance even more in the future - we also work on using OCaml string and bytes, instead of off-the-OCaml-heap-allocated bigarrays (see [our previous article](https://blog.robur.coop/articles/speeding-ec-string.html), which provided some speedups).
|
||||
To conclude: we already achieved a factor of 25 in performance by adapting the code in various ways. We have ideas to improve the performance even more in the future - we also work on using OCaml string and bytes, instead of off-the-OCaml-heap-allocated bigarrays (see [our previous article](speeding-ec-string.html), which provided some speedups).
|
||||
|
||||
Don't hesitate to reach out to us on [GitHub](https://github.com/robur-coop/miragevpn/issues), or [by mail](https://robur.coop/Contact) if you're stuck.
|
||||
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2024-06-17
|
||||
article.title: MirageVPN server
|
||||
article.description:
|
||||
title: MirageVPN server
|
||||
description:
|
||||
Announcement of our MirageVPN server.
|
||||
tags:
|
||||
- OCaml
|
||||
|
@ -18,12 +18,11 @@ coauthors:
|
|||
name: Reynir Björnsson
|
||||
email: reynir@reynir.dk
|
||||
link: https://reyn.ir/
|
||||
contribution: What is this field used for?
|
||||
---
|
||||
|
||||
It is a great pleasure to finally announce that we have finished a server implementation for MirageVPN (OpenVPN™-compatible). This allows to setup a very robust VPN network on both the client and the server side.
|
||||
|
||||
As announced last year, [MirageVPN](https://blog.robur.coop/articles/miragevpn.html) is a reimplemtation of OpenVPN™ in OCaml, with [MirageOS](https://mirage.io) unikernels.
|
||||
As announced last year, [MirageVPN](miragevpn.html) is a reimplemtation of OpenVPN™ in OCaml, with [MirageOS](https://mirage.io) unikernels.
|
||||
|
||||
## Why a MirageVPN server?
|
||||
|
||||
|
@ -37,4 +36,4 @@ The overall progress was tracked in [this issue](https://github.com/robur-coop/m
|
|||
|
||||
Please move along to our handbook with the [chapter on MirageVPN server](https://robur-coop.github.io/miragevpn-handbook/miragevpn_server.html).
|
||||
|
||||
If you encounter any issues, please open an issue at [the repository](https://github.com/robur-coop/miragevpn).
|
||||
If you encounter any issues, please open an issue at [the repository](https://github.com/robur-coop/miragevpn).
|
||||
|
|
54
articles/miragevpn-testing.md
Normal file
54
articles/miragevpn-testing.md
Normal file
|
@ -0,0 +1,54 @@
|
|||
---
|
||||
date: 2024-06-26
|
||||
title: Testing MirageVPN against OpenVPN™
|
||||
description: Some notes about how we test MirageVPN against OpenVPN™
|
||||
tags:
|
||||
- OCaml
|
||||
- MirageOS
|
||||
- cryptography
|
||||
- security
|
||||
- testing
|
||||
- vpn
|
||||
author:
|
||||
name: Reynir Björnsson
|
||||
email: reynir@reynir.dk
|
||||
link: https://reyn.ir/
|
||||
---
|
||||
|
||||
As our last milestone for the [EU NGI Assure](https://www.assure.ngi.eu/) funded MirageVPN project (for now) we have been working on testing MirageVPN, our OpenVPN™-compatible VPN implementation against the upstream OpenVPN™.
|
||||
During the development we have conducted many manual tests.
|
||||
However, this scales poorly and it is easy to forget testing certain cases.
|
||||
Therefore, we designed and implemented interoperability testing, driving the C implementation on the one side, and our OCaml implementation on the other side. The input for such a test is a configuration file that both implementations can use.
|
||||
Thus we test establishment of the tunnel as well as the tunnel itself.
|
||||
|
||||
While conducting the tests, our instrumented binaries expose code coverage information. We use that to guide ourselves which other configurations are worth testing. Our goal is to achieve a high code coverage rate while using a small amount of different configurations. These interoperability tests are running fast enough, so they are executed on each commit by CI.
|
||||
|
||||
A nice property of this test setup is that it runs with an unmodified OpenVPN binary.
|
||||
This means we can use an off-the-shelf OpenVPN binary from the package repository and does not entail further maintenance of an OpenVPN fork.
|
||||
Testing against a future version of OpenVPN becomes trivial.
|
||||
We do not just test a single part of our implementation but achieve an end-to-end test.
|
||||
The same configuration files are used for both our implementation and the C implementation, and each configuration is used twice, once our implementation acts as the client, once as the server.
|
||||
|
||||
We added a flag to our client and our [recently finished server](miragevpn-server) applications, `--test`, which make them to exit once a tunnel is established and an ICMP echo request from the client has been replied to by the server.
|
||||
Our client and server can be run without a tun device which otherwise would require elevated privileges.
|
||||
Unfortunately, OpenVPN requires privileges to at least configure a tun device.
|
||||
Our MirageVPN implementation does IP packet parsing in userspace.
|
||||
We test our protocol implementation, not the entire unikernel - but the unikernel code is a tiny layer on top of the purely functional protocol implementation.
|
||||
|
||||
We explored unit testing the packet decoding and decryption with our implementation and the C implementation.
|
||||
Specifically, we encountered a packet whose message authentication code (MAC) was deemed invalid by the C implementation.
|
||||
It helped us discover the MAC computation was correct but the packet encoding was truncated - both implementations agreed that the MAC was bad.
|
||||
The test was very tedious to write and would not easily scale to cover a large portion of the code.
|
||||
If of interest, take a look into our [modifications to OpenVPN](https://github.com/reynir/openvpn/tree/badmac-test) and [modifications to MirageVPN](https://github.com/robur-coop/miragevpn/tree/badmac-test).
|
||||
|
||||
The end-to-end testing is in addition to our unit tests and fuzz testing; and to our [benchmarking](miragevpn-performance.html) binary.
|
||||
|
||||
Our results are that with 4 configurations we achieve above 75% code coverage in MirageVPN.
|
||||
While investigating the code coverage results, we found various pieces of code that were never executed, and we were able to remove them.
|
||||
Code that does not exist is bug-free :D
|
||||
With these tests in place future maintenance is less daunting as they will help us guard us from breaking the code.
|
||||
|
||||
At the moment we do not exercise the error paths very well in the code.
|
||||
This is much less straightforward to test in this manner, and is important future work.
|
||||
We plan to develop a client and server that injects faults at various stages of the protocol to test these error paths.
|
||||
OpenVPN built with debugging enabled also comes with a `--gremlin` mode that injects faults, and would be interesting to investigate.
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2023-11-14
|
||||
article.title: MirageVPN & tls-crypt-v2
|
||||
article.description:
|
||||
title: MirageVPN & tls-crypt-v2
|
||||
description:
|
||||
How we implementated tls-crypt-v2 for miragevpn
|
||||
tags:
|
||||
- OCaml
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2024-06-24
|
||||
article.title: qubes-miragevpn, a MirageVPN client for QubesOS
|
||||
article.description: A new OpenVPN client for QubesOS
|
||||
title: qubes-miragevpn, a MirageVPN client for QubesOS
|
||||
description: A new OpenVPN client for QubesOS
|
||||
tags:
|
||||
- OCaml
|
||||
- vpn
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2024-02-13
|
||||
article.title: Speeding elliptic curve cryptography
|
||||
article.description:
|
||||
title: Speeding elliptic curve cryptography
|
||||
description:
|
||||
How we improved the performance of elliptic curves by only modifying the underlying byte array
|
||||
tags:
|
||||
- OCaml
|
||||
|
@ -93,6 +93,6 @@ As a spoiler, for P-256 sign there's another improvement of around 4.5 with [Vir
|
|||
|
||||
Remove all cstruct, everywhere, apart from in mirage-block-xen and mirage-net-xen ;). It was a fine decision in the early MirageOS days, but from a performance point of view, and for making our packages more broadly usable without many dependencies, it is time to remove cstruct. Earlier this year we already [removed cstruct from ocaml-tar](https://github.com/mirage/ocaml-tar/pull/137) for similar reasons.
|
||||
|
||||
Our MirageOS work is only partially funded, we cross-fund our work by commercial contracts and public (EU) funding. We are part of a non-profit company, you can make a (tax-deducable - at least in the EU) [donation](https://aenderwerk.de/donate/) (select "DONATION robur" in the dropdown menu).
|
||||
Our MirageOS work is only partially funded, we cross-fund our work by commercial contracts and public (EU) funding. We are part of a non-profit company, you can make a (tax-deductable - at least in the EU) [donation](https://aenderwerk.de/donate/) (select "DONATION robur" in the dropdown menu).
|
||||
|
||||
We're keen to get MirageOS deployed in production - if you would like to do that, don't hesitate to reach out to us via eMail team at robur.coop
|
||||
We're keen to get MirageOS deployed in production - if you would like to do that, don't hesitate to reach out to us via eMail team at robur.coop
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
date: 2024-08-15
|
||||
article.title: The new Tar release, a retrospective
|
||||
article.description: A little retrospective to the new Tar release and changes
|
||||
title: The new Tar release, a retrospective
|
||||
description: A little retrospective to the new Tar release and changes
|
||||
tags:
|
||||
- OCaml
|
||||
- Cstruct
|
||||
|
|
681
bin/blog.ml
Normal file
681
bin/blog.ml
Normal file
|
@ -0,0 +1,681 @@
|
|||
open Yocaml
|
||||
|
||||
module SM = Map.Make(String)
|
||||
|
||||
let is_empty_list = function [] -> true | _ -> false
|
||||
|
||||
module Date = struct
|
||||
type month =
|
||||
| Jan
|
||||
| Feb
|
||||
| Mar
|
||||
| Apr
|
||||
| May
|
||||
| Jun
|
||||
| Jul
|
||||
| Aug
|
||||
| Sep
|
||||
| Oct
|
||||
| Nov
|
||||
| Dec
|
||||
|
||||
type day_of_week = Mon | Tue | Wed | Thu | Fri | Sat | Sun
|
||||
type year = int
|
||||
type day = int
|
||||
type hour = int
|
||||
type min = int
|
||||
type sec = int
|
||||
|
||||
type t = {
|
||||
year : year
|
||||
; month : month
|
||||
; day : day
|
||||
; hour : hour
|
||||
; min : min
|
||||
; sec : sec
|
||||
}
|
||||
|
||||
let invalid_int x message =
|
||||
Data.Validation.fail_with ~given:(string_of_int x) message
|
||||
|
||||
let month_from_int x =
|
||||
if x > 0 && x <= 12 then
|
||||
Result.ok
|
||||
[| Jan; Feb; Mar; Apr; May; Jun; Jul; Aug; Sep; Oct; Nov; Dec |].(x - 1)
|
||||
else invalid_int x "Invalid month value"
|
||||
|
||||
let year_from_int x =
|
||||
if x >= 0 then Result.ok x else invalid_int x "Invalid year value"
|
||||
|
||||
let is_leap year =
|
||||
if year mod 100 = 0 then year mod 400 = 0 else year mod 4 = 0
|
||||
|
||||
let days_in_month year month =
|
||||
match month with
|
||||
| Jan | Mar | May | Jul | Aug | Oct | Dec -> 31
|
||||
| Feb -> if is_leap year then 29 else 28
|
||||
| _ -> 30
|
||||
|
||||
let day_from_int year month x =
|
||||
let dim = days_in_month year month in
|
||||
if x >= 1 && x <= dim then Result.ok x
|
||||
else invalid_int x "Invalid day value"
|
||||
|
||||
let hour_from_int x =
|
||||
if x >= 0 && x < 24 then Result.ok x else invalid_int x "Invalid hour value"
|
||||
|
||||
let min_from_int x =
|
||||
if x >= 0 && x < 60 then Result.ok x else invalid_int x "Invalid min value"
|
||||
|
||||
let sec_from_int x =
|
||||
if x >= 0 && x < 60 then Result.ok x else invalid_int x "Invalid sec value"
|
||||
|
||||
let ( let* ) = Result.bind
|
||||
|
||||
let make ?(time = (0, 0, 0)) ~year ~month ~day () =
|
||||
let hour, min, sec = time in
|
||||
let* year = year_from_int year in
|
||||
let* month = month_from_int month in
|
||||
let* day = day_from_int year month day in
|
||||
let* hour = hour_from_int hour in
|
||||
let* min = min_from_int min in
|
||||
let* sec = sec_from_int sec in
|
||||
Result.ok { year; month; day; hour; min; sec }
|
||||
|
||||
let validate_from_datetime_str str =
|
||||
let str = String.trim str in
|
||||
match
|
||||
Scanf.sscanf_opt str "%04d%c%02d%c%02d%c%02d%c%02d%c%02d"
|
||||
(fun year _ month _ day _ hour _ min _ sec ->
|
||||
((hour, min, sec), year, month, day))
|
||||
with
|
||||
| None -> Data.Validation.fail_with ~given:str "Invalid date format"
|
||||
| Some (time, year, month, day) -> make ~time ~year ~month ~day ()
|
||||
|
||||
let validate_from_date_str str =
|
||||
let str = String.trim str in
|
||||
match
|
||||
Scanf.sscanf_opt str "%04d%c%02d%c%02d" (fun year _ month _ day ->
|
||||
(year, month, day))
|
||||
with
|
||||
| None -> Data.Validation.fail_with ~given:str "Invalid date format"
|
||||
| Some (year, month, day) -> make ~year ~month ~day ()
|
||||
|
||||
let validate =
|
||||
let open Data.Validation in
|
||||
string & (validate_from_datetime_str / validate_from_date_str)
|
||||
|
||||
let month_to_int = function
|
||||
| Jan -> 1
|
||||
| Feb -> 2
|
||||
| Mar -> 3
|
||||
| Apr -> 4
|
||||
| May -> 5
|
||||
| Jun -> 6
|
||||
| Jul -> 7
|
||||
| Aug -> 8
|
||||
| Sep -> 9
|
||||
| Oct -> 10
|
||||
| Nov -> 11
|
||||
| Dec -> 12
|
||||
|
||||
let dow_to_int = function
|
||||
| Mon -> 0
|
||||
| Tue -> 1
|
||||
| Wed -> 2
|
||||
| Thu -> 3
|
||||
| Fri -> 4
|
||||
| Sat -> 5
|
||||
| Sun -> 6
|
||||
|
||||
let compare_date a b =
|
||||
let cmp = Int.compare a.year b.year in
|
||||
if Int.equal cmp 0 then
|
||||
let cmp = Int.compare (month_to_int a.month) (month_to_int b.month) in
|
||||
if Int.equal cmp 0 then Int.compare a.day b.day else cmp
|
||||
else cmp
|
||||
|
||||
let compare_time a b =
|
||||
let cmp = Int.compare a.hour b.hour in
|
||||
if Int.equal cmp 0 then
|
||||
let cmp = Int.compare a.min b.min in
|
||||
if Int.equal cmp 0 then Int.compare a.sec b.sec else cmp
|
||||
else cmp
|
||||
|
||||
let compare a b =
|
||||
let cmp = compare_date a b in
|
||||
if Int.equal cmp 0 then compare_time a b else cmp
|
||||
|
||||
let pp_date ppf { year; month; day; _ } =
|
||||
Format.fprintf ppf "%04d-%02d-%02d" year (month_to_int month) day
|
||||
|
||||
let month_value = function
|
||||
| Jan -> 0
|
||||
| Feb -> 3
|
||||
| Mar -> 3
|
||||
| Apr -> 6
|
||||
| May -> 1
|
||||
| Jun -> 4
|
||||
| Jul -> 6
|
||||
| Aug -> 2
|
||||
| Sep -> 5
|
||||
| Oct -> 0
|
||||
| Nov -> 3
|
||||
| Dec -> 5
|
||||
|
||||
let day_of_week { year; month; day; _ } =
|
||||
let yy = year mod 100 in
|
||||
let cc = (year - yy) / 100 in
|
||||
let c_code = [| 6; 4; 2; 0 |].(cc mod 4) in
|
||||
let y_code = (yy + (yy / 4)) mod 7 in
|
||||
let m_code =
|
||||
let v = month_value month in
|
||||
if is_leap year && (month = Jan || month = Feb) then v - 1 else v
|
||||
in
|
||||
let index = (c_code + y_code + m_code + day) mod 7 in
|
||||
[| Sun; Mon; Tue; Wed; Thu; Fri; Sat |].(index)
|
||||
|
||||
let normalize ({ year; month; day; hour; min; sec } as dt) =
|
||||
let day_of_week = day_of_week dt in
|
||||
let open Data in
|
||||
record
|
||||
[
|
||||
("year", int year); ("month", int (month_to_int month)); ("day", int day)
|
||||
; ("hour", int hour); ("min", int min); ("sec", int sec)
|
||||
; ("day_of_week", int (dow_to_int day_of_week))
|
||||
; ("human", string (Format.asprintf "%a" pp_date dt))
|
||||
]
|
||||
|
||||
let to_archetype_date_time { year; month; day; hour; min; sec } =
|
||||
let time = (hour, min, sec) in
|
||||
let month = month_to_int month in
|
||||
Result.get_ok (Archetype.Datetime.make ~time ~year ~month ~day ())
|
||||
end
|
||||
|
||||
module Page = struct
|
||||
let entity_name = "Page"
|
||||
|
||||
class type t = object ('self)
|
||||
method title : string option
|
||||
method charset : string option
|
||||
method description : string option
|
||||
method tags : string list
|
||||
method with_host : string -> 'self
|
||||
method get_host : string option
|
||||
end
|
||||
|
||||
class page ?title ?description ?charset ?(tags = []) () =
|
||||
object (_ : #t)
|
||||
method title = title
|
||||
method charset = charset
|
||||
method description = description
|
||||
method tags = tags
|
||||
val host = None
|
||||
method with_host v = {< host = Some v >}
|
||||
method get_host = host
|
||||
end
|
||||
|
||||
let neutral = Result.ok @@ new page ()
|
||||
|
||||
let validate fields =
|
||||
let open Data.Validation in
|
||||
let+ title = optional fields "title" string
|
||||
and+ description = optional fields "description" string
|
||||
and+ charset = optional fields "charset" string
|
||||
and+ tags = optional_or fields ~default:[] "tags" (list_of string) in
|
||||
new page ?title ?description ?charset ~tags ()
|
||||
|
||||
let validate =
|
||||
let open Data.Validation in
|
||||
record validate
|
||||
end
|
||||
|
||||
module Author = struct
|
||||
class type t = object
|
||||
method name : string
|
||||
method link : string
|
||||
method email : string
|
||||
method avatar : string option
|
||||
end
|
||||
|
||||
let gravatar email =
|
||||
let tk = String.(lowercase_ascii (trim email)) in
|
||||
let hs = Digest.(to_hex (string tk)) in
|
||||
"https://www.gravatar.com/avatar/" ^ hs
|
||||
|
||||
class author ~name ~link ~email ?(avatar = gravatar email) () =
|
||||
object (_ : #t)
|
||||
method name = name
|
||||
method link = link
|
||||
method email = email
|
||||
method avatar = Some avatar
|
||||
end
|
||||
|
||||
let validate fields =
|
||||
let open Data.Validation in
|
||||
let+ name = required fields "name" string
|
||||
and+ link = required fields "link" string
|
||||
and+ email = required fields "email" string
|
||||
and+ avatar = optional fields "avatar" string in
|
||||
match avatar with
|
||||
| None -> new author ~name ~link ~email ()
|
||||
| Some avatar -> new author ~name ~link ~email ~avatar ()
|
||||
|
||||
let validate =
|
||||
let open Data.Validation in
|
||||
record validate
|
||||
|
||||
let normalize obj =
|
||||
let open Data in
|
||||
record
|
||||
[
|
||||
("name", string obj#name); ("link", string obj#link)
|
||||
; ("email", string obj#email); ("avatar", option string obj#avatar)
|
||||
]
|
||||
end
|
||||
|
||||
let robur_coop =
|
||||
new Author.author
|
||||
~name:"The Robur Team" ~link:"https://robur.coop/"
|
||||
~email:"team@robur.coop" ()
|
||||
|
||||
module Article = struct
|
||||
let entity_name = "Article"
|
||||
|
||||
class type t = object ('self)
|
||||
method title : string
|
||||
method description : string
|
||||
method charset : string option
|
||||
method tags : string list
|
||||
method date : Date.t
|
||||
method author : Author.t
|
||||
method co_authors : Author.t list
|
||||
method with_host : string -> 'self
|
||||
method get_host : string option
|
||||
end
|
||||
|
||||
class article ~title ~description ?charset ?(tags = []) ~date ~author
|
||||
?(co_authors = []) () =
|
||||
object (_ : #t)
|
||||
method title = title
|
||||
method description = description
|
||||
method charset = charset
|
||||
method tags = tags
|
||||
method date = date
|
||||
method author = author
|
||||
method co_authors = co_authors
|
||||
val host = None
|
||||
method with_host v = {< host = Some v >}
|
||||
method get_host = host
|
||||
end
|
||||
|
||||
let title p = p#title
|
||||
let description p = p#description
|
||||
let date p = p#date
|
||||
|
||||
let neutral =
|
||||
Data.Validation.fail_with ~given:"null" "Cannot be null"
|
||||
|> Result.map_error (fun error ->
|
||||
Required.Validation_error { entity = entity_name; error })
|
||||
|
||||
let validate fields =
|
||||
let open Data.Validation in
|
||||
let+ title = required fields "title" string
|
||||
and+ description = required fields "description" string
|
||||
and+ charset = optional fields "charset" string
|
||||
and+ tags = optional_or fields ~default:[] "tags" (list_of string)
|
||||
and+ date = required fields "date" Date.validate
|
||||
and+ author =
|
||||
optional_or fields ~default:robur_coop "author" Author.validate
|
||||
and+ co_authors =
|
||||
optional_or fields ~default:[] "co-authors" (list_of Author.validate)
|
||||
in
|
||||
new article ~title ~description ?charset ~tags ~date ~author ~co_authors ()
|
||||
|
||||
let validate =
|
||||
let open Data.Validation in
|
||||
record validate
|
||||
|
||||
let normalize obj =
|
||||
Data.
|
||||
[
|
||||
("title", string obj#title); ("description", string obj#description)
|
||||
; ("date", Date.normalize obj#date); ("charset", option string obj#charset)
|
||||
; ("tags", list_of string obj#tags)
|
||||
; ("author", Author.normalize obj#author)
|
||||
; ("co-authors", list_of Author.normalize obj#co_authors)
|
||||
; ("host", option string obj#get_host)
|
||||
]
|
||||
end
|
||||
|
||||
module Articles = struct
|
||||
class type t = object ('self)
|
||||
method title : string option
|
||||
method description : string option
|
||||
method articles : (Path.t * Article.t) list
|
||||
method with_host : string -> 'self
|
||||
method get_host : string option
|
||||
end
|
||||
|
||||
class articles ?title ?description articles =
|
||||
object (_ : #t)
|
||||
method title = title
|
||||
method description = description
|
||||
method articles = articles
|
||||
val host = None
|
||||
method with_host v = {< host = Some v >}
|
||||
method get_host = host
|
||||
end
|
||||
|
||||
let sort_by_date ?(increasing = false) articles =
|
||||
List.sort
|
||||
(fun (_, articleA) (_, articleB) ->
|
||||
let r = Date.compare articleA#date articleB#date in
|
||||
if increasing then r else ~-r)
|
||||
articles
|
||||
|
||||
let fetch (module P : Required.DATA_PROVIDER) ?increasing
|
||||
?(filter = fun x -> x) ?(on = `Source) ~where ~compute_link path =
|
||||
Task.from_effect begin fun () ->
|
||||
let open Eff in
|
||||
let* files = read_directory ~on ~only:`Files ~where path in
|
||||
let+ articles =
|
||||
List.traverse
|
||||
(fun file ->
|
||||
let url = compute_link file in
|
||||
let+ metadata, _content =
|
||||
Eff.read_file_with_metadata (module P) (module Article) ~on file
|
||||
in
|
||||
(url, metadata))
|
||||
files
|
||||
in
|
||||
articles |> sort_by_date ?increasing |> filter end
|
||||
|
||||
let compute_index (module P : Required.DATA_PROVIDER) ?increasing
|
||||
?(filter = fun x -> x) ?(on = `Source) ~where ~compute_link path =
|
||||
let open Task in
|
||||
(fun x -> (x, ()))
|
||||
|>> second
|
||||
(fetch (module P) ?increasing ~filter ~on ~where ~compute_link path)
|
||||
>>> lift (fun (v, articles) ->
|
||||
new articles ?title:v#title ?description:v#description articles)
|
||||
|
||||
let normalize (ident, article) =
|
||||
let open Data in
|
||||
record (("url", string @@ Path.to_string ident) :: Article.normalize article)
|
||||
|
||||
let normalize obj =
|
||||
let open Data in
|
||||
[
|
||||
("articles", list_of normalize obj#articles)
|
||||
; ("has_articles", bool @@ is_empty_list obj#articles)
|
||||
; ("title", option string obj#title)
|
||||
; ("description", option string obj#description)
|
||||
; ("host", option string obj#get_host)
|
||||
]
|
||||
end
|
||||
|
||||
module Tag = struct
|
||||
type t = {
|
||||
name : string;
|
||||
articles : (Path.t * Article.t) list;
|
||||
}
|
||||
|
||||
let make ~name ~articles =
|
||||
{ name; articles }
|
||||
|
||||
let normalize_article (ident, article) =
|
||||
let open Data in
|
||||
record (("url", string @@ Path.to_string ident) :: Article.normalize article)
|
||||
|
||||
let normalize { name; articles } =
|
||||
let open Data in
|
||||
[
|
||||
("name", string name);
|
||||
("articles", (list_of normalize_article) articles);
|
||||
]
|
||||
end
|
||||
|
||||
module Tags = struct
|
||||
class type t = object ('self)
|
||||
inherit Articles.t
|
||||
method tags : Tag.t list
|
||||
end
|
||||
|
||||
class tags ?title ?description articles =
|
||||
object
|
||||
inherit Articles.articles ?title ?description articles as super
|
||||
method! title = Some "Tags"
|
||||
method tags =
|
||||
let tags =
|
||||
let update article sm tag =
|
||||
SM.update tag
|
||||
(function
|
||||
| None -> Some [article]
|
||||
| Some urls -> Some (article :: urls))
|
||||
sm
|
||||
in
|
||||
List.fold_left
|
||||
(fun sm (url, article) ->
|
||||
List.fold_left (update (url, article)) sm article#tags)
|
||||
SM.empty
|
||||
super#articles
|
||||
|> SM.bindings
|
||||
in
|
||||
List.map (fun (tag, articles) ->
|
||||
Tag.make ~name:tag ~articles)
|
||||
tags
|
||||
end
|
||||
|
||||
let of_articles articles =
|
||||
new tags ?title:articles#title ?description:articles#description articles#articles
|
||||
|
||||
let normalize_tag tag =
|
||||
let open Data in
|
||||
record (Tag.normalize tag)
|
||||
|
||||
let normalize tags =
|
||||
let open Data in
|
||||
("all_tags", (list_of normalize_tag tags#tags)) :: Articles.normalize tags
|
||||
end
|
||||
|
||||
module Make_with_target (S : sig
|
||||
val source : Path.t
|
||||
val target : Path.t
|
||||
end) =
|
||||
struct
|
||||
let source_root = S.source
|
||||
|
||||
module Source = struct
|
||||
let css = Path.(source_root / "css")
|
||||
let js = Path.(source_root / "js")
|
||||
let images = Path.(source_root / "images")
|
||||
let articles = Path.(source_root / "articles")
|
||||
let index = Path.(source_root / "pages" / "index.md")
|
||||
let tags = Path.(source_root / "pages" / "tags.md")
|
||||
let templates = Path.(source_root / "templates")
|
||||
let template file = Path.(templates / file)
|
||||
let binary = Path.rel [ Sys.argv.(0) ]
|
||||
let cache = Path.(source_root / "_cache")
|
||||
end
|
||||
|
||||
module Target = struct
|
||||
let target_root = S.target
|
||||
let pages = target_root
|
||||
let articles = Path.(target_root / "articles")
|
||||
let rss2 = Path.(target_root / "feed.xml")
|
||||
|
||||
let as_html into file =
|
||||
file |> Path.move ~into |> Path.change_extension "html"
|
||||
end
|
||||
|
||||
let target = Target.target_root
|
||||
|
||||
let process_css_files =
|
||||
Action.copy_directory ~into:Target.target_root Source.css
|
||||
|
||||
let process_js_files =
|
||||
Action.copy_directory ~into:Target.target_root Source.js
|
||||
|
||||
let process_images_files =
|
||||
Action.copy_directory ~into:Target.target_root Source.images
|
||||
|
||||
let process_article ~host file =
|
||||
let file_target = Target.(as_html articles file) in
|
||||
let open Task in
|
||||
Action.write_static_file file_target
|
||||
begin
|
||||
Pipeline.track_file Source.binary
|
||||
>>> Yocaml_yaml.Pipeline.read_file_with_metadata (module Article) file
|
||||
>>* (fun (obj, str) -> Eff.return (obj#with_host host, str))
|
||||
>>> Yocaml_cmarkit.content_to_html ~strict:false ()
|
||||
>>> Yocaml_jingoo.Pipeline.as_template
|
||||
(module Article)
|
||||
(Source.template "article.html")
|
||||
>>> Yocaml_jingoo.Pipeline.as_template
|
||||
(module Article)
|
||||
(Source.template "layout.html")
|
||||
>>> drop_first ()
|
||||
end
|
||||
|
||||
let process_articles ~host =
|
||||
Action.batch ~only:`Files ~where:(Path.has_extension "md") Source.articles
|
||||
(process_article ~host)
|
||||
|
||||
let process_index ~host =
|
||||
let file = Source.index in
|
||||
let file_target = Target.(as_html pages file) in
|
||||
|
||||
let open Task in
|
||||
let compute_index =
|
||||
Articles.compute_index
|
||||
(module Yocaml_yaml)
|
||||
~where:(Path.has_extension "md")
|
||||
~compute_link:(Target.as_html @@ Path.abs [ "articles" ])
|
||||
Source.articles
|
||||
in
|
||||
|
||||
Action.write_static_file file_target
|
||||
begin
|
||||
Pipeline.track_files [ Source.binary; Source.articles ]
|
||||
>>> Yocaml_yaml.Pipeline.read_file_with_metadata (module Page) file
|
||||
>>> Yocaml_cmarkit.content_to_html ~strict:false ()
|
||||
>>> first compute_index
|
||||
>>* (fun (obj, str) -> Eff.return (obj#with_host host, str))
|
||||
>>> Yocaml_jingoo.Pipeline.as_template ~strict:true
|
||||
(module Articles)
|
||||
(Source.template "index.html")
|
||||
>>> Yocaml_jingoo.Pipeline.as_template ~strict:true
|
||||
(module Articles)
|
||||
(Source.template "layout.html")
|
||||
>>> drop_first ()
|
||||
end
|
||||
|
||||
let process_tags ~host =
|
||||
let file = Source.tags in
|
||||
let file_target = Target.(as_html pages file) in
|
||||
|
||||
let open Task in
|
||||
let compute_index =
|
||||
Articles.compute_index
|
||||
(module Yocaml_yaml)
|
||||
~where:(Path.has_extension "md")
|
||||
~compute_link:(Target.as_html @@ Path.abs [ "articles" ])
|
||||
Source.articles
|
||||
in
|
||||
|
||||
Action.write_static_file file_target
|
||||
begin
|
||||
Pipeline.track_files [ Source.binary; Source.articles ]
|
||||
>>> Yocaml_yaml.Pipeline.read_file_with_metadata (module Page) file
|
||||
>>> Yocaml_cmarkit.content_to_html ~strict:false ()
|
||||
>>> first compute_index
|
||||
>>* (fun (obj, str) -> Eff.return (Tags.of_articles (obj#with_host host), str))
|
||||
>>> Yocaml_jingoo.Pipeline.as_template ~strict:true
|
||||
(module Tags)
|
||||
(Source.template "tags.html")
|
||||
>>> Yocaml_jingoo.Pipeline.as_template ~strict:true
|
||||
(module Tags)
|
||||
(Source.template "layout.html")
|
||||
>>> drop_first ()
|
||||
end
|
||||
|
||||
let feed_title = "The Robur's blog"
|
||||
let site_url = "https://blog.robur.coop"
|
||||
let feed_description = "The Robur cooperative blog"
|
||||
|
||||
let fetch_articles =
|
||||
let open Task in
|
||||
Pipeline.track_files [ Source.binary; Source.articles ]
|
||||
>>> Articles.fetch
|
||||
(module Yocaml_yaml)
|
||||
~where:(Path.has_extension "md")
|
||||
~compute_link:(Target.as_html @@ Path.abs [ "articles" ])
|
||||
Source.articles
|
||||
|
||||
let rss2 =
|
||||
let open Task in
|
||||
let from_articles ~title ~site_url ~description ~feed_url () =
|
||||
let open Yocaml_syndication in
|
||||
lift
|
||||
begin
|
||||
fun articles ->
|
||||
let last_build_date =
|
||||
List.fold_left
|
||||
begin
|
||||
fun acc (_, elt) ->
|
||||
let v = Date.to_archetype_date_time (Article.date elt) in
|
||||
match acc with
|
||||
| None -> Some v
|
||||
| Some a ->
|
||||
if Archetype.Datetime.compare a v > 0 then Some a
|
||||
else Some v
|
||||
end
|
||||
None articles
|
||||
|> Option.map Datetime.make
|
||||
in
|
||||
let feed =
|
||||
Rss2.feed ?last_build_date ~title ~link:site_url ~url:feed_url
|
||||
~description
|
||||
begin
|
||||
fun (path, article) ->
|
||||
let title = Article.title article in
|
||||
let link = site_url ^ Path.to_string path in
|
||||
let guid = Rss2.guid_from_link in
|
||||
let description = Article.description article in
|
||||
let pub_date =
|
||||
Datetime.make
|
||||
(Date.to_archetype_date_time (Article.date article))
|
||||
in
|
||||
Rss2.item ~title ~link ~guid ~description ~pub_date ()
|
||||
end
|
||||
articles
|
||||
in
|
||||
Xml.to_string feed
|
||||
end
|
||||
in
|
||||
Action.write_static_file Target.rss2
|
||||
begin
|
||||
fetch_articles
|
||||
>>> from_articles ~title:feed_title ~site_url
|
||||
~description:feed_description
|
||||
~feed_url:"https://blog.robur.coop/feed.xml" ()
|
||||
end
|
||||
|
||||
let process_all ~host =
|
||||
let open Eff in
|
||||
Action.restore_cache ~on:`Source Source.cache
|
||||
>>= process_css_files >>= process_js_files >>= process_images_files
|
||||
>>= process_tags ~host
|
||||
>>= process_articles ~host >>= process_index ~host >>= rss2
|
||||
>>= Action.store_cache ~on:`Source Source.cache
|
||||
end
|
||||
|
||||
module Make (S : sig
|
||||
val source : Path.t
|
||||
end) =
|
||||
Make_with_target (struct
|
||||
include S
|
||||
|
||||
let target = Path.(source / "_site")
|
||||
end)
|
14
bin/blog.mli
Normal file
14
bin/blog.mli
Normal file
|
@ -0,0 +1,14 @@
|
|||
module Make_with_target (_ : sig
|
||||
val source : Yocaml.Path.t
|
||||
val target : Yocaml.Path.t
|
||||
end) : sig
|
||||
val target : Yocaml.Path.t
|
||||
val process_all : host:string -> unit Yocaml.Eff.t
|
||||
end
|
||||
|
||||
module Make (_ : sig
|
||||
val source : Yocaml.Path.t
|
||||
end) : sig
|
||||
val target : Yocaml.Path.t
|
||||
val process_all : host:string -> unit Yocaml.Eff.t
|
||||
end
|
24
bin/dune
Normal file
24
bin/dune
Normal file
|
@ -0,0 +1,24 @@
|
|||
(executable
|
||||
(name watch)
|
||||
(libraries
|
||||
yocaml
|
||||
yocaml_syndication
|
||||
yocaml_yaml
|
||||
yocaml_jingoo
|
||||
yocaml_cmarkit
|
||||
yocaml_unix))
|
||||
|
||||
(executable
|
||||
(name push)
|
||||
(libraries
|
||||
fmt.tty
|
||||
logs.fmt
|
||||
git-unix
|
||||
bos
|
||||
yocaml
|
||||
yocaml_git
|
||||
yocaml_syndication
|
||||
yocaml_yaml
|
||||
yocaml_jingoo
|
||||
yocaml_cmarkit
|
||||
yocaml_unix))
|
82
bin/push.ml
Normal file
82
bin/push.ml
Normal file
|
@ -0,0 +1,82 @@
|
|||
let reporter ppf =
|
||||
let report src level ~over k msgf =
|
||||
let k _ =
|
||||
over ();
|
||||
k ()
|
||||
in
|
||||
let with_metadata header _tags k ppf fmt =
|
||||
Format.kfprintf k ppf
|
||||
("%a[%a]: " ^^ fmt ^^ "\n%!")
|
||||
Logs_fmt.pp_header (level, header)
|
||||
Fmt.(styled `Magenta string)
|
||||
(Logs.Src.name src)
|
||||
in
|
||||
msgf @@ fun ?header ?tags fmt -> with_metadata header tags k ppf fmt
|
||||
in
|
||||
{ Logs.report }
|
||||
|
||||
let run_git_rev_parse () =
|
||||
let open Bos in
|
||||
let value = OS.Cmd.run_out
|
||||
Cmd.(v "git" % "describe" % "--always" % "--dirty"
|
||||
% "--exclude=*" % "--abbrev=0")
|
||||
in
|
||||
match OS.Cmd.out_string value with
|
||||
| Ok (value, (_, `Exited 0)) -> Some value
|
||||
| Ok (value, (run_info, _)) ->
|
||||
Logs.warn (fun m -> m "Failed to get commit id: %a: %s"
|
||||
Cmd.pp (OS.Cmd.run_info_cmd run_info)
|
||||
value);
|
||||
None
|
||||
| Error `Msg e ->
|
||||
Logs.warn (fun m -> m "Failed to get commit id: %s" e);
|
||||
None
|
||||
|
||||
let message () =
|
||||
match run_git_rev_parse () with
|
||||
| Some hash -> Fmt.str "Pushed by YOCaml 2 from %s" hash
|
||||
| None -> Fmt.str "Pushed by YOCaml 2"
|
||||
|
||||
let () = Fmt_tty.setup_std_outputs ~style_renderer:`Ansi_tty ~utf_8:true ()
|
||||
let () = Logs.set_reporter (reporter Fmt.stdout)
|
||||
(* let () = Logs.set_level ~all:true (Some Logs.Debug) *)
|
||||
let author = ref "The Robur Team"
|
||||
let email = ref "team@robur.coop"
|
||||
let message = ref (message ())
|
||||
let remote = ref "git@git.robur.coop:robur/blog.robur.coop.git#gh-pages"
|
||||
let host = ref "https://blog.robur.coop"
|
||||
|
||||
module Source = Yocaml_git.From_identity (Yocaml_unix.Runtime)
|
||||
|
||||
let usage =
|
||||
Fmt.str
|
||||
"%s [--message <message>] [--author <author>] [--email <email>] -r \
|
||||
<repository>#<branch>"
|
||||
Sys.argv.(0)
|
||||
|
||||
let specification =
|
||||
[
|
||||
("--message", Arg.Set_string message, "The commit message")
|
||||
; ("--email", Arg.Set_string email, "The email used to craft the commit")
|
||||
; ("-r", Arg.Set_string remote, "The Git repository including #branch, e.g. " ^ !remote)
|
||||
; ("--author", Arg.Set_string author, "The Git commit author")
|
||||
; ("--host", Arg.Set_string host, "The host where the blog is available")
|
||||
]
|
||||
|
||||
let () =
|
||||
Arg.parse specification ignore usage;
|
||||
let author = !author
|
||||
and email = !email
|
||||
and message = !message
|
||||
and remote = !remote in
|
||||
let module Blog = Blog.Make_with_target (struct
|
||||
let source = Yocaml.Path.rel []
|
||||
let target = Yocaml.Path.rel []
|
||||
end) in
|
||||
Yocaml_git.run
|
||||
(module Source)
|
||||
(module Pclock)
|
||||
~context:`SSH ~author ~email ~message ~remote
|
||||
(fun () -> Blog.process_all ~host:!host)
|
||||
|> Lwt_main.run
|
||||
|> Result.iter_error (fun (`Msg err) -> invalid_arg err)
|
15
bin/watch.ml
Normal file
15
bin/watch.ml
Normal file
|
@ -0,0 +1,15 @@
|
|||
let port = ref 8000
|
||||
let usage = Fmt.str "%s [--port <port>]" Sys.argv.(0)
|
||||
|
||||
let specification =
|
||||
[ ("--port", Arg.Set_int port, "The port where we serve the website") ]
|
||||
|
||||
module Dest = Blog.Make (struct
|
||||
let source = Yocaml.Path.rel []
|
||||
end)
|
||||
|
||||
let () =
|
||||
Arg.parse specification ignore usage;
|
||||
let host = Fmt.str "http://localhost:%d" !port in
|
||||
Yocaml_unix.serve ~level:`Info ~target:Dest.target ~port:!port
|
||||
@@ fun () -> Dest.process_all ~host
|
12
blogger.opam
12
blogger.opam
|
@ -5,7 +5,7 @@ maintainer: "romain.calascibetta@gmail.com"
|
|||
authors: [ "The XHTMLBoy <xhtmlboi@gmail.com>" ]
|
||||
|
||||
build: [
|
||||
[ "dune" "subst" ]
|
||||
[ "dune" "subst" ] {dev}
|
||||
[ "dune" "build" "-p" name "-j" jobs ]
|
||||
[ "dune" "runtest" "-p" name ] {with-test}
|
||||
[ "dune" "build" "@doc" "-p" name ] {with-doc}
|
||||
|
@ -18,16 +18,18 @@ dev-repo: "git://github.com/dinosaure/blogger.git"
|
|||
bug-reports: "https://github.com/dinosaure/blogger/issues"
|
||||
|
||||
depends: [
|
||||
"ocaml" { >= "4.11.1" }
|
||||
"dune" { >= "2.8" }
|
||||
"ocaml" { >= "5.1.0" }
|
||||
"dune" { >= "3.16.0" }
|
||||
"preface" { >= "0.1.0" }
|
||||
"logs" {>= "0.7.0" }
|
||||
"cmdliner" { >= "1.0.0"}
|
||||
"http-lwt-client"
|
||||
"yocaml"
|
||||
"bos"
|
||||
"yocaml" {>= "2.0.1"}
|
||||
"yocaml_unix"
|
||||
"yocaml_yaml"
|
||||
"yocaml_cmark"
|
||||
"yocaml_cmarkit"
|
||||
"yocaml_git"
|
||||
"yocaml_jingoo"
|
||||
"yocaml_syndication"
|
||||
]
|
||||
|
|
|
@ -197,6 +197,10 @@ article code {
|
|||
color: #fff;
|
||||
}
|
||||
|
||||
.tag-box:target > h3 > span {
|
||||
background-color: #c2410c;
|
||||
}
|
||||
|
||||
.tag-box > h3 > span::before {
|
||||
content: "#";
|
||||
}
|
||||
|
|
|
@ -1,2 +1,2 @@
|
|||
(lang dune 2.8)
|
||||
(lang dune 3.16)
|
||||
(name blogger)
|
||||
|
|
BIN
images/finances.png
Normal file
BIN
images/finances.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 6.3 KiB |
BIN
images/smtp.jpg
Normal file
BIN
images/smtp.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 128 KiB |
332
src/blogger.ml
332
src/blogger.ml
|
@ -1,332 +0,0 @@
|
|||
let caller = Sys.argv.(0)
|
||||
let version = "%%VERSION%%"
|
||||
let default_port = 8888
|
||||
let default_target = Fpath.v "_site"
|
||||
|
||||
let program ~target =
|
||||
let open Yocaml in
|
||||
let* () = Task.move_javascript target in
|
||||
let* () = Task.move_css target in
|
||||
let* () = Task.move_images target in
|
||||
let* () = Task.process_articles target in
|
||||
let* () = Task.generate_feed target in
|
||||
let* () = Task.generate_tags target in
|
||||
Task.generate_index target
|
||||
|
||||
let local_build _quiet target =
|
||||
Yocaml_unix.execute (program ~target:(Fpath.to_string target))
|
||||
|
||||
module SSH = struct
|
||||
open Lwt.Infix
|
||||
|
||||
type error = Unix.error * string * string
|
||||
type write_error = [ `Closed | `Error of Unix.error * string * string ]
|
||||
|
||||
let pp_error ppf (err, f, v) =
|
||||
Fmt.pf ppf "%s(%s): %s" f v (Unix.error_message err)
|
||||
|
||||
let pp_write_error ppf = function
|
||||
| `Closed -> Fmt.pf ppf "Connection closed by peer"
|
||||
| `Error (err, f, v) -> Fmt.pf ppf "%s(%s): %s" f v (Unix.error_message err)
|
||||
|
||||
type flow = { ic : in_channel; oc : out_channel }
|
||||
|
||||
type endpoint = {
|
||||
user : string;
|
||||
path : string;
|
||||
host : Unix.inet_addr;
|
||||
port : int;
|
||||
capabilities : [ `Rd | `Wr ];
|
||||
}
|
||||
|
||||
let pp_inet_addr ppf inet_addr =
|
||||
Fmt.string ppf (Unix.string_of_inet_addr inet_addr)
|
||||
|
||||
let connect { user; path; host; port; capabilities } =
|
||||
let edn = Fmt.str "%s@%a" user pp_inet_addr host in
|
||||
let cmd =
|
||||
match capabilities with
|
||||
| `Wr -> Fmt.str {sh|git-receive-pack '%s'|sh} path
|
||||
| `Rd -> Fmt.str {sh|git-upload-pack '%s'|sh} path
|
||||
in
|
||||
let cmd = Fmt.str "ssh -p %d %s %a" port edn Fmt.(quote string) cmd in
|
||||
try
|
||||
let ic, oc = Unix.open_process cmd in
|
||||
Lwt.return_ok { ic; oc }
|
||||
with Unix.Unix_error (err, f, v) -> Lwt.return_error (`Error (err, f, v))
|
||||
|
||||
let read t =
|
||||
let tmp = Bytes.create 0x1000 in
|
||||
try
|
||||
let len = input t.ic tmp 0 0x1000 in
|
||||
if len = 0 then Lwt.return_ok `Eof
|
||||
else Lwt.return_ok (`Data (Cstruct.of_bytes tmp ~off:0 ~len))
|
||||
with Unix.Unix_error (err, f, v) -> Lwt.return_error (err, f, v)
|
||||
|
||||
let write t cs =
|
||||
let str = Cstruct.to_string cs in
|
||||
try
|
||||
output_string t.oc str;
|
||||
flush t.oc;
|
||||
Lwt.return_ok ()
|
||||
with Unix.Unix_error (err, f, v) -> Lwt.return_error (`Error (err, f, v))
|
||||
|
||||
let writev t css =
|
||||
let rec go t = function
|
||||
| [] -> Lwt.return_ok ()
|
||||
| x :: r -> (
|
||||
write t x >>= function
|
||||
| Ok () -> go t r
|
||||
| Error _ as err -> Lwt.return err)
|
||||
in
|
||||
go t css
|
||||
|
||||
let close t =
|
||||
close_in t.ic;
|
||||
close_out t.oc;
|
||||
Lwt.return_unit
|
||||
|
||||
let shutdown t mode =
|
||||
match mode with
|
||||
| `read -> close_in t.ic ; Lwt.return_unit
|
||||
| `write -> close_out t.oc ; Lwt.return_unit
|
||||
| `read_write -> close t
|
||||
end
|
||||
|
||||
let ssh_edn, ssh_protocol = Mimic.register ~name:"ssh" (module SSH)
|
||||
|
||||
let unix_ctx_with_ssh () =
|
||||
let open Lwt.Infix in
|
||||
Git_unix.ctx (Happy_eyeballs_lwt.create ()) >|= fun ctx ->
|
||||
let open Mimic in
|
||||
let k0 scheme user path host port capabilities =
|
||||
match (scheme, Unix.gethostbyname host) with
|
||||
| `SSH, { Unix.h_addr_list; _ } when Array.length h_addr_list > 0 ->
|
||||
Lwt.return_some
|
||||
{ SSH.user; path; host = h_addr_list.(0); port; capabilities }
|
||||
| _ -> Lwt.return_none
|
||||
in
|
||||
ctx
|
||||
|> Mimic.fold Smart_git.git_transmission
|
||||
Fun.[ req Smart_git.git_scheme ]
|
||||
~k:(function `SSH -> Lwt.return_some `Exec | _ -> Lwt.return_none)
|
||||
|> Mimic.fold ssh_edn
|
||||
Fun.
|
||||
[
|
||||
req Smart_git.git_scheme;
|
||||
req Smart_git.git_ssh_user;
|
||||
req Smart_git.git_path;
|
||||
req Smart_git.git_hostname;
|
||||
dft Smart_git.git_port 22;
|
||||
req Smart_git.git_capabilities;
|
||||
]
|
||||
~k:k0
|
||||
|
||||
let run_git_config key = function
|
||||
| Some value -> Some value
|
||||
| None -> (
|
||||
let open Bos in
|
||||
let value = OS.Cmd.run_out Cmd.(v "git" % "config" % "--global" % key) in
|
||||
match OS.Cmd.out_string value with
|
||||
| Ok (value, _) -> Some value
|
||||
| Error _ -> None)
|
||||
|
||||
let run_git_rev_parse default =
|
||||
let open Bos in
|
||||
let value = OS.Cmd.run_out
|
||||
Cmd.(v "git" % "describe" % "--always" % "--dirty"
|
||||
% "--exclude=*" % "--abbrev=0")
|
||||
in
|
||||
match OS.Cmd.out_string value with
|
||||
| Ok (value, (_, `Exited 0)) -> value
|
||||
| Ok (value, (run_info, _)) ->
|
||||
Logs.warn (fun m -> m "Failed to get commit id: %a: %s"
|
||||
Cmd.pp (OS.Cmd.run_info_cmd run_info)
|
||||
value);
|
||||
default
|
||||
| Error `Msg e ->
|
||||
Logs.warn (fun m -> m "Failed to get commit id: %s" e);
|
||||
default
|
||||
|
||||
let get_name_and_email name email =
|
||||
let name = run_git_config "user.name" name in
|
||||
let email = run_git_config "user.email" email in
|
||||
(name, email)
|
||||
|
||||
let name_and_email =
|
||||
let name_arg =
|
||||
let doc = "Name of the committer." in
|
||||
Cmdliner.Arg.(value & opt (some string) None & info [ "name" ] ~doc)
|
||||
in
|
||||
let email_arg =
|
||||
let doc = "Email of the committer." in
|
||||
Cmdliner.Arg.(value & opt (some string) None & info [ "email" ] ~doc)
|
||||
in
|
||||
Cmdliner.Term.(const get_name_and_email $ name_arg $ email_arg)
|
||||
|
||||
let build_and_push _quiet remote (author, email) hook =
|
||||
let fiber () =
|
||||
let open Lwt.Syntax in
|
||||
let commit_id = run_git_rev_parse "an unknown state" in
|
||||
let comment = Printf.sprintf "Built from %s" commit_id in
|
||||
let* ctx = unix_ctx_with_ssh () in
|
||||
let* res =
|
||||
Yocaml_git.execute
|
||||
(module Yocaml_unix)
|
||||
(module Pclock)
|
||||
~ctx ?author ?email ~comment remote (program ~target:"")
|
||||
in
|
||||
match res with
|
||||
| Error (`Msg err) -> Fmt.failwith "build-and-push: %s." err
|
||||
| Ok () -> (
|
||||
match hook with
|
||||
| None -> Lwt.return_unit
|
||||
| Some hook -> (
|
||||
let open Lwt.Infix in
|
||||
Http_lwt_client.request ~config:(`HTTP_1_1 Httpaf.Config.default)
|
||||
~meth:`GET (Uri.to_string hook)
|
||||
(fun _ () _ -> Lwt.return_unit)
|
||||
()
|
||||
>>= function
|
||||
| Ok (_response, ()) -> Lwt.return_unit
|
||||
| Error (`Msg err) -> failwith err))
|
||||
in
|
||||
Lwt_main.run (fiber ())
|
||||
|
||||
let watch quiet target potential_port =
|
||||
let port = Option.value ~default:default_port potential_port in
|
||||
let () = local_build quiet target in
|
||||
let target = Fpath.to_string target in
|
||||
let server = Yocaml_unix.serve ~filepath:target ~port (program ~target) in
|
||||
Lwt_main.run server
|
||||
|
||||
let common_options = "COMMON OPTIONS"
|
||||
|
||||
let verbosity =
|
||||
let open Cmdliner in
|
||||
let env = Cmd.Env.info "BLOGGER_LOGS" in
|
||||
Logs_cli.level ~docs:common_options ~env ()
|
||||
|
||||
let renderer =
|
||||
let open Cmdliner in
|
||||
let env = Cmd.Env.info "BLOGGER_FMT" in
|
||||
Fmt_cli.style_renderer ~docs:common_options ~env ()
|
||||
|
||||
let utf_8 =
|
||||
let open Cmdliner in
|
||||
let doc = "Allow binaries to emit UTF-8 characters." in
|
||||
let env = Cmd.Env.info "BLOGGER_UTF_8" in
|
||||
Arg.(value & opt bool true & info [ "with-utf-8" ] ~doc ~env)
|
||||
|
||||
let reporter ppf =
|
||||
let report src level ~over k msgf =
|
||||
let k _ =
|
||||
over ();
|
||||
k ()
|
||||
in
|
||||
let with_metadata header _tags k ppf fmt =
|
||||
Fmt.kpf k ppf
|
||||
("%a[%a]: " ^^ fmt ^^ "\n%!")
|
||||
Logs_fmt.pp_header (level, header)
|
||||
Fmt.(styled `Magenta string)
|
||||
(Logs.Src.name src)
|
||||
in
|
||||
msgf @@ fun ?header ?tags fmt -> with_metadata header tags k ppf fmt
|
||||
in
|
||||
{ Logs.report }
|
||||
|
||||
let setup_logs utf_8 style_renderer level =
|
||||
Fmt_tty.setup_std_outputs ~utf_8 ?style_renderer ();
|
||||
Logs.set_level level;
|
||||
Logs.set_reporter (reporter Fmt.stderr);
|
||||
Option.is_none level
|
||||
|
||||
let setup_logs = Cmdliner.Term.(const setup_logs $ utf_8 $ renderer $ verbosity)
|
||||
|
||||
let man =
|
||||
let open Cmdliner in
|
||||
[ `S Manpage.s_authors; `P "blog.robur.coop" ]
|
||||
|
||||
let build_cmd =
|
||||
let open Cmdliner in
|
||||
let doc = Format.asprintf "Build the blog into the specified directory" in
|
||||
let exits = Cmd.Exit.defaults in
|
||||
let info = Cmd.info "build" ~version ~doc ~exits ~man in
|
||||
let path_arg =
|
||||
let doc =
|
||||
Format.asprintf "Specify where we build the website (default: %a)"
|
||||
Fpath.pp default_target
|
||||
in
|
||||
let arg = Arg.info ~doc [ "destination" ] in
|
||||
Arg.(value & opt (conv (Fpath.of_string, Fpath.pp)) default_target & arg)
|
||||
in
|
||||
Cmd.v info Term.(const local_build $ setup_logs $ path_arg)
|
||||
|
||||
let watch_cmd =
|
||||
let open Cmdliner in
|
||||
let doc =
|
||||
Format.asprintf
|
||||
"Serve from the specified directory as an HTTP server and rebuild \
|
||||
website on demand"
|
||||
in
|
||||
let exits = Cmd.Exit.defaults in
|
||||
let path_arg =
|
||||
let doc =
|
||||
Format.asprintf "Specify where we build the website (default: %a)"
|
||||
Fpath.pp default_target
|
||||
in
|
||||
let arg = Arg.info ~doc [ "destination" ] in
|
||||
Arg.(value & opt (conv (Fpath.of_string, Fpath.pp)) default_target & arg)
|
||||
in
|
||||
let port_arg =
|
||||
let doc = Format.asprintf "The port (default: %d)" default_port in
|
||||
let arg = Arg.info ~doc [ "port"; "P"; "p" ] in
|
||||
Arg.(value & opt (some int) None & arg)
|
||||
in
|
||||
let info = Cmd.info "watch" ~version ~doc ~exits ~man in
|
||||
Cmd.v info Term.(const watch $ setup_logs $ path_arg $ port_arg)
|
||||
|
||||
let push_cmd =
|
||||
let open Cmdliner in
|
||||
let doc =
|
||||
Format.asprintf
|
||||
"Push the blog (from the specified directory) into a Git repository"
|
||||
in
|
||||
let exits = Cmd.Exit.defaults in
|
||||
let remote_arg =
|
||||
let remote =
|
||||
let parser str =
|
||||
match Smart_git.Endpoint.of_string str with
|
||||
| Ok _ -> Ok str
|
||||
| Error _ as err -> err
|
||||
in
|
||||
Arg.conv (parser, Fmt.string)
|
||||
in
|
||||
let doc = "The remote Git repository" in
|
||||
let arg = Arg.info ~doc [ "r"; "remote" ] in
|
||||
Arg.(required & opt (some remote) None & arg)
|
||||
in
|
||||
let hook_arg =
|
||||
let doc = "The URL of the hook to update the unikernel" in
|
||||
let arg = Arg.info ~doc [ "h"; "hook" ] in
|
||||
let of_string str =
|
||||
match Uri.of_string str with
|
||||
| v -> Ok v
|
||||
| exception _ -> Rresult.R.error_msgf "Invalid URI: %s" str
|
||||
in
|
||||
Arg.(value & opt (some (conv (of_string, Uri.pp))) None & arg)
|
||||
in
|
||||
let info = Cmd.info "push" ~version ~doc ~exits ~man in
|
||||
Cmd.v info
|
||||
Term.(
|
||||
const build_and_push $ setup_logs $ remote_arg $ name_and_email $ hook_arg)
|
||||
|
||||
let cmd =
|
||||
let open Cmdliner in
|
||||
let sdocs = Manpage.s_common_options in
|
||||
let doc = "Build, push or serve my personal website" in
|
||||
let default_info = Cmd.info caller ~version ~doc ~sdocs ~man in
|
||||
let default = Term.(ret (const (`Help (`Pager, None)))) in
|
||||
Cmd.group ~default default_info [ build_cmd; watch_cmd; push_cmd ]
|
||||
|
||||
let () = exit @@ Cmdliner.Cmd.eval cmd
|
|
@ -1,64 +0,0 @@
|
|||
open Yocaml
|
||||
|
||||
let get_article (module V : Metadata.VALIDABLE) article_file =
|
||||
let arr =
|
||||
Build.read_file_with_metadata
|
||||
(module V)
|
||||
(module Model.Article)
|
||||
article_file
|
||||
in
|
||||
let deps = Build.get_dependencies arr in
|
||||
let task = Build.get_task arr in
|
||||
let+ meta, _ = task () in
|
||||
deps, (meta, Model.article_path article_file)
|
||||
;;
|
||||
|
||||
let get_articles (module V : Metadata.VALIDABLE) path =
|
||||
let* files = read_child_files path File.is_markdown in
|
||||
let+ articles = Traverse.traverse (get_article (module V)) files in
|
||||
let deps, effects = List.split articles in
|
||||
Deps.Monoid.reduce deps, effects
|
||||
;;
|
||||
|
||||
module Articles = struct
|
||||
type t = (Model.Article.t * Filepath.t) list
|
||||
|
||||
let get_all (module V : Metadata.VALIDABLE) ?(decreasing = true) path =
|
||||
let+ deps, articles = get_articles (module V) path in
|
||||
let sorted_article = Model.Articles.sort ~decreasing articles in
|
||||
Build.make deps (fun x -> return (x, sorted_article))
|
||||
;;
|
||||
end
|
||||
|
||||
module Tags = struct
|
||||
module M = Map.Make (String)
|
||||
|
||||
let by_quantity ?(decreasing = true) (_, a) (_, b) =
|
||||
let r = Int.compare $ List.length a $ List.length b in
|
||||
if decreasing then ~-r else r
|
||||
;;
|
||||
|
||||
let group metas =
|
||||
List.fold_left
|
||||
(fun accumulator (article, path) ->
|
||||
List.fold_left
|
||||
(fun map tag ->
|
||||
match M.find_opt tag map with
|
||||
| Some articles -> M.add tag ((article, path) :: articles) map
|
||||
| None -> M.add tag [ article, path ] map)
|
||||
accumulator
|
||||
(Model.Article.tags article))
|
||||
M.empty
|
||||
metas
|
||||
|> M.map
|
||||
(List.sort (fun (a, _) (b, _) -> Model.Article.compare_by_date a b))
|
||||
|> M.to_seq
|
||||
|> List.of_seq
|
||||
|> List.sort by_quantity
|
||||
;;
|
||||
|
||||
let compute (module V : Metadata.VALIDABLE) path =
|
||||
let+ deps, articles = get_articles (module V) path in
|
||||
deps, group articles
|
||||
;;
|
||||
end
|
|
@ -1,18 +0,0 @@
|
|||
open Yocaml
|
||||
|
||||
module Articles : sig
|
||||
type t = (Model.Article.t * Filepath.t) list
|
||||
|
||||
val get_all
|
||||
: (module Metadata.VALIDABLE)
|
||||
-> ?decreasing:bool
|
||||
-> Filepath.t
|
||||
-> ('a, 'a * t) Build.t Effect.t
|
||||
end
|
||||
|
||||
module Tags : sig
|
||||
val compute
|
||||
: (module Metadata.VALIDABLE)
|
||||
-> Filepath.t
|
||||
-> (Deps.t * (string * (Model.Article.t * string) list) list) Effect.t
|
||||
end
|
21
src/dune
21
src/dune
|
@ -1,21 +0,0 @@
|
|||
(executable
|
||||
(name blogger)
|
||||
(libraries
|
||||
logs
|
||||
logs.fmt
|
||||
logs.cli
|
||||
fmt
|
||||
fmt.tty
|
||||
fmt.cli
|
||||
cmdliner
|
||||
preface
|
||||
mirage-clock-unix
|
||||
http-lwt-client
|
||||
git-unix
|
||||
cmarkit
|
||||
yocaml
|
||||
yocaml_yaml
|
||||
yocaml_cmark
|
||||
yocaml_unix
|
||||
yocaml_git
|
||||
yocaml_jingoo))
|
15
src/feed.ml
15
src/feed.ml
|
@ -1,15 +0,0 @@
|
|||
open Yocaml
|
||||
|
||||
let domain = "https://blog.robur.coop"
|
||||
let feed_url = into domain "feed.xml"
|
||||
|
||||
let articles_to_items articles =
|
||||
List.map
|
||||
(fun (article, url) -> Model.Article.to_rss_item (into domain url) article)
|
||||
articles
|
||||
|
||||
let make ((), articles) =
|
||||
Yocaml.Rss.Channel.make ~title:"Robur's blog" ~link:domain ~feed_link:feed_url
|
||||
~description:"The Robur cooperative blog" ~generator:"yocaml"
|
||||
~webmaster:"team@robur.coop"
|
||||
(articles_to_items articles)
|
|
@ -1 +0,0 @@
|
|||
val make : unit * Collection.Articles.t -> Yocaml.Rss.Channel.t
|
18
src/file.ml
18
src/file.ml
|
@ -1,18 +0,0 @@
|
|||
open Yocaml
|
||||
|
||||
let is_css = with_extension "css"
|
||||
let is_javascript = with_extension "js"
|
||||
|
||||
let is_image =
|
||||
let open Preface.Predicate in
|
||||
with_extension "png"
|
||||
|| with_extension "svg"
|
||||
|| with_extension "jpg"
|
||||
|| with_extension "jpeg"
|
||||
|| with_extension "gif"
|
||||
;;
|
||||
|
||||
let is_markdown =
|
||||
let open Preface.Predicate in
|
||||
with_extension "md" || with_extension "markdown"
|
||||
;;
|
|
@ -1,6 +0,0 @@
|
|||
open Yocaml
|
||||
|
||||
val is_css : Filepath.t -> bool
|
||||
val is_javascript : Filepath.t -> bool
|
||||
val is_image : Filepath.t -> bool
|
||||
val is_markdown : Filepath.t -> bool
|
256
src/model.ml
256
src/model.ml
|
@ -1,256 +0,0 @@
|
|||
open Yocaml
|
||||
|
||||
let article_path file =
|
||||
let filename = basename $ replace_extension file "html" in
|
||||
filename |> into "articles"
|
||||
|
||||
let tag_path tag = add_extension tag "html" |> into "tags"
|
||||
|
||||
module Author = struct
|
||||
type t = {
|
||||
name : string;
|
||||
link : string;
|
||||
email : string;
|
||||
avatar : string option;
|
||||
}
|
||||
|
||||
let equal a b =
|
||||
String.equal a.name b.name && String.equal a.link b.link
|
||||
&& String.equal a.email b.email
|
||||
&& Option.equal String.equal a.avatar b.avatar
|
||||
|
||||
let make name link email avatar = { name; link; email; avatar }
|
||||
|
||||
let from (type a) (module V : Metadata.VALIDABLE with type t = a) obj =
|
||||
V.object_and
|
||||
(fun assoc ->
|
||||
let open Validate.Applicative in
|
||||
make
|
||||
<$> V.(required_assoc string) "name" assoc
|
||||
<*> V.(required_assoc string) "link" assoc
|
||||
<*> V.(required_assoc string) "email" assoc
|
||||
<*> V.(optional_assoc string) "avatar" assoc)
|
||||
obj
|
||||
|
||||
let default_user =
|
||||
make "robur" "https://blog.robur.coop/" "team@robur.coop" None
|
||||
|
||||
let gravatar email =
|
||||
let tk = String.(lowercase_ascii $ trim email) in
|
||||
let hs = Digest.(to_hex $ string tk) in
|
||||
"https://www.gravatar.com/avatar/" ^ hs
|
||||
|
||||
let inject (type a) (module D : Key_value.DESCRIBABLE with type t = a)
|
||||
{ name; link; email; avatar } =
|
||||
let avatar = match avatar with Some uri -> uri | None -> gravatar email in
|
||||
D.
|
||||
[
|
||||
("name", string name);
|
||||
("link", string link);
|
||||
("email", string email);
|
||||
("avatar", string avatar);
|
||||
]
|
||||
end
|
||||
|
||||
module Co_author = struct
|
||||
type t = { author : Author.t; contribution : string }
|
||||
|
||||
let make author contribution = { author; contribution }
|
||||
|
||||
let from (type a) (module V : Metadata.VALIDABLE with type t = a) obj =
|
||||
V.object_and
|
||||
(fun assoc ->
|
||||
let open Validate.Applicative in
|
||||
make
|
||||
<$> V.(required_assoc (Author.from (module V))) "author" assoc
|
||||
<*> V.(required_assoc string) "contribution" assoc)
|
||||
obj
|
||||
|
||||
let inject (type a) (module D : Key_value.DESCRIBABLE with type t = a)
|
||||
{ author; contribution } =
|
||||
D.
|
||||
[
|
||||
("author", object_ $ Author.inject (module D) author);
|
||||
("contribution", string contribution);
|
||||
]
|
||||
end
|
||||
|
||||
module Article = struct
|
||||
type t = {
|
||||
article_title : string;
|
||||
article_description : string;
|
||||
tags : string list;
|
||||
date : Date.t;
|
||||
title : string option;
|
||||
description : string option;
|
||||
author : Author.t;
|
||||
co_authors : Co_author.t list;
|
||||
invited_article : bool;
|
||||
}
|
||||
|
||||
let date { date; _ } = date
|
||||
let tags { tags; _ } = tags
|
||||
|
||||
let escape_string str =
|
||||
let renderer = Cmarkit_renderer.make () in
|
||||
let buffer = Buffer.create (String.length str) in
|
||||
let ctx = Cmarkit_renderer.Context.make renderer buffer in
|
||||
Cmarkit_html.html_escaped_string ctx str;
|
||||
Buffer.contents buffer
|
||||
|
||||
let to_rss_item url article =
|
||||
let title = escape_string article.article_title in
|
||||
let description = escape_string article.article_description in
|
||||
Rss.(
|
||||
Item.make ~title ~link:url ~pub_date:article.date ~description
|
||||
~guid:(Guid.link url) ())
|
||||
|
||||
let make article_title article_description tags date title description author
|
||||
co_authors =
|
||||
let author = Option.value ~default:Author.default_user author in
|
||||
let invited_article = not (Author.equal author Author.default_user) in
|
||||
{
|
||||
article_title;
|
||||
article_description;
|
||||
tags = List.map String.lowercase_ascii tags;
|
||||
date;
|
||||
title;
|
||||
description;
|
||||
author;
|
||||
co_authors;
|
||||
invited_article;
|
||||
}
|
||||
|
||||
let from_string (module V : Metadata.VALIDABLE) = function
|
||||
| None -> Validate.error $ Error.Required_metadata [ "Article" ]
|
||||
| Some str ->
|
||||
let open Validate.Monad in
|
||||
V.from_string str
|
||||
>>= V.object_and (fun assoc ->
|
||||
let open Validate.Applicative in
|
||||
make
|
||||
<$> V.(required_assoc string) "article.title" assoc
|
||||
<*> V.(required_assoc string) "article.description" assoc
|
||||
<*> V.(optional_assoc_or ~default:[] (list_of string))
|
||||
"tags" assoc
|
||||
<*> V.required_assoc
|
||||
(Metadata.Date.from (module V))
|
||||
"date" assoc
|
||||
<*> V.(optional_assoc string) "title" assoc
|
||||
<*> V.(optional_assoc string) "description" assoc
|
||||
<*> V.(optional_assoc (Author.from (module V))) "author" assoc
|
||||
<*> V.(
|
||||
optional_assoc_or ~default:[]
|
||||
(list_of (Co_author.from (module V)))
|
||||
"coauthors" assoc))
|
||||
|
||||
let inject (type a) (module D : Key_value.DESCRIBABLE with type t = a)
|
||||
{
|
||||
article_title;
|
||||
article_description;
|
||||
tags;
|
||||
date;
|
||||
title;
|
||||
description;
|
||||
author;
|
||||
co_authors;
|
||||
invited_article;
|
||||
} =
|
||||
let co_authors =
|
||||
List.map (fun c -> D.object_ $ Co_author.inject (module D) c) co_authors
|
||||
in
|
||||
let has_co_authors = match co_authors with [] -> false | _ -> true in
|
||||
D.
|
||||
[
|
||||
("root", string "..");
|
||||
( "metadata",
|
||||
object_
|
||||
[
|
||||
("title", string article_title);
|
||||
("description", string article_description);
|
||||
] );
|
||||
("tags", list (List.map string tags));
|
||||
("date", object_ $ Metadata.Date.inject (module D) date);
|
||||
("author", object_ $ Author.inject (module D) author);
|
||||
("coauthors", list co_authors);
|
||||
("invited", boolean invited_article);
|
||||
("has_coauthors", boolean has_co_authors);
|
||||
]
|
||||
@ Metadata.Page.inject (module D) (Metadata.Page.make title description)
|
||||
|
||||
let compare_by_date a b = Date.compare a.date b.date
|
||||
end
|
||||
|
||||
module Articles = struct
|
||||
type t = {
|
||||
articles : (Article.t * string) list;
|
||||
title : string option;
|
||||
description : string option;
|
||||
}
|
||||
|
||||
let make ?title ?description articles = { articles; title; description }
|
||||
let title p = p.title
|
||||
let description p = p.description
|
||||
let articles p = p.articles
|
||||
let set_articles new_articles p = { p with articles = new_articles }
|
||||
let set_title new_title p = { p with title = new_title }
|
||||
let set_description new_desc p = { p with description = new_desc }
|
||||
|
||||
let sort ?(decreasing = true) articles =
|
||||
List.sort
|
||||
(fun (a, _) (b, _) ->
|
||||
let a_date = Article.date a and b_date = Article.date b in
|
||||
let r = Date.compare a_date b_date in
|
||||
if decreasing then ~-r else r)
|
||||
articles
|
||||
|
||||
let sort_articles_by_date ?(decreasing = true) p =
|
||||
{ p with articles = sort ~decreasing p.articles }
|
||||
|
||||
let inject (type a) (module D : Key_value.DESCRIBABLE with type t = a)
|
||||
{ articles; title; description } =
|
||||
( "articles",
|
||||
D.list
|
||||
(List.map
|
||||
(fun (article, url) ->
|
||||
D.object_
|
||||
(("url", D.string url) :: Article.inject (module D) article))
|
||||
articles) )
|
||||
:: ("root", D.string ".")
|
||||
:: (Metadata.Page.inject (module D) $ Metadata.Page.make title description)
|
||||
end
|
||||
|
||||
let article_object (type a) (module D : Key_value.DESCRIBABLE with type t = a)
|
||||
(article, url) =
|
||||
D.object_ (("url", D.string url) :: Article.inject (module D) article)
|
||||
|
||||
module Tag = struct
|
||||
type t = {
|
||||
tag : string;
|
||||
tags : (string * int) list;
|
||||
articles : (Article.t * string) list;
|
||||
title : string option;
|
||||
description : string option;
|
||||
}
|
||||
|
||||
let make ?title ?description tag articles tags =
|
||||
{ tag; tags; articles = Articles.sort articles; title; description }
|
||||
|
||||
let inject (type a) (module D : Key_value.DESCRIBABLE with type t = a)
|
||||
{ tag; tags; articles; title; description } =
|
||||
("tag", D.string tag)
|
||||
:: ("root", D.string "..")
|
||||
:: ("articles", D.list (List.map (article_object (module D)) articles))
|
||||
:: ( "tags",
|
||||
D.list
|
||||
(List.map
|
||||
(fun (tag, n) ->
|
||||
D.object_
|
||||
[
|
||||
("name", D.string tag);
|
||||
("link", D.string (tag_path tag));
|
||||
("number", D.integer n);
|
||||
])
|
||||
tags) )
|
||||
:: (Metadata.Page.inject (module D) $ Metadata.Page.make title description)
|
||||
end
|
|
@ -1,55 +0,0 @@
|
|||
open Yocaml
|
||||
|
||||
val article_path : Filepath.t -> Filepath.t
|
||||
val tag_path : string -> Filepath.t
|
||||
|
||||
module Article : sig
|
||||
type t
|
||||
|
||||
val date : t -> Date.t
|
||||
val tags : t -> string list
|
||||
val to_rss_item : string -> t -> Rss.Item.t
|
||||
val compare_by_date : t -> t -> int
|
||||
|
||||
include Metadata.INJECTABLE with type t := t
|
||||
include Metadata.READABLE with type t := t
|
||||
end
|
||||
|
||||
module Tag : sig
|
||||
type t
|
||||
|
||||
val make
|
||||
: ?title:string
|
||||
-> ?description:string
|
||||
-> string
|
||||
-> (Article.t * string) list
|
||||
-> (string * int) list
|
||||
-> t
|
||||
|
||||
include Metadata.INJECTABLE with type t := t
|
||||
end
|
||||
|
||||
module Articles : sig
|
||||
type t
|
||||
|
||||
val make
|
||||
: ?title:string
|
||||
-> ?description:string
|
||||
-> (Article.t * string) list
|
||||
-> t
|
||||
|
||||
val sort
|
||||
: ?decreasing:bool
|
||||
-> (Article.t * string) list
|
||||
-> (Article.t * string) list
|
||||
|
||||
val sort_articles_by_date : ?decreasing:bool -> t -> t
|
||||
val articles : t -> (Article.t * string) list
|
||||
val title : t -> string option
|
||||
val description : t -> string option
|
||||
val set_title : string option -> t -> t
|
||||
val set_description : string option -> t -> t
|
||||
val set_articles : (Article.t * string) list -> t -> t
|
||||
|
||||
include Metadata.INJECTABLE with type t := t
|
||||
end
|
108
src/task.ml
108
src/task.ml
|
@ -1,108 +0,0 @@
|
|||
open Yocaml
|
||||
module Metaformat = Yocaml_yaml
|
||||
module Markup = Yocaml_cmark
|
||||
module Template = Yocaml_jingoo
|
||||
|
||||
let css_target target = "css" |> into target
|
||||
let javascript_target target = "js" |> into target
|
||||
let images_target target = "images" |> into target
|
||||
let template file = add_extension file "html" |> into "templates"
|
||||
let article_template = template "article"
|
||||
let layout_template = template "layout"
|
||||
let list_template = template "list_articles"
|
||||
let article_target file target = Model.article_path file |> into target
|
||||
let binary_update = Build.watch Sys.argv.(0)
|
||||
let index_html target = "index.html" |> into target
|
||||
let index_md = "index.md" |> into "pages"
|
||||
let rss_feed target = "feed.xml" |> into target
|
||||
let tag_file tag target = Model.tag_path tag |> into target
|
||||
let tag_template = template "tag"
|
||||
|
||||
let move_css target =
|
||||
process_files
|
||||
[ "css" ]
|
||||
File.is_css
|
||||
(Build.copy_file ~into:(css_target target))
|
||||
;;
|
||||
|
||||
let move_javascript target =
|
||||
process_files
|
||||
[ "js" ]
|
||||
File.is_javascript
|
||||
(Build.copy_file ~into:(javascript_target target))
|
||||
;;
|
||||
|
||||
let move_images target =
|
||||
process_files
|
||||
[ "images" ]
|
||||
File.is_image
|
||||
(Build.copy_file ~into:(images_target target))
|
||||
;;
|
||||
|
||||
let process_articles target =
|
||||
process_files [ "articles" ] File.is_markdown (fun article_file ->
|
||||
let open Build in
|
||||
create_file
|
||||
(article_target article_file target)
|
||||
(binary_update
|
||||
>>> Metaformat.read_file_with_metadata
|
||||
(module Model.Article)
|
||||
article_file
|
||||
>>> Markup.content_to_html ~strict:false ()
|
||||
>>> Template.apply_as_template (module Model.Article) article_template
|
||||
>>> Template.apply_as_template (module Model.Article) layout_template
|
||||
>>^ Stdlib.snd))
|
||||
;;
|
||||
|
||||
let merge_with_page ((meta, content), articles) =
|
||||
let title = Metadata.Page.title meta in
|
||||
let description = Metadata.Page.description meta in
|
||||
Model.Articles.make ?title ?description articles, content
|
||||
;;
|
||||
|
||||
let generate_feed target =
|
||||
let open Build in
|
||||
let* articles_arrow =
|
||||
Collection.Articles.get_all (module Metaformat) "articles"
|
||||
in
|
||||
create_file
|
||||
(rss_feed target)
|
||||
(binary_update >>> articles_arrow >>^ Feed.make >>^ Rss.Channel.to_rss)
|
||||
;;
|
||||
|
||||
let generate_tags target =
|
||||
let* deps, tags = Collection.Tags.compute (module Metaformat) "articles" in
|
||||
let tags_string = List.map (fun (i, s) -> i, List.length s) tags in
|
||||
let mk_meta tag articles () = Model.Tag.make tag articles tags_string, "" in
|
||||
List.fold_left
|
||||
(fun program (tag, articles) ->
|
||||
let open Build in
|
||||
program
|
||||
>> create_file
|
||||
(tag_file tag target)
|
||||
(init deps
|
||||
>>> binary_update
|
||||
>>^ mk_meta tag articles
|
||||
>>> Template.apply_as_template (module Model.Tag) tag_template
|
||||
>>> Template.apply_as_template (module Model.Tag) layout_template
|
||||
>>^ Stdlib.snd))
|
||||
(return ())
|
||||
tags
|
||||
;;
|
||||
|
||||
let generate_index target =
|
||||
let open Build in
|
||||
let* articles_arrow =
|
||||
Collection.Articles.get_all (module Metaformat) "articles"
|
||||
in
|
||||
create_file
|
||||
(index_html target)
|
||||
(binary_update
|
||||
>>> Metaformat.read_file_with_metadata (module Metadata.Page) index_md
|
||||
>>> Markup.content_to_html ~strict:false ()
|
||||
>>> articles_arrow
|
||||
>>^ merge_with_page
|
||||
>>> Template.apply_as_template (module Model.Articles) list_template
|
||||
>>> Template.apply_as_template (module Model.Articles) layout_template
|
||||
>>^ Stdlib.snd)
|
||||
;;
|
|
@ -1,7 +0,0 @@
|
|||
val move_css : string -> unit Yocaml.Effect.t
|
||||
val move_images : string -> unit Yocaml.Effect.t
|
||||
val move_javascript : string -> unit Yocaml.Effect.t
|
||||
val process_articles : string -> unit Yocaml.Effect.t
|
||||
val generate_feed : string -> unit Yocaml.Effect.t
|
||||
val generate_index : string -> unit Yocaml.Effect.t
|
||||
val generate_tags : string -> unit Yocaml.Effect.t
|
|
@ -1,13 +1,13 @@
|
|||
<a href="/index.html">Back to index</a>
|
||||
|
||||
<article>
|
||||
<h1>{{ metadata.title }}</h1>
|
||||
<h1>{{ title }}</h1>
|
||||
<ul class="tags-list">
|
||||
{%- for tag in tags -%}
|
||||
<li><a href="/tags/{{ tag }}.html">{{ tag }}</a></li>
|
||||
<li><a href="/tags.html#tag-{{ tag }}">{{ tag }}</a></li>
|
||||
{%- endfor -%}
|
||||
</ul>
|
||||
{%- autoescape false -%}
|
||||
{{ body }}
|
||||
{{ yocaml_body }}
|
||||
{% endautoescape -%}
|
||||
</article>
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
<a class="small-button rss" href="./feed.xml">RSS</a>
|
||||
<a class="small-button rss" href="/feed.xml">RSS</a>
|
||||
|
||||
{%- autoescape false -%}
|
||||
{{ body }}
|
||||
{{ yocaml_body }}
|
||||
{% endautoescape -%}
|
||||
|
||||
<h3>Essays and ramblings</h3>
|
||||
|
@ -20,13 +20,13 @@
|
|||
{%- endfor -%}
|
||||
</div>
|
||||
<div class="content">
|
||||
<span class="date">{{ article.date.canonical }}</span>
|
||||
<a href="{{ article.url }}">{{ article.metadata.title }}</a><br />
|
||||
<p>{{ article.metadata.description }}</p>
|
||||
<span class="date">{{ article.date.human }}</span>
|
||||
<a href="{{ article.url }}">{{ article.title }}</a><br />
|
||||
<p>{{ article.description }}</p>
|
||||
<div class="bottom">
|
||||
<ul class="tags-list">
|
||||
{%- for tag in article.tags -%}
|
||||
<li><a href="/tags/{{ tag }}.html">{{ tag }}</a></li>
|
||||
<li><a href="/tags.html#tag-{{ tag }}">{{ tag }}</a></li>
|
||||
{%- endfor -%}
|
||||
</ul>
|
||||
</div>
|
|
@ -1,20 +1,3 @@
|
|||
{%- if metadata.title -%}
|
||||
{%- set dash = " - " -%}
|
||||
{%- set page_title = metadata.title -%}
|
||||
{%- elseif title -%}
|
||||
{%- set dash = " - " -%}
|
||||
{%- set page_title = title -%}
|
||||
{%- else -%}
|
||||
{%- set dash = "" -%}
|
||||
{%- set page_title = "" -%}
|
||||
{%- endif %}
|
||||
{% if metadata.description -%}
|
||||
{%- set page_description = metadata.description -%}
|
||||
{%- elseif description -%}
|
||||
{%- set page_description = description -%}
|
||||
{%- else -%}
|
||||
{%- set page_description = "blog.robur.coop" -%}
|
||||
{%- endif -%}
|
||||
<!doctype html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
|
@ -22,13 +5,13 @@
|
|||
<meta http-equiv="x-ua-compatible" content="ie=edge">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<title>
|
||||
Robur's blog{{ dash }}{{ page_title }}
|
||||
Robur's blog - {{ title }}
|
||||
</title>
|
||||
<meta name="description" content="{{ page_description }}">
|
||||
<link type="text/css" rel="stylesheet" href="{{ root }}/css/hl.css">
|
||||
<link type="text/css" rel="stylesheet" href="{{ root }}/css/style.css">
|
||||
<script src="{{ root }}/js/hl.js"></script>
|
||||
<link rel="alternate" type="application/rss+xml" href="{{ root }}/feed.xml" title="blog.robur.coop">
|
||||
<meta name="description" content="{{ description }}">
|
||||
<link type="text/css" rel="stylesheet" href="/css/hl.css">
|
||||
<link type="text/css" rel="stylesheet" href="/css/style.css">
|
||||
<script src="/js/hl.js"></script>
|
||||
<link rel="alternate" type="application/rss+xml" href="/feed.xml" title="blog.robur.coop">
|
||||
</head>
|
||||
<body>
|
||||
<header>
|
||||
|
@ -39,7 +22,7 @@
|
|||
</header>
|
||||
<main>
|
||||
{%- autoescape false -%}
|
||||
{{ body }}
|
||||
{{ yocaml_body }}
|
||||
{% endautoescape -%}
|
||||
</main>
|
||||
<footer>
|
||||
|
|
|
@ -1,21 +0,0 @@
|
|||
<a href="/index.html">Back to index</a>
|
||||
|
||||
<ul class="tags-list aeration">
|
||||
{%- for tag in tags -%}
|
||||
<li><a href="/{{ tag.link }}">{{ tag.name }} ({{ tag.number }})</a></li>
|
||||
{%- endfor -%}
|
||||
</ul>
|
||||
|
||||
<div class="tag-box" id="tag-{{ tag }}">
|
||||
{%- set nb_tags = length (articles) -%}
|
||||
<h3>
|
||||
<span>{{ tag }}</span>
|
||||
{{ nb_tags }}
|
||||
{%- if nb_tags > 1 %} entries{%- else %} entry{%- endif -%}
|
||||
</h3>
|
||||
<ul>
|
||||
{%- for article in articles -%}
|
||||
<li><a href="/{{ article.url }}">{{ article.metadata.title }}</a></li>
|
||||
{%- endfor -%}
|
||||
</ul>
|
||||
</div>
|
20
templates/tags.html
Normal file
20
templates/tags.html
Normal file
|
@ -0,0 +1,20 @@
|
|||
<a href="/index.html">Back to index</a>
|
||||
|
||||
<ul class="tags-list aeration">
|
||||
{%- for tag in all_tags -%}
|
||||
<li><a href="#tag-{{ tag.name }}">{{ tag.name }}</a></li>
|
||||
{%- endfor -%}
|
||||
</ul>
|
||||
|
||||
{%- for tag in all_tags -%}
|
||||
<div class="tag-box" id="tag-{{ tag.name }}">
|
||||
<h3>
|
||||
<span>{{ tag.name }}</span>
|
||||
</h3>
|
||||
<ul>
|
||||
{%- for article in tag.articles -%}
|
||||
<li><a href="{{ article.url }}">{{ article.title }}</a></li>
|
||||
{%- endfor -%}
|
||||
</ul>
|
||||
</div>
|
||||
{%- endfor -%}
|
|
@ -1,6 +1,7 @@
|
|||
#!/bin/sh
|
||||
|
||||
opam exec -- dune exec src/blogger.exe -- push \
|
||||
opam exec -- dune exec bin/push.exe -- \
|
||||
-r git@git.robur.coop:robur/blog.robur.coop.git#gh-pages \
|
||||
--name "The Robur team" \
|
||||
--host https://blog.robur.coop \
|
||||
--author "The Robur team" \
|
||||
--email team@robur.coop
|
||||
|
|
Loading…
Reference in a new issue