191 lines
11 KiB
Text
191 lines
11 KiB
Text
---
|
|
title:
|
|
author: hannes
|
|
tags: mirageos, security, package signing, tls, monitoring, deployment
|
|
abstract:
|
|
---
|
|
|
|
## Working at [robur](https://robur.io)
|
|
|
|
As announced [previously](/Posts/DNS), I started to work at robur early 2018.
|
|
We're a collective of five people, distributed around Europe and the US, with
|
|
the goal to deploy MirageOS unikernels. We do this by developing bespoke
|
|
MirageOS unikernels which provide useful services, and deploy them for ourselves.
|
|
We also develop new libraries and enhance existing ones and other components of MirageOS.
|
|
Example unikernels include [our website](https://robur.io) which uses [Canopy](https://github.com/Engil/Canopy),
|
|
a [CalDAV server that stores entries in a git remote](https://robur.io/Projects/CalDAV),
|
|
and [DNS servers](https://github.com/roburio/unikernels) (the latter two are further described below).
|
|
|
|
Robur is part of the non-profit company [Center for the Cultivation of Technology](https://techcultivation.org),
|
|
who are managing the legal and administrative sides for us. We're ourselves responsible to
|
|
acquire funding to pay ourselves reasonable salaries. We received funding for CalDAV from
|
|
[prototypefund](https://prototypefund.de) and further funding from [Tarides](https://tarids.com),
|
|
for TLS 1.3 from [OCaml Labs](http://ocamllabs.io/); worked for
|
|
[Least Authority](https://leastauthority.com/) on a security audit of an OCaml codebase,
|
|
and received [donations](https://robur.io/Donate), also in
|
|
the form of Bitcoins. We're looking for further funded collaborations and
|
|
also contracting, mail us at `team@robur.io`. Please
|
|
[donate](https://robur.io/Donate) (tax-deductible in EU), so we can accomplish
|
|
our goal of putting robust and sustainable MirageOS unikernels into production,
|
|
replacing insecure legacy system that emit tons of
|
|
CO<span style="vertical-align: baseline; position: relative;bottom: -0.4em;">2</span>.
|
|
|
|
## Deploying MirageOS unikernels
|
|
|
|
While several examples are running since years (the [MirageOS website](https://mirage.io),
|
|
[Bitcoin Piñata](http://ownme.ipredator.se), [TLS demo server](https://tls.nqsb.io), etc.),
|
|
and some shell-scripts for cloud providers are floating around, it is not (yet)
|
|
streamlined.
|
|
|
|
Service deployment is complex: you have to consider its configuration,
|
|
exfiltration of logs and metrics, provisioning with valid key material (TLS
|
|
certificate, hmac shared secret) and authenticators (CA certificate, ssh key
|
|
fingerprint). Instead of requirign millions lines of code during orchestration
|
|
(such as Kubernetes), creating the images (docker), or provisioning (ansible),
|
|
why not minimise the required configuration and dependencies?
|
|
|
|
[Earlier in this blog I introduced Albatross](/Posts/VMM), which serves in an
|
|
enhanced version as our deployment platform on a physical machine (running
|
|
15 unikernels at the moment), I won't discuss more detail thereof in this
|
|
article.
|
|
|
|
## CalDAV
|
|
|
|
[Steffi](https://linse.me/) and I developed in 2018 a CalDAV server. Since
|
|
November 2018 we have a test installation for robur, initially running as a Unix
|
|
process on a virtual machine and persisting data to files on the disk.
|
|
Mid-June 2019 we migrated it to a MirageOS unikernel, thanks to
|
|
great efforts in [git](https://github.com/mirage/ocaml-git) and
|
|
[irmin](https://github.com/mirage/irmin), unikernels can push to a
|
|
remote git repository. We [extended the ssh library](https://github.com/haesbaert/awa-ssh/pull/8)
|
|
with a ssh client and [use this in git](https://github.com/mirage/ocaml-git/pull/362).
|
|
This also means our CalDAV server is completely immutable (does not carry state
|
|
across reboots, apart from the data in the remote repository) and does not have
|
|
persistent state in the form of a block device. Its configuration is mainly done
|
|
at compile time by the selection of libraries (syslog, monitoring, ...), and
|
|
boot arguments passed to the unikernel at startup.
|
|
|
|
We monitored the resource usage when migrating our CalDAV server from Unix
|
|
process to a MirageOS unikernel. The unikernel size is just below 10MB. The
|
|
workload is some clients communicating with the server on a regular basis.
|
|
We use [Grafana](https://grafana.com/) with a
|
|
[influx](https://www.influxdata.com/) time series database to monitor virtual
|
|
machines. Data is collected
|
|
on the host system (`rusage` sysctl, `kinfo_mem` sysctl, `ifdata` sysctl,
|
|
`vm_get_stats` BHyve statistics), and our unikernels these days emit further
|
|
metrics (mostly counters: gc statistics, malloc statistics, tcp sessions, http
|
|
requests and status codes).
|
|
|
|
[<img src="https://berlin.ccc.de/~hannes/crobur-june-2019.png" width="700" />](https://berlin.ccc.de/~hannes/crobur-june-2019.png)
|
|
|
|
Please note that memory usage (upper right)
|
|
and vm exits (lower right) use logarithmic scale. The CPU usage reduced by more
|
|
than a factor of 4. The memory usage dropped by a factor of 25, and the network
|
|
traffic increased - previously we stored log messages on the virtual machine
|
|
itself, now we send them to a dedicated log host.
|
|
|
|
A MirageOS unikernel,
|
|
apart from a smaller attack surface, indeed uses fewer resources and actually emits
|
|
less CO<span style="vertical-align: baseline; position: relative;bottom: -0.4em;">2</span>
|
|
than the same service on a Unix virtual machine. So we're doing something good
|
|
for the environment! :)
|
|
|
|
Our calendar server contains at the moment 63 events, the git repository had
|
|
around 500 commits in the past month: nearly all of them from the CalDAV server
|
|
itself when a client modified data via CalDAV, and two manual commits:
|
|
the initial data imported from the file system, and one commit for fixing a bug
|
|
of the encoder
|
|
in our [icalendar library](https://github.com/roburio/icalendar/pull/2).
|
|
|
|
Our CalDAV implementation is very basic, scheduling, adding attendees (which
|
|
requires sending out eMail), is not supported. But it works well for us, we have
|
|
individual calendars and a shared one which everyone can write to.
|
|
On the client side we use macOS and iOS iCalendar, Android DAVdroid, and
|
|
Thunderbird. If you like to try our CalDAV server, have a look
|
|
[at our installation instructions](https://github.com/roburio/caldav/tree/future/README.md). At the
|
|
moment, not all of our above development is merged and released yet, that's
|
|
why we provide an [opam repository overlay](https://github.com/roburio/git-ssh-dns-mirage3-repo) with various packages pinned to specific
|
|
branches. We recommend to use a custom opam `switch`
|
|
(`opam switch create caldav 4.07.1`) just for the CalDAV unikernel. Please
|
|
[report issues](https://github.com/roburio/caldav/issues) if you find issues
|
|
or struggle with the installation.
|
|
|
|
## DNS
|
|
|
|
There has been more work on our DNS implementation, now
|
|
[here](https://github.com/mirage/ocaml-dns). We included a DNS client library,
|
|
and some [examples](https://github.com/roburio/unikernels/tree/future) are
|
|
available. They as well require our
|
|
[opam repository overlay](https://github.com/roburio/git-ssh-dns-mirage3-repo).
|
|
Please report issues if you run into trouble while experimenting with that.
|
|
|
|
Most prominently is `primary-git`, a unikernel which acts as a primary
|
|
authoritative DNS server (UDP and TCP). On startup, it fetches a remote git
|
|
repository that contains zone files and shared hmac secrets. The zones are
|
|
served, and secondary servers are notified with the respective serial numbers
|
|
of the zones, authenticated using TSIG with the shared secrets. The primary server
|
|
provides dynamic in-protocol updates of DNS resource records (`nsupdate`), and
|
|
after successful authentication pushes the change to the remote git. To change
|
|
the zone, you can just edit the zonefile and push to the git remote - with the
|
|
proper pre- and post-commit-hooks an authenticated notify is send to the primary
|
|
server which then pulls the git remote.
|
|
|
|
Another noteworthy unikernel is `letsencrypt`, which acts as a secondary server,
|
|
and whenever a TLSA record with private type and a DER-encoded certificate
|
|
signing request is observed, it requests a signature from letsencrypt by solving
|
|
the DNS challenge. The certificate is pushed to the DNS server as TLSA record
|
|
as well. The DNS implementation provides `ocertify` and `dns-mirage-certify`
|
|
which use the above mechanism to retrieve valid let's encrypt certificates.
|
|
The caller (unikernel or Unix command-line utility) either takes a private key
|
|
directly or generates one from a (provided) seed and generates a certificate
|
|
signing request. It then looks in DNS for a certificate which is still valid and
|
|
matches the public key and the hostname. If such a certificate is not present,
|
|
the certificate signing request is pushed to DNS (via the nsupdate protocol),
|
|
authenticated using TSIG with a given secret. This way our public facing
|
|
unikernels (website, this blog, TLS demo server, ..) block until they got a
|
|
certificate via DNS on startup - we avoid embedding of the certificate into the
|
|
unikernel image.
|
|
|
|
## Monitoring
|
|
|
|
We like to gather statistics about the resource usage of our unikernels to
|
|
find potential bottlenecks and observe memory leaks ;) The base for the setup
|
|
is the [metrics](https://github.com/mirage/metrics) library, which is similarly
|
|
in design to the [logs](https://erratique.ch/software/logs) library: libraries
|
|
use the core to gather metrics. A different aspect is the reporter, which is
|
|
globally registered and responsible for exfiltrating the data via their
|
|
favourite protocol. If no reporter is registered, the work overhead is
|
|
negligible.
|
|
|
|
[<img src="https://berlin.ccc.de/~hannes/crobur-june-2019-unikernel.png" width="700" />](https://berlin.ccc.de/~hannes/crobur-june-2019-unikernel.png)
|
|
|
|
This is a dashboard which combines both statistics gathered from the host system
|
|
and various metrics from the MirageOS unikernel. The `monitoring` branch of our
|
|
opam repository overlay is used together with
|
|
[monitoring-experiments](https://github.com/hannesm/monitoring-experiments).
|
|
The logs errors counter (middle right) was the icalendar parser which tried to
|
|
parse its badly emitted ics (the bug is now fixed, the dashboard is from last
|
|
month).
|
|
|
|
|
|
## OCaml libraries
|
|
|
|
The [domain-name](https://github.com/hannesm/domain-name) library was developed
|
|
to handle RFC 1035 domain names and host names. It initially was part of the DNS
|
|
code, but is now freestanding to be used in other core libraries (such as ipaddr)
|
|
with a small dependency footprint.
|
|
|
|
The [GADT map](https://github.com/hannesm/gmap) is a normal OCaml Map structure,
|
|
but takes key-dependent value types by using a GADT. This library also was part
|
|
of DNS, but is more broadly useful, we already use it in our icalendar (the data
|
|
format for calendar entries in CalDAV) library, our [OpenVPN](https://git.robur.io/?p=openvpn.git;a=summary) configuration
|
|
parser uses it as well, and also [x509](https://github.com/mirleft/ocaml-x509/pull/115)
|
|
- which got reworked quite a bit recently (release pending), and there's
|
|
preliminary PKCS12 support (which deserves its own article).
|
|
[TLS 1.3](https://github.com/hannesm/ocaml-tls) is available on a branch, but
|
|
is not yet merged. More work is underway, hopefully with sufficient time to write more articles
|
|
about it.
|
|
|
|
I'm interested in feedback, either via <strike>[twitter](https://twitter.com/h4nnes)</strike>
|
|
hannesm@mastodon.social or an issue on the [data
|
|
repository](https://github.com/hannesm/hannes.nqsb.io/issues).
|