new
This commit is contained in:
parent
0edf5e7f1d
commit
5f9301cb01
1 changed files with 191 additions and 0 deletions
191
Posts/Summer2019
Normal file
191
Posts/Summer2019
Normal file
|
@ -0,0 +1,191 @@
|
|||
---
|
||||
title:
|
||||
author: hannes
|
||||
tags: mirageos, security, package signing, tls, monitoring, deployment
|
||||
abstract:
|
||||
---
|
||||
|
||||
## Working at [robur](https://robur.io)
|
||||
|
||||
As announced [previously](/Posts/DNS), I started to work at robur early 2018.
|
||||
We're a collective of five people, distributed around Europe and the US, with
|
||||
the goal to deploy MirageOS unikernels. We do this by developing bespoke
|
||||
MirageOS unikernels which provide useful services, and deploy them for ourselves.
|
||||
We also develop new libraries and enhance existing ones and other components of MirageOS.
|
||||
Example unikernels include [our website](https://robur.io) which uses [Canopy](https://github.com/Engil/Canopy),
|
||||
a [CalDAV server that stores entries in a git remote](https://robur.io/Projects/CalDAV),
|
||||
and [DNS servers](https://github.com/roburio/unikernels) (the latter two are further described below).
|
||||
|
||||
Robur is part of the non-profit company [Center for the Cultivation of Technology](https://techcultivation.org),
|
||||
who are managing the legal and administrative sides for us. We're ourselves responsible to
|
||||
acquire funding to pay ourselves reasonable salaries. We received funding for CalDAV from
|
||||
[prototypefund](https://prototypefund.de) and further funding from [Tarides](https://tarids.com),
|
||||
for TLS 1.3 from [OCaml Labs](http://ocamllabs.io/); worked for
|
||||
[Least Authority](https://leastauthority.com/) on a security audit of an OCaml codebase,
|
||||
and received [donations](https://robur.io/Donate), also in
|
||||
the form of Bitcoins. We're looking for further funded collaborations and
|
||||
also contracting, mail us at `team@robur.io`. Please
|
||||
[donate](https://robur.io/Donate) (tax-deductible in EU), so we can accomplish
|
||||
our goal of putting robust and sustainable MirageOS unikernels into production,
|
||||
replacing insecure legacy system that emit tons of
|
||||
CO<span style="vertical-align: baseline; position: relative;bottom: -0.4em;">2</span>.
|
||||
|
||||
## Deploying MirageOS unikernels
|
||||
|
||||
While several examples are running since years (the [MirageOS website](https://mirage.io),
|
||||
[Bitcoin Piñata](http://ownme.ipredator.se), [TLS demo server](https://tls.nqsb.io), etc.),
|
||||
and some shell-scripts for cloud providers are floating around, it is not (yet)
|
||||
streamlined.
|
||||
|
||||
Service deployment is complex: you have to consider its configuration,
|
||||
exfiltration of logs and metrics, provisioning with valid key material (TLS
|
||||
certificate, hmac shared secret) and authenticators (CA certificate, ssh key
|
||||
fingerprint). Instead of requirign millions lines of code during orchestration
|
||||
(such as Kubernetes), creating the images (docker), or provisioning (ansible),
|
||||
why not minimise the required configuration and dependencies?
|
||||
|
||||
[Earlier in this blog I introduced Albatross](/Posts/VMM), which serves in an
|
||||
enhanced version as our deployment platform on a physical machine (running
|
||||
15 unikernels at the moment), I won't discuss more detail thereof in this
|
||||
article.
|
||||
|
||||
## CalDAV
|
||||
|
||||
[Steffi](https://linse.me/) and I developed in 2018 a CalDAV server. Since
|
||||
November 2018 we have a test installation for robur, initially running as a Unix
|
||||
process on a virtual machine and persisting data to files on the disk.
|
||||
Mid-June 2019 we migrated it to a MirageOS unikernel, thanks to
|
||||
great efforts in [git](https://github.com/mirage/ocaml-git) and
|
||||
[irmin](https://github.com/mirage/irmin), unikernels can push to a
|
||||
remote git repository. We [extended the ssh library](https://github.com/haesbaert/awa-ssh/pull/8)
|
||||
with a ssh client and [use this in git](https://github.com/mirage/ocaml-git/pull/362).
|
||||
This also means our CalDAV server is completely immutable (does not carry state
|
||||
across reboots, apart from the data in the remote repository) and does not have
|
||||
persistent state in the form of a block device. Its configuration is mainly done
|
||||
at compile time by the selection of libraries (syslog, monitoring, ...), and
|
||||
boot arguments passed to the unikernel at startup.
|
||||
|
||||
We monitored the resource usage when migrating our CalDAV server from Unix
|
||||
process to a MirageOS unikernel. The unikernel size is just below 10MB. The
|
||||
workload is some clients communicating with the server on a regular basis.
|
||||
We use [Grafana](https://grafana.com/) with a
|
||||
[influx](https://www.influxdata.com/) time series database to monitor virtual
|
||||
machines. Data is collected
|
||||
on the host system (`rusage` sysctl, `kinfo_mem` sysctl, `ifdata` sysctl,
|
||||
`vm_get_stats` BHyve statistics), and our unikernels these days emit further
|
||||
metrics (mostly counters: gc statistics, malloc statistics, tcp sessions, http
|
||||
requests and status codes).
|
||||
|
||||
[<img src="https://berlin.ccc.de/~hannes/crobur-june-2019.png" width="700" />](https://berlin.ccc.de/~hannes/crobur-june-2019.png)
|
||||
|
||||
Please note that memory usage (upper right)
|
||||
and vm exits (lower right) use logarithmic scale. The CPU usage reduced by more
|
||||
than a factor of 4. The memory usage dropped by a factor of 25, and the network
|
||||
traffic increased - previously we stored log messages on the virtual machine
|
||||
itself, now we send them to a dedicated log host.
|
||||
|
||||
A MirageOS unikernel,
|
||||
apart from a smaller attack surface, indeed uses fewer resources and actually emits
|
||||
less CO<span style="vertical-align: baseline; position: relative;bottom: -0.4em;">2</span>
|
||||
than the same service on a Unix virtual machine. So we're doing something good
|
||||
for the environment! :)
|
||||
|
||||
Our calendar server contains at the moment 63 events, the git repository had
|
||||
around 500 commits in the past month: nearly all of them from the CalDAV server
|
||||
itself when a client modified data via CalDAV, and two manual commits:
|
||||
the initial data imported from the file system, and one commit for fixing a bug
|
||||
of the encoder
|
||||
in our [icalendar library](https://github.com/roburio/icalendar/pull/2).
|
||||
|
||||
Our CalDAV implementation is very basic, scheduling, adding attendees (which
|
||||
requires sending out eMail), is not supported. But it works well for us, we have
|
||||
individual calendars and a shared one which everyone can write to.
|
||||
On the client side we use macOS and iOS iCalendar, Android DAVdroid, and
|
||||
Thunderbird. If you like to try our CalDAV server, have a look
|
||||
[at our installation instructions](https://github.com/roburio/caldav/tree/future/README.md). At the
|
||||
moment, not all of our above development is merged and released yet, that's
|
||||
why we provide an [opam repository overlay](https://github.com/roburio/git-ssh-dns-mirage3-repo) with various packages pinned to specific
|
||||
branches. We recommend to use a custom opam `switch`
|
||||
(`opam switch create caldav 4.07.1`) just for the CalDAV unikernel. Please
|
||||
[report issues](https://github.com/roburio/caldav/issues) if you find issues
|
||||
or struggle with the installation.
|
||||
|
||||
## DNS
|
||||
|
||||
There has been more work on our DNS implementation, now
|
||||
[here](https://github.com/mirage/ocaml-dns). We included a DNS client library,
|
||||
and some [examples](https://github.com/roburio/unikernels/tree/future) are
|
||||
available. They as well require our
|
||||
[opam repository overlay](https://github.com/roburio/git-ssh-dns-mirage3-repo).
|
||||
Please report issues if you run into trouble while experimenting with that.
|
||||
|
||||
Most prominently is `primary-git`, a unikernel which acts as a primary
|
||||
authoritative DNS server (UDP and TCP). On startup, it fetches a remote git
|
||||
repository that contains zone files and shared hmac secrets. The zones are
|
||||
served, and secondary servers are notified with the respective serial numbers
|
||||
of the zones, authenticated using TSIG with the shared secrets. The primary server
|
||||
provides dynamic in-protocol updates of DNS resource records (`nsupdate`), and
|
||||
after successful authentication pushes the change to the remote git. To change
|
||||
the zone, you can just edit the zonefile and push to the git remote - with the
|
||||
proper pre- and post-commit-hooks an authenticated notify is send to the primary
|
||||
server which then pulls the git remote.
|
||||
|
||||
Another noteworthy unikernel is `letsencrypt`, which acts as a secondary server,
|
||||
and whenever a TLSA record with private type and a DER-encoded certificate
|
||||
signing request is observed, it requests a signature from letsencrypt by solving
|
||||
the DNS challenge. The certificate is pushed to the DNS server as TLSA record
|
||||
as well. The DNS implementation provides `ocertify` and `dns-mirage-certify`
|
||||
which use the above mechanism to retrieve valid let's encrypt certificates.
|
||||
The caller (unikernel or Unix command-line utility) either takes a private key
|
||||
directly or generates one from a (provided) seed and generates a certificate
|
||||
signing request. It then looks in DNS for a certificate which is still valid and
|
||||
matches the public key and the hostname. If such a certificate is not present,
|
||||
the certificate signing request is pushed to DNS (via the nsupdate protocol),
|
||||
authenticated using TSIG with a given secret. This way our public facing
|
||||
unikernels (website, this blog, TLS demo server, ..) block until they got a
|
||||
certificate via DNS on startup - we avoid embedding of the certificate into the
|
||||
unikernel image.
|
||||
|
||||
## Monitoring
|
||||
|
||||
We like to gather statistics about the resource usage of our unikernels to
|
||||
find potential bottlenecks and observe memory leaks ;) The base for the setup
|
||||
is the [metrics](https://github.com/mirage/metrics) library, which is similarly
|
||||
in design to the [logs](https://erratique.ch/software/logs) library: libraries
|
||||
use the core to gather metrics. A different aspect is the reporter, which is
|
||||
globally registered and responsible for exfiltrating the data via their
|
||||
favourite protocol. If no reporter is registered, the work overhead is
|
||||
negligible.
|
||||
|
||||
[<img src="https://berlin.ccc.de/~hannes/crobur-june-2019-unikernel.png" width="700" />](https://berlin.ccc.de/~hannes/crobur-june-2019-unikernel.png)
|
||||
|
||||
This is a dashboard which combines both statistics gathered from the host system
|
||||
and various metrics from the MirageOS unikernel. The `monitoring` branch of our
|
||||
opam repository overlay is used together with
|
||||
[monitoring-experiments](https://github.com/hannesm/monitoring-experiments).
|
||||
The logs errors counter (middle right) was the icalendar parser which tried to
|
||||
parse its badly emitted ics (the bug is now fixed, the dashboard is from last
|
||||
month).
|
||||
|
||||
|
||||
## OCaml libraries
|
||||
|
||||
The [domain-name](https://github.com/hannesm/domain-name) library was developed
|
||||
to handle RFC 1035 domain names and host names. It initially was part of the DNS
|
||||
code, but is now freestanding to be used in other core libraries (such as ipaddr)
|
||||
with a small dependency footprint.
|
||||
|
||||
The [GADT map](https://github.com/hannesm/gmap) is a normal OCaml Map structure,
|
||||
but takes key-dependent value types by using a GADT. This library also was part
|
||||
of DNS, but is more broadly useful, we already use it in our icalendar (the data
|
||||
format for calendar entries in CalDAV) library, our [OpenVPN](https://git.robur.io/?p=openvpn.git;a=summary) configuration
|
||||
parser uses it as well, and also [x509](https://github.com/mirleft/ocaml-x509/pull/115)
|
||||
- which got reworked quite a bit recently (release pending), and there's
|
||||
preliminary PKCS12 support (which deserves its own article).
|
||||
[TLS 1.3](https://github.com/hannesm/ocaml-tls) is available on a branch, but
|
||||
is not yet merged. More work is underway, hopefully with sufficient time to write more articles
|
||||
about it.
|
||||
|
||||
I'm interested in feedback, either via <strike>[twitter](https://twitter.com/h4nnes)</strike>
|
||||
hannesm@mastodon.social or an issue on the [data
|
||||
repository](https://github.com/hannesm/hannes.nqsb.io/issues).
|
Loading…
Reference in a new issue