homepage-data/Projects/Reproducible_builds

141 lines
14 KiB
Text

---
title: Robur Reproducible Builds
---
Over the past year we in [Robur](https://robur.coop/) have been working towards easing deployment of reproducible mirage applications. The work has been funded by the Eurepean Union under the [Next Generation Internet (NGI Pointer) initiative](https://pointer.ngi.eu/). The result is [available as a website](https://builds.robur.coop).
The overall goal is to push MirageOS into production in a trustworthy way. We worked on reproducible builds for MirageOS - with the infrastructure being reproducible itself. A handful of core packages are hosted there (described below in this article), next to several ready-to-use MirageOS unikernels - ranging from [authoritative DNS servers](https://builds.robur.coop/job/dns-primary-git/) ([secondary](https://builds.robur.coop/job/dns-secondary/), [let's encrypt DNS solver](https://builds.robur.coop/job/dns-letsencrypt-secondary/)), [DNS-and-DHCP service (similar to dnsmasq)](https://builds.robur.coop/job/dnsvizor/), [TLS reverse proxy](https://builds.robur.coop/job/tlstunnel/), [Unipi - a web server that delivers content from a git repository](https://builds.robur.coop/job/unipi/), [DNS resolver](https://builds.robur.coop/job/dns-resolver/), [CalDAV server](https://builds.robur.coop/job/caldav/), and of course your own MirageOS unikernel.
Reproducible builds are crucial for supply chain security - everyone can reproduce the exact same binary (by using the same sources and environment), without reproducible builds we would not publish binaries.
Reproducible builds are also great for fleet management: by inspecting the hash of the binary that is executed, we can figure out which versions of which packages are in the unikernel - and suggest updates if newer builds are available or if a used packages has a security flaw in that version -- `albatross-client-local update my-unikernel` is everything needed for an update.
In the following, we'll explain in more detail two scenarios: how to deploy MirageOS unikernels using the infrastructure we provide, how to bootstrap and run the infrastructure for yourself. Afterwards we briefly describe how to reproduce a package, and what are our core packages and their relationships.
## Brief robur and MirageOS introduction
MirageOS is an operating system, developed in OCaml, which produces unikernels. A unikernel serves a single purpose and is a single process, i.e. only has the really needed dependencies. For example, an OpenVPN endpoint does neither include persistent storage (block device, file system) nor user management. MirageOS unikernels are developed in OCaml, a statically typed and type-safe programming language - which avoids common pitfalls from the grounds up (spatial and temporal memory safety issues).
[Robur](https://robur.coop) is a collective that develops MirageOS and OCaml software with open source license. It was started in 2017, and is part of the non-profit company [center for the cultivation of technology](https://techcultivation.org). We received funding from several projects (prototypefund, NGI pointer), donations, and some commercial contracts.
## For someone who wants to run MirageOS unikernels
To run a MirageOS unikernel on your laptop or computer with virtualization extensions (VT-x - KVM/BHyve), you can first install solo5-hvt as a [package](https://builds.robur.coop/job/solo5-hvt/) (take which fits your distribution), and [albatross](https://builds.robur.coop/job/albatross/).
There is no configuration needed, you should start the `albatross_console` and the `albatross_daemon` service (via `systemctl daemon-reload ; systemctl start albatross_daemon` on Linnux or `service albatross_daemon start` on FreeBSD). Executing `albatross-client-local info ` should return success (exit code 0) and no running unikernel. You may need to be in the albatross group, or change the permissions of the Unix domain socket (`vmmd.sock` in `/run/albatross/util/` on Linux, `/var/run/albatross/util/` on FreeBSD).
### Network setup
To setup networking, you need a bridge interface, usually named service, that albatross will use for unikernels. To provide network connectivity to that bridge interface, you can either use NAT, forward public IP addresses there, provide a gateway that tunnels via VPN, or add your network interface to the bridge. In the following, we describe the setup in detail on Linux. Get in touch with us if you're interested in other platforms.
Bridge setup on Linux in `/etc/network/interfaces`:
```
auto service
# Host-only bridge
iface service inet manual
up ip link add service-master address 02:00:00:00:00:01 type dummy
up ip link set dev service-master up
up ip link add service type bridge
up ip link set dev service-master master service
up ip link set dev service up
down ip link del service
down ip link del service-master
```
#### Routing of a subnet
If your host system acts as a router for a network, enable IPv4 forwarding (` echo "1" > /proc/sys/net/ipv4/ip_forward`), and setup that IP address (`up ip addr add 192.168.0.1/24 dev service`)
#### Physical network interface with IP address space
To put your unikernels on the same network as your host system, add that external network interface to the bridge: `up ip link set dev enp0s20f0 master service`.
#### NAT (no public IP address, e.g. for testing on your Laptop)
Setup a private network on the `service` bridge (`up ip addr add 192.168.0.1/24 dev service`), enable IPv4 forwarding (`echo "1" > /proc/sys/net/ipv4/ip_forward`), and a firewall rule (`iptables -t nat -A POSTROUTING -o enp0s20f0 -j MASQUERADE`).
### Unikernel execution
Download the [traceroute](https://builds.robur.coop/job/traceroute/) unikernel ([direct link to unikernel image](https://builds.robur.coop/job/traceroute/build/latest/f/bin/traceroute.hvt)), and run it via albatross: in one shell, observe the console output: `albatross-client-local console traceroute`, in a second shell create the unikernel: `albatross-client-local create --net=service traceroute traceroute.hvt --arg='--ipv4=192.168.0.2/24' --arg='--ipv4-gateway=192.168.0.1'`
That's it. Albatross has more features, such as block devices, multiple bridges (for management, private networks, ...), restart on certain exit codes, assignment to a specific CPU. It also has remote command execution and resource limits (you can allow your friends to execute U unikernels with M MB memory and B MB block devices accessing your bridges A and B). There is a daemon to collect metrics and report them to Telegraf (to push them into Influx and view in nice Grafana dashboards). MirageOS unikernels also support IPv6, you're not limited to legacy IP.
You can also use `albatross-client-local update` to ensure you're running the latest unikernel - it checks https://builds.robur.coop for the job and suggests to update if there is a newer binary available.
## For someone who wants to build and run MirageOS unikernels
The fundamental tools for building in a reproducible way are orb and builder. On some distributions we provide binary packages ([orb](https://builds.robur.coop/job/orb/), [builder](https://builds.robur.coop/job/builder/)) that you can use. On other distributions you'll need to bootstrap them from source:
- To build in a reproducible way, we developed orb, which is written in OCaml. It is an opam package available at https://github.com/roburio/orb (installation via `opam pin add orb https://github.com/roburio/orb.git`) - once you have OCaml and [opam](https://opam.ocaml.org) installed.
- To build builder, `opam install builder` is all you need to do. `opam install builder-web` will install the latest version of builder-web.
### Setup builder
On Linux:
Builder provides a systemd service (builder) that you should start. There is as well a builder-worker service that executes the worker process in a docker container. Check the URLs and configuration in the systemd service files, if necessary modify it using `systemctl edit --full builder-worker.service`, and start it. The provided builder-worker.service script will build for Ubuntu 20.04 as of writing.
On FreeBSD:
For FreeBSD, rc scripts and an example jail.conf (and shell script to launch) are provided. Setting up a jail is documented in the README (using poudriere).
### Setup builder-web
Builder-web needs an initial database, an initial user, and also has a service script. Use the `builder-db migrate` command to create an initial database, and `builder-db user-add --unrestricted my_user` to create a privileged user `my_user`. Setup your builder to use reproducible packages from builder-web and upload results there (by setting the `--upload https://my_user:my_password@builds.robur.coop/upload`).
### Schedule an orb job
The command `builder-client info` should output the schedule, queues, and running builds. To schedule a daily build, run `builder-client orb-build traceroute traceroute-hvt`. This will create a new job named traceroute and pick up the job template (`/etc/builder/orb-build.template.PLATFORM`) and schedule that job to your worker in order to build the opam package traceroute-hvt.
We document the commands, you can always execute it with `--help` to see the man page.
## Reproducing builds
From a build on https://builds.robur.coop, select an operating system and distribution that has been used for a build. Go to the specific build, and download the "system-packages" file -- these are the exact versions of host system packages that were used during the build. Make sure they're installed (version variance may lead to non-reproducibility - orb and builder are not needed for a manual rebuild).
Download the build-environment file, which contains all environment variables that were set during the build. Set these, and only these, in your shell.
Install opam (at least in version 2.1). Then, download the opam-switch file - which includes all opam files and dependencies (including the OCaml compiler). Execute `opam switch import opam-switch --switch reproduced-unikernel` which will create a fresh opam switch where it will install the unikernel. This will be located in `opam switch prefix`/bin/unikernel.hvt.
## Core software components in more detail
### [orb](https://github.com/roburio/orb)
The Opam Reproducible Builder uses the opam libraries to conduct a build of an opam package using any opam repositories. It collects system packages, environment variables, and a full and frozen opam switch export. These artifacts contain the build information and can be used to reproduce the exact same binary.
### [builder](https://github.com/roburio/builder/)
Builder is a suite of three executables: builder-server, builder-worker and builder-client. Together they periodically run scheduled jobs which execute orb, collecting build artifacts and information used for reproducing the build. The builder-worker is executed in a container or jailed environment, and communicates via TCP with the builder-server. The result of the build can be uploaded to builder-web or stored in the file system.
### [builder-web](https://github.com/roburio/builder-web)
Builder-web is a web interface for viewing and downloading builds and build artifacts created by builder jobs. The binary checksums can be viewed and the build inputs (opam packages, environment variables, system packages) can be compared across builds.
It uses [dream](https://github.com/aantron/dream) with sqlite3 as backend database. The database schema evolved over time, we developed migration and rollback tooling to update our live database.
### [albatross](https://github.com/roburio/albatross)
Albatross is an orchestration system for MirageOS unikernels. It manages system resources (tap interfaces, virtual block devices) that can be passed to the unikernels. It reads the console output of a unikernel and provides it via a TCP stream. It also has remote access via TLS, where apart from inspecting the running status also new unikernels can be uploaded. Albatross integrates with builder-web to look up running unikernels by their hash and optionally updating the unikernel binary.
### [solo5](https://github.com/solo5/solo5)
Solo5 is the tender - the application that runs in the host system as a user process, consuming the system resources, and delegating them to the unikernel. This is a pretty small binary with a tiny API between host and unikernel. [A great solo5 overview talk (FOSDEM 2019)](https://archive.fosdem.org/2019/schedule/event/solo5_unikernels/).
## Future
We have enhancements and more features planned in the future. At the same time we are looking for feedback of the reproducible build and unikernel deployment system (with a security perspective, with a devops perspective, etc.). We are also keen to collaborate and would take new people on board.
- Improving the web UI on https://builds.robur.coop/. If you're interested, please get in touch, we have funding available.
- Supporting more distributions: tell us your favourite distribution and how to build a package, then we can integrate that into our reproducible builds infrastructure.
- Supporting spt - the sadboxed process tender - to run unikernels without a hypervisor.
- Data analytics: which system packages updates or opam package releases result in variance of the binaries - did the release of an opam package increase or decrease the overall build times?
- Functional and performance tests of the unikernels: for each different build, conduct basic functional testing, and performance test - to graph in the ouput. Also includes data analytics: did the release of an opam package increase or decrease the performance of unikernels?
- Whole system performance analysis with memory profiling, and how to integrate this into a running unikernel.
- MirageOS 4.0 support.
- Metrics and logging collection and dynamic adjustment of metrics and log levels.
- DNS resolver unikernel, still missing DNSSec support.
Interested? Get in touch with us.