hannes.robur.coop/Posts/OperatingSystem
2017-01-24 13:31:51 +00:00

181 lines
12 KiB
Text
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: Operating systems
author: hannes
tags: overview, operating system, mirageos
abstract: Operating systems and MirageOS
---
Sorry to be late with this entry, but I had to fix some issues.
## What is an operating system?
Wikipedia says: "An operating system (OS) is system software that manages
computer hardware and software resources and provides common services for
computer programs." Great. In other terms, it is an abstraction layer.
Applications don't need to deal with the low-level bits (device drivers) of the
computer.
But if we look at the landscape of deployed operating systems, there is a lot
more going on than abstracting devices: usually this includes process management (scheduler),
memory management (virtual memory), [C
library](https://en.wikipedia.org/wiki/C_standard_library), user management
(including access control), persistent storage (file system), network stack,
etc. all being part of the kernel, and executed in kernel space. A
counterexample is [Minix](http://www.minix3.org/), which consists of a tiny
microkernel, and executes the above mentioned services as user-space processes.
We are (or at least I am) interested in robust systems. Development is done
by humans, thus will always be error-prone. Even a proof of its functional
correctness can be flawed if the proof system is inconsistent or the
specification is wrong. We need to have damage control in place by striving
for the [principle of least authority](https://en.wikipedia.org/wiki/Principle_of_least_privilege).
The goods to guard is the user data (passwords, personal information, private
mails, ...), which lives in memory.
A CPU contains [protection rings](https://en.wikipedia.org/wiki/Protection_ring),
where the kernel runs in ring 0 and thus has full access to the hardware,
including memory. A flaw in the kernel is devastating for the security of the
entire system, it is part of the [trusted computing base](https://en.wikipedia.org/wiki/Trusted_computing_base)).
Every byte of kernel code should be carefully developed and audited. If we
can contain code into areas with less authority, we should do so. Obviously,
the mechanism to contain code needs to be carefully audited as well, since
it will likely need to run in privileged mode.
In a virtualised world, we run a
[hypervisor](https://en.wikipedia.org/wiki/Hypervisor) in ring -1, on top of
which we run an operating system kernel. The hypervisor gives access to memory
and hardware to virtual machines, schedules those virtual machines on
processors, and should isolate the virtual machines from each other (by using
the MMU).
![there's no cloud, just other people's computers](https://fsfe.org/contribute/promopics/thereisnocloud-v2-preview.png)
This ominous "cloud" uses hypervisors on huge amount of physical machines, and
executes off-the-shelf operating systems as virtual machines on top. Accounting
is done by resource usage (time, bandwidth, storage).
## From scratch
Ok, now we have hypervisors which already deals with memory and scheduling. Why
should we have the very same functionality again in the (general purpose) operating
system running as virtual machine?
Additionally, earlier in my life (back in 2005 at the Dutch hacker camp "What
the hack") I proposed (together with Andreas Bogk) to [phase out UNIX before
2038-01-19](https://berlin.ccc.de/~hannes/wth.pdf) (this is when `time_t`
overflows, unless promoted to 64 bit), and replace it with Dylan. A [random
comment](http://www.citizen428.net/blog/2005/08/03/what-the-hack-recap/) about
our talk on the Internet is "the proposal that rewriting an entire OS in a
language with obscure syntax was somewhat original. However, I now somewhat feel
a strange urge to spend some time on Dylan, which is really weird..."
Being without funding back then, we didn't get far (hugest success was a
[TCP/IP](https://github.com/dylan-hackers/network-night-vision/) stack in
Dylan), and as mentioned earlier I went into formal methods and mechanised
proofs of full functional correctness properties.
### MirageOS
At the end of 2013, David pointed me to
[MirageOS](https://mirage.io), an operating system developed from scratch in the
functional and statically typed language [OCaml](https://ocaml.org). I've not
used much OCaml before, but some other functional programming languages.
Since then, I spend nearly every day on developing OCaml libraries (with varying success on being happy
with my code). In contrast to Dylan, there are more than two people developing MirageOS.
The idea is straightforward: use a hypervisor, and its hardware
abstractions (virtualised input/output and network device), and execute the
OCaml runtime directly on it. No C library included (since May 2015, see [this
thread](http://lists.xenproject.org/archives/html/mirageos-devel/2014-05/msg00070.html)).
The virtual machine, based on the OCaml runtime and composed of OCaml libraries,
uses a single address space and runs in ring 0.
As mentioned above, all code which runs in ring 0 needs to be carefully
developed and checked since a flaw in it can jeopardise the security properties
of the entire system: the TCP/IP library should not have access to the private
key used for the TLS handshake. If we trust the OCaml runtime, especially its
memory management, there is no way for the TCP/IP library to access the memory
of the TLS subsystem: the TLS API does not expose the private key via an API
call, and being in a memory safe language, a library cannot read arbitrary
memory. There is no real need to isolate each library into a separate address
spaces. In my opinion, using capabilities for memory access would be a great
improvement, similar to [barrelfish](http://www.barrelfish.org). OCaml has a C
foreign function call interface which can be used to read arbitrary memory --
you have to take care that all C bits of the system are not malicious (it is
fortunately difficult to embed C code into MirageOS, thus only few bits written
in C are in MirageOS (such as (loop and allocation free) [crypto
primitives](https://github.com/mirleft/ocaml-nocrypto/tree/f076d4e75c56054d79b876e00b6bded06d90df86/src/native)).
To further read up on the topic, there is a [nice article about the
security](https://matildah.github.io/posts/2016-01-30-unikernel-security.html).
This website is 12MB in size (and I didn't even bother to strip yet), which
includes the static CSS and JavaScript (bootstrap, jquery, fonts), [HTTP](https://github.com/mirage/ocaml-cohttp), [TLS](https://github.com/mirleft/ocaml-tls) (also [X.509](https://github.com/mirleft/ocaml-x509), [ASN.1](https://github.com/mirleft/ocaml-asn1-combinators), [crypto](https://github.com/mirleft/ocaml-nocrypto)), [git](https://github.com/mirage/ocaml-git/) (and [irmin](https://github.com/mirage/irmin)), [TCP/IP](https://github.com/mirage/mirage-tcpip) libraries.
The memory management in MirageOS is
straightforward: the hypervisor provides the OCaml runtime with a chunk of memory, which
immediately takes all of it.
This is much simpler to configure and deploy than a UNIX operating system:
There is no virtual memory, no process management, no file
system (the markdown content is held in memory with irmin!), no user management in the image.
At compile (configuration) time, the TLS keys are baked into the image, in addition to the url of the remote
git repository, the IPv4 address and ports the image should use:
The full command line for configuring this website is: `mirage configure --no-opam --xen -i Posts -n "full stack engineer" -r https://github.com/hannesm/hannes.nqsb.io.git --dhcp false --network 0 --ip 198.167.222.205 --netmask 255.255.255.0 --gateways 198.167.222.1 --tls 443 --port 80`.
It relies on the fact that the TLS certificate chain and private key are in the `tls/` subdirectory, which is transformed to code and included in the image (using [crunch](https://github.com/mirage/ocaml-crunch)). An improvement would be to [use an ELF section](https://github.com/mirage/mirage/issues/489), but there is no code yet.
After configuring and installing the required dependencies, a `make` builds the statically linked image.
Deployment is done via `xl create canopy.xl`. The file `canopy.xl` is automatically generated by `mirage --configure` (but might need modifications). It contains the full path to the image, the name of the bridge
interface, and how much memory the image can use:
```
name = 'canopy'
kernel = 'mir-canopy.xen'
builder = 'linux'
memory = 256
on_crash = 'preserve'
vif = [ 'bridge=br0' ]
```
To rephrase: instead of running on a multi-purpose operating system including processes, file system, etc., this website uses a
set of libraries, which are compiled and statically
linked into the virtual machine image.
MirageOS uses the module system of OCaml to define how interfaces should be, thus an
application developer does not need to care whether they are using the TCP/IP
stack written in OCaml, or the sockets API of a UNIX operating system. This
also allows to compile and debug your library on UNIX using off-the-shelf tools
before deploying it as a virtual machine (NB: this is a lie, since there is code
which is only executed when running on Xen, and this code can be buggy) ;).
Most of the MirageOS ecosystem is developed under MIT/ISC/BSD license, which
allows everybody to use it for whichever project they want.
Did I mention that by using less code the attack vector shrinks? In
addition to that, using a memory safe programming language, where the developer
does not need to care about memory management and bounds checks, immediately removes
several classes of security problems (namely spatial and temporal memory
issues), once the runtime is trusted.
The OCaml runtime was reviewed by the French [Agence nationale de la sécurité des systèmes dinformation](http://www.ssi.gouv.fr/agence/publication/lafosec-securite-et-langages-fonctionnels/) in 2013,
leading to some changes, such as separation of immutable strings (`String`) from mutable byte vectors (`Bytes`).
The attack surface is still big enough: logical issues, resource management, and there is no access
control. This website does not need access control, publishing of content is protected by relying on GitHub's
access control.
I hope I gave some insight into what the purpose of an operating systems is, and
how MirageOS fits into the picture. I'm interested in feedback, either via
[twitter](https://twitter.com/h4nnes) or as an issue on the [data repository on
GitHub](https://github.com/hannesm/hannes.nqsb.io/issues).
## Other updates in the MirageOS ecosystem
- this website is based on [Canopy](https://github.com/Engil/Canopy), the content is stored as markdown in a [git repository](https://github.com/hannesm/hannes.nqsb.io)
- it was running in a [FreeBSD](https://FreeBSD.org) jail, but when I compiled too much the underlying [zfs file system](https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html) wasn't happy (and is now hanging in kernel space in a read)
- no remote power switch (borrowed to a friend 3 weeks ago), nobody was willing to go to the data centre and reboot
- I wanted to move it anyways to a host where I can deploy [Xen](http://www.xenproject.org/) guest VMs
- turns out the Xen compilation and deployment mode needed some love:
- I ported a newer [bin_prot](https://github.com/hannesm/bin_prot/tree/113.33.00+xen) to xen
- I wrote a clean patch to [serve via TLS](https://github.com/Engil/Canopy/pull/15) (including [HSTS header](https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security) and redirecting HTTP (moved permanently) to HTTPS)
- I found a memory leak in the [mirage-http](https://github.com/mirage/mirage-http/pull/23) library
- I was travelling
- good news: it now works on Xen, and there is [an atom feed](https://hannes.nqsb.io/atom)
- life of an "eat your own dogfood" full stack engineer ;)