add arguments article

This commit is contained in:
Hannes Mehnert 2024-10-21 16:15:38 +02:00
parent 9388e31171
commit db5e8fd9cb

538
articles/arguments.md Normal file
View file

@ -0,0 +1,538 @@
---
date: 2024-10-22
title: Runtime arguments in MirageOS
description:
The history of runtime arguments to a MirageOS unikernel
tags:
- OCaml
- MirageOS
author:
name: Hannes Mehnert
email: hannes@mehnert.org
link: https://hannes.robur.coop
---
TL;DR: Passing runtime arguments around is tricky, and prone to change every other month.
## Motivation
Sometimes, as an unikernel developer and also as operator, it's nice to have
some runtime arguments passed to an unikernel. Now, if you're into OCaml,
command-line parsing - together with error messages, man page generation, ... -
can be done by the amazing [cmdliner](https://erratique.ch/software/cmdliner)
package from Daniel Bünzli.
MirageOS uses cmdliner for command line argument passing. This also enabled
us from the early days to have nice man pages for unikernels (see
`my-unikernel-binary --help`). There are two kinds
of arguments: those at configuration time (`mirage configure`), such as the
target to compile for, and those at runtime - when the unikernel is executed.
In Mirage 4.8.1 and 4.8.0 (released October 2024) there have been some changes
to command-line arguments, which were motivated by 4.5.0 (released April 2024)
and user feedback.
First of all, our current way to pass a custom runtime argument to a unikernel
(`unikernel.ml`):
```OCaml
open Lwt.Infix
open Cmdliner
let hello =
let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
let term = Arg.(value & opt string "Hello World!" doc) in
Mirage_runtime.register_arg term
module Hello (Time : Mirage_time.S) = struct
let start _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "%s" (hello ()));
Time.sleep_ns (Duration.of_sec 1) >>= fun () -> loop (n - 1)
in
loop 4
end
```
We define the [Cmdliner.Term.t](https://erratique.ch/software/cmdliner/doc/Cmdliner/Term/index.html#type-t)
in line 6 (`let term = ..`) - which provides documentation ("How to say hello."), the option to
use (`["hello"]` - which is then translated to `--hello=`), that it is optional,
of type `string` (cmdliner allows you to convert the incoming strings to more
complex (or more narrow) data types, with decent error handling).
The defined argument is directly passed to [`Mirage_runtime.register_arg`](https://ocaml.org/p/mirage-runtime/4.8.1/doc/Mirage_runtime/index.html#val-register_arg),
(in line 7) so our binding `hello` is of type `unit -> string`.
In line 14, the value of the runtime argument is used (`hello ()`) for printing
a log message.
The nice property is that it is all local in `unikernel.ml`, there are no other
parts involved. It is just a bunch of API calls. The downside is that `hello ()`
should only be evaluated after the function `start` was called - since the
`Mirage_runtime` needs to parse and fill in the command line arguments. If you
call `hello ()` earlier, you'll get an exception "Called too early. Please delay
this call to after the start function of the unikernel.". Also, since
Mirage_runtime needs to collect and evaluate the command line arguments, the
`Mirage_runtime.register_arg` may only be called at top-level, otherwise you'll
get another exception "The function register_arg was called to late. Please call
register_arg before the start function is executed (e.g. in a top-level binding).".
Another advantage is, having it all in unikernel.ml means adding and removing
arguments doesn't need another execution of `mirage configure`. Also, any
type can be used that the unikernel depends on - the config.ml is compiled only
with a small set of dependencies (mirage itself) - and we don't want to impose a
large dependency cone for mirage just because someone may like to use
X509.Key_type.t as argument type.
Earlier, before mirage 4.5.0, we had runtime and configure arguments mixed
together. And code was generated when `mirage configure` was executed to
deal with these arguments. The downsides included: we needed serialization for
all command-line arguments (at configure time you could fill the argument, which
was then serialized, and deserialized at runtime and used unless the argument
was provided explicitly), they had to appear in `config.ml` (which also means
changing any would need an execution of `mirage configure`), since they generated code
potential errors were in code that the developer didn't write (though we had
some `__POS__` arguments to provide error locations in the developer code).
Related recent changes are:
- in mirage 4.8.1, the runtime arguments to configure the OCaml runtime system
(such as GC settings, randomization of hashtables, recording of backtraces)
are now provided using the [cmdliner-stdlib](https://ocaml.org/p/cmdliner-stdlib)
package.
- in mirage 4.8.0, for git, dns-client, and happy-eyeballs devices the optional
arguments are generated by default - so they are always available and don't
need to be manually done by the unikernel developer.
Let's dive a bit deeper into the history.
## History
In MirageOS, since the early stages (I'll go back to 2.7.0 (February 2016) where
functoria was introduced) used an embedded fork of `cmdliner` to handle command
line arguments.
[![Animated changes to the hello world unikernel](https://asciinema.org/a/ruHoadi2oZGOzgzMKk5ZYoFgf.svg)](https://asciinema.org/a/ruHoadi2oZGOzgzMKk5ZYoFgf)
### February 2016 (Mirage 2.7.0)
When looking into the MirageOS 2.x series, here's the code for our hello world
unikernel:
`config.ml`
```OCaml
open Mirage
let hello =
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
Key.(create "hello" Arg.(opt string "Hello World!" doc))
let main =
foreign
~keys:[Key.abstract hello]
"Unikernel.Hello" (console @-> job)
let () = register "hello-key" [main $ default_console]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
module Hello (C: V1_LWT.CONSOLE) = struct
let start c =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
C.log c (Key_gen.hello ());
OS.Time.sleep 1.0 >>= fun () ->
loop (n-1)
in
loop 4
end
```
As you can see, the cmdliner term was provided in `config.ml`, and in
`unikernel.ml` the expression `Key_gen.hello ()` was used - `Key_gen` was
a module generated by the `mirage configure` invocation.
You can as well see that the term was wrapped in `Key.create "hello"` - where
this string was used as the identifier for the code generation.
As mentioned above, a change needed to be done in `config.ml` and a
`mirage configure` to take effect.
### July 2016 (Mirage 2.9.1)
The `OS.Time` was functorized with a `Time` functor:
`config.ml`
```OCaml
open Mirage
let hello =
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
Key.(create "hello" Arg.(opt string "Hello World!" doc))
let main =
foreign
~keys:[Key.abstract hello]
"Unikernel.Hello" (console @-> time @-> job)
let () = register "hello-key" [main $ default_console $ default_time]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
module Hello (C: V1_LWT.CONSOLE) (Time : V1_LWT.TIME) = struct
let start c _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
C.log c (Key_gen.hello ());
Time.sleep 1.0 >>= fun () ->
loop (n-1)
in
loop 4
end
```
### February 2017 (Mirage pre3)
The `Time` signature changed, now the `sleep_ns` function sleeps in nanoseconds.
This avoids floating point numbers at the core of MirageOS. The helper package
`duration` is used to avoid manual conversions.
Also, the console signature changed - and `log` is now inside the Lwt monad.
`config.ml`
```OCaml
open Mirage
let hello =
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
Key.(create "hello" Arg.(opt string "Hello World!" doc))
let main =
foreign
~keys:[Key.abstract hello]
~packages:[package "duration"]
"Unikernel.Hello" (console @-> time @-> job)
let () = register "hello-key" [main $ default_console $ default_time]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
module Hello (C: V1_LWT.CONSOLE) (Time : V1_LWT.TIME) = struct
let start c _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
C.log c (Key_gen.hello ()) >>= fun () ->
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
end
```
### February 2017 (Mirage 3)
Another big change is that now console is not used anymore, but
[logs](https://erratique.ch/software/logs).
`config.ml`
```OCaml
open Mirage
let hello =
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
Key.(create "hello" Arg.(opt string "Hello World!" doc))
let main =
foreign
~keys:[Key.abstract hello]
~packages:[package "duration"]
"Unikernel.Hello" (time @-> job)
let () = register "hello-key" [main $ default_time]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
module Hello (Time : Mirage_time_lwt.S) = struct
let start _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "%s" (Key_gen.hello ()));
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
end
```
### January 2020 (Mirage 3.7.0)
The `_lwt` is dropped from the interfaces (we used to have Mirage_time and
Mirage_time_lwt - where the latter was instantiating the former with concrete
types: `type 'a io = Lwt.t` and `type buffer = Cstruct.t` -- in a cleanup
session we dropped the `_lwt` interfaces and opam packages. The reasoning was
that when we'll get around to move to another IO system, we'll move everything
at once anyways. No need to have `lwt` and something else (`async`, or nowadays
`miou` or `eio`) in a single unikernel.
`config.ml`
```OCaml
open Mirage
let hello =
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
Key.(create "hello" Arg.(opt string "Hello World!" doc))
let main =
foreign
~keys:[Key.abstract hello]
~packages:[package "duration"]
"Unikernel.Hello" (time @-> job)
let () = register "hello-key" [main $ default_time]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
module Hello (Time : Mirage_time.S) = struct
let start _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "%s" (Key_gen.hello ()));
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
end
```
### October 2021 (Mirage 3.10)
Some renamings to fix warnings. Only `config.ml` changed.
`config.ml`
```OCaml
open Mirage
let hello =
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
Key.(create "hello" Arg.(opt string "Hello World!" doc))
let main =
main
~keys:[key hello]
~packages:[package "duration"]
"Unikernel.Hello" (time @-> job)
let () = register "hello-key" [main $ default_time]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
module Hello (Time : Mirage_time.S) = struct
let start _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "%s" (Key_gen.hello ()));
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
end
```
### June 2023 (Mirage 4.4)
The argument was moved to runtime.
`config.ml`
```OCaml
open Mirage
let hello =
let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
Key.(create "hello" Arg.(opt ~stage:`Run string "Hello World!" doc))
let main =
main
~keys:[key hello]
~packages:[package "duration"]
"Unikernel.Hello" (time @-> job)
let () = register "hello-key" [main $ default_time]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
module Hello (Time : Mirage_time.S) = struct
let start _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "%s" (Key_gen.hello ());
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
end
```
### March 2024 (Mirage 4.5)
The runtime argument is in `config.ml` refering to the argument as string
("Unikernel.hello"), and being passed to the `start` function as argument.
`config.ml`
```OCaml
open Mirage
let runtime_args = [ runtime_arg ~pos:__POS__ "Unikernel.hello" ]
let main =
main
~runtime_args
~packages:[package "duration"]
"Unikernel.Hello" (time @-> job)
let () = register "hello-key" [main $ default_time]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
open Cmdliner
let hello =
let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
Arg.(value & opt string "Hello World!" doc)
module Hello (Time : Mirage_time.S) = struct
let start _time hello =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "%s" hello);
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
end
```
### October 2024 (Mirage 4.8)
Again, moved out of `config.ml`.
`config.ml`
```OCaml
open Mirage
let main =
main
~packages:[package "duration"]
"Unikernel.Hello" (time @-> job)
let () = register "hello-key" [main $ default_time]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
open Cmdliner
let hello =
let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
Mirage_runtime.register_arg Arg.(value & opt string "Hello World!" doc)
module Hello (Time : Mirage_time.S) = struct
let start _time =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "%s" (hello ()));
Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
end
```
### 2024 (Not yet released)
This is the future with time defunctorized. Read more in the [discussion](https://github.com/mirage/mirage/issues/1513).
To delay the start function, a `dep` of `noop` is introduced.
`config.ml`
```OCaml
open Mirage
let main =
main
~packages:[package "duration"]
~dep:[dep noop]
"Unikernel" job
let () = register "hello-key" [main]
```
and `unikernel.ml`
```OCaml
open Lwt.Infix
open Cmdliner
let hello =
let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
Mirage_runtime.register_arg Arg.(value & opt string "Hello World!" doc)
let start () =
let rec loop = function
| 0 -> Lwt.return_unit
| n ->
Logs.info (fun f -> f "%s" (hello ()));
Mirage_timer.sleep_ns (Duration.of_sec 1) >>= fun () ->
loop (n-1)
in
loop 4
```
## Conclusion
The history of hello world shows that over time we slowly improve the developer
experience, and removing the boilerplate needed to get MirageOS unikernels up
and running. This is work over a decade including lots of other (here invisible)
improvements to the mirage utility.
Our current goal is to minimize the code generated by mirage, since code
generation has lots of issues (e.g. error locations, naming, binary size). It
is a long journey. At the same time, we are working on improving the performance
of MirageOS unikernels, developing unikernels that are useful in the real
world ([VPN endpoint](https://github.com/robur-coop/miragevpn), [DNSmasq replacement](https://github.com/robur-coop/dnsvizor), ...), and also [simplifying the
deployment of MirageOS unikernels](https://github.com/robur-coop/mollymawk).
If you're interested in MirageOS and using it in your domain, don't hesitate
to reach out to us (via eMail: team@robur.coop) - we're keen to deploy MirageOS
and find more domains where it is useful. If you can spare a dime, we're a
registered non-profit in Germany - and can provide tax-deductable receipts for
donations ([more information](https://robur.coop/Donate)).