From db5e8fd9cb09298e862bbf8e0dc10f114b40edb8 Mon Sep 17 00:00:00 2001 From: Hannes Mehnert Date: Mon, 21 Oct 2024 16:15:38 +0200 Subject: [PATCH] add arguments article --- articles/arguments.md | 538 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 538 insertions(+) create mode 100644 articles/arguments.md diff --git a/articles/arguments.md b/articles/arguments.md new file mode 100644 index 0000000..ba225d7 --- /dev/null +++ b/articles/arguments.md @@ -0,0 +1,538 @@ +--- +date: 2024-10-22 +title: Runtime arguments in MirageOS +description: + The history of runtime arguments to a MirageOS unikernel +tags: + - OCaml + - MirageOS +author: + name: Hannes Mehnert + email: hannes@mehnert.org + link: https://hannes.robur.coop +--- + +TL;DR: Passing runtime arguments around is tricky, and prone to change every other month. + +## Motivation + +Sometimes, as an unikernel developer and also as operator, it's nice to have +some runtime arguments passed to an unikernel. Now, if you're into OCaml, +command-line parsing - together with error messages, man page generation, ... - +can be done by the amazing [cmdliner](https://erratique.ch/software/cmdliner) +package from Daniel Bünzli. + +MirageOS uses cmdliner for command line argument passing. This also enabled +us from the early days to have nice man pages for unikernels (see +`my-unikernel-binary --help`). There are two kinds +of arguments: those at configuration time (`mirage configure`), such as the +target to compile for, and those at runtime - when the unikernel is executed. + +In Mirage 4.8.1 and 4.8.0 (released October 2024) there have been some changes +to command-line arguments, which were motivated by 4.5.0 (released April 2024) +and user feedback. + +First of all, our current way to pass a custom runtime argument to a unikernel +(`unikernel.ml`): +```OCaml +open Lwt.Infix +open Cmdliner + +let hello = + let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in + let term = Arg.(value & opt string "Hello World!" doc) in + Mirage_runtime.register_arg term + +module Hello (Time : Mirage_time.S) = struct + let start _time = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + Logs.info (fun f -> f "%s" (hello ())); + Time.sleep_ns (Duration.of_sec 1) >>= fun () -> loop (n - 1) + in + loop 4 +end +``` + +We define the [Cmdliner.Term.t](https://erratique.ch/software/cmdliner/doc/Cmdliner/Term/index.html#type-t) +in line 6 (`let term = ..`) - which provides documentation ("How to say hello."), the option to +use (`["hello"]` - which is then translated to `--hello=`), that it is optional, +of type `string` (cmdliner allows you to convert the incoming strings to more +complex (or more narrow) data types, with decent error handling). + +The defined argument is directly passed to [`Mirage_runtime.register_arg`](https://ocaml.org/p/mirage-runtime/4.8.1/doc/Mirage_runtime/index.html#val-register_arg), +(in line 7) so our binding `hello` is of type `unit -> string`. +In line 14, the value of the runtime argument is used (`hello ()`) for printing +a log message. + +The nice property is that it is all local in `unikernel.ml`, there are no other +parts involved. It is just a bunch of API calls. The downside is that `hello ()` +should only be evaluated after the function `start` was called - since the +`Mirage_runtime` needs to parse and fill in the command line arguments. If you +call `hello ()` earlier, you'll get an exception "Called too early. Please delay +this call to after the start function of the unikernel.". Also, since +Mirage_runtime needs to collect and evaluate the command line arguments, the +`Mirage_runtime.register_arg` may only be called at top-level, otherwise you'll +get another exception "The function register_arg was called to late. Please call +register_arg before the start function is executed (e.g. in a top-level binding).". + +Another advantage is, having it all in unikernel.ml means adding and removing +arguments doesn't need another execution of `mirage configure`. Also, any +type can be used that the unikernel depends on - the config.ml is compiled only +with a small set of dependencies (mirage itself) - and we don't want to impose a +large dependency cone for mirage just because someone may like to use +X509.Key_type.t as argument type. + +Earlier, before mirage 4.5.0, we had runtime and configure arguments mixed +together. And code was generated when `mirage configure` was executed to +deal with these arguments. The downsides included: we needed serialization for +all command-line arguments (at configure time you could fill the argument, which +was then serialized, and deserialized at runtime and used unless the argument +was provided explicitly), they had to appear in `config.ml` (which also means +changing any would need an execution of `mirage configure`), since they generated code +potential errors were in code that the developer didn't write (though we had +some `__POS__` arguments to provide error locations in the developer code). + +Related recent changes are: +- in mirage 4.8.1, the runtime arguments to configure the OCaml runtime system + (such as GC settings, randomization of hashtables, recording of backtraces) + are now provided using the [cmdliner-stdlib](https://ocaml.org/p/cmdliner-stdlib) + package. +- in mirage 4.8.0, for git, dns-client, and happy-eyeballs devices the optional + arguments are generated by default - so they are always available and don't + need to be manually done by the unikernel developer. + +Let's dive a bit deeper into the history. + +## History + +In MirageOS, since the early stages (I'll go back to 2.7.0 (February 2016) where +functoria was introduced) used an embedded fork of `cmdliner` to handle command +line arguments. + +[![Animated changes to the hello world unikernel](https://asciinema.org/a/ruHoadi2oZGOzgzMKk5ZYoFgf.svg)](https://asciinema.org/a/ruHoadi2oZGOzgzMKk5ZYoFgf) + +### February 2016 (Mirage 2.7.0) + +When looking into the MirageOS 2.x series, here's the code for our hello world +unikernel: + +`config.ml` +```OCaml +open Mirage + +let hello = + let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in + Key.(create "hello" Arg.(opt string "Hello World!" doc)) + +let main = + foreign + ~keys:[Key.abstract hello] + "Unikernel.Hello" (console @-> job) + +let () = register "hello-key" [main $ default_console] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix + +module Hello (C: V1_LWT.CONSOLE) = struct + let start c = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + C.log c (Key_gen.hello ()); + OS.Time.sleep 1.0 >>= fun () -> + loop (n-1) + in + loop 4 +end +``` + +As you can see, the cmdliner term was provided in `config.ml`, and in +`unikernel.ml` the expression `Key_gen.hello ()` was used - `Key_gen` was +a module generated by the `mirage configure` invocation. + +You can as well see that the term was wrapped in `Key.create "hello"` - where +this string was used as the identifier for the code generation. + +As mentioned above, a change needed to be done in `config.ml` and a +`mirage configure` to take effect. + +### July 2016 (Mirage 2.9.1) + +The `OS.Time` was functorized with a `Time` functor: + +`config.ml` +```OCaml +open Mirage + +let hello = + let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in + Key.(create "hello" Arg.(opt string "Hello World!" doc)) + +let main = + foreign + ~keys:[Key.abstract hello] + "Unikernel.Hello" (console @-> time @-> job) + +let () = register "hello-key" [main $ default_console $ default_time] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix + +module Hello (C: V1_LWT.CONSOLE) (Time : V1_LWT.TIME) = struct + let start c _time = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + C.log c (Key_gen.hello ()); + Time.sleep 1.0 >>= fun () -> + loop (n-1) + in + loop 4 +end +``` + +### February 2017 (Mirage pre3) + +The `Time` signature changed, now the `sleep_ns` function sleeps in nanoseconds. +This avoids floating point numbers at the core of MirageOS. The helper package +`duration` is used to avoid manual conversions. + +Also, the console signature changed - and `log` is now inside the Lwt monad. + +`config.ml` +```OCaml +open Mirage + +let hello = + let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in + Key.(create "hello" Arg.(opt string "Hello World!" doc)) + +let main = + foreign + ~keys:[Key.abstract hello] + ~packages:[package "duration"] + "Unikernel.Hello" (console @-> time @-> job) + +let () = register "hello-key" [main $ default_console $ default_time] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix + +module Hello (C: V1_LWT.CONSOLE) (Time : V1_LWT.TIME) = struct + let start c _time = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + C.log c (Key_gen.hello ()) >>= fun () -> + Time.sleep_ns (Duration.of_sec 1) >>= fun () -> + loop (n-1) + in + loop 4 +end +``` + +### February 2017 (Mirage 3) + +Another big change is that now console is not used anymore, but +[logs](https://erratique.ch/software/logs). + +`config.ml` +```OCaml +open Mirage + +let hello = + let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in + Key.(create "hello" Arg.(opt string "Hello World!" doc)) + +let main = + foreign + ~keys:[Key.abstract hello] + ~packages:[package "duration"] + "Unikernel.Hello" (time @-> job) + +let () = register "hello-key" [main $ default_time] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix + +module Hello (Time : Mirage_time_lwt.S) = struct + let start _time = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + Logs.info (fun f -> f "%s" (Key_gen.hello ())); + Time.sleep_ns (Duration.of_sec 1) >>= fun () -> + loop (n-1) + in + loop 4 +end +``` + +### January 2020 (Mirage 3.7.0) + +The `_lwt` is dropped from the interfaces (we used to have Mirage_time and +Mirage_time_lwt - where the latter was instantiating the former with concrete +types: `type 'a io = Lwt.t` and `type buffer = Cstruct.t` -- in a cleanup +session we dropped the `_lwt` interfaces and opam packages. The reasoning was +that when we'll get around to move to another IO system, we'll move everything +at once anyways. No need to have `lwt` and something else (`async`, or nowadays +`miou` or `eio`) in a single unikernel. + +`config.ml` +```OCaml +open Mirage + +let hello = + let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in + Key.(create "hello" Arg.(opt string "Hello World!" doc)) + +let main = + foreign + ~keys:[Key.abstract hello] + ~packages:[package "duration"] + "Unikernel.Hello" (time @-> job) + +let () = register "hello-key" [main $ default_time] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix + +module Hello (Time : Mirage_time.S) = struct + let start _time = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + Logs.info (fun f -> f "%s" (Key_gen.hello ())); + Time.sleep_ns (Duration.of_sec 1) >>= fun () -> + loop (n-1) + in + loop 4 +end +``` + +### October 2021 (Mirage 3.10) + +Some renamings to fix warnings. Only `config.ml` changed. + +`config.ml` +```OCaml +open Mirage + +let hello = + let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in + Key.(create "hello" Arg.(opt string "Hello World!" doc)) + +let main = + main + ~keys:[key hello] + ~packages:[package "duration"] + "Unikernel.Hello" (time @-> job) + +let () = register "hello-key" [main $ default_time] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix + +module Hello (Time : Mirage_time.S) = struct + let start _time = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + Logs.info (fun f -> f "%s" (Key_gen.hello ())); + Time.sleep_ns (Duration.of_sec 1) >>= fun () -> + loop (n-1) + in + loop 4 +end +``` + +### June 2023 (Mirage 4.4) + +The argument was moved to runtime. + +`config.ml` +```OCaml +open Mirage + +let hello = + let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in + Key.(create "hello" Arg.(opt ~stage:`Run string "Hello World!" doc)) + +let main = + main + ~keys:[key hello] + ~packages:[package "duration"] + "Unikernel.Hello" (time @-> job) + +let () = register "hello-key" [main $ default_time] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix + +module Hello (Time : Mirage_time.S) = struct + let start _time = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + Logs.info (fun f -> f "%s" (Key_gen.hello ()); + Time.sleep_ns (Duration.of_sec 1) >>= fun () -> + loop (n-1) + in + loop 4 +end +``` + +### March 2024 (Mirage 4.5) + +The runtime argument is in `config.ml` refering to the argument as string +("Unikernel.hello"), and being passed to the `start` function as argument. + +`config.ml` +```OCaml +open Mirage + +let runtime_args = [ runtime_arg ~pos:__POS__ "Unikernel.hello" ] + +let main = + main + ~runtime_args + ~packages:[package "duration"] + "Unikernel.Hello" (time @-> job) + +let () = register "hello-key" [main $ default_time] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix +open Cmdliner + +let hello = + let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in + Arg.(value & opt string "Hello World!" doc) + +module Hello (Time : Mirage_time.S) = struct + let start _time hello = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + Logs.info (fun f -> f "%s" hello); + Time.sleep_ns (Duration.of_sec 1) >>= fun () -> + loop (n-1) + in + loop 4 +end +``` + +### October 2024 (Mirage 4.8) + +Again, moved out of `config.ml`. + +`config.ml` +```OCaml +open Mirage + +let main = + main + ~packages:[package "duration"] + "Unikernel.Hello" (time @-> job) + +let () = register "hello-key" [main $ default_time] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix +open Cmdliner + +let hello = + let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in + Mirage_runtime.register_arg Arg.(value & opt string "Hello World!" doc) + +module Hello (Time : Mirage_time.S) = struct + let start _time = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + Logs.info (fun f -> f "%s" (hello ())); + Time.sleep_ns (Duration.of_sec 1) >>= fun () -> + loop (n-1) + in + loop 4 +end +``` + +### 2024 (Not yet released) + +This is the future with time defunctorized. Read more in the [discussion](https://github.com/mirage/mirage/issues/1513). +To delay the start function, a `dep` of `noop` is introduced. + +`config.ml` +```OCaml +open Mirage + +let main = + main + ~packages:[package "duration"] + ~dep:[dep noop] + "Unikernel" job + +let () = register "hello-key" [main] +``` + +and `unikernel.ml` +```OCaml +open Lwt.Infix +open Cmdliner + +let hello = + let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in + Mirage_runtime.register_arg Arg.(value & opt string "Hello World!" doc) + +let start () = + let rec loop = function + | 0 -> Lwt.return_unit + | n -> + Logs.info (fun f -> f "%s" (hello ())); + Mirage_timer.sleep_ns (Duration.of_sec 1) >>= fun () -> + loop (n-1) + in + loop 4 +``` + +## Conclusion + +The history of hello world shows that over time we slowly improve the developer +experience, and removing the boilerplate needed to get MirageOS unikernels up +and running. This is work over a decade including lots of other (here invisible) +improvements to the mirage utility. + +Our current goal is to minimize the code generated by mirage, since code +generation has lots of issues (e.g. error locations, naming, binary size). It +is a long journey. At the same time, we are working on improving the performance +of MirageOS unikernels, developing unikernels that are useful in the real +world ([VPN endpoint](https://github.com/robur-coop/miragevpn), [DNSmasq replacement](https://github.com/robur-coop/dnsvizor), ...), and also [simplifying the +deployment of MirageOS unikernels](https://github.com/robur-coop/mollymawk). + +If you're interested in MirageOS and using it in your domain, don't hesitate +to reach out to us (via eMail: team@robur.coop) - we're keen to deploy MirageOS +and find more domains where it is useful. If you can spare a dime, we're a +registered non-profit in Germany - and can provide tax-deductable receipts for +donations ([more information](https://robur.coop/Donate)).