blog.robur.coop

The Robur cooperative blog.
Back to index

Runtime arguments in MirageOS

I MODIFIED THIS FILE THE REPO IS NOW DIRTY!

TL;DR: Passing runtime arguments around is tricky, and prone to change every other month.

Motivation

Sometimes, as an unikernel developer and also as operator, it's nice to have some runtime arguments passed to an unikernel. Now, if you're into OCaml, command-line parsing - together with error messages, man page generation, ... - can be done by the amazing cmdliner package from Daniel Bünzli.

MirageOS uses cmdliner for command line argument passing. This also enabled us from the early days to have nice man pages for unikernels (see my-unikernel-binary --help). There are two kinds of arguments: those at configuration time (mirage configure), such as the target to compile for, and those at runtime - when the unikernel is executed.

In Mirage 4.8.1 and 4.8.0 (released October 2024) there have been some changes to command-line arguments, which were motivated by 4.5.0 (released April 2024) and user feedback.

First of all, our current way to pass a custom runtime argument to a unikernel (unikernel.ml):

open Lwt.Infix
open Cmdliner

let hello =
  let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
  let term = Arg.(value & opt string "Hello World!" doc) in
  Mirage_runtime.register_arg term

module Hello (Time : Mirage_time.S) = struct
  let start _time =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
          Logs.info (fun f -> f "%s" (hello ()));
          Time.sleep_ns (Duration.of_sec 1) >>= fun () -> loop (n - 1)
    in
    loop 4
end

We define the Cmdliner.Term.t in line 6 (let term = ..) - which provides documentation ("How to say hello."), the option to use (["hello"] - which is then translated to --hello=), that it is optional, of type string (cmdliner allows you to convert the incoming strings to more complex (or more narrow) data types, with decent error handling).

The defined argument is directly passed to Mirage_runtime.register_arg, (in line 7) so our binding hello is of type unit -> string. In line 14, the value of the runtime argument is used (hello ()) for printing a log message.

The nice property is that it is all local in unikernel.ml, there are no other parts involved. It is just a bunch of API calls. The downside is that hello () should only be evaluated after the function start was called - since the Mirage_runtime needs to parse and fill in the command line arguments. If you call hello () earlier, you'll get an exception "Called too early. Please delay this call to after the start function of the unikernel.". Also, since Mirage_runtime needs to collect and evaluate the command line arguments, the Mirage_runtime.register_arg may only be called at top-level, otherwise you'll get another exception "The function register_arg was called to late. Please call register_arg before the start function is executed (e.g. in a top-level binding).".

Another advantage is, having it all in unikernel.ml means adding and removing arguments doesn't need another execution of mirage configure. Also, any type can be used that the unikernel depends on - the config.ml is compiled only with a small set of dependencies (mirage itself) - and we don't want to impose a large dependency cone for mirage just because someone may like to use X509.Key_type.t as argument type.

Earlier, before mirage 4.5.0, we had runtime and configure arguments mixed together. And code was generated when mirage configure was executed to deal with these arguments. The downsides included: we needed serialization for all command-line arguments (at configure time you could fill the argument, which was then serialized, and deserialized at runtime and used unless the argument was provided explicitly), they had to appear in config.ml (which also means changing any would need an execution of mirage configure), since they generated code potential errors were in code that the developer didn't write (though we had some __POS__ arguments to provide error locations in the developer code).

Related recent changes are:

Let's dive a bit deeper into the history.

History

In MirageOS, since the early stages (I'll go back to 2.7.0 (February 2016) where functoria was introduced) used an embedded fork of cmdliner to handle command line arguments.

Animated changes to the hello world unikernel

February 2016 (Mirage 2.7.0)

When looking into the MirageOS 2.x series, here's the code for our hello world unikernel:

config.ml

open Mirage

let hello =
  let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
  Key.(create "hello" Arg.(opt string "Hello World!" doc))

let main =
  foreign
    ~keys:[Key.abstract hello]
    "Unikernel.Hello" (console @-> job)

let () = register "hello-key" [main $ default_console]

and unikernel.ml

open Lwt.Infix

module Hello (C: V1_LWT.CONSOLE) = struct
  let start c =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
        C.log c (Key_gen.hello ());
        OS.Time.sleep 1.0 >>= fun () ->
        loop (n-1)
    in
    loop 4
end

As you can see, the cmdliner term was provided in config.ml, and in unikernel.ml the expression Key_gen.hello () was used - Key_gen was a module generated by the mirage configure invocation.

You can as well see that the term was wrapped in Key.create "hello" - where this string was used as the identifier for the code generation.

As mentioned above, a change needed to be done in config.ml and a mirage configure to take effect.

July 2016 (Mirage 2.9.1)

The OS.Time was functorized with a Time functor:

config.ml

open Mirage

let hello =
  let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
  Key.(create "hello" Arg.(opt string "Hello World!" doc))

let main =
  foreign
    ~keys:[Key.abstract hello]
    "Unikernel.Hello" (console @-> time @-> job)

let () = register "hello-key" [main $ default_console $ default_time]

and unikernel.ml

open Lwt.Infix

module Hello (C: V1_LWT.CONSOLE) (Time : V1_LWT.TIME) = struct
  let start c _time =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
        C.log c (Key_gen.hello ());
        Time.sleep 1.0 >>= fun () ->
        loop (n-1)
    in
    loop 4
end

February 2017 (Mirage pre3)

The Time signature changed, now the sleep_ns function sleeps in nanoseconds. This avoids floating point numbers at the core of MirageOS. The helper package duration is used to avoid manual conversions.

Also, the console signature changed - and log is now inside the Lwt monad.

config.ml

open Mirage

let hello =
  let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
  Key.(create "hello" Arg.(opt string "Hello World!" doc))

let main =
  foreign
    ~keys:[Key.abstract hello]
    ~packages:[package "duration"]
    "Unikernel.Hello" (console @-> time @-> job)

let () = register "hello-key" [main $ default_console $ default_time]

and unikernel.ml

open Lwt.Infix

module Hello (C: V1_LWT.CONSOLE) (Time : V1_LWT.TIME) = struct
  let start c _time =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
        C.log c (Key_gen.hello ()) >>= fun () ->
        Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
        loop (n-1)
    in
    loop 4
end

February 2017 (Mirage 3)

Another big change is that now console is not used anymore, but logs.

config.ml

open Mirage

let hello =
  let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
  Key.(create "hello" Arg.(opt string "Hello World!" doc))

let main =
  foreign
    ~keys:[Key.abstract hello]
    ~packages:[package "duration"]
    "Unikernel.Hello" (time @-> job)

let () = register "hello-key" [main $ default_time]

and unikernel.ml

open Lwt.Infix

module Hello (Time : Mirage_time_lwt.S) = struct
  let start _time =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
        Logs.info (fun f -> f "%s" (Key_gen.hello ()));
        Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
        loop (n-1)
    in
    loop 4
end

January 2020 (Mirage 3.7.0)

The _lwt is dropped from the interfaces (we used to have Mirage_time and Mirage_time_lwt - where the latter was instantiating the former with concrete types: type 'a io = Lwt.t and type buffer = Cstruct.t -- in a cleanup session we dropped the _lwt interfaces and opam packages. The reasoning was that when we'll get around to move to another IO system, we'll move everything at once anyways. No need to have lwt and something else (async, or nowadays miou or eio) in a single unikernel.

config.ml

open Mirage

let hello =
  let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
  Key.(create "hello" Arg.(opt string "Hello World!" doc))

let main =
  foreign
    ~keys:[Key.abstract hello]
    ~packages:[package "duration"]
    "Unikernel.Hello" (time @-> job)

let () = register "hello-key" [main $ default_time]

and unikernel.ml

open Lwt.Infix

module Hello (Time : Mirage_time.S) = struct
  let start _time =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
        Logs.info (fun f -> f "%s" (Key_gen.hello ()));
        Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
        loop (n-1)
    in
    loop 4
end

October 2021 (Mirage 3.10)

Some renamings to fix warnings. Only config.ml changed.

config.ml

open Mirage

let hello =
  let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
  Key.(create "hello" Arg.(opt string "Hello World!" doc))

let main =
  main
    ~keys:[key hello]
    ~packages:[package "duration"]
    "Unikernel.Hello" (time @-> job)

let () = register "hello-key" [main $ default_time]

and unikernel.ml

open Lwt.Infix

module Hello (Time : Mirage_time.S) = struct
  let start _time =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
        Logs.info (fun f -> f "%s" (Key_gen.hello ()));
        Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
        loop (n-1)
    in
    loop 4
end

June 2023 (Mirage 4.4)

The argument was moved to runtime.

config.ml

open Mirage

let hello =
  let doc = Key.Arg.info ~doc:"How to say hello." ["hello"] in
  Key.(create "hello" Arg.(opt ~stage:`Run string "Hello World!" doc))

let main =
  main
    ~keys:[key hello]
    ~packages:[package "duration"]
    "Unikernel.Hello" (time @-> job)

let () = register "hello-key" [main $ default_time]

and unikernel.ml

open Lwt.Infix

module Hello (Time : Mirage_time.S) = struct
  let start _time =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
        Logs.info (fun f -> f "%s" (Key_gen.hello ());
        Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
        loop (n-1)
    in
    loop 4
end

March 2024 (Mirage 4.5)

The runtime argument is in config.ml refering to the argument as string ("Unikernel.hello"), and being passed to the start function as argument.

config.ml

open Mirage

let runtime_args = [ runtime_arg ~pos:__POS__ "Unikernel.hello" ]

let main =
  main
    ~runtime_args
    ~packages:[package "duration"]
    "Unikernel.Hello" (time @-> job)

let () = register "hello-key" [main $ default_time]

and unikernel.ml

open Lwt.Infix
open Cmdliner

let hello =
  let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
  Arg.(value & opt string "Hello World!" doc)

module Hello (Time : Mirage_time.S) = struct
  let start _time hello =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
        Logs.info (fun f -> f "%s" hello);
        Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
        loop (n-1)
    in
    loop 4
end

October 2024 (Mirage 4.8)

Again, moved out of config.ml.

config.ml

open Mirage

let main =
  main
    ~packages:[package "duration"]
    "Unikernel.Hello" (time @-> job)

let () = register "hello-key" [main $ default_time]

and unikernel.ml

open Lwt.Infix
open Cmdliner

let hello =
  let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
  Mirage_runtime.register_arg Arg.(value & opt string "Hello World!" doc)

module Hello (Time : Mirage_time.S) = struct
  let start _time =
    let rec loop = function
      | 0 -> Lwt.return_unit
      | n ->
        Logs.info (fun f -> f "%s" (hello ()));
        Time.sleep_ns (Duration.of_sec 1) >>= fun () ->
        loop (n-1)
    in
    loop 4
end

2024 (Not yet released)

This is the future with time defunctorized. Read more in the discussion. To delay the start function, a dep of noop is introduced.

config.ml

open Mirage

let main =
  main
    ~packages:[package "duration"]
    ~dep:[dep noop]
    "Unikernel" job

let () = register "hello-key" [main]

and unikernel.ml

open Lwt.Infix
open Cmdliner

let hello =
  let doc = Arg.info ~doc:"How to say hello." [ "hello" ] in
  Mirage_runtime.register_arg Arg.(value & opt string "Hello World!" doc)

let start () =
  let rec loop = function
    | 0 -> Lwt.return_unit
    | n ->
      Logs.info (fun f -> f "%s" (hello ()));
      Mirage_timer.sleep_ns (Duration.of_sec 1) >>= fun () ->
      loop (n-1)
  in
  loop 4

Conclusion

The history of hello world shows that over time we slowly improve the developer experience, and removing the boilerplate needed to get MirageOS unikernels up and running. This is work over a decade including lots of other (here invisible) improvements to the mirage utility.

Our current goal is to minimize the code generated by mirage, since code generation has lots of issues (e.g. error locations, naming, binary size). It is a long journey. At the same time, we are working on improving the performance of MirageOS unikernels, developing unikernels that are useful in the real world (VPN endpoint, DNSmasq replacement, ...), and also simplifying the deployment of MirageOS unikernels.

If you're interested in MirageOS and using it in your domain, don't hesitate to reach out to us (via eMail: team@robur.coop) - we're keen to deploy MirageOS and find more domains where it is useful. If you can spare a dime, we're a registered non-profit in Germany - and can provide tax-deductable receipts for donations (more information).