From 86c45a6e5f0457c377cdaf910b53d3a79b3f51e3 Mon Sep 17 00:00:00 2001 From: The Robur team Date: Wed, 21 Feb 2024 09:45:43 +0000 Subject: [PATCH] Built from d1e411bf7e3e13f5a9f8c535dc86f07830881690-dirty --- articles/lwt_pause.html | 264 ++++++++++++++++++++++++++++++++++++++++ feed.xml | 2 +- index.html | 13 ++ tags/community.html | 41 +++++++ tags/cryptography.html | 2 +- tags/git.html | 41 +++++++ tags/mirageos.html | 2 +- tags/ocaml.html | 6 +- tags/python.html | 2 +- tags/scheduler.html | 41 +++++++ tags/security.html | 2 +- tags/unicode.html | 2 +- tags/unikernel.html | 41 +++++++ tags/vpn.html | 2 +- 14 files changed, 451 insertions(+), 10 deletions(-) create mode 100644 articles/lwt_pause.html create mode 100644 tags/community.html create mode 100644 tags/git.html create mode 100644 tags/scheduler.html create mode 100644 tags/unikernel.html diff --git a/articles/lwt_pause.html b/articles/lwt_pause.html new file mode 100644 index 0000000..fd18f09 --- /dev/null +++ b/articles/lwt_pause.html @@ -0,0 +1,264 @@ + + + + + + + + + Robur's blog - Cooperation and Lwt.pause + + + + + + + + +
+

blog.robur.coop

+
+ The Robur cooperative blog. +
+
+
Back to index + +
+

Cooperation and Lwt.pause

+

Here's a concrete example of the notion of availability and the scheduler used +(in this case Lwt). As you may know, at Robur we have developed a unikernel: +opam-mirror. It launches an HTTP service that can be used as an +OPAM overlay available from a Git repository (with opam repository add <name> <url>).

+

The purpose of such an unikernel was to respond to a failure of the official +repository which fortunately did not last long and to offer decentralisation +of such a service. You can use https://opam.robur.coop!

+

It was also useful at the Mirage retreat, where we don't usually have a +great internet connection. Caching packages for our OCaml users on the local +network has benefited us in terms of our Internet bill by allowing the OCaml +users to fetch opam packages over the local network instead of over the shared, +metered 4G Internet conncetion.

+

Finally, it's a unikernel that I also use on my server for my software +reproducibility service in order to have an overlay for my +software like Bob.

+

In short, I advise you to use it, you can see its installation +here (I think that in the context of a company, internally, it +can be interesting to have such a unikernel available).

+

However, this unikernel had a long-standing problem. We were already talking +about it at the Mirleft retreat, when we tried to get the repository from Git, +we had a (fairly long) unavailability of our HTTP server. Basically, we had to +wait ~10 min before the service offered by the unikernel was available.

+

Availability

+

If you follow my articles, as far as Miou is concerned, from +the outset I talk of the notion of availability if we were to make yet another +new scheduler for OCaml 5. We emphasised this notion because we had quite a few +problems on this subject and Lwt.

+

In this case, the notion of availability requires the scheduler to be able to +observe system events as often as possible. The problem is that Lwt doesn't +really offer this approach.

+

Indeed, Lwt offers a way of observing system events (Lwt.pause) but does not +do so systematically. The only time you really give the scheduler the +opportunity to see whether you can read or write is when you want to... +read or write...

+

More generally, it is said that Lwt's bind does not yield. In other words, +you can chain any number of functions together (via the >>= operator), but +from Lwt's point of view, there is no opportunity to see if an event has +occurred. Lwt always tries to go as far down your chain as possible:

+
    +
  • and finish your promise
  • +
  • or come across an operation that requires a system event (read or write)
  • +
  • or come across an Lwt.pause (as a yield point)
  • +
+

Lwt is rather sparse in adding cooperation points besides Lwt.pause and +read/write operations, in contrast with Async where the bind operator is a +cooperation point.

+

If there is no I/O, do not wrap in Lwt

+

It was (bad1) advice I was given. If a function doesn't do +I/O, there's no point in putting it in Lwt. At first glance, however, the idea +may be a good one. If you have a function that doesn't do I/O, whether it's in +the Lwt monad or not won't make any difference to the way Lwt tries to execute +it. Once again, Lwt should go as far as possible. So Lwt tries to solve both +functions in the same way:

+
val merge : int array -> int array -> int array
+
+let rec sort0 arr =
+  if Array.length arr <= 1 then arr
+  else
+    let m = Array.length arr / 2 in
+    let arr0 = sort0 (Array.sub arr 0 m) in
+    let arr1 = sort0 (Array.sub arr m (Array.length arr - m)) in
+    merge arr0 arr1
+
+let rec sort1 arr =
+  let open Lwt.Infix in
+  if Array.length arr <= 1 then Lwt.return arr
+  else
+    let m = Array.length arr / 2 in
+    Lwt.both
+      (sort1 (Array.sub arr m (Array.length arr - m)))
+      (sort1 (Array.sub arr 0 m))
+    >|= fun (arr0, arr1) ->
+    merge arr0 arr1
+
+

If we trace the execution of the two functions (for example, by displaying our +arr each time), we see the same behaviour whether Lwt is used or not. However, +what is interesting in the Lwt code is the use of both, which suggests that +the processes are running at the same time.

+

"At the same time" does not necessarily suggest the use of several cores or "in +parallel", but the possibility that the right-hand side may also have the +opportunity to be executed even if the left-hand side has not finished. In other +words, that the two processes can run concurrently.

+

But factually, this is not the case, because even if we had the possibility of +a point of cooperation (with the >|= operator), Lwt tries to go as far as +possible and decides to finish the left part before launching the right part:

+
$ ./a.out
+sort0: [|3; 4; 2; 1; 7; 5; 8; 9; 0; 6|]
+sort0: [|3; 4; 2; 1; 7|]
+sort0: [|3; 4|]
+sort0: [|2; 1; 7|]
+sort0: [|1; 7|]
+sort0: [|5; 8; 9; 0; 6|]
+sort0: [|5; 8|]
+sort0: [|9; 0; 6|]
+sort0: [|0; 6|]
+
+sort1: [|3; 4; 2; 1; 7; 5; 8; 9; 0; 6|]
+sort1: [|3; 4; 2; 1; 7|]
+sort1: [|3; 4|]
+sort1: [|2; 1; 7|]
+sort1: [|1; 7|]
+sort1: [|5; 8; 9; 0; 6|]
+sort1: [|5; 8|]
+sort1: [|9; 0; 6|]
+sort1: [|0; 6|]
+
+
+

1: However, if you are not interested in availability +and would like the scheduler to try to resolve your promises as quickly as +possible, this advice is clearly valid.

+

Performances

+

It should be noted, however, that Lwt has an impact. Even if the behaviour is +the same, the Lwt layer is not free. A quick benchmark shows that there is an +overhead:

+
let _ =
+  let t0 = Unix.gettimeofday () in
+  for i = 0 to 1000 do let _ = sort0 arr in () done;
+  let t1 = Unix.gettimeofday () in
+  Fmt.pr "sort0 %fs\n%!" (t1 -. t0)
+
+let _ =
+  let t0 = Unix.gettimeofday () in
+  Lwt_main.run @@ begin
+    let open Lwt.Infix in
+    let rec go idx = if idx = 1000 then Lwt.return_unit
+      else sort1 arr >>= fun _ -> go (succ idx) in
+    go 0 end;
+  let t1 = Unix.gettimeofday () in
+  Fmt.pr "sort1 %fs\n%!" (t1 -. t0)
+
+
$ ./a.out
+sort0 0.000264s
+sort1 0.000676s
+
+

This is the fairly obvious argument for not using Lwt when there's no I/O. Then, +if the Lwt monad is really needed, a simple Lwt.return at the very last +instance is sufficient (or, better, the use of Lwt.map / >|=).

+

Cooperation and concrete example

+

So Lwt.both is the one to use when we want to run two processes +"at the same time". For the example, ocaml-git attempts both to +retrieve a repository and also to analyse it. This can be seen in this snippet +of code.

+

In our example with ocaml-git, the problem "shouldn't" appear because, in this +case, both the left and right side do I/O (the left side binds into a socket +while the right side saves Git objects in your file system). So, in our tests +with Git_unix, we were able to see that the analysis (right-hand side) was +well executed and 'interleaved' with the reception of objects from the network.

+

Composability

+

However, if we go back to our initial problem, we were talking about our +opam-mirror unikernel. As you might expect, there is no standalone MirageOS file +system (and many of our unikernels don't need one). So, in the case of +opam-mirror, we use the ocaml-git memory implementation: Git_mem.

+

Git_mem is different in that Git objects are simply stored in a Hashtbl. +There is no cooperation point when it comes to obtaining Git objects from this +Hashtbl. So let's return to our original advice:

+
+

don't wrap code in Lwt if it doesn't do I/O.

+
+

And, of course, Git_mem doesn't do I/O. It does, however, require the process +to be able to work with Lwt. In this case, Git_mem wraps the results in Lwt +as late as possible (as explained above, so as not to slow down our +processes unnecessarily). The choice inevitably means that the right-hand side +can no longer offer cooperation points. And this is where our problem begins: +composition.

+

In fact, we had something like:

+
let clone socket git =
+  Lwt.both (receive_pack socket) (analyse_pack git) >>= fun ((), ()) ->
+  Lwt.return_unit
+
+

However, our analyse_pack function is an injection of a functor representing +the Git backend. In other words, Git_unix or Git_mem:

+
module Make (Git : Git.S) = struct
+  let clone socket git =
+    Lwt.both (receive_pack socket) (Git.analyse_pack git) >>= fun ((), ()) ->
+    Lwt.return_unit
+end
+
+

Composability poses a problem here because even if Git_unix and Git_mem +offer the same function (so both modules can be used), the fact remains that one +will always offer a certain availability to other services (such as an HTTP +service) while the other will offer a Lwt function which will try to go as far +as possible quite to make other services unavailable.

+

Composing with one or the other therefore does not produce the same behavior.

+

Where to put Lwt.pause?

+

In this case, our analyse_pack does read/write on the Git store. As far as +Git_mem is concerned, we said that these read/write accesses were just +accesses to a Hashtbl.

+

Thanks to Hannes' help, it took us an afternoon to work out where we +needed to add cooperation points in Git_mem so that analyse_pack could give +another service such as HTTP the opportunity to work. Basically, this series of +commits shows where we needed to add Lwt.pause.

+

However, this points to a number of problems:

+
    +
  1. it is not necessarily true that on the basis of composability alone (by +functor or by value), Lwt reacts in the same way
  2. +
  3. Subtly, you have to dig into the code to find the right opportunities where +to put, by hand, Lwt.pause.
  4. +
  5. In the end, Lwt has no mechanisms for ensuring the availability of a service +(this is something that must be taken into account by the implementer).
  6. +
+

In-depth knowledge of Lwt

+

I haven't mentioned another problem we encountered with Armael when +implementing multipart_form where the use of stream meant that +Lwt didn't interleave the two processes and the use of a bounded stream was +required. Again, even when it comes to I/O, Lwt always tries to go as far as +possible in one of two branches of a Lwt.both.

+

This allows us to conclude that beyond the monad, Lwt has subtleties in its +behaviour which may be different from another scheduler such as Async (hence the +incompatibility between the two, which is not just of the 'a t type).

+

Digression on Miou

+

That's why we put so much emphasis on the notion of availability when it comes +to Miou: to avoid repeating the mistakes of the past. The choices that can be +made with regard to this notion in particular have a major impact, and can be +unsatisfactory to the user in certain cases (for example, so-called pure +calculations could take longer with Miou than with another scheduler).

+

In this sense, we have tried to constrain ourselves in the development of Miou +through the use of Effect.Shallow which requires us to always re-attach our +handler (our scheduler) as soon as an effect is produced, unlike Effect.Deep +which can re-use the same handler for several effects. In other words, and as +we've described here, an effect yields!

+

Conclusion

+

As far as opam-mirror is concerned, we now have an unikernel that is available +even if it attempts to clone a Git repository and save Git objects in memory. At +least, an HTTP service can co-exist with ocaml-git!

+

I hope we'll be able to use it at the next retreat, which I invite +you to attend to talk more about Lwt, scheduler, Git and unikernels!

+ +
+ +
+ + + + diff --git a/feed.xml b/feed.xml index 3b5c812..66d37e6 100644 --- a/feed.xml +++ b/feed.xml @@ -1 +1 @@ -Robur's bloghttps://blog.robur.coopThe Robur cooperative blogyocamlteam@robur.coopSpeeding elliptic curve cryptographyhttps://blog.robur.coop/articles/speeding-ec-string.htmlTue, 13 Feb 2024 10:00:00 GMTHow we improved the performance of elliptic curves by only modifying the underlying byte arrayhttps://blog.robur.coop/articles/speeding-ec-string.htmlPython's `str.__repr__()`https://blog.robur.coop/articles/2024-02-03-python-str-repr.htmlSat, 03 Feb 2024 10:00:00 GMTReimplementing Python string escaping in OCamlhttps://blog.robur.coop/articles/2024-02-03-python-str-repr.htmlMirageVPN updated (AEAD, NCP)https://blog.robur.coop/articles/miragevpn-ncp.htmlMon, 20 Nov 2023 10:00:00 GMTHow we resurrected MirageVPN from its bitrot statehttps://blog.robur.coop/articles/miragevpn-ncp.htmlMirageVPN & tls-crypt-v2https://blog.robur.coop/articles/miragevpn.htmlTue, 14 Nov 2023 10:00:00 GMTHow we implementated tls-crypt-v2 for miragevpnhttps://blog.robur.coop/articles/miragevpn.html \ No newline at end of file +Robur's bloghttps://blog.robur.coopThe Robur cooperative blogyocamlteam@robur.coopSpeeding elliptic curve cryptographyhttps://blog.robur.coop/articles/speeding-ec-string.htmlTue, 13 Feb 2024 10:00:00 GMTHow we improved the performance of elliptic curves by only modifying the underlying byte arrayhttps://blog.robur.coop/articles/speeding-ec-string.htmlCooperation and Lwt.pausehttps://blog.robur.coop/articles/lwt_pause.htmlSun, 11 Feb 2024 10:00:00 GMTA disgression about Lwt and Miouhttps://blog.robur.coop/articles/lwt_pause.htmlPython's `str.__repr__()`https://blog.robur.coop/articles/2024-02-03-python-str-repr.htmlSat, 03 Feb 2024 10:00:00 GMTReimplementing Python string escaping in OCamlhttps://blog.robur.coop/articles/2024-02-03-python-str-repr.htmlMirageVPN updated (AEAD, NCP)https://blog.robur.coop/articles/miragevpn-ncp.htmlMon, 20 Nov 2023 10:00:00 GMTHow we resurrected MirageVPN from its bitrot statehttps://blog.robur.coop/articles/miragevpn-ncp.htmlMirageVPN & tls-crypt-v2https://blog.robur.coop/articles/miragevpn.htmlTue, 14 Nov 2023 10:00:00 GMTHow we implementated tls-crypt-v2 for miragevpnhttps://blog.robur.coop/articles/miragevpn.html \ No newline at end of file diff --git a/index.html b/index.html index 0047b58..23dc595 100644 --- a/index.html +++ b/index.html @@ -38,6 +38,19 @@ +
  • + +
    + 2024-02-11 + Cooperation and Lwt.pause
    +

    A disgression about Lwt and Miou

    + +
  • diff --git a/tags/community.html b/tags/community.html new file mode 100644 index 0000000..c66c95b --- /dev/null +++ b/tags/community.html @@ -0,0 +1,41 @@ + + + + + + + + + Robur's blog + + + + + + + + +
    +

    blog.robur.coop

    +
    + The Robur cooperative blog. +
    +
    +
    Back to index + + + +

    + community + 1 entry

    + +
    + +
    + + + + diff --git a/tags/cryptography.html b/tags/cryptography.html index 8686474..9f08b3d 100644 --- a/tags/cryptography.html +++ b/tags/cryptography.html @@ -23,7 +23,7 @@
    Back to index - +

    cryptography diff --git a/tags/git.html b/tags/git.html new file mode 100644 index 0000000..9df8ee5 --- /dev/null +++ b/tags/git.html @@ -0,0 +1,41 @@ + + + + + + + + + Robur's blog + + + + + + + + +
    +

    blog.robur.coop

    +
    + The Robur cooperative blog. +
    +
    +
    Back to index + + + +

    + git + 1 entry

    + +
    + +
    + + + + diff --git a/tags/mirageos.html b/tags/mirageos.html index 9789fae..85913d7 100644 --- a/tags/mirageos.html +++ b/tags/mirageos.html @@ -23,7 +23,7 @@
    Back to index - +
    diff --git a/tags/python.html b/tags/python.html index 568d3ea..147dd97 100644 --- a/tags/python.html +++ b/tags/python.html @@ -23,7 +23,7 @@
    Back to index - +

    python diff --git a/tags/scheduler.html b/tags/scheduler.html new file mode 100644 index 0000000..2730b3f --- /dev/null +++ b/tags/scheduler.html @@ -0,0 +1,41 @@ + + + + + + + + + Robur's blog + + + + + + + + +
    +

    blog.robur.coop

    +
    + The Robur cooperative blog. +
    +
    +
    Back to index + + + +

    + scheduler + 1 entry

    + +
    + +
    + + + + diff --git a/tags/security.html b/tags/security.html index 831ac8b..7d2413d 100644 --- a/tags/security.html +++ b/tags/security.html @@ -23,7 +23,7 @@
    Back to index - +

    security diff --git a/tags/unicode.html b/tags/unicode.html index e4ff8d3..d5ee053 100644 --- a/tags/unicode.html +++ b/tags/unicode.html @@ -23,7 +23,7 @@
    Back to index - +

    unicode diff --git a/tags/unikernel.html b/tags/unikernel.html new file mode 100644 index 0000000..25b388f --- /dev/null +++ b/tags/unikernel.html @@ -0,0 +1,41 @@ + + + + + + + + + Robur's blog + + + + + + + + +
    +

    blog.robur.coop

    +
    + The Robur cooperative blog. +
    +
    +
    Back to index + + + +

    + unikernel + 1 entry

    + +
    + +
    + + + + diff --git a/tags/vpn.html b/tags/vpn.html index c6cb765..a3bd902 100644 --- a/tags/vpn.html +++ b/tags/vpn.html @@ -23,7 +23,7 @@
    Back to index - +

    vpn