Complete the README.md with how to integrate Cachet into schedulers

2024-11-08 11:30:47 +01:00 · 2024-11-08 11:30:47 +01:00 · 806ea63152
commit 806ea63152
parent 2860b7d33a
1 changed files with 49 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -28,3 +28,52 @@ let () =
  let seq = Cachet.get_seq cache 0 in
  ...
 ```
 ## Cachet and schedulers
 Cachet is designed to treat the `map` function as **atomic**. In other words: a
 unit of work that is indivisible and guaranteed to be executed as a single,
 coherent, and uninterrupted operation. Therefore, the `load` function (used to
 load a page) cannot be more cooperative (and give other tasks the opportunity to
 run) than it already is.
 Using Cachet with a scheduler requires addressing two issues:
 1) enabling cooperation **after** a page has been loaded
 2) the possibility of parallel loading of the page to ensure that other tasks
   can be executed
 For the first point, with regard to Lwt or Async, it's essentially a question of
 potentially adding `Lwt.pause` or `Async.yield` after using `Cachet.load` (or
 the user-friendly functions):
 ```ocaml
 let () = Lwt_main.run begin
  let page = Cachet.load cache 0xdead in
  let* () = Lwt.pause () in
  ... end
 ```
 For the second point, only OCaml 5 and effects can answer this issue by using an
 effect which will notify the scheduler to read the page **in parallel**.
 ```ocaml
 let map fd ~pos len = Effect.perform (Scheduler.Pread (fd, pos, len))
 let () = Scheduler.run begin fun () ->
    let fd = Unix.openfile "disk.img" Unix.[ O_RDONLY ] 0o644 in
    let finally () = Unix.close fd in
    Fun.protect ~finally @@ fun () ->
    let cache = Cachet.make ~pagesize:(getpagesize ()) ~map fd in
    let page = Cachet.load cache 0xdead in
    ...
  end
 ```
 Note that this is only effective if the page is read **in parallel**. If this is
 not the case, adding a cooperation point as you could do with Lwt/Async is
 enough. Reading a page remains **atomic** and allowing other tasks to run at
 the same time as this reading implies that the latter must necessarily be done
 in parallel (via a `Thread` or a `Domain`).
 Finally, the Cachet documentation specifies how many pages we would need to read
 to obtain the requested value. As a result, it's up to the user to know where
 the cooperation point should be placed and whether it makes sense to use, for
 example, `get_string` or just use `load` interspersed with cooperation points.