diff --git a/Posts/OpamMirror b/Posts/OpamMirror index 82a83af..fb172d4 100644 --- a/Posts/OpamMirror +++ b/Posts/OpamMirror @@ -2,18 +2,16 @@ title: Mirroring the opam repository and all tarballs author: hannes tags: mirageos, deployment, opam -abstract: +abstract: Re-developing the opam cache from scratch, as a unikernel --- -# TL;DR - We at [robur](https://robur.coop) developed [opam-mirror](https://git.robur.io/robur/opam-mirror) in the last month and run a public opam mirror at https://opam.robur.coop (updated hourly). # What is opam and why should I care? [Opam](https://opam.ocaml.org) is the OCaml package manager (also used by other projects such as [coq](https://coq.inria.fr)). It is a source based system: the so-called repository contains the metadata (url to source tarballs, build dependencies, author, homepage, development repository) of all packages. The main repository is hosted on GitHub as [ocaml/opam-repository](https://github.com/ocaml/opam-repository), where authors of OCaml software can contribute (as pull request) their latest releases. -When opening a pull request, automated systems attempt to build not only the newly released package on various platforms and OCaml versions, but also all reverse dependencies, and also with dependencies with the lowest allowed version numbers. That's crucial since neither semantic versioning has been adapted across the OCaml ecosystem (which is tricky, for example due to local opens any newly introduced binding will lad to a major version bump), neither do many people add upper bounds of dependencies when releasing a package (nobody is keen to state "my package will not work with [cmdliner](https://erratique.ch/software/cmdliner) in version 1.2.0"). +When opening a pull request, automated systems attempt to build not only the newly released package on various platforms and OCaml versions, but also all reverse dependencies, and also with dependencies with the lowest allowed version numbers. That's crucial since neither semantic versioning has been adapted across the OCaml ecosystem (which is tricky, for example due to local opens any newly introduced binding will lead to a major version bump), neither do many people add upper bounds of dependencies when releasing a package (nobody is keen to state "my package will not work with [cmdliner](https://erratique.ch/software/cmdliner) in version 1.2.0"). So, the opam-repository holds the metadata of lots of OCaml packages (around 4000 at the moment this article was written) with lots of versions (in total 25000) that have been released. It is used by the opam client to figure out which packages to install or upgrade (using a solver that takes the version bounds into consideration). @@ -24,11 +22,12 @@ The vast majority of opam packages released to the opam-repository include a lin # How does the opam client work? Opam, after initialisation, downloads the `index.tar.gz` from `https://opam.ocaml.org/index.tar.gz`, and uses this as the local opam universe. An `opam install cmdliner` will resolve the dependencies, and download all required tarballs. The download is first tried from the cache, and if that failed, the URL in the package file is used. The download from the cache uses the base url, appends the archive-mirror, followed by the hash algorithm, the first two characters of the has of the tarball, and the hex encoded hash of the archive, i.e. for cmdliner 1.1.1 which specifies its sha512: `https://opam.ocaml.org/cache/sha512/54/5478ad833da254b5587b3746e3a8493e66e867a081ac0f653a901cc8a7d944f66e4387592215ce25d939be76f281c4785702f54d4a74b1700bc8838a62255c9e`. + # How does the opam repository work? -According to DNS, opam.ocaml.org is a machine at amazon. It likely, apart from the website, uses `opam admin index` periodically to create the index tarball and the cache. There's an obversable delay between a package merge in the opam-repository and when it shows up at opam.ocaml.org. Recently, there was recently [a reported downtime](https://discuss.ocaml.org/t/opam-ocaml-org-is-currently-down-is-that-where-indices-are-kept-still/). +According to DNS, opam.ocaml.org is a machine at amazon. It likely, apart from the website, uses `opam admin index` periodically to create the index tarball and the cache. There's an observable delay between a package merge in the opam-repository and when it shows up at opam.ocaml.org. Recently, there was [a reported downtime](https://discuss.ocaml.org/t/opam-ocaml-org-is-currently-down-is-that-where-indices-are-kept-still/). -Apart from being a single point of failure, if you're compiling a lot of opam projects (e.g. a continuous integration / continuous build system), it makes sense from a network usage (and thus sustainability perspective) to move the cache closer to where you need the source archives. We're also organising the MirageOS [hack retreats](http://retreat.mirage.io) in a northern African country with poor connectivity - so if you gather two dozen camels you better bring your opam repository cache with you to reduce the bandwidth usage (NB: this requires at the momemnt cooperation of all participants to configure their default opam repository accordingly). +Apart from being a single point of failure, if you're compiling a lot of opam projects (e.g. a continuous integration / continuous build system), it makes sense from a network usage (and thus sustainability perspective) to move the cache closer to where you need the source archives. We're also organising the MirageOS [hack retreats](http://retreat.mirage.io) in a northern African country with poor connectivity - so if you gather two dozen camels you better bring your opam repository cache with you to reduce the bandwidth usage (NB: this requires at the moment cooperation of all participants to configure their default opam repository accordingly). # Re-developing "opam admin create" as MirageOS unikernel @@ -62,6 +61,6 @@ What is next? Downloading and writing to the tar archive could be done chunk-wis # Conclusion -To conclude, we managed within a month to develop this opam-mirror cache from scratch. It has a reasonable footprint (CPU and memory-wise), is easy to maintain and easy to update - if you want to use it, we also provide [reproducible binaries](https://builds.robur.coop/job/opam-mirror) for solo5-hvt. You can use our opam mirror with `opam repository set-url default https://opam.robur.coop` (revert to the other with `opam repository set-url default https://opam.ocaml.org`). +To conclude, we managed within a month to develop this opam-mirror cache from scratch. It has a reasonable footprint (CPU and memory-wise), is easy to maintain and easy to update - if you want to use it, we also provide [reproducible binaries](https://builds.robur.coop/job/opam-mirror) for solo5-hvt. You can use our opam mirror with `opam repository set-url default https://opam.robur.coop` (revert to the other with `opam repository set-url default https://opam.ocaml.org`) or use it as a backup with `opam repository add robur --rank 2 https://opam.robur.coop`. Please reach out to us (at team AT robur DOT coop) if you have feedback and suggestions. We are a non-profit company, and rely on [donations](https://robur.coop/Donate) for doing our work - everyone can contribute.