Add a paragraph about patches and the limit on them

2025-01-13 18:10:16 +01:00 · 2025-01-13 18:10:16 +01:00 · 575cc7f095
commit 575cc7f095
parent 04e71b2485
1 changed files with 39 additions and 0 deletions
--- a/articles/2025-01-07-carton-and-cachet.md
+++ b/articles/2025-01-07-carton-and-cachet.md
@ -328,6 +328,45 @@ also the content of another GitHub notification email. This shows that Carton
 is well on its way to finding the best candidate for the patch, which should be
 similar content, moreover, another GitHub notification.

+The idea is to sacrifice a little computing time (in the reconstruction of
+objects via their patches) to gain in compression ratio. It's fair to say that
+a very long patch chain can degrade performance. However, there is a limit in
+Git and Carton: a chain can't be longer than 50. Another point is the search for
+the candidate source for the patch, which is often physically close to the patch
+(within a few bytes): reading the PACK file by page (thanks to [Cachet][cachet])
+sometimes gives access to 3 or 4 objects, which have a certain chance of being
+patched together.
+
+Let's take the example of Carton and a Git object:
+
+```shell
+$ carton get pack-*.idx eaafd737886011ebc28e6208e03767860c22e77d
+...
+cache misses: 62
+cache hits:   758
+tree:           160720bb
+              Δ 160ae4bc
+              Δ 160ae506
+              Δ 160ae575
+              Δ 160ae5be
+              Δ 160ae5fc
+              Δ 160ae62f
+              Δ 160ae667
+              Δ 160ae6a5
+              Δ 160ae6db
+              Δ 160ae72a
+              Δ 160ae766
+              Δ 160ae799
+              Δ 160ae81e
+              Δ 160ae858
+              Δ 16289943
+```
+
+We can see here that we had to load 62 pages, but that we also reused the pages
+we'd already read 758 times. We can also see that the offset of the patches
+(which can be seen in Tree) is always close (the objects often follow each
+other).
+
 ### Mbox and real emails

 In a way, the concrete cases we use here are my emails. There may be a fairly