From 511816141305d7bb07933490a818beed7e938a48 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Reynir=20Bj=C3=B6rnsson?= Date: Tue, 29 Oct 2024 11:15:13 +0000 Subject: [PATCH] Pushed by YOCaml 2 from e993307d838a335e44509f99773bd29a3f5d0364 --- articles/gptar-update.html | 110 +++++++++++++++++++++++++++++++++++++ 1 file changed, 110 insertions(+) create mode 100644 articles/gptar-update.html diff --git a/articles/gptar-update.html b/articles/gptar-update.html new file mode 100644 index 0000000..8bde6d8 --- /dev/null +++ b/articles/gptar-update.html @@ -0,0 +1,110 @@ + + + + + + + + Robur's blog - GPTar (update) + + + + + + + + +
+

blog.robur.coop

+
+ The Robur cooperative blog. +
+
+
Back to index + +
+

GPTar (update)

+

In a previous post I describe how I craft a hybrid GUID partition table (GPT) and tar archive by exploiting that there are disjoint areas of a 512 byte block that are important to tar headers and protective master boot records used in GPT respectively. +I recommend reading it first if you haven't already for context.

+

After writing the above post I read an excellent and fun and totally normal article by Emily on how she created executable tar archives. +Therein I learned a clever hack: +GNU tar has a tar extension for volume headers. +These are essentially labels for your tape archives when you're forced to split an archive across multiple tapes. +They can (seemingly) hold any text as label including shell scripts. +What's more is GNU tar and bsdtar does not extract these as files! +This is excellent, because I don't actually want to extract or list the GPT header when using GNU tar or bsdtar. +This prompted me to use a different link indicator.

+

This worked pretty great. +Listing the archive using GNU tar I still get GPTAR, but with verbose listing it's displayed as a --Volume Header--:

+
$ tar -tvf disk.img
+Vr-------- 0/0           16896 1970-01-01 01:00 GPTAR--Volume Header--
+-rw-r--r-- 0/0              14 1970-01-01 01:00 test.txt
+
+

And more importantly the GPTAR entry is ignored when extracting:

+
$ mkdir tmp
+$ cd tmp/
+$ tar -xf ../disk.img
+$ ls
+test.txt
+
+

BSD tar / libarchive

+

Unfortunately, this broke bsdtar!

+
$ bsdtar -tf disk.img
+bsdtar: Damaged tar archive
+bsdtar: Error exit delayed from previous errors.
+
+

This is annoying because we run FreeBSD on the host for opam.robur.coop, our instance of opam-mirror. +This Autumn we updated opam-mirror to use the hybrid GPT+tar GPTar tartition table[1] instead of hard coded or boot parameter specified disk offsets for the different partitions - which was extremely brittle! +So we were no longer able to inspect the contents of the tar partition from the host! +Unacceptable! +So I started to dig into libarchive where bsdtar comes from. +To my surprise, after building bsdtar from the git clone of the source code it ran perfectly fine!

+
$ ./bsdtar -tf ../gptar/disk.img
+test.txt
+
+

I eventually figure out this change fixed it for me. +I got in touch with Emily to let her know that bsdtar recently fixed this (ab)use of GNU volume headers. +Her reply was basically "as of when I wrote the article, I was pretty sure bsdtar ignored it." +And indeed it did. +Examining the diff further revealed that it ignored the GNU volume header - just not "correctly" when the GNU volume header was abused to carry file content as I did:

+
 /*
+  * Interpret 'V' GNU tar volume header.
+  */
+ static int
+ header_volume(struct archive_read *a, struct tar *tar,
+     struct archive_entry *entry, const void *h, size_t *unconsumed)
+ {
+-       (void)h;
++       const struct archive_entry_header_ustar *header;
++       int64_t size, to_consume;
++
++       (void)a; /* UNUSED */
++       (void)tar; /* UNUSED */
++       (void)entry; /* UNUSED */
+
+-       /* Just skip this and read the next header. */
+-       return (tar_read_header(a, tar, entry, unconsumed));
++       header = (const struct archive_entry_header_ustar *)h;
++       size = tar_atol(header->size, sizeof(header->size));
++       to_consume = ((size + 511) & ~511);
++       *unconsumed += to_consume;
++       return (ARCHIVE_OK);
+ }
+
+

So thanks to the above change we can expect a release of libarchive supporting further flavors of abuse of GNU volume headers! +🥳

+
    +
  1. +

    Emily came up with the much better term "tartition table" than what I had come up with - "GPTar".

    +↩︎︎
+ +
+ +
+ + + +