Compare commits
No commits in common. "e993307d838a335e44509f99773bd29a3f5d0364" and "bc0bbbc7065262b41e6a8c189bf8c0043a1e1109" have entirely different histories.
e993307d83
...
bc0bbbc706
1 changed files with 0 additions and 110 deletions
|
@ -1,110 +0,0 @@
|
||||||
---
|
|
||||||
title: GPTar (update)
|
|
||||||
date: 2024-10-28
|
|
||||||
description: libarchive vs hybrid GUID partition table and GNU tar volume header
|
|
||||||
tags:
|
|
||||||
- OCaml
|
|
||||||
- gpt
|
|
||||||
- tar
|
|
||||||
- mbr
|
|
||||||
- persistent storage
|
|
||||||
author:
|
|
||||||
name: Reynir Björnsson
|
|
||||||
email: reynir@reynir.dk
|
|
||||||
link: https://reyn.ir/
|
|
||||||
---
|
|
||||||
|
|
||||||
In a [previous post][gptar-post] I describe how I craft a hybrid GUID partition table (GPT) and tar archive by exploiting that there are disjoint areas of a 512 byte *block* that are important to tar headers and *protective* master boot records used in GPT respectively.
|
|
||||||
I recommend reading it first if you haven't already for context.
|
|
||||||
|
|
||||||
After writing the above post I read an excellent and fun *and totally normal* article by Emily on how [she created **executable** tar archives][tar-executable].
|
|
||||||
Therein I learned a clever hack:
|
|
||||||
GNU tar has a tar extension for *volume headers*.
|
|
||||||
These are essentially labels for your tape archives when you're forced to split an archive across multiple tapes.
|
|
||||||
They can (seemingly) hold any text as label including shell scripts.
|
|
||||||
What's more is GNU tar and bsdtar **does not** extract these as files!
|
|
||||||
This is excellent, because I don't actually want to extract or list the GPT header when using GNU tar or bsdtar.
|
|
||||||
This prompted me to [use a different link indicator](https://github.com/reynir/gptar/pull/1).
|
|
||||||
|
|
||||||
This worked pretty great.
|
|
||||||
Listing the archive using GNU tar I still get `GPTAR`, but with verbose listing it's displayed as a `--Volume Header--`:
|
|
||||||
|
|
||||||
```shell
|
|
||||||
$ tar -tvf disk.img
|
|
||||||
Vr-------- 0/0 16896 1970-01-01 01:00 GPTAR--Volume Header--
|
|
||||||
-rw-r--r-- 0/0 14 1970-01-01 01:00 test.txt
|
|
||||||
```
|
|
||||||
|
|
||||||
And more importantly the `GPTAR` entry is ignored when extracting:
|
|
||||||
|
|
||||||
```shell
|
|
||||||
$ mkdir tmp
|
|
||||||
$ cd tmp/
|
|
||||||
$ tar -xf ../disk.img
|
|
||||||
$ ls
|
|
||||||
test.txt
|
|
||||||
```
|
|
||||||
|
|
||||||
## BSD tar / libarchive
|
|
||||||
|
|
||||||
Unfortunately, this broke bsdtar!
|
|
||||||
|
|
||||||
```shell
|
|
||||||
$ bsdtar -tf disk.img
|
|
||||||
bsdtar: Damaged tar archive
|
|
||||||
bsdtar: Error exit delayed from previous errors.
|
|
||||||
```
|
|
||||||
|
|
||||||
This is annoying because we run FreeBSD on the host for [opam.robur.coop](https://opam.robur.coop), our instance of [opam-mirror][opam-mirror].
|
|
||||||
This Autumn we updated [opam-mirror][opam-mirror] to use the hybrid GPT+tar GPTar *tartition table*[^tartition] instead of hard coded or boot parameter specified disk offsets for the different partitions - which was extremely brittle!
|
|
||||||
So we were no longer able to inspect the contents of the tar partition from the host!
|
|
||||||
Unacceptable!
|
|
||||||
So I started to dig into libarchive where bsdtar comes from.
|
|
||||||
To my surprise, after building bsdtar from the git clone of the source code it ran perfectly fine!
|
|
||||||
|
|
||||||
```shell
|
|
||||||
$ ./bsdtar -tf ../gptar/disk.img
|
|
||||||
test.txt
|
|
||||||
```
|
|
||||||
|
|
||||||
I eventually figure out [this change][libarchive-pr] fixed it for me.
|
|
||||||
I got in touch with Emily to let her know that bsdtar recently fixed this (ab)use of GNU volume headers.
|
|
||||||
Her reply was basically "as of when I wrote the article, I was pretty sure bsdtar ignored it."
|
|
||||||
And indeed it did.
|
|
||||||
Examining the diff further revealed that it ignored the GNU volume header - just not "correctly" when the GNU volume header was abused to carry file content as I did:
|
|
||||||
|
|
||||||
```diff
|
|
||||||
/*
|
|
||||||
* Interpret 'V' GNU tar volume header.
|
|
||||||
*/
|
|
||||||
static int
|
|
||||||
header_volume(struct archive_read *a, struct tar *tar,
|
|
||||||
struct archive_entry *entry, const void *h, size_t *unconsumed)
|
|
||||||
{
|
|
||||||
- (void)h;
|
|
||||||
+ const struct archive_entry_header_ustar *header;
|
|
||||||
+ int64_t size, to_consume;
|
|
||||||
+
|
|
||||||
+ (void)a; /* UNUSED */
|
|
||||||
+ (void)tar; /* UNUSED */
|
|
||||||
+ (void)entry; /* UNUSED */
|
|
||||||
|
|
||||||
- /* Just skip this and read the next header. */
|
|
||||||
- return (tar_read_header(a, tar, entry, unconsumed));
|
|
||||||
+ header = (const struct archive_entry_header_ustar *)h;
|
|
||||||
+ size = tar_atol(header->size, sizeof(header->size));
|
|
||||||
+ to_consume = ((size + 511) & ~511);
|
|
||||||
+ *unconsumed += to_consume;
|
|
||||||
+ return (ARCHIVE_OK);
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
So thanks to the above change we can expect a release of libarchive supporting further flavors of abuse of GNU volume headers!
|
|
||||||
🥳
|
|
||||||
|
|
||||||
[gptar-post]: gptar.html
|
|
||||||
[tar-executable]: https://uni.horse/executable-tarballs.html
|
|
||||||
[opam-mirror]: https://git.robur.coop/robur/opam-mirror/
|
|
||||||
[libarchive-pr]: https://github.com/libarchive/libarchive/pull/2127
|
|
||||||
|
|
||||||
[^tartition]: Emily came up with the much better term "tartition table" than what I had come up with - "GPTar".
|
|
Loading…
Reference in a new issue