Sunday, March 3, 2013

git-annex: encrypted remotes

Due to the data loss I blogged about, I had to reverse engineer the encryption used by git-annex for its encrypted special remotes. The file system on which the content lived has a bullet hole of 8 GB in it, which was helpfully discarded by pvmove. It's pretty unhappy about that fact, parts of the git repository are unusable and directories cannot be accessed anymore. git-annex cannot possibly run anymore.

However, I was still able to access the git-annex branch within said git repository (using porcelain). This branch contains a file called remote.log which contains the keys of the special remotes. There's one per remote, encrypted to a GPG key of your choice and all files within that remote are encrypted with the same symmetric key.

One small detail stopped me from getting the decryption right the first time, though. It seems that git-annex uses randomness generated by GPG and armored into base64. In my naïveté I spotted the base64 and decoded it. Instead it's used verbatim: the first 256 bytes as HMAC key (which reduces randomness to 192 bytes) and the remaining bytes for the symmetric key used by GPG (which will do another key derivation for CAST5 with it). A bug about that just hit the git-annex wiki.

With that knowledge in mind I wrote a little tool that's able to generate encrypted content keys from the plain ones used in the symlinks. That helps you to locate the file in the encrypted remote. Fetch it and then use the tool to decrypt the file in question with the right key.

The lesson: Really backup the git repository used with git-annex and especially remote.log. I'm now missing most of the metadata but for some more important files it's luckily still present. Recovery of the file content does not depend on it if you can deduce the filename from the content. If you have many little files it might be a bit futile without it, though.

Saturday, March 2, 2013

PSA: LVM, pvmove and SSDs

If you use LVM with Wheezy on a solid-state drive, you really want to install the latest lvm2 update (i.e. 2.02.95-6, which contains the changes of -5). Otherwise, if you set issue_discards=1 in /etc/lvm/lvm.conf, you will experience severe data loss when using pvmove. Happened to me twice, once I didn't care (chroot data being lost), the second time (today) I did. Not fun, especially when the backup of the data was scheduled for the same day.

One has to wonder why it takes three months for a bug that trashes data to reach testing. (Obviously I know the answer, but they're not particularly good reasons.) Other distributions, like Ubuntu, were much quicker to notice and incorporate that fix. And in the case of the named distribution not because they auto-synced it from unstable. If somebody notices such a grave bug, please yell at people to get the fix out there to our users. Thanks.