Friday, November 29, 2013

On PDiffs Usefulness for Stable

Axel just said that PDiffs are really useful for stable because it changes seldom. The truth is that there are no PDiffs for stable. I consider this a bug because you need to download a huge blob on every point release even though only few stanzas changed.

The real problem of PDiffs is the count that needs to be applied. We run "dinstall" much more often now than we used to and we cannot currently skip diffs in the middle when looking at a dak-maintained repository. It would be more useful if you could fetch a single diff from some revision to the current state and apply this instead of those (download, apply, hash check)+ cycles. 56 available index diffs are also not particularly helpful in this regard. They cover 14 days with four dinstalls each, but it takes a very long time to apply them, even on modern machines.

I know that there was ongoing work improving PDiffs that wanted to use git to generate the intermediate diffs. The current algorithm used in dak does not require old versions except one to be kept, but keeping everything is the far other end of the spectrum where you have a git repository that potentially gets huge unless you rewrite it. It should be possible to just implement a rolling ring buffer of Packages files to diff against, reprepro already generates the PDiffs in the right way, the Index format is flexible enough to support this. So someone would just need to make that bit of work in dak.

Monday, September 16, 2013

Did you notice that it was gone? It's back now. Phew.

Apparently an upgrade to wheezy was scheduled for the passed week-end. This went badly with the RAID controller acting up after the upgrade. So the move to a new machine (or rather VM) was… sped up. And because various policies changed, we needed to move to a dedicated database server with a new authentication scheme, get rid of the /org symlink, get rid of the buildd_* accounts, cope with a new PHP, cope with rlimits that broke a Python CGI (thanks, KiBi!), triggers that weren't tickled, and various other stuff.

Thanks to DSA for handling the underlying infrastructure despite some communication issues during heavy fire fighting.

Sunday, March 3, 2013

git-annex: encrypted remotes

Due to the data loss I blogged about, I had to reverse engineer the encryption used by git-annex for its encrypted special remotes. The file system on which the content lived has a bullet hole of 8 GB in it, which was helpfully discarded by pvmove. It's pretty unhappy about that fact, parts of the git repository are unusable and directories cannot be accessed anymore. git-annex cannot possibly run anymore.

However, I was still able to access the git-annex branch within said git repository (using porcelain). This branch contains a file called remote.log which contains the keys of the special remotes. There's one per remote, encrypted to a GPG key of your choice and all files within that remote are encrypted with the same symmetric key.

One small detail stopped me from getting the decryption right the first time, though. It seems that git-annex uses randomness generated by GPG and armored into base64. In my naïveté I spotted the base64 and decoded it. Instead it's used verbatim: the first 256 bytes as HMAC key (which reduces randomness to 192 bytes) and the remaining bytes for the symmetric key used by GPG (which will do another key derivation for CAST5 with it). A bug about that just hit the git-annex wiki.

With that knowledge in mind I wrote a little tool that's able to generate encrypted content keys from the plain ones used in the symlinks. That helps you to locate the file in the encrypted remote. Fetch it and then use the tool to decrypt the file in question with the right key.

The lesson: Really backup the git repository used with git-annex and especially remote.log. I'm now missing most of the metadata but for some more important files it's luckily still present. Recovery of the file content does not depend on it if you can deduce the filename from the content. If you have many little files it might be a bit futile without it, though.

Saturday, March 2, 2013

PSA: LVM, pvmove and SSDs

If you use LVM with Wheezy on a solid-state drive, you really want to install the latest lvm2 update (i.e. 2.02.95-6, which contains the changes of -5). Otherwise, if you set issue_discards=1 in /etc/lvm/lvm.conf, you will experience severe data loss when using pvmove. Happened to me twice, once I didn't care (chroot data being lost), the second time (today) I did. Not fun, especially when the backup of the data was scheduled for the same day.

One has to wonder why it takes three months for a bug that trashes data to reach testing. (Obviously I know the answer, but they're not particularly good reasons.) Other distributions, like Ubuntu, were much quicker to notice and incorporate that fix. And in the case of the named distribution not because they auto-synced it from unstable. If somebody notices such a grave bug, please yell at people to get the fix out there to our users. Thanks.

Friday, February 8, 2013

Mozilla's direction

Am I the only one who's disappointed with the route Mozilla's taking and left wondering what the direction is? First they killed off the development of Thunderbird because, as we all know, people mainly use webmail these days. Then they presented us their view that the big Certificate Authorities are too big to fail, as CAs gravely violated our trust (c.f. Trustwave and their MitM authority). And "now" they're also blocking the introduction of new formats into their browser because they cannot be the one who innovates. Instead Microsoft and Apple obviously need to take the lead in introducing a format into their browsers because otherwise it wouldn't be useful. Even though it's safe to say that Chrome and Firefox make up for more than half of the desktop browser market share. It might be that Chrome's nibbling from Firefox's, still IE seems to be in decline and Safari is rather a further mention than something many people would care strongly about.

There were of course some valid reasons for not supporting WebP yet. But most of them got fixed in the meantime and all we hear is the referal to proprietary vendors who need to move first. If I'd want to depend on such vendors I'd go with proprietary operating systems. (Having to deal with hardware products of proprietary vendors at $dayjob is enough.) So what's up Mozilla? The solution is to ignore your users and tag bugs with patches wontfix?

The only real advantage of Firefox over Chromium these days is the vast amount of plugins and extensions (e.g. Pentadactyl, for which there is no real equivalent available). Another sad fact is that you need to pull Firefox from a 3rd party repository (even though packages are coming from the 2nd party) to get a current version onto your Debian system to work with the web. But then it's not Mozilla who's to blame here. Maybe we should've introduced one Iceweasel version that's allowed to have reverse-dependencies and one that cannot.

(This post might contain hyberboles, which should be considered as such.)