Backups

Sep. 5th, 2006 11:24 pm
ewx: (geek)
[personal profile] ewx

For some years I semi-regularly backed up my computers to tape. This is actually pretty inconvenient, but I never had to restore.

At the beginning of this year I switched to a disk-based backup scheme, using a program of my own devising, originally in Python but now with a C++ implementation as well.

Since then, I've used this software to restore my laptop (which was stolen); to preserve [livejournal.com profile] naath's files across a reinstall; to recover the configuration of my firewall (which failed roughly a month ago) in order to replicate it on another machine; to save (but not yet to restore) [livejournal.com profile] lnr's files, though they're still hopefuly on her hard disk as well; and to restore my home directory on my desktop PC, a bunch of files having been smashed by (I think) a new kernel, and though diff hasn't finished running yet so far it's looking like the only files damaged were in /usr and so already restored via dpkg.

While repeatedly discovering that my backup software works well is gratifying, I am left wondering why I've needed it three times this year, in addition to the times it's merely been useful, when I've had so little pressing need for backups for many years previously.

(no subject)

Date: 2006-09-06 10:17 am (UTC)
From: [identity profile] robhu.livejournal.com
Why do you use your own program rather than an existing backup program?

(no subject)

Date: 2006-09-06 10:22 am (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com
You're welcome to point to one that actually has the features I require, and is comparably simple to it.

(no subject)

Date: 2006-09-06 10:25 am (UTC)
From: [identity profile] robhu.livejournal.com
I might if I knew what you wanted :-) I wasn't saying there was a better solution, I was wondering what your program did (although I obviously I did an amazingly bad job of explaining that).

(no subject)

Date: 2006-09-06 10:45 am (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com
  • It must run on Linux and MacOS (and perhaps other UNIX systems in the future).
  • It must be able to use non-POSIX filesystems (e.g. FAT32) to store files on yet nonetheless preserve POSIX file attributes; I want to be able to share the media between different operating systems, and the intersection of filesystems supported doesn't reliably include one with POSIX semantics. I solve this by keeping the mapping from name to permissions/content in a linear index file.
  • It must be able to back up over the network; I don't want separate media for each host. I solve this by implementing SFTP support.
  • It must not duplicate files on the backup medium unnecessarily. Full backups take too long; incrementals must be the default. I solve this by storing files by their hash (or more recently by putting the content in the index file, if it's smaller than the hash would be).
  • It must provide easy access to old backups, not just the most recent. I solve this by including the date in the index filenames.
  • Restoration requirements must be small. I solve this by providing a complete implementation in a few hundred lines of Python, which can be put on the backup medium; as soon as you have a Python install (on whatever OS) you can restore your files. (That doesn't mean that there can't also be a cleverer version that takes more infrastructure to get going.)
  • It must be free in the sense of admitting to unrestricted modification. Backups are too important to allow my ability to fix bugs to be curtailed.
  • It must be simple, as this will keep the bug count down.

(no subject)

Date: 2006-09-06 10:48 am (UTC)
From: [identity profile] robhu.livejournal.com
Is your backup system free? I'd like to take a look at it... have you licensed it?

(no subject)

Date: 2006-09-06 10:53 am (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com

It's not really ‘released’ as such but it's available from the mirror of my Arch archive:

$ tla register-archive http://www.greenend.org.uk/arch/rjk@greenend.org.uk--2004
$ tla get rjk@greenend.org.uk--2004/hbackup--mainline--0 hbackup

(no subject)

Date: 2006-09-06 11:15 am (UTC)
From: [identity profile] imc.livejournal.com
Very silly question, but why do you need to share the media between different OSses if it's network-based?

I have been known to use GNU tar in the past, which seems to satisfy most of the above (you'd need to do something like pipe it via nc and stick a listener on the other end in order to do the networking part), but of course this makes it very slow to restore a single file. I only ever had to do it once (after accidentally deleting something I shouldn't have) and that was back when no partition was bigger than about 800MB. My current /home will just about fit uncompressed on to a single-layer DVD.

(no subject)

Date: 2006-09-06 12:25 pm (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com

The network might not be available in a disaster-recovery scenario.

Tape archive formats are, indeed, horribly limiting if your backup medium has a proper filesystem on it; worse than the single file case is when you want to cobble together a consistent filesystem from some collection of incremntals - if I did full dumps every night I'd fill my backup disk in a few days, but with the current arrangement it'll be years before I fill it even given that the files aren't compressed, and a full restore to any chosen date requires only a single pass.

I did try backing up to CD for a while, but the need to manually feed CDs to computers is just too painful. DVDs would improve matters, but not that much...

(no subject)

Date: 2006-09-06 12:32 pm (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com
... also tar and similar can't de-dupe between systems; apart from obvious differences like what packages are installed, my backup medium has only one copy of Debian stable's /usr despite multiple machines backing up to it.

(no subject)

Date: 2006-09-06 03:28 pm (UTC)
From: [identity profile] imc.livejournal.com
That's neat.

I'm in two minds about backing up /usr though; if something disastrous enough to wipe out /usr happened, I'd probably take the opportunity to install a more recent release of the appropriate distro. (The problem then is deciding which bits of /etc I need to restore.)

(no subject)

Date: 2006-09-06 04:55 pm (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com
Backing up /usr is cheap once the backup is seeded and increases the options at restore time. In particular it opens up the possibility of fairly naively restoring onto a transplanted hard disk, with the only potential trickyness being getting it to boot properly; a fresh install would require at least that one answered all the questions the installer wanted.

(no subject)

Date: 2006-09-06 04:55 pm (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com
(But yes, regular backups aren't the only option for restoring bits of the OS.)

(no subject)

Date: 2006-09-06 12:30 pm (UTC)
ext_8103: (Default)
From: [identity profile] ewx.livejournal.com
Oh, the 'I must be able to modify it' bit means it has to be written in a language I'm reasonably familiar with, as well as requiring a suitably licence - debugging might have to take place during an emergency, so I don't want to have to be waiting on other people for help. That's less of an obstacle as I can learn new languages easily enough, so a really good backup system in some language I didn't know might nonetheless be acceptable.

November 2025

S M T W T F S
      1
2345678
91011121314 15
1617 181920 2122
23242526272829
30      

Most Popular Tags

Expand Cut Tags

No cut tags