July 1st, 2003

Conbadge

GAH!

Today has NOT been my day.

Spent the morning playing political games trying to get everything happy for the CRASH servers that are now in the CSL. We have them, and they are (thankfully) finally being installed. So I suppose that's a good thing. But politics kinda piss me off.

Got home today to discover that polaris is now cold-booting itself every few minutes. I am not happy about this, but there's really not much I can do. I suspect the power supply, but it could be any number of things, I suppose... I may end up keeping deneb as my primary server after all (although if I do, I'm going to have to do a reload on it this summer, maybe with Gentoo, maybe with FreeBSD).

I also booted up alnath after an unclean shutdown this morning to find that ReiserFS had (again) corrupted my zsh history, my GAIM settings, and my KDE panel settings. Worse yet, I found portions of my zsh history in my Noatun playlist (along with a bunch of gibberish). ReiserFS has not *once* gracefully recovered from an unclean shutdown in the year or so that I've been using it. I think Reiser is going bye-bye the next time I reload.

If anybody has had any positive experiences with ReiserFS and unclean shutdowns, I'd like to hear them. Because, as interesting and fast as Reiser is, I just don't think it's stable and robust enough for my needs.

On the other hand, BSD's filesystem (FFS) on polaris had a rather...interesting unfixable error. If a directory somehow gets created without '.' and '..' (i.e. the directory is created and written to disk, but the two entries are still in the disk cache when the machine crashes), and some subdirs/files get created (which do get written), then fsck can't fix the directory because the first two entries are already taken. What's more, you'll never be able to delete said directory, or (afaict) its contents. This is almost as bad as Reiser...

I don't know if this is a result of extreme system insanity (a la bad RAM, bad power supply, whatever), but I *do* know that it seems stupid that you can't recover from filesystem errors, even with fsck. In that situation, if there's *really* nothinig you can do about it, I'd say just remove the erring directory entry and recover the rest as lost data.

I believe in things like "stability"...I mean, sheesh, how weird is that?

-- Des

 18:46:10 up 76 days, 12:19,  0 users,  load average: 0.00, 0.00, 0.00
  • Current Mood
    pissed off pissed off