It's not a bug, it's a feature...
Never worry about theory as long as the machinery does what
it's supposed to do.
-- Robert A. Heinlein
Sat, 01 Dec 2007
AFS fileserver issue
One of our AFS fileservers lost a disk late this afternoon, resulting in a couple hours of downtime. A single disk failure shouldn't result in any downtime, but in this case it did. The disk was part of a mirror set hosting the machine's root filesystem and boot blocks, and for some reason it didn't seem to notice correctly that the disk had failed, so it continued trying to access it. This resulted in access attempts hanging, causing the machine to develop a backlog of AFS fileserver requests eventually triggering an alert to the TIG oncall people (which included me this weekend).
The dead disk has been replaced, and things are OK again...
Posted by Noah Permalink Comments (0) 2007-12-01 13:29
Copyright © 2006 Noah Meyerhans |
This content is licensed under a Creative
Commons Attribution-Share Alike 3.0 United States License |
Design: David Herreman |
CSS
and XHTML