Dec 02, 2010 17:43
So, about 2 months ago, I asked my IT department about some innocuous warnings I saw in the event log on two of my ClearCase servers. Nothing too alarming, just a message that the battery on the drive array accelerator board had died and could someone please fix it. Given I'd been told to drop everything I was in the middle of and un-fuck the clearcase servers, this made sense.
Monday, IT got back to me, had the batteries in, wanted to do a quick 5 min firmware update and reboot. Notified my users of a lunchtime outage Tuesday, scheduled with IT, etc... One server went just fine. The other, after rebooting, would not boot/post/etc. Turned out a heat sink under spring tension escaped when the spring clamp gave. 3 hours later, IT managed to migrate everything into a new chassis. Moral of the story, never assume that something should just be a quick task.