Apr 27, 2006 23:39
i'm exhausted. we had a hardware failure that corrupted 1.5 TB of email last week. the kind of hardware failure that isn't supposed to happen.
when one of the disks in this kind of raid fail, the raid is supposed to automatically fail over to another disk.
instead the raid froze and the data ended up a bit corrupted. i put in about 45 hours working from friday to sunday.
i've been exhausted all week, i barely made it in today. i've been working hard this week to make sure we can get new replacement hardware online.
a failure happened again on a different raid today. another 1.5 TB. this isn't supposed to happen.
these units have worked properly in the past.
so thanks to faulty hardware it appears like we are running a crappy service to campus, despite super hard work from tons of people to bring the email back online as soon as possible.
its gonna be about 30 hours till we are done copying the data to another raid, then who knows.