LJ outage

Jan 20, 2005 15:51

Why LJ went down

Another customer in the facility accidentally pressed the EPO button, then depressed it, replaced the protective case, and left the building.

Brad is being very kind here. It should read:

Another customer in the facility "accidentally" opened the protective case, pressed the EPO button, then depressed it, replaced the protective case, and left the building. This shut down hundreds or even thousands of active web servers, whose owners considered them critical, and they just took off without even telling anyone.

WTF? No one does that accidentally. If I ever pressed the EPO button just in our local server room- which only hosts internal servers, as our website and critical servers are in a secure data center- I'd be kicked out on my ass so hard that I'd bounce like an antwerp*.

This is not just the network equivalent of hitting a pedestrian and leaving them in the street. This is like someone who aims, accelerates, swerves into the crosswalk, and probably backs up afterwards to hit them again.

*Bonus geek points to anyone who knows what an antwerp is.
Previous post Next post
Up