recent_* on clusters-- PLZ DIE THX: lj

bradfitz in lj_dev

recent_* on clusters-- PLZ DIE THX

Mar 02, 2002 18:37

I've been checking in a few patches to make the LJ server code do intelligent things when any given type of database role is unavailable. One thing I realized during the process of these patches is that the whole idea of recent_* tables on clusters is frickin' retarded. I'll now be removing all that code and deleting the files from the clusters.

The whole point of recent_ was to have slaves strung off the master that don't have $10,000 disk arrays attached to them. That is, we needed to store a subset of the data on the slaves. With the cluster master/slaves, the disk capacity is the same... 30-36 usable GB on all machines, with raid 5 on masters and raid 0 on slaves. Each is a bit different, but none has usable 140 GB like Jesus does (in raid 10 no less!) Dear god that's an insane amount of disk space... 280 GB raw.

Anyway, back to the point ... what I eventually want to do is add another master to each user cluster (master = raid 5 & redundant power), and have the two masters doing two-replication amongst each other, but with writes always directed at one. If for some reason we need to do maintenance on one, we switch turn off master on on and turn it on the other (this is literally two clicks of work with the new dbadmin tool). Magic. And we'll always keep one raid 0 in the pool just for speed, and because we already have it now. Plus it's good to keep the clusters really fucking fast, so if we do lose any one machine of three, then things keep tickin' and it's still fucking fast.

So recent_ is useless because all machines have the same disk space and need to have the same capabilities in case of failover role changes. Actually, it's more than useless.... it's wasting about a GB of space already, and all that extra I/O and memory hoggin'.

To recent_* I say, "PLZ DIE THX".

That is all. Questions?