The joys of database repair (NOT) part 4

Apr 28, 2008 00:22

This is the last of it....

Here, I'm going to detail out EXACTLY what we're doing.

This is the cleanup phase and what is entailed.

First of all, what we're doing now, much of it SHOULD have been done following the subdivide of the database tables.  However, due to being politicked out of what she was doing, manta2g wasn't able to do anything about it until recently.  So, those that were left in charge, did not do what was necessary to finish up fixing things after the first major crash, nor did they do things to prevent the subsequent minor crashes which followed.

This is actually where I come in.  I actually have done this sort of thing for years, as part of my job IRL.  Not only have I prevented major db crashes, I've had to recover from them as well.

The DIFFERENCE is, when doing it for a business one has hard copy records to work from, should the data be unrecoverable, which I encountered more than a few times.  In this for instance, we do not.  So, in order to prevent another major crash, which would've caused tremendous data loss,  I've set up this whole project.  manta2g has been aware of it, of course, every step of the way.  We (the techs) all have our strong suits, this one is mine.  The nuts and bolts of how a database works, and keeping it healthy and viable.  Also, making it so a crash can be prevented.

So on to the project...

Phase one is already complete.  This was the initial recategorization across all the subdomains, and the deletion of the categories now handled in the forum.  This meant that I subleveled and split as much as I could, to cut down the query load.

Phase two, is going through each subdomain.  A category at time, a story at at a time.  What we are looking for are the following:

  • "wordord" glitch in chapters.  These are the remnants of the 2003 crash.  Where they have been found, I go direct into that table and fix it.  I am NOT proofreading, simply fixing the duplicateplicate (yes that's intentional) words where they are found. 
  • no author glitch in chapters.  There are chapters which lost their link to the author who added/uploaded them.  Once this happens, the author has no way to access that chapter.  Ever.  Unless it's relinked.  They are being relinked.
  • endless horizontal scrolling...
    About half is corruption (a variation of wordord) and about half is user input error.  Regardless, it's all getting fixed once it's found.  Whether in the chapter or the review board.
  • Review board checking.  There are duplicate entries, there are glitched entries (they have no data at all, just a user name, usually).  Troll and spam entries as well.  These are being found, noted, and deleted.
  • Orphan stories.  When one is looking at a page of stories listed, there are stories with no author attached.  They, and their associated records (review, chapter, rate) are being deleted once found.

Then, there are the new top level categories, and the categories which look like they will grow IN to top level categories.  Each are being sorted and the data shifted around.  The new top levels, this allows us to further subcategorize the larger ones where a story, for example is a pair type.  Then, it features a particular pair.  This makes it so we can work those three levels.  Which once again reduces load on the database and the server.  The categories which look like they will eventually grow into top levels at this point, are being subdivided as well.  Not to the level of a top level category, as that is not possible, but subdivided nonetheless.  Again, the end result is reduction of stress on the database.

Hopefully, I've explained this in plain english enough to where it's easy to understand.

database repair

Previous post Next post
Up