Defragment before backup?

Nov 01, 2006 11:54


According to ps, I started dd-ing my Windows partition at 1:42 am. It is still going.

Of course, I am also using bzip2 to compress, but here are some interesting stats on speed:
  • My Windows partition is about 88G. Assuming a slow sustained transfer rate of 5MB per second (not sure how they got this data, but in this example a slower disk than mine purportedly has a sustained transfer rate of 35MB per second - plus my data should be sequential since it's an entire partition), dd should have been done with the copy in little more than 5 hours. It has been over 10 hours.
  • Bzip2 seems to be a slow algorithm. I think I may use gzip, since this article seems to assert that gzip offers comparable compression ratio to bzip2, and is faster besides. However, this author doesn't cite his methodology - what his sample was, how he measured how long the compression took, etc., so I take this with a grain of salt.

Here's some more interesting stats on compression ration:
  • I am using only 5% (4.4G) of my Windows partition.
  • I'm not sure how far along the copy and compressing are, but the unfinished file is currently 62G. This means that bzip2 compresses my disk image by less than 30%.

I understand how the bzip2 algorithm works, so I am confused as to why the compression ratio is so bad. Lots of free space = heaven for Huffman encoding, yes?

My current theory is that I need to defragment/zero out the free space on my Windows partition. I did not reformat my disk when I kicked off this whole project; I'm not sure what Ubuntu did when I repartitioned and installed since I clicked "install" and went off to make chocolate chip cookies. When I came back it was done, so if it did format whole drive I missed the message. The Windows install only "quick-formatted" my Windows partition. Therefore, all the "free space" on my Windows partition could be filled with garbage - since dd is passing all that information byte for byte to bzip2 as one giant file/stream, bzip2 probably treats it all as data.

I say this is my theory because I'm not particularly good at this computer sciencey stuff (I really need to learn more about dd and bzip2 -- I only approximately what they're supposed to do), so if anyone wants comment and prove me wrong, please feel free.

I'm also debating whether I should let this stupid copy run itself out, so I can test my theory. That will probably take at least another 4 hours or so though and as brief stints in two separate Bio labs have taught me, I have little patience for conducting experiments. If I'm wrong though, and I cancel and restart (and defragging and clearing out garbage does absolutely nothing), it will take another 12-15 hours. Hmmm.

In the meantime, I have not really been able to do very much besides watch my disk I/O get hosed. I know it's sort of like watching a pot boil, but I will note that the last time I decided NOT to watch a pot boil, I ended up with bitter, brown, burned split-pea soup. (This is probably because I forgot about it entirely, and hence neither turned the heat down nor bothered to stir).

Since I cannot seem to sit down at my computer without obsessively typing "ls -lh" to see how the backup/compression is going, I decided to finally clean my kitchen:

Glyph and I actually have a very nice kitchen - it had just been really dusty and cluttered up with garbage and whatnot since we finally moved into our apartment. I cleared it all out though and even made steaks and cookies. Doo do doo dooooo.... FINISH COPYING OVER ALREADY.

home, windows

Previous post Next post
Up