It pains me to write this.
Has anybody ever done backups for Apache Lucene? (In particular, Solr, though I've no reason to believe that means anthing special for backup/recovery.) What did you do to do that, in broad strokes?
Note that I'm interested in responses in accordance with
jwz's rules here. I do not want to know what you think might work
(
Read more... )
It would appear that Lucene pretends that it only ever writes files to disk atomically. The Solr community provides several utilities (actually, those are links to documentation about them; the utilities are in the distribution) that perform backups / site redundancy by way of making a hard link to the important files and then copying the data in the hard link.
Because, apparently, Lucene uses a brand new inode every time it writes data out, and then removes the predecessor link in the file system. I'm so sure this is faster than, um, using mmap(2), for example. It definitely makes sense to run every single actual I/O through the file system. Oh, wait, maybe not.
But, you know, this is just my take as of 05:00 local Thursday morning, and I haven't gone back to actually sort it out in the ensuing twenty-one hours, so maybe I'm missing something. I will, however, post again when I have an answer to my own question here (since, it would appear, none of my readership does).
Reply
Leave a comment