It drives me nuts that we in the Django community rely on Solr or Haystack to provide us with full-text search when MySQL provides a
perfectly functional full-text search feature, at least at the table level and for modest projects. I understand that not every app runs on MySQL, but mine do, and I’m sure many of you are running exactly that, and could use this technique without modification.
Well, after much digging, I found an
article on MercuryTide’s website covering custom QuerySets with FULLTEXT and relevance, and built this library around it.
I used this rather than Django’s internal filter keyword search, because this technique adds an additional aggregated value, the relevance of the search terms to the search. This is useful in sorting the search, something not automatically provided by the QuerySet.filter() mechanism.
You must create the indexes against which the search will be conducted. For performance reasons, if you’re importing a massive collection of data, it’s better to import all of the data and then create the index. More importantly, when you declare that a SearchManager to be used by a Model, you declare it thusly:
class Book(models.Model):
...
objects = SearchManager()
When you do, you must add an index that corresponds to that list of fields:
CREATE FULLTEXT INDEX book_text_index ON books_book (title, summary)
Notice how the contents of the index correspond with the contents of the Search Manager. Or you can automate the process with South:
def forwards(self, orm):
db.execute('CREATE FULLTEXT INDEX book_text_index ON books_book (title, summary)')
def backwards(self, orm):
db.execute('DROP INDEX book_text_index on books_book')
To use the library is fairly trivial. If there is only one index (which can encompass several columns) for any table, you call
books = Book.objects.search('The Metamorphosis').order_by('-relevance')
If there’s more than one index, you specify the index by the list of fields:
books = Book.objects.search('The Metamorphosis', ('title', 'summary')).order_by('-relevance')
Note that that’s a tuple, and must be.
If you specify fields that are not part of a FULLTEXT index, the error message will include lists of viable indices. It will also tell you if there are no indices. (Getting that to work was tricky, as it involved database introspection and the decoration of methods, so I’m especially proud of it.)
The library is fully available on my github account:
django_mysqlfulltextsearch
This entry was automatically cross-posted from Elf's technical journal,
ElfSternberg.com