Tags and SQLObject

Jan 09, 2006 18:57


Tagging! It is popular! For my current project (in TurboGears) I decided to do a tag-based keywording system. I wanted handling the tags to be as simple as possible, to the point of just feeding an object a string and have it chop it up for me and manifest the list of tags as a string. I made a simple keyword class:

class keyword(SQLObject): class sqlmeta: defaultOrder="keyword" keyword = StringCol(length=64, alternateID=True, unique=True) documents = RelatedJoin('document')
And then you can add the following methods to your document SQLObject in the model to allow them to be tagged by setting the keywordstring:
def _set_keywordstring(self, keywordstring): import sets p = sets.Set() for kw in self.keywords: self.removeKeyword(kw) for word in keywordstring.replace('/', ':').split(' '): word = word.lower() kw = None try: kw = keyword.byKeyword(word) except sqlobject.SQLObjectNotFound: kw = keyword(keyword=word) if not word in p: self.addKeyword(kw) p.add(word) def _get_keywordstring(self): return ' '.join(map(lambda x: x.keyword, self.keywords))
Then you can make documents and set document strings!

myDocument = document(title="Some document") myDocument.keywordstring = "web programming awesome" for document in keyword.byKeyword("awesome").documents: print "Document: '%s', keywords: %s" % (document.title, ' '.join([kw.keyword for kw in document.keywords])) # "Some document" should be in here
Obviously, you can extend this to other things as well: add a links = RelatedJoin('link') to your keyword and add a keywords = RelatedJoin('keyword') in your link object definition and you're golden.

Other potential useful things: set up comma-separated tag fields rather than del.icio.us-style space separated ones, using [x.strip() for x in somestring.split(',')]. Using shlex for smarter tokenizing. Also, if you don't like alphabetized tags, or want a tag to appear multiple times, or tags to maintain case, well, that's all possible but cutting out the list comprehension/set trickery in the _set_keywordstring method. I thought an all-lowercase, one-time-use set of tags was unambiguous and less confusing than a more freeform system for the task at hand. Also, note that any tag with a forwardslash has it replaced with a colon. This was kind of important because I have URLs that look something like /keyword/whatever/ and a tag with a forwardslash would break the URL pattern.

One thing I cut out of my example code: deleting tags that are no longer in use by anything. This is simple enough with a single DELETE statement, something like DELETE FROM keyword WHERE id NOT IN (SELECT DISTINCT keyword_id FROM document_keyword) would work. Note if you're using more than one thing tagged, like links, you can do something like DELETE FROM keyword WHERE id NOT IN (SELECT DISTINCT keyword_id FROM document_keyword UNION SELECT DISTINCT keyword_id FROM keyword_link) and just keep chaining UNIONs to include all your pertinent tagged tables.
Previous post Next post
Up