What is Toronto?

Feb 24, 2011 05:41

Let me precede this post by expressing that I am speaking from human experience and not from formal expertise in Natural Language Processing (NLP) which I lack.

IBM and author Stephen Baker make the claim that Jeopardy! is full of wordplay and that there is little correlation between the category name and each clue's question. I assert that more often than not, it's clear to a human what type of question (person, place, thing, or other part of speech) each clue is looking for.

Assuming that both myself and IBM are correct, one concludes that NLP is a complex field (which is nothing new) and that Trebek is right in saying that "Watson does NLP well enough to compete in Jeopardy!" with the obvious implication that Watson isn't as good at NLP as humans are.

However, I take the unpopular position that both IBM and Baker are incorrect: not only is it clear to a human what type of question each clue is looking for, but to a large degree it's clear to humans from each clue how literally to interpret the category name. Let me consider Baker's example:

[Country Clubs]
A French riot policeman may wield this, simply the French word for stick.

To a large degree it is clear to humans that the answer is not a country club... but allow me to further observe that here the clue's "answer" is a direct object. By comparison, the question Watson missed features an answer which is a subject. I have a very strong suspicion that clues which feature the answer as the subject and which repeat the subject multiple times suggest that the category name is to be taken literally (except when the clue blatantly suggests otherwise); whereas clues which feature the answer as a direct object suggest that the question could be open-ended depending on the NL context of the clue.

I have a less strong suspicion that this knowledge isn't generally applicable.

watson, ibm, grammar, jeopardy

Previous post Next post
Up