Jul 11, 2008 23:01
(JONNY is setting plates on a table. MOM is placing romaine lettuce into a bowl.)
JONNY
The meeting was really great. I met everyone involved in the project -- except for one guy who joined us by speakerphone. There's me, my professor Chris, Zhifei, who's the PhD student whose project this is, and Wren and Jason and Omar, and Lane who was the disembodied voice.
So Zhifei walked us through all of his code. And it was a little difficult for me because sometimes the descriptions were very high-level, as in "and we obviously have to do it this way we're using such-and-such an algorithm." And everyone nods, but it's clear I'll have to do a lot more background reading on CYK parsing and synchronous context-free grammars and hyper-graphs.
And so we all have our different components. Mine is important, because it's the language model, which is just one of the many probabilities that we have to consider when finding the best translation for a phrase. The language model essentially answers the question, "Is my translation candidate something a native English speaker would ever say?" And so my job is to implement that. At the moment we don't have an in-house implementation, so the decoder relies on SRILM, developed at the Stanford Research Institute, and which no one really seems to like. (Particularly Zhifei, and it's his project after all.) I'll be redeveloping the language model in Java, for portability, using a log-frequency Bloom-filter-based language model as described by David Talbot in a couple of short papers, and which he eventually developed into a PhD thesis. Talbot sent me some implementation tips by email; did you know that?
(Short pause.)
MOM
Do you think there will be any girls in your program?