K had been reading "The Goldfinch" and some of the characters speak Polish and she noticed one said "Dziȩkujȩ" ("thank you") which is TOTALLY WRONG because the diacritics on the "e"s are supposed to be ogoneks and not cedillas. It seems odd how you'd even make that mistake. I know if I were writing a novel with a foreign-to-me language in it I
(
Read more... )
Comments 2
Reply
My understanding is that a lot of training data for machine translation systems ends up having wack-ass characters in it, possibly because it was originally transcribed from a book by some hapless undergrad using whatever characters they happened to have handy on their input device.
Reply
Leave a comment