Brit-picking tool for writers and other word-lovers: the British National Corpus

Mar 26, 2008 16:10

Have you ever wondered how the British use a word like "pavement" or "jumper"? In fandom, non-British writers with such questions can get their fic Brit-picked, and that's a terrific idea. But Brit-picking has its limitations. What if you consult several British fans and they disagree? Who's right? Would a British person really use a word in the way you want your character to use it? How can you tell?

You can do what professional linguists do: consult the British National Corpus. Maintained at Oxford, the BNC is a 100 million word searchable database of spoken and written British English. The general public can do free web-based searches for a word or phrase. If you search, for example, for "pavement," your search will return the number of times the word appears in the database (1,263), plus fifty randomly-chosen sample sentences containing it.

As an American trying to grasp the subtleties of a British expression, I've found it invaluable to see fifty instances of the expression all in one place. Here, for example, is one of the hits for "pavement."

I ran into the road, did a Highland fling and ran back on to the pavement.

If an American performed the same rash action, she'd say: "I ran into the street, did a Highland fling, and ran back onto the sidewalk." (There are only 79 hits for "sidewalk" in the BNC, and most of them seem to come from novels about American characters.)

The database has its limitations. It's descriptive, not prescriptive: it merely reports the words and phrases that British writers and speakers use, telling you nothing about whether these words would be perceived as correct, as Americanisms, etc. Also, the database extracts the sample sentences from their context -- though you can click on a link to see a bibliographic reference for a quote, so you can deduce approximately what the source text is about, and perhaps take a guess at other important facts like the age of the speaker, his or her regional dialect, etc.

With all those caveats in mind, the BNC still strikes me as both useful and fascinating. It's like a snapshot of British English from the mid-nineties (when the database was collected). And it decisively and objectively answers the question of whether British writers and speakers use a word, and how often.

Oh, and one of the samples entered into the database was a selection from Tom Shippey's The Road to Middle Earth, so many of Tolkien's characters and places have been recorded for posterity. The word "hobbit" appears in the database fifty-nine times. :D

ETA Oooooh! As darius has pointed out in the comments, there's an equivalent database for American English here. Perfect for British writers seeking to have their SGA or SPN fics -- what's the equivalent word? American-picked? In-stated? De-britted?
Previous post Next post
Up