h
a
c
k
l
o
g

Computational, You Say?

Written by Patrick Hall, August 5th, 2008

Man, am I confused.

Could someone explain to me how the “Computational” Linguistics Olympiad has anything to do with computation?

Via Language Log I learned that Google, the NSA, Cambridge University Press, and a bunch of Universities all over the US and Canada ran a competition called the “North American Computational Linguistics Olympiad.” The winners get to go to something called the 6th International Olympiad in Linguistics.

Query 1: Man, why didn’t this kind of thing exist when I was in high school? (Oh yeah, computational linguistics on clay tablets is tiresome…)

Query 2: Do the questions in the first round problems (pdf) really have anything to do with computation… at all?

Since the pdf says that the questions are copyrighted, I can’t reproduce them here, but I can say that they are exactly the sorts of problems that I was given in my undergraduate linguistics classes. (The kinds of classes where people would get indignant if you suggested they make use of machine-readable dictionaries… “That’s cheating!” they countered. I kid you not). There isn’t a thing in there that can’t be done with a pencil and paper. And indeed, that seems to have been the point; the students were given 3 hours and no computer to complete their “computational” linguistics test.

I’m doubly confused because of just how illustrious the committee running the thing was: a Who’s-Who of numerically-informed linguistics. (Among them was Steven Abney, author of the best paper on why linguistics needs statistics that I have ever had the pleasure of discovering.)

So why are all the problems exercises in logic? There’s nary a number in sight.

We do not grok.

2 Comments for 'Computational, You Say?'

  1. Comment received August 5th, 2008 from MBM

    Well, there are two schools of thought as to what computational linguistics actually is, and how it is different from natural language processing (NLP). Some people treat those terms as synonyms but others believe that computational linguistics is to NLP like mathematics is to engineering: it’s highly theoretical and obsesses about “elegant” theories while NLP is dirty, data-intensive and mired in heuristics. I’m sure you’ll agree that those are at least two ends of a continuum if not separate disciplines.

    Under the latter definition, the Olympiad questions are not all that off-target.

  2. Comment received August 5th, 2008 from Patrick Hall

    Hey there MBM,

    I think the dichotomy you mention certainly exists, but I think it’s not in fact terribly useful, if we think of the long-term goals of… whatever we’re going to call it. It’s my humble opinion that if we want to see a new generation of researchers in NLP/HLT/Computational Linguistics/Whatever, they need to be learning how to crunch numbers, and the sooner the better.

    There is no shortage of tasks in this arena which can be accomplished with only the most basic mathematical background. Cavnar and Trenkle’s language identification algorithm requires no more than a bit of counting. Ando and Lee’s Tango algorithm for splitting up Kanji sequences is based on pretty simple math. Brill’s tagger is “rule based” but has a statistical element.

    I’m not saying that these particular projects are suitable for a high-school competition, what I’m saying is that dealing with a bit of statistics is not generally beyond high school students. The problem that I have with the test is not that there were logic problems (I like those too and agree with their importance), it’s that there were zero that convey the importance of math to the new worlds of linguistic research.

Leave a comment

(required)

(required)

Comment moderation may delay the posting of your comment. XHTML: You can use the following tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> . Don't forget to close them after use.