Hacklog: Blogamundo — poking holes in the language barrier since approximately 1 month from now

b
l
o
g
a
m
u
n
d
o

Machine translation and Open Source

Written by Patrick Hall, 2 months, 2 weeks ago.
Tags: , , .

Information Week blogger Serdar Yegulalp has some thoughts on the intersection of machine translation and open source:
Talk To Me, Openly - Open Source Blog - InformationWeek

He’s got an interesting anecdote about how he tackled studying Japanese, and it serves as an interesting intro to the idea behind bitext and statistical machine translation:

..Since I didn’t have money for classes, I homebrewed my own self-teaching method. I went out and bought a grammar guide, and then two copies of a given book — one in Japanese, the other an English translation — and sat with them side-by-side, comparing the two on a sentence-by-sentence and phrase-by-phrase level. It worked, up to a point, and while I’m no native speaker I can certainly figure out a fair amount of what’s put in front of me as long as I have a dictionary.

I didn’t know it at the time, but this parallel-texts technique is actually one of the best ways to also teach a computer to perform translations between languages.

He’s also got some thoughts on licensing issues involved with the data used to build MT systems, which is a topic which I don’t think has gotten enough attention.

(Please consider this an open thread for your thoughts on how MT and FOSS can and should interact.)

No Comments for 'Machine translation and Open Source'

No comments yet.

Leave a comment

(required)

(required)

Comment moderation may delay the posting of your comment. XHTML: You can use the following tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <img src="" alt=""> <strike> <strong> . Don't forget to close them after use.