h
a
c
k
l
o
g

Mysteries in translation seem …

Written by Patrick Hall, January 4th, 2009

Mysteries in translation seem to be hot http://tinyurl.com/8l4gvw

Onomatopoeia

Written by Patrick Hall, January 1st, 2009

It took me forever to learn how to spell that word, so that’s what you get for the title.

Good fun:

None of the young women behind the counter spoke a word of English, and the kanji signs were no help. So in trying to ascertain the pedigrees of the various meats, I pointed to a likely candidate, flapped my wings and let out a heartfelt “cluck cluck!”

In unison, the women behind the counter held their hands to their mouths to cover their giggles. They exchanged puzzled looks, and then I saw a light bulb go off in the head of one.

“Ku ku ku, ko ko ko!” she said, flapping her arms.

Happy New Year everybody!

Translations of the US Declara…

Written by Patrick Hall, December 28th, 2008

Translations of the US Declaration of Independence at the Center for History & New Media http://tinyurl.com/8zyjzx

An idea for building a Japanese dictionary from Wikipedia

Written by Patrick Hall, December 26th, 2008

Here’s a thought:

It seems to be the case that many entries on the Japanese Wikipedia begin with a consistent pattern.

Here’s the first sentence from the entry for “continent“:

大陸(たいりく、continent)は、地球上のの中で特に面積の広いものをいう。これに対して面積の小さな陸をという。大陸の定義は、文化や学問により異なる

And here’s “island“:

(しま、Island)は、水域に四方を囲まれたの中で面積の規模の小さいものをいう。より規模の大きなものは大陸と呼ばれる。

Even if you have only the slightest familiarity with Japanese, you can probably detect the similar patterns at the beginning of the sentences. It goes:

  1. Title in bold
  2. Open parenthesis…
  3. Hiragana rendering of the subject
  4. U+3001 IDEOGRAPHIC COMMA
  5. English translation
  6. …closing parenthesis.

That could be matched with some regular-expression-fu, and extracted from lots of articles. Implementation is left as (*cough*) an exercise to the reader.

(Of course, EDICT is already pretty huge and available…)

Russian on the New York Times,…

Written by Patrick Hall, December 25th, 2008

Russian on the New York Times, a year in review http://tinyurl.com/8rpmvf

Mining for stuff to translate …

Written by Patrick Hall, December 24th, 2008

Mining for stuff to translate on Wikipedia (just getting off the ground): http://github.com/amundo/translationspider

Dutch names and your database columns

Written by Patrick Hall, December 23rd, 2008

So sayeth Wikipedia:

A tussenvoegsel, in Dutch linguistics, is a word that is positioned between someone’s given name and surname, but is still a part of someone’s last name.

In the Netherlands, the tussenvoegsels strictly speaking are not a part of someone’s last name. For example, someone whose family name is “De Vries” is not found at the letter “D” in the telephone directory but at “V”. Tussenvoegsels are therefore also required to be listed separately in databases; another reason for this is that it makes finding someone’s name relatively easy, as most Dutch prepositions start with the same letter. In the Netherlands, the tussenvoegsel is written with a capital letter if no name precedes it. So Jan de Vries (Jan being a given name), but: de heer De Vries (meaning Mr De Vries) and de heer en mevrouw Jansen-De Vries, (Mr and Mrs Jansen-De Vries).

In Flanders tussenvoegsels of personal names always keep their original orthography: mevrouw van der Velde, mevrouw J. van der Velde, and Jan Vanden Broucke.

So, if you have a database with “first name” and “last name” fields, where do you put the van or the van den or the uijt te de or whatever?

A great illustrated explanatio…

Written by Patrick Hall, December 23rd, 2008

A great illustrated explanation of Unicode, UTF-8, UTF-16, and all that: http://tinyurl.com/7u4kob

Informative podcast by Eve Bod…

Written by Patrick Hall, December 19th, 2008

Informative podcast by Eve Bodeux and Corinne McKay on translation and the business of translation: http://speakingoftranslation.com

Using github for l10n http://t…

Written by Patrick Hall, December 13th, 2008

Using github for l10n http://tinyurl.com/6cp9dm Neat idea.

Next Page »