Wikipedia is the biggest multilingual project, ever. “Interwiki links,” links between articles in all the Wikipedias, constitute an impressive translation database. The question is, could it be harvested somehow?
What kind of information is in these links? Well, translation: The Tasmanian Wilderness is called タスマニア原生地域 in Japanese… the Snowdon lily is called Späte Faltenlilie in German… a Moustached Warbler is called a Мустакато шаварче) in Bulgaria… and the translation for secondary sex characteristics are available for Español (Spanish), Lietuvių (Lithuanian), Svenska (Swedish), or 中文 (Chinese). And on and on.
I’m only just starting to look into this stuff, but the thought of somehow programmatically collecting these correspondences as a rough multilingual lexicon is pretty interesting.Which links are made, between which languages, is all still a pretty random affair, but there a lot of Wikipedians (myself included) working on fleshing them out. It seems that in the future this will be a sizeable resource indeed. And though there is no metadata in these interwiki links, I suspect that as a whole they will turn out to be more robust than the Wiktionary.
Harvesting and analysis efforts are already being made with Interwiki bots. This graph, for instance, was generated by a automated tool (which apparently lacked a Japanese font!).

A final prerequisite, and fundamental one: in order to re-use any content from Wikipedia, one would have to spend some time thinking about where interwiki links themselves fall under the Wikipedia’s license — the GNU Free Documentation License. But it’s not clear to me how this applies to the interwiki links alone. The Wikipedia:Forking FAQ says:
As set forth at Wikipedia:Copyrights#Definitions_and_trademarks, Wikipedia considers each Wikipedia article to be an individual document. Moreover, for the purposes of creating derivative works of individual Wikipedia articles, Wikipedia considers a direct link-back to a particular Wikipedia article as being in full compliance with the GNU Free Documentation License (GFDL), provided your derivative work is also licensed under the auspices of the GFDL. As such, would-be Wikipedia forkers need not worry about the challenges involved in setting up a large-scale Web site.
But a link isn’t a document. And should one cite each link with another link? And how could we help to give back to the interwiki linking effort?
Lots of stuff to think about, here. The legalities can get a bit tedious. But they’re certainly important!
Update: Here’s a little comparison of the translations in interwiki links compared to the number of translations on Wiktionary, in this case for the article Fungus:
- فطر
- Fungi
- Гъби
- ব্যাঙের_ছাতা
- Fong
- Houby
- Ffwng
- Svampe
- Pilze
- Fungo
- Fungi
- Seened
- Sienet
- Mycota
- Fungas
- פטריות
- Gomba
- Fungi
- 菌類
- 균류
- Fungi
- Pilzeräich
- Grybų_karalystė
- Габа
- Poggenstöhl
- Schimmels
- Sopper
- Grzyby
- Fungos
- Грибы
- Svampar
- பூஞ்சைகள்
- เห็ดรา
- Mantar
- Tchampion
- 真菌
Wiktionary has seven entries.