Hacklog: Blogamundo — poking holes in the language barrier since approximately 1 month from now

b
l
o
g
a
m
u
n
d
o

Socrates translates Etruscan to Greek, Socrates translates Etruscan to Latin, therefore…

Written by Patrick Hall, 9 months, 3 weeks ago.
Tags: .

Dear translators:

I have a question for you. It’s my understanding that translators generally describe their language skills in one of two ways:

  1. A language which you can translate into or out of
  2. A language which you can translate out of

I’m using plain language because I find that the terminology used in the translation community can be a bit confusing. (Cases in point: the term “working language” seems to sometimes refer to either one of these two. The term “native language” usually implies that the translator can translate into or out of the language. Both of these terms seem to have exceptions.)

And another question:

I saw a CV of a translator who described their language skills as follows:

  • English → French
  • English → Portuguese
  • French → Portuguese

What I don’t understand is, why wouldn’t one also list Portuguese to French? Isn’t it fair to assume that if a translator can transate into some French from English, that it’s safe to assume that they can also translate into French from any other language they know well (Portuguese, in this case)?

Bilingual Packaging is Interesting

Written by Patrick Hall, 9 months, 4 weeks ago.
Tags: , , .

Print materials are a great place to look for web design inspiration. (I’m particularly fond of table of contents pages in magazines — lots of ideas for navigation motifs.)

The sort of print that appears on packaging has an additional problem: limited real estate.

Which is what makes bilingual packaging doubly interesting. Bilingual designers have to find ways to balance more than one language, as well as deal with space restrictions. A post on
bilingual packaging
got me thinking about this again.

I’ve started poking around in results like these looking for interesting examples. While the author of the link above is concentrating on Welsh/English bilingual packages, I’ll be looking for links to any language pair. Pointers welcome.

(Language) Lost in Aggregation

Written by Patrick Hall, 10 months ago.
Tags: , , .

Lots of “aggregation”-style sites, where content streams in either from various users (del.icio.us , Digg / News , Reddit ) or automatically from various sites via feeds or via search (Bloglines , Technorati, Afrigator), face an interesting problem: how should such a service deal with multilingual input?

Let me show you what I mean with just a few examples:

del.icio.us

Japanese links on del.icio.us

Technorati

Spanish link on Technorati

Afrigator

French link on Afrigator

Disclaimer: Creating an aggregator which expressly allows and even encourages multilingual content is a perfectly noble thing to do. There are tons of reasons why one would want such a thing. (Perhaps the aggregator is from Switzerland or South Africa!)

In each of the cases in the screenshots, you have a language which is not “the” language of the site (such links are shaded yellow): in these instances we see Japanese on Del.icio.us, Spanish on Technorati, French on Afrigator. But this is the nature of a collaborative site; clearly, in the case of del.icio.us, there are a ton of Japanese users*.

What is the right way to handle language identification in an aggregator?

  1. Let it be. Aggregators are supposed to be “emergent” anyhow.
  2. Use the Spec. Rely on things like lang attributes in XHTML and all that to classify posts.
  3. Get statistical. Classify posts automatically by spidering the links, then running a statistical language identifier on the page.

My own opinions on these hypothetical attitudes:

  1. Okay John Lennon, but… The fact is, nobody reads every language. And if current trends continue, the linguistic diversity of the web is only going to grow and grow. Does it make sense to rule out even the option of filtering content in aggregators by language, just to be… um… emergent?
  2. Specs are all well and good but… Sinful though it may be, people don’t use the lang attribute consistently (yet), in XHTML forms or anywhere else.
  3. Just count letters! This is the option I favor, in theory.

Let’s talk about that third option. Implementating a statistical language identifier isn’t too hard.

Here’s a buggy but fairly functional one I wrote by leveraging some existing libraries:

what language is this?

The problem is, (most) aggregators aren’t search engines. They don’t want to go spidering every one of the bazillions of pages that people post. Seriously, think of what Digg.com would have to do to spider every page posted. It would have to be Google.

So, there you go. I don’t know what the right answer is.

What are you reactions to these “attitudes”? Are there any I have missed?

*Interestingly, the tag I took that screenshot from is from http://del.icio.us/popular/tool, and there’s also a Japanese word “ツール” (tsuuru, “tool”), which has its own tag: http://del.icio.us/tag/ツール. Somebody oughta write a paper about these sorts of cross-linguistic tag relationships…

Ḻ⎋ŀ

Written by Patrick Hall, 10 months ago.
Tags: , .

ȈⒻ ㄚ⒪Ů ĂℛḘ Ầ ựⓝⒾᏨȪ₫Ḙ ⒢Ȇⓔḳ, ¢ḨặɲČȅ$ ⒶⓇ∊ ¥ỡự ⓦȋ⒧Ł ṨṖ⒠₦đ ⓦ@ㄚ ŤɵⓄ ṂṺⒸℎ ⓣⒾℳǝ ṖȴǞŸȈꊁⓖ ẘⓘڅℋ ʈĦⒾꇘ Ŧℋ༏ℵğ.

OLPC in Brazil

Written by Patrick Hall, 10 months ago.
Tags: .

Just for fun, I transcribed and then translated a short video about the XO-1, the laptop formerly (?) known as the OLPC. And of course I did it with our tool!

Quite aside from the whole topic of translation, I’m a huge fan of the OLPC project. If you’re interested too, you can read the transcription or translation here:

Kids test OLPC

olpc.tv » GLOBO- BRASIL: Crianças testam computador portátil

Typing is tricky

Written by Patrick Hall, 10 months, 1 week ago.
Tags: , , , , , , , .

With the spread of Unicode, it’s now possible to create texts in most of the languages of the world directly on the web. No more weird, inaccessible images of text.

So, everyone gets to write in their own language now, right?

Well, not so fast, unfortunately. There’s a problem that many folks seem to forget: Typing is tricky.

Think about trying to teach someone to type.

“This’ll be easy,” you think, “To explain how to type «a» I just point at the key with the «a» on it, and say, ‘Pound on that thingie right there!’”

Except, what do you see on the key you use to type «a»?

You see «A». That’s right kids, hit «A», get «a».

What if you actually want an «A»? Press the mysterious “shift” key. Well, explain that there are two, but they’re really the same (usually). And each has a redundant “arrow pointing up” icon (the «A» on the key is already “up”!). Got that? Good. Now, hold it down while you type «A».

That’s the thought process that new typists face… to type «A».

Now, multiply this challenge by the number of keys on the keyboard.

And multiply that challenge by the number of physical keyboard layouts in the world.

And that’s where we begin to see the problems that a computer newbie who bravely buys an hour’s worth of time in cybercafe faces.

I actually have a broader reason for getting into this topic, and there will be some more posts coming along, but for now…

Seriously, just think about this for a second:

typing is tricky: a simple chart.

Translation starts at the pub!

Written by Jonas Galvez, 10 months, 2 weeks ago.
Tags: , .

Globalized menu

That’s it, I’m going to get drunk right now and see if I grok this translation business better… [via Ben Hammersley]