Hacklog: Blogamundo — poking holes in the language barrier since approximately 1 month from now

b
l
o
g
a
m
u
n
d
o

Should tags be translated? I don’t think so.

Written by Patrick Hall, 2 years, 3 months ago.
Tags: , , .

Quick note about the Python/Linguistics series — the first one was actually done, until I tested it with a friend who was using Windows. We discovered some… issues with the steps as they stood. Newlines + platforms, bah. Anyway, the first installment should be up today or tomorrow.

Here’s a post that been sitting in the Hacklog outbox for a while now…

You’re It! led me to an intriguing question: Should tags be translated?

Information architect Peter Van Dijck poses the question in the description to his talk “Tags and facets, tags and languages: a case study“:

Tag clouds are … notably hard to localize. Most tag clouds currently simply present a combination of languages to the user.

  1. What approaches are possible to fix that, and what works?
  2. Can algorithms be used efficiently?
  3. Should tags be translated?
  4. Manually, or by the computer?

And he answers the question with regard to his own site, Mefeedia, in the slides to the talk:

The plan for Mefeedia:

  1. Don’t translate tags.
  2. Don’t copy tags – new language = new community. Possible complaints:
    “My tags don’t show up in the Spanish version”.
  3. I could perhaps use names (“People”) in multiple languages (that have Latin charsets). → Semantics might help with localization.

All this brings to mind what Joshua Schacter (of del.icio.us fame) had to say about the nature of tagging, as Suw Charman at Strange Attractor described it:

Tagging is not really about classification or organisation, it’s user interface. It’s a way to store your working state or context. Useful for recall. Ok for discovery because someone might tag similarly to you. Bad for distribution.

Not all metadata is tags. People ask for automatic metadata, but that’s not the value - the value is attention, that you saw it and decided that it was important enough to tag. Auto-tagging doesn’t help you do what you’re trying to do.

I agree with both of these viewpoints — translating tags doesn’t make sense. I used to think it would be cool if del.icio.us, for instance, would let you filter by the language of the linked content, but the fact is that if you want to do that, you can: just use the tags in whatever language you’re looking for. If you want to find Portuguese articles on programming, you just go to http://del.icio.us/tag/programação+artigos.

(I will say that I’m not sure what Peter meant with regard to Latin letter character sets… It’s just as easy to find Japanese articles on programming with a Japanese tag as it is to find Portuguese…)

Python and Linguistics

Written by Patrick Hall, 2 years, 3 months ago.
Tags: , , , , .

I’m starting a series of posts with short, simple tutorials on using Python for doing linguistics.

Via a roundabout route, I came across Heidi Harley’s interesting post on the frequency distributions of letters. Heidi muses:

I know it would be a supersimple programming problem to produce a list of letters and their respective percentage distributions in the headwords of any online dictionary database … but it’d be a biggish time investment for me to figure it out right this second.

This is just the sort of person I have in mind: I’d like to write some tutorials that help language geeks learn just enough programming to scratch itches without biggish time investments.

If this sounds like your idea of a good time, then the only step you have to take ahead of time is to make sure you have Python installed. You can learn how to do that here:

http://wiki.python.org/moin/BeginnersGuide

If you’re running a relatively recent Linux or Mac OSX, you probably don’t even need to do that, Python comes by default. If you’re running Windows just follow these download instructions:

http://wiki.python.org/moin/BeginnersGuide/Download

I can think of a lot of interesting little textual and linguistic questions that can be answered with a short Python program (10 to 30 lines, maybe). We’ll see if the whole concept has any traction…

If you’re a Python hacker and find this sort of thing interesting as well, and would like to post a related tutorial somewhere, feel free to use that little icon doohickey I made for no apparent reason.

« Previous Page