Hacklog: Blogamundo — poking holes in the language barrier since approximately 1 month from now

b
l
o
g
a
m
u
n
d
o

Why doesn’t Google index Khmer and Amharic?

Update: They fixed it! All the links that previously didn’t work in this post, do now. Good job Google, better late than never! ☺ (That’s our working motto around here, too…)

Note: you might want to install Khmer and Ethiopic fonts. But you can still get the idea behind this post without having them installed.

Compare:

Three searches for ភាសាខ្មែរ (”Khmer language”) on Yahoo, MSN, and Google. You can click on the images to run the searches yourself.

Successful search for a Khmer word on Yahoo

Successful search for a Khmer word on MSN

Failed search for a Khmer word on Google

Google doesn’t just return zero results, it returns nothingness.

A Google blue screen of death.

And it’s not just Khmer:

Here’s a search for ዩኒኮድ (”Unicode” in Amharic, Tigrigna, and several other Ethiopic languages)
on Yahoo, MSN, and Google.

Successful search for an Ethiopic word on Yahoo

Successful search for an Ethiopic word on MSN

Failed search for an Ethiopic word on Google

Once again, Google gives us nothing for those queries. Nary a “Did you mean” or “Your search did not match any documents.”

Just zilch. Zippo. Nada. Niente.

It’s not like nobody at Google has ever heard of these languages: unlike Yahoo and MSN, Google has actually been localized into Amharic, Tigrigna, and Khmer. And they’ve all got millions of speakers.

So what gives?

Theories welcome, I have none.

Update: They don’t bother to index Burmese, either. ဗမာစာ. Yahoo does. MSN , too.

Lame, Google.

5 Comments for 'Why doesn’t Google index Khmer and Amharic?'

  1. Comment received 2 years, 4 months ago from Leandro N. Camargo

    Maybe these big fellows just don’t give the credit to none of these people. Perhaps, for them, none of them have at least computer at home or at office. =(

  2. Comment received 2 years, 4 months ago from Patrick Hall

    Hi Leandro,

    Obrigado pela visita :)

    It may be that the size the communities of these languages is too small to warrant Google’s indexing them, but I suspect that there is some other reason. It doesn’t seem to be the case that the number of speakers corresponding to a particular Unicode block is relevant.

    Google indexes plenty of writing systems that are only used for “small” languages. To wit: Հայերեն (Armenian) with 7 million speakers , ქართული ენა (Georgian), with just 4.1 million.

    Amharic alone has 26 million speaker, Tigrinya another 5.1; Khmer has something like 20 million.

    Admittedly, most of those speakers have no access to computers… yet. But the point

    Oh look! I found another missing language: ဗမာစာ … Burmese! There’s another 40 million ungoogleable speakers.

    Pfeh.

  3. Comment received 2 years, 3 months ago from Denis Jacquerye

    One search seems to work: ማውጫ - ጉግል መፈለጊያ

  4. Comment received 2 years, 2 months ago from U Khant

    fyi
    Now Google , Altavista, MSN , Scroggle(any language)now
    supports search in Burmese characters using truly,fully compliant Unicode-Padauk G font
    Yahoo’s only supports in web mails up to now, May 07. 2006

  5. Comment received 2 years ago from Mike Maxwell

    Funny thing is, Ethiopic searching was working on Google about two years ago, then it suddenly stopped working. I had email and face-to-face correspondence with several people at Google over this. As you note, they finally got it working again. Not sure what the problem was…

Leave a comment

(required)

(required)

Comment moderation may delay the posting of your comment. XHTML: You can use the following tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <img src="" alt=""> <strike> <strong> . Don't forget to close them after use.