An wiki-focloir gaeilge
Via the Language and Linguistics Reddit, I ran across an interesting account of running a Wiki-style dictionary.
The author, Eoin Ó Conchúir, describes how he has been running an English↔Irish wiki-style dictionary at IrishDictionary.org and FocloirGaeilge.ie.
I found this bit quite interesting:
People submit everything and anything. As much as I could, I tried to write the wording on the pages to guide people into adding words into the database if they knew the translation. Ok, let’s look at the last 15 *Irish* headwords added to the dictionary:
dúirt
doire
salus
emphasize
josh
anunwise
cuirm
goog morning ruth
sorry
i am very well
sweater
ma chara
symphony
muintirYou see, this is a validation pain, for want of a better term. “emphasize” is not an Irish word. “josh” is not an Irish word”. “goog morning ruth” not an Irish word, far from it!
Validation pain is something I can identify with. On my statistical language identifier, I’ve found that no amount of help text convinces users that it will not work without enough text. I have plans to enforce this rule programmatically.
But even that won’t prevent people from putting in bad data—there are plenty of examples of people pasting in the same two words over and over a few hundred times, so that they are satisfied that there’s “enough” text in the system.
Sigh.
As for Eoin’s project, I wonder if it might make sense to go out on the web and acquire tons of Irish text, tokenize it, and then put those words into the database as “empty” headwords.
Then, if someone tries to submit a word which has never been seen, at least one could flag those words as doubly suspicious, and put a low priority on vetting them.
Just a thought.
Anyway, cool project, Eoin!
PS, I have no idea if the title of this post is acceptable as Irish, but I couldn’t resist trying… never had a posted titled in Irish, you see… ☺

Hi Patrick. It’s an interesting idea to have an empty list of headwords taken from lots of text. I hadn’t thought of approaching it that way.
Good attempt with the title :) My grammar isn’t great, so I’m not sure if the words should transform such as follows “An wiki-fhoclóir Ghaeilge”.
Hey Eoin!
Small world.
Ah, I see that Irish has the same
bugfeature as Welsh, consonant mutations. ☺ I never got those straightened out in Welsh, either.Ádh mór ort with the dictionary!!