Aligning translations with text compression?
Dear interwebs series of tubes people:
Random thought:
If you have two translations, and you perform some sort of compression on the both of them, could interesting relationships between the content of the two translations be uncovered? For instance, it seems like you might be able to get rid of non-content words, which might make it conceivable to align the texts at a phrase level.
I’ve dug a bit, but only found a paper by Conley and Klein, “Using Alignment for Text Compression.” But a quick glance at that (haven’t read it yet) suggests that they’re interested in improving compression for compressions sake, which isn’t what I have in mind.
Thanks for your thoughts and observations, interwebs.
No comments yet.
Technorati tags: alignment, Code, Linguistic Computing, translation