Automated page translation with Google

Hasan Aljudy hasan.aljudy at gmail.com
Mon Mar 26 16:37:26 PDT 2007



Jan Claeys wrote:
> Op Fri, 23 Mar 2007 15:46:24 -0700
> schreef Walter Bright <newshound at digitalmars.com>:
> 
>> Hasan Aljudy wrote:
> 
>>> I don't think it's google that wrote the translation engines ..
>>> it's probably some other company's 30+ years of work!  
>> You're right they bought it. But I think they'll continue to improve
>> it, because doing it better can be worth enormous money.
> 
> The Systran software they have licensed (not bought AFAIK) hasn't
> improved in any obvious way since the first time I used it something
> like 10 years ago...
> 
> It's often usable if you want to get an impression of what a page talks
> about, but IMHO technical documentation requires accuracy.
> 
> E.g., something like "Objets de classe d'Instantiating ailleurs que le
> tas de CHROMATOGRAPHIE GAZEUSE" is complete nonsense if you are
> talking about D.   ;-)
> 
> 

Actually I was looking up "free statistical translation" (or something 
like that) in Google, when I discovered a Google Blog entry stating that 
Google now uses a statistical model for translating Arabic and Chinese 
(I think all languages labeled BETA use that model now)
http://googleresearch.blogspot.com/2006/04/statistical-machine-translation-live.html

and, interestingly enough, you can now "suggest a better translation" 
for any piece of text that Google translates! I'm guessing it goes 
through some sort of filtering mechanism then gets passed to the 
statistical engine.

http://googleblog.blogspot.com/2007/03/suggest-better-translation.html

I've found that translating news articles from Arabic to English gives 
very good results ..
However, translating technical articles from English to Arabic gives the 
crappiest results!! I guess it all depends on what they feed the 
statistical engine.

Try it on aljazeera.net or something .. I think you'll be amazed; I was. 
I never thought there'd be any hope for "reasonable" machine translation 
involving Arabic, and I happily admit that I've been proved wrong!



More information about the Digitalmars-d mailing list