Interesting Machine Translation Project by a Karachite

Zeeshan Ahmed is a young engineer who is actively working on an interesting machine translation project called PakTranslations. Self described as a die hard Karachiwala, Zeeshan is a graudate engineer from GIKI

KMB has conducted an email interview with Zeeshan on this project and what he expects out of it in the coming days. In a time when we have a tough fight between the forces of gloom and that of hope, let us do every bit possible to side with the positive forces.

Interview with PakTranslations follows.

KMB: What is Pak Translations?

Zeeshan: Pak Translations.com is a personal project. Something I set out to do a year an a half ago. Personally developed the technology and got the machine translation dictionary done by volunteers, friends and data-entry people across Karachi sitting in photo-stat shops. With the new version launched in a few days, then my prime focus would be on bringing in traffic and introducing Sindhi.

For more info, see the mission here and the road map here.

KMB: Tells us something more about yourself that is included in you online profile?

Zeeshan: Well I am a Computer Systems Engineer, graduated in 2004. Did a few jobs first, I call it ‘paid my dues’. It was in May 2006 when I realized that one should save his best for his own work. No one else deserves it and would never appreciate it. Therefore I decided to dedicated a portion of my career on work, rather than a job. It has been tough but over all a great experience!

KMB: Is Pak Translations just a word by word replacement of English words with Urdu words? or something more than that?

Zeeshan: Its not word to word. It rearranges, supports tenses, grammar, gender, respect and other language details. The technology is much advance than it looks. Its actually the data which lacks. That would improve with time and with the new version coming, the system would also become context aware.

The translation core uses rule-based grammar. This was the only way I could do it. As Urdu has not been much worked on, there is not much digital data available. The statistical approach (Google’s way) needs millions of examples to train the system. Therefore I went for the age old rule based approach for translation and statistical for learning. Learning would kick in with the new version.

KMB: What is the nature of the project? (Commercial, hobby, passion or a cross between these)

Zeeshan: Started as a passion but yes now it is a cross between everything. Still, commercial or no commercial, technology would always be worked on and the end result would always be free for the end user.

KMB: Do you agree that once machine translation for English/Urdu is perfected to the reasonable level, its main pull with come from mobile population?

Zeeshan: Being a very new concept for Pakistan, I actually didn’t know where the real pull would come from. And I didn’t spend much time trying to figure it out. I would keep improving the technology and see where the light bulbs go off. Can be in media, news papers, dot com, ISPs, etc. But I do agree that mobile market is one where I have high hopes from.

KMB: How does it feel to have Pak Translations at the first place in the Google’s Zeitgeist this month?

Zeeshan: haha … may sound indifferent of me but I didnt know about Google Zeitgeist till I got your mail. Then I Googled and found out. This of course is an honor but actually makes my mind wonder of the possibilities ahead. i.e. when the technology is matured and proper marketing done.

KMB: Do you have a dream of someone from GMY acquiring Pak Translations in future?

Zeeshan: As you may have guessed already that I am not much of a dot com person. About acquisition, of course but only if

– the technology is constantly worked on

– free for end user

– price is right

18 Comments so far

  1. Kashif (unregistered) on January 22nd, 2008 @ 12:11 pm

    Very good. The guy should be supported and promoted.


  2. morality cop (unregistered) on January 22nd, 2008 @ 12:16 pm

    excellent work!


  3. BoZz (unregistered) on January 22nd, 2008 @ 12:38 pm

    Will the Bloggers please stop referring constantly to this cliche, “gloom and that of hope”, it is getting kind of painful to read it so often. Besides I doubt if PakTranslations could do much if anything to correct our predicament. Do not try and over sell your story please.


  4. Tee Emm (unregistered) on January 22nd, 2008 @ 1:11 pm

    @Bozz: How often do you listen positive news and stories coming out from Pakistan? How often do you listen otherwise?

    It’s not overselling.


  5. Raza (unregistered) on January 22nd, 2008 @ 1:19 pm

    Also, take a look at another project of similar nature – ApniUrdu.com started by a fellow engineer of NED University.


  6. AMMAR (unregistered) on January 22nd, 2008 @ 4:07 pm

    Well done Zeeshan. Believe me, you are doing great. All of us must support him and also try to find out some sponsorship so that he can work without any fear of money. Can we work out for an endowment fund for this. This is really a national service. Please folks, keep highlighting this.


  7. Adnan Siddiqi (unregistered) on January 22nd, 2008 @ 5:27 pm

    Clickety

    LOL!. ThoughWrong translation of “Crazy” :-)


  8. Atif Abdul-Rahman (unregistered) on January 22nd, 2008 @ 6:55 pm

    Ive met him three years ago, he was very passionate about technology and very keen on achievements back then too. well done Zeeshan…


  9. balma (unregistered) on January 22nd, 2008 @ 8:40 pm

    Fantastic


  10. d0ct0r (unregistered) on January 22nd, 2008 @ 10:34 pm

    interesting.. am impressed by his replies.. btw what would be the urdu word for blogging?


  11. d0ct0r (unregistered) on January 22nd, 2008 @ 10:40 pm

    am not a techie so the comment might sound dumb but can’t he open up the dictionary and make it editable so that we all may help him add new words just like wiki


  12. d0ct0r (unregistered) on January 22nd, 2008 @ 10:43 pm

    instead of data entry operators at photo-stat shops,all the pakistanis/urdu speakers online can do wonders… just a thought


  13. d0ct0r (unregistered) on January 22nd, 2008 @ 10:47 pm

    add a bookmarklet as well


  14. d0ct0r (unregistered) on January 22nd, 2008 @ 11:11 pm

    whats up with pakistanis.. seems like most of em are wasting their time online

    gaining queries on google..
    urdu love sms pushto music commandersafeguard

    although there is no xxx search which is surprising n encouraging but still people need to be more productive while they’re online


  15. SELF (unregistered) on January 22nd, 2008 @ 11:35 pm

    Funny yes, practical no.

    Translation of ‘Software & Driver Downloads’;

    »کمپیوٹر کا پروگرام اور گاڑی چلانے والا ڈون لوڈ کرتا

    Unless he adds some semantics processing to his word by word translation, it wont work.


  16. Adnan Siddiqi (unregistered) on January 23rd, 2008 @ 12:57 am


    am not a techie so the comment might sound dumb but can’t he open up the dictionary and make it editable so that we all may help him add new words just like wiki

    heavy chances of misusage.

    wish someone could make a transliteral application like this for Urdu as well.


  17. ZeeShan Ahmed (unregistered) on January 25th, 2008 @ 3:07 am

    hello everyone.

    thank you for evaluating paktranslations.com project.

    KASHIF/MORALITY COP/AMMAR/ATIF ABDUL-RAHMAN/BALMA/DOCTOR – thank you. you guys are the best!

    BOZZ – ahhh the troubled kid. you are already my favorite :-]

    TEE EMM – my hero!

    RAZA – his name is faisal nizam, currently at microsoft. i have met him. he did apniurdu when he was in matric. pure genius.

    DOCTOR/ADNAN SIDDIQI/SELF – i agree. yes guys like yourself can do alot better and yes it would be very prone to misuse. in the next 10 days, i would be deploying an online proof-reader which solves everything. the dictionary actually contains multiple meaning for every word. it only lacks contextual statictical knowledge. binding a meaning with a context. this is what the proof-reader does with your help. in return you can instantly translate and proof-reader with a few clicks. hint: can be very useful for new-papers who already have translation teams working round the clock. to control misuse, people will submit the proof-read page with their own key and this way make their own contextual repository. which means they can fine tune this technology for their business. different businesses are from different domains. this is how i would get domain specific repositories like for automotive, media, weather, web, etc. would also use this data to build my own general repository. problem solved!

    ADNAN SIDDIQI – about the transliterator. well i call it the PhoneticTranslator.java. this piece of code is already a part of the core. just never thought people would find it useful. would also add it in the new version. this one is only for you :-]


  18. Adnan Ul Haque (unregistered) on January 25th, 2008 @ 9:00 pm

    Dear All, salam:

    I am NED University graduate (1997) and currenlty working in Ericsson R&D in Montreal, Canada.

    I was working in Instaphone from 2000 to 2002 and had implemented / design / wrote softwares in telecomm switches, which was used all over the Pakistan (both in Paktel and Instaphone Network).

    Reason for my above introduction is not speaking high volume of myself but to give tribute to zeeshan that he has made such a wonderfull effort which no one can imagine right now. If this effort can be properly supported than no wonder it can change the faith of Pakistan…..whatever problem we are facing in Pakistan is becaus of illeteracy…if this project could be seriously sponsored and once we have high speed avaialable at every home (at cheap cost) then it will change the style of education , from kidergarton to university level.

    Please support Zeeshan and this project by all means.

    Thanks,

    Adnan Ul Haque
    adnan.ul.haque@ericsson.com



Terms of use | Privacy Policy | Content: Creative Commons | Site and Design © 2009 | Metroblogging ® and Metblogs ® are registered trademarks of Bode Media, Inc.