[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 13:38]
zeman Where is the script?
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 13:53]
zeman Frequent words.
Line 128: Line 128:
   * http://en.wikipedia.org/wiki/Arabic_diacritics   * http://en.wikipedia.org/wiki/Arabic_diacritics
   * http://users.skynet.be/hugocoolens/newurdu/vowels.html   * http://users.skynet.be/hugocoolens/newurdu/vowels.html
 +
 +===== Vocabulary of Frequent Words =====
 +
 +Some frequent words cannot be disambiguated by character-based rules alone but a vocabulary could identify them as existing unambiguous Urdu words and save much manual work by disambiguating them. Here are some examples:
 +
 +  * ہے => he
 +  * میں => meñ
 +  * ایک => ek
 +  * اور => or
 +
 +Note however that there are inherently ambiguous words that cannot be disambiguated without human intervention (or at least without looking at the neighboring words). Examples:
 +
 +  * تو => to | tū
 +  * اس => is | us
 +  * ان => in | un
  
 ===== The Transliteration Script ===== ===== The Transliteration Script =====

[ Back to the navigation ] [ Back to the content ]