Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | Next revision Both sides next revision | ||
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 13:38] zeman Where is the script? |
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 13:53] zeman Frequent words. |
||
---|---|---|---|
Line 128: | Line 128: | ||
* http:// | * http:// | ||
* http:// | * http:// | ||
+ | |||
+ | ===== Vocabulary of Frequent Words ===== | ||
+ | |||
+ | Some frequent words cannot be disambiguated by character-based rules alone but a vocabulary could identify them as existing unambiguous Urdu words and save much manual work by disambiguating them. Here are some examples: | ||
+ | |||
+ | * ہے => he | ||
+ | * میں => meñ | ||
+ | * ایک => ek | ||
+ | * اور => or | ||
+ | |||
+ | Note however that there are inherently ambiguous words that cannot be disambiguated without human intervention (or at least without looking at the neighboring words). Examples: | ||
+ | |||
+ | * تو => to | tū | ||
+ | * اس => is | us | ||
+ | * ان => in | un | ||
===== The Transliteration Script ===== | ===== The Transliteration Script ===== |