Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 13:38] zeman Where is the script? |
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 14:03] zeman Translations of frequent words. |
||
---|---|---|---|
Line 128: | Line 128: | ||
* http:// | * http:// | ||
* http:// | * http:// | ||
+ | |||
+ | ===== Vocabulary of Frequent Words ===== | ||
+ | |||
+ | Some frequent words cannot be disambiguated by character-based rules alone but a vocabulary could identify them as existing unambiguous Urdu words and save much manual work by disambiguating them. Here are some examples: | ||
+ | |||
+ | * ہے => he (“is”) | ||
+ | * میں => meñ (“in”) | ||
+ | * ایک => ek (“one”) | ||
+ | * اور => or (“and”) | ||
+ | |||
+ | Note however that there are inherently ambiguous words that cannot be disambiguated without human intervention (or at least without looking at the neighboring words). Examples: | ||
+ | |||
+ | * تو => to (“so”) | tū (“thou”) | ||
+ | * اس => is (“of this”) | us (“of that”) | ||
+ | * ان => in (“of these”) | un (“of those”) | ||
===== The Transliteration Script ===== | ===== The Transliteration Script ===== | ||
Line 136: | Line 151: | ||
If you happen to sit on the ÚFAL network, you will find the script in '' | If you happen to sit on the ÚFAL network, you will find the script in '' | ||
+ | |||
+ | I am also attaching the current snapshot of the two folders to this wiki {{: | ||
This is an example of an Urdu sentence and the romanized output by the script: | This is an example of an Urdu sentence and the romanized output by the script: |