[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 13:53]
zeman Frequent words.
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 14:03]
zeman Translations of frequent words.
Line 133: Line 133:
 Some frequent words cannot be disambiguated by character-based rules alone but a vocabulary could identify them as existing unambiguous Urdu words and save much manual work by disambiguating them. Here are some examples: Some frequent words cannot be disambiguated by character-based rules alone but a vocabulary could identify them as existing unambiguous Urdu words and save much manual work by disambiguating them. Here are some examples:
  
-  * ہے => he +  * ہے => he (“is”) 
-  * میں => meñ +  * میں => meñ (“in”) 
-  * ایک => ek +  * ایک => ek (“one”) 
-  * اور => or+  * اور => or (“and”)
  
 Note however that there are inherently ambiguous words that cannot be disambiguated without human intervention (or at least without looking at the neighboring words). Examples: Note however that there are inherently ambiguous words that cannot be disambiguated without human intervention (or at least without looking at the neighboring words). Examples:
  
-  * تو => to | tū +  * تو => to (“so”) | tū (“thou”) 
-  * اس => is | us +  * اس => is (“of this”) | us (“of that”) 
-  * ان => in | un+  * ان => in (“of these”) | un (“of those”)
  
 ===== The Transliteration Script ===== ===== The Transliteration Script =====
Line 151: Line 151:
  
 If you happen to sit on the ÚFAL network, you will find the script in ''~zeman/projekty/transliterace''. It should be able to find the library itself; the library is in ''~zeman/lib/translit'' (you will programs and libraries for other writing systems in these two folders as well). If you happen to sit on the ÚFAL network, you will find the script in ''~zeman/projekty/transliterace''. It should be able to find the library itself; the library is in ''~zeman/lib/translit'' (you will programs and libraries for other writing systems in these two folders as well).
 +
 +I am also attaching the current snapshot of the two folders to this wiki {{:user:zeman:translit.zip|here}}. Note however that it will not be updated regularly.
  
 This is an example of an Urdu sentence and the romanized output by the script: This is an example of an Urdu sentence and the romanized output by the script:

[ Back to the navigation ] [ Back to the content ]