[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 13:59]
zeman Attached the software.
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 14:18]
zeman New vowels with hamza.
Line 103: Line 103:
 | 06CC | ی | j, i:, e: | y, ī, e | | 06CC | ی | j, i:, e: | y, ī, e |
 | 06D2 | ے | e: | e | | 06D2 | ے | e: | e |
-| 0626 | ئ | - | 0 | +| 06D3 | ۓ | e: | e | 
-0674 | ٔ (high hamza) | - | 0 |+| 0624 | ؤ | u:, o: | ū, o | 
 +| 0626 | ئ | -, i:, e | 0, ī, e 
 +0654 (hamza above)ٔ | - | 0 | 
 +| 0674 | (high hamza)ٔ | - | 0 |
  
 The transliteration script should contain a gradually growing vocabulary that would help disambiguate known words. Otherwise there would be a very high number of ambiguous positions in any transliterated string. The transliteration script should contain a gradually growing vocabulary that would help disambiguate known words. Otherwise there would be a very high number of ambiguous positions in any transliterated string.
Line 133: Line 136:
 Some frequent words cannot be disambiguated by character-based rules alone but a vocabulary could identify them as existing unambiguous Urdu words and save much manual work by disambiguating them. Here are some examples: Some frequent words cannot be disambiguated by character-based rules alone but a vocabulary could identify them as existing unambiguous Urdu words and save much manual work by disambiguating them. Here are some examples:
  
-  * ہے => he +  * ہے => he (“is”) 
-  * میں => meñ +  * میں => meñ (“in”) 
-  * ایک => ek +  * ایک => ek (“one”) 
-  * اور => or+  * اور => or (“and”)
  
 Note however that there are inherently ambiguous words that cannot be disambiguated without human intervention (or at least without looking at the neighboring words). Examples: Note however that there are inherently ambiguous words that cannot be disambiguated without human intervention (or at least without looking at the neighboring words). Examples:
  
-  * تو => to | tū +  * تو => to (“so”) | tū (“thou”) 
-  * اس => is | us +  * اس => is (“of this”) | us (“of that”) 
-  * ان => in | un+  * ان => in (“of these”) | un (“of those”)
  
 ===== The Transliteration Script ===== ===== The Transliteration Script =====

[ Back to the navigation ] [ Back to the content ]