[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/09 16:40]
zeman Short vowels.
user:zeman:transliteration-of-urdu-to-latin-script [2010/11/10 09:35]
zeman Schwa.
Line 68: Line 68:
 ===== Vowels ===== ===== Vowels =====
  
-The consonant (or semi-vowel) و //(w)// is also ambiguously used to represent the long vowels //ū// (pronounced as //oo// in English //fool//) and //o// (pronounced as //oo// in English //door//). I want to distinguish these three pronunciations. In most cases however, the script can only output //[wūo]// and leave the disambiguation to a human judgment:+The consonant (or semi-vowel) و //(w)// is also ambiguously used to represent the long vowels //ū// (pronounced as //oo// in English //fool//) and //o// (pronounced as //oo// in English //door//). I want to distinguish these three pronunciations (note however that I am not attempting to further distinguish //o// from the slightly different vowel //ao// that is pronounced as //au// in English //automatic//; I am pretending that these two are identical). In most cases however, the script can only output //[wūo]// and leave the disambiguation to a human judgment:
  
   * In word-initial position, I assume that only consonantal pronunciation is possible and always output //w//.   * In word-initial position, I assume that only consonantal pronunciation is possible and always output //w//.
Line 76: Line 76:
   * In all other cases I output //[wūo]//.   * In all other cases I output //[wūo]//.
  
-The consonant (or semi-vowel) ی //(y)// is also ambiguously used to represent the long vowels //ī// (pronounced as //ee// in English //feet//) and //e// (pronounced roughly as //ai// in English //fair//). I want to distinguish these three pronunciations. In most cases however, the script can only output //[yīe]// and leave the disambiguation to a human judgment:+The consonant (or semi-vowel) ی //(y)// is also ambiguously used to represent the long vowels //ī// (pronounced as //ee// in English //feet//) and //e// (pronounced roughly as //ai// in English //fair//). I want to distinguish these three pronunciations (note however that I am not attempting to further distinguish //e// from the slightly different vowel //ae// that is pronounced more open; I am pretending that these two are identical). In most cases however, the script can only output //[yīe]// and leave the disambiguation to a human judgment:
  
   * In word-initial position, I assume that only consonantal pronunciation is possible and always output //y//.   * In word-initial position, I assume that only consonantal pronunciation is possible and always output //y//.
Line 106: Line 106:
 | 0674 | ٔ (high hamza) | - | 0 | | 0674 | ٔ (high hamza) | - | 0 |
  
-===== Vowel Diacritics =====+The transliteration script should contain a gradually growing vocabulary that would help disambiguate known words. Otherwise there would be a very high number of ambiguous positions in any transliterated string. 
 + 
 +===== Short Vowels and Diacritics ===== 
 + 
 +Without diacritics (which is more common), every consonant that is not followed by a long vowel may or may not be followed by a short vowel. I denote this possibility by the character for the neutral character schwa: //ə//.
  
 //Warning! This section is under construction. I am still confused about the exact rules for Urdu vowel representation, so I also expect more errors to occur here.// //Warning! This section is under construction. I am still confused about the exact rules for Urdu vowel representation, so I also expect more errors to occur here.//
  
 Although used rarely, Urdu has means to mark the three short vowels as well. This is done using one of the three diacritical marks. Long vowels can be disambiguated as well, e.g. a consonant with the pesh mark followed by a waw without any diacritic means that the waw is a long vowel //[ūo]// but not the consonant //w//. Although used rarely, Urdu has means to mark the three short vowels as well. This is done using one of the three diacritical marks. Long vowels can be disambiguated as well, e.g. a consonant with the pesh mark followed by a waw without any diacritic means that the waw is a long vowel //[ūo]// but not the consonant //w//.
 +
 +^ Unicode ^ Unicode Name ^ Urdu Name ^ With Alef ^ Transliteration ^
 +| 064E | ARABIC FATHA | zabar | َا | a |
 +| 064F | ARABIC DAMMA | pesh | ُا | u |
 +| 0650 | ARABIC KASRA | zer | ِا | i |
  
 pesh (ARABIC DAMMA, 064F) ... u ... کُون //kon// “who” pesh (ARABIC DAMMA, 064F) ... u ... کُون //kon// “who”
Line 116: Line 125:
 zer (ARABIC KASRA, 0650) ... i ...  zer (ARABIC KASRA, 0650) ... i ... 
  
-Possible further reading: http://en.wikipedia.org/wiki/Arabic_diacritics +Possible further reading: 
-http://users.skynet.be/hugocoolens/newurdu/vowels.html+  * http://en.wikipedia.org/wiki/Arabic_diacritics 
 +  http://users.skynet.be/hugocoolens/newurdu/vowels.html
  

[ Back to the navigation ] [ Back to the content ]