Differences

This shows you the differences between two versions of the page.

--- user:zeman:transliteration-of-urdu-to-latin-script [2010/11/09 13:05]
zeman vytvořeno
+++ user:zeman:transliteration-of-urdu-to-latin-script [2010/11/09 16:14]
zeman Hamza.
@@ Line 22: / Line 22: @@
 Some other notes: //j// is pronounced as in English, not as in Czech or German. //č// and //š// are used in Baltic and Slavic languages (among others) to represent the sounds that are usually written “ch” or “sh”, respectively, in English. Of similar descent is the character //ž//; the corresponding sound is sometimes represented as “zh” in English and corresponds to the French pronunciation of //j//. //x// represents (in accord with phonetic tradition) the same sound as Czech/German/Scottish “ch”. English-oriented transcriptions of Arabic often transcribe this sound as “kh”, a solution that we want to avoid. It would conflict with the aspirated //kh// of Urdu. //ğ// is taken from Turkish and describes the sound that is often transcribed “gh” from Arabic (which we cannot use, again because of the aspirated //gh//).
-| **Unicode** | **Character** | **Pronunciation** | **Transliteration** |
+I do not attempt to map the special Semitic guttural consonant //ayin// to a Latin letter following pronunciation of a European language, as this sound is very peculiar to most Europeans. In transcription of Arabic, it is sometimes represented by superscript //c//. We use the IPA symbol ˀ (MODIFIER LETTER GLOTTAL STOP).
+The letter ں (NOON GHUNNA) occurs only at the end of the word and marks nasalization of the preceding vowel rather than a real consonant.
+There are two //h// letters: ہ (HEH GOAL) and ھ (HEH DOACHASHMEE). It is not necessary to distinguish them by diacritics as they occur in different positions. The normal consonant //h// is written using ہ (HEH GOAL), which can also appear at the end of the word to mark an (otherwise invisible) word-final short vowel //a// (transcribed //ah//). In contrast, ھ (HEH DOACHASHMEE) is used exclusively after other consonants (such as //k, g, č, j, t, d, b, p//) to form their aspirated counterparts. Thus, بھ is //bh//, پھ is //ph// etc.
+^ Unicode ^ Character ^ Pronunciation ^ Transliteration ^
 | 0628 | ب | b | b |
 | 067E | پ | p | p |
@@ Line 54: / Line 60: @@
 | 0645 | م | m | m |
 | 0646 | ن | n | n |
-| 06BA | ں | n | n |
+| 06BA | ں | n | ñ |
 | 0648 | و | v | w |
 | 06C1 | ہ | h | h |
 | 06BE | ھ | h | h |
 | 06CC | ی | j | y |
+===== Vowels =====
+The consonant (or semi-vowel) و //(w)// is also ambiguously used to represent the long vowels //ū// (pronounced as //oo// in English //fool//) and //o// (pronounced as //oo// in English //door//). I want to distinguish these three pronunciations. In most cases however, the script can only output //[wūo]// and leave the disambiguation to a human judgment:
+  * In word-initial position, I assume that only consonantal pronunciation is possible and always output //w//.
+  * Anywhere immediately before ا (ALEF), I assume that only consonantal pronunciation is possible and always output //w//.
+  * In word-final position, I believe that vowel is more likely although I am not sure that the consonant can be completely excluded. Nevertheless, I currently output //[ūo]//.
+  * If it appears immediately before word-final ں (NOON GHUNNA), I consider it part of plural oblique case suffix and invariably output //o//.
+  * In all other cases I output //[wūo]//.
+The consonant (or semi-vowel) ی //(y)// is also ambiguously used to represent the long vowels //ī// (pronounced as //ee// in English //feet//) and //e// (pronounced roughly as //ai// in English //fair//). I want to distinguish these three pronunciations. In most cases however, the script can only output //[yīe]// and leave the disambiguation to a human judgment:
+  * In word-initial position, I assume that only consonantal pronunciation is possible and always output //y//.
+  * Anywhere immediately before ا (ALEF), I assume that only consonantal pronunciation is possible and always output //y//.
+  * In word-final position, I assume that the only possible reading is //ī//.
+  * In all other cases I output //[yīe]//.
+The letter ے (YEH BARREE) only appears in word-final position and is transliterated as //e// (which is written in other positions using the ambiguous ی).
+The letter ا (ALEF) is ambiguous and can lead to many different readings:
+  * In word-initial position, it merely says that the word begins with a vowel. It could be any of the three short vowels //[aiu]//: افریقہ //afrīqah// “Africa”, اسلام //islām// “Islam”, اردو //urdū// “Urdu”.
+    * If word-initial ا is followed by و or ی, they together could represent a word-initial long vowel //[ūoīe]//, such as in ایک //ek// “one”. In this case, ا should map to an empty string (because the next character itself will allow for transliteration by the long vowel).
+  * In word-internal and word-final positions, ا is transliterated to the long vowel //ā// (pronounced as //a// in English //father//).
+The letter آ (ALEF MADDA) only appears in word-initial position and is transliterated as //ā// (which is written in other positions using normal ا).
+The YEH with the diacritic HAMZA above separates two consecutive vowels, e.g. جائے گا //jāe gā// “will go” or کوئی //koī// “some”.
+Similarly, the diacritic HAMZA above a و separates it from the preceding vowel as in ہاؤسنگ //hāūsing// “housing”. (In this case, the hamza is a separate character that is placed in the logical sequence after the و.)
+^ Unicode ^ Character ^ Pronunciation ^ Transliteration ^
+| 0627 | ا | -, a: | a, i, u, 0, ā |
+| 0622 | آ | a: | ā |
+| 0648 | و | v, u:, o: | w, ū, o |
+| 06CC | ی | j, i:, e: | y, ī, e |
+| 06D2 | ے | e: | e |
+| 0626 | ئ | - | 0 |
+| 0674 | ٔ (high hamza) | - | 0 |

[ Back to the navigation ] [ Back to the content ]

Institute of Formal and Applied Linguistics Wiki

Differences