Hello, you have come here looking for the meaning of the word Module talk:ur-translit. In DICTIOUS you will not only get to know all the dictionary meanings for the word Module talk:ur-translit, but we will also tell you about its etymology, its characteristics and you will know how to say Module talk:ur-translit in singular and plural. Everything you need to know about the word Module talk:ur-translit you have here. The definition of the word Module talk:ur-translit will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofModule talk:ur-translit, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
Food for thought
Latest comment: 3 years ago5 comments3 people in discussion
Is there a difference in pronunciation in final small prolonged ye and final short 'i' ye?
(i.e. 'i' vs 'ī' in the final form)
Is there any need for ō rather than just 'o' (i.e. is there a shortened o in Urdu)?
1. I have seen few words in Hindi: शक्ति, मति, मिति while not a single word in Urdu. Final short 'i's just appear in Sankrit loanwords. They are not different in pronunciation; they neutralize to same phone(prolonged or not).
3. I have found a word پیدائش (pɛ̄.dɑ.iʃ); I believe ئ is used to transcribe post-vocalic short i (इ). The इ in Devanāgarī is used to transcribe English words: साइड. Kushalpok01 (talk) 10:09, 3 January 2021 (UTC)Reply
1. Do you think that it's better to transliterate ye in final form as 'i' over 'ī'? I think I would support this too because there isn't a distinction in Urdu unlike Hindi.
3. I wonder whether there should be something done to lower the chances of user error? Also do you think we should remove the diacritic from ɛ̄ and change it with ɛ for prolonged ai sounds? -Taimoor Ahmed(گل بات؟)02:55, 10 January 2021 (UTC)Reply
@Fenakhay Thank you, but since Urdu and Punjabi have very different transliteration policies, I don't know if having them use the same module is a good Idea... Urdu policy treats the hamza as a zero consonant (silent consonant that only exists to bear a vowel) but Punjabi policy treats it as a glottal stop, they both transcribe nasals differently, and the letter correspondences are different.
Hola @Sameerhameedy, Fenakhay, I requested Module:pa-Arab-translit to be used for Urdu, because of how closely the languages are related (speaker-wise). I'm not sure why pronunciation is related to this, because transliteration doesn't necessarily represent pronunciation, no? Anyways, for me when it comes to transliteration, the representation of the individual letters matters the most, so IMO, we shouldn't just transliterate, for instance, ز(z), ض(z), ظ(z) as merely 'z'. I'm assuming there never was a policy set for Urdu, and the Transliteration policy for Hindi was adopted for Urdu. نعم البدل (talk) 19:28, 6 September 2023 (UTC)Reply
@نعم البدل to be clear, I have no objections to whatever transliteration policy is implemented. I am only saying that you should start a discussion about changing the policy. In fact, if you do start a discussion and get a new transliteration policy implemented I will change Module:ur-translit to conform to whatever transliteration that is (since urdu translit is currently less buggy than the panjabi one). But I am only asking that you discuss the changes instead of unilaterally implementing them. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 19:29, 6 September 2023 (UTC)Reply
And not with me (because I will follow whatever the community decides), go to beer parlor, tag all active urdu editors and propose your new transliteration. If they agree I will change this module to match whatever policy you put in place. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 19:34, 6 September 2023 (UTC)Reply
@Sameerhameedy: No you have a perfectly valid objection/point, a discussion is warranted. I'm not really sure who to invite to the discussion though, perhaps you could help me in this case. I'd love to get your opinion as well, as a Persian speaker. نعم البدل (talk) 19:35, 6 September 2023 (UTC)Reply
Latest comment: 1 year ago47 comments4 people in discussion
@Sameerhameedy: Hi. Pinging you as the only active editor on the module but maybe I should post in WT:GP or WT:BP.
Words like مشعہ, currently showing transliteration as "mśʻa", (found at radio#Translations) should fail for the lack of sufficient information. Either vowels should be provided or sokuns when there is no vowel. Otherwise, we'll get a lot of wrong transliterations.
I don't know if the word is valid and how to read it.
Urdu transliterations currently still use Module:pa-Arab-translit, (ben hasn't made the switch yet) I can add a provision that words that do not have any diacritics should not be transliterated. But the issue is that some vowels in Urdu don't need diacritics like سے, not sure what to do in that situation but maybe in that case we can add a spurious sukoon? e.g. سےْ? سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 01:15, 7 September 2023 (UTC)Reply
@Sameerhameedy: @Benwing2 used the provision for Arabic to fail transliterations, if a word looks "suspicious" with various consonant clusters and no vocalisation. I guess, Urdu and Persian can use a similar approach but the rules will be somewhat different. Errors like کُرُوز vs کُروز read as "korowz" instead of "koruz" are unavoidable, if some uses incorrect vocalisation by modelling on e.g. Arabic. Anatoli T.(обсудить/вклад)01:55, 7 September 2023 (UTC)Reply
Hmm I actually might not be able to implement this but I'll try. I was going to just copy and paste what ben put in the Arabic transliteration module but I don't know what kinda magic he used. I cant really understand what the code he wrote does but if I figure it out i'll implement it here. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 03:21, 7 September 2023 (UTC)Reply
@Sameerhameedy Hi Sameer. The basic approach I used is to start with the vocalized source and remove sequences that are considered vocalized or are allowed to be unvocalized. If any are left at the end, you fail the translit. Benwing2 (talk) 03:31, 7 September 2023 (UTC)Reply
@Benwing2 Okay sorry I fixed it, it was because it called some functions that didn't exist (a function I copied called for the letter waaw but it's called vao in the module). But usually it says that a value that was called was null, not that the module didn't exist. So im still not sure why it did that.
Anyways I got it working in this revision this revision but I had to remove it because it caused 2/3 of the test cases to fail instead of the 2 that I wanted. I might be able to get it working, but i'm worried about the aspirated consonants, and if they'd mess things up. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 06:14, 7 September 2023 (UTC)Reply
@Sameerhameedy The aspirated consonants shouldn't mess things up. Basically, you have a series of pattern subs to remove sequences that are allowable. You can see the patterns used in Arabic near the bottom of Module:ar-translit, starting at line 319. You'd just need some extra patterns to handle the aspirated consonants, which go before the corresponding regular-consonant patterns. I don't know whether the fatha/kasra/damma goes on the first or second consonant in the aspirated cluster, but e.g. if it goes on the first one, you just remove the consonant for /h/ in the sequence of consonant + fatha/kasra/damma + /h/, then you can treat everything following as if it were the corresponding unaspirated consonant. Benwing2 (talk) 06:33, 7 September 2023 (UTC)Reply
@نعم البدل no the module already handles vocalized aspirated consonants. What we're trying to do is have the module do a "count" to see how many syllables don't have any vowels (or a sukoon). And if too many syllables don't have any vowels, it won't transliterate. So if I wrote سنسکرت the module would return blank, but if I wrote something like سَنْسْکْرِت(sanskrit), it would transliterate. Unfortunately words like دیدار will transliterate as "dēdār", instead of going blank. Since Urdu has a some vowels that don't need diacritics.
@Atitarev @Benwing2 this now works at least the vast majority of the time. But there are some false positives (though so far i've only seen one, but that means there's a lot more out there). Not sure why but I'll see if I can figure it out. I tried making it more strict to prevent that but it caused a lot of false negatives. I'll try to see if I can figure out what's going on, but since Urdu vowels don't need diacritics like Arabic, it can't be as strict as Arabic is. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 00:20, 8 September 2023 (UTC)Reply
For the purposes of good transliterations perhaps it needs to more strict than the usual casual vocalisation, e.g. using sokuns in the middle of words but there should be rules regarding what clusters are allowed and at what position, which is not easy.
E.g. I've changed from کھلائی to کِھلائی(khilāī). The former should fail, since I don't think any word can start with "khl" but if it does, it needs to have sukuns. It's further complicated by digraphs. "کھ" is "kh", and the diacritic is set on the first consonant.
If you look at the test cases {{l|ur|کھلائی}} transliterates but the cluster "khl" shouldn't be allowed, at least without a sukoon. The module could bar adjacent consonants without a sukoon (i tried to do that but did something wrong) but unlike Arabic (afaik) it can't require all letters to have a diacritic since semi vowels without diacritics are distinguished (e.g. ایـ "e", اِیـ "ī", and اَیـ "ai" ). سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 06:22, 8 September 2023 (UTC)Reply
@Atitarev I assume جن can occur word-medially in words like Arabic مَجْنُون(majnūn), which appears to become مَجْنُوں(majnū̃) in Urdu. Either we need to have a full understanding of Urdu phonotactics, or (better) we need to require sukuuns to mark consonant clusters. I believe requiring sukuuns is the right thing. Benwing2 (talk) 08:38, 8 September 2023 (UTC)Reply
@نعم البدل By jazm you mean sukuun? sukuun is a symbol, whereas from what I can tell, jazm is a phonological concept referring to a consonant followed directly by another consonant. Benwing2 (talk) 08:57, 8 September 2023 (UTC)Reply
@نعم البدل Let's standardize on sukuun (or sukoon, whatever). sukuun has only one meaning, which is consistent everywhere, but jazm has multiple meanings and is inconsistent: in Arabic, jazm means only the jussive mood (which happens to be marked by a null ending in Arabic); its extension of use to mean both a consonant cluster and the symbol marking a consonant cluster shows a general confusion between morphology, phonology and orthography. Benwing2 (talk) 09:27, 8 September 2023 (UTC)Reply
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ @Sameerhameedy – Module is returning nil in cases like رَہنا(“to live”); بَہنا(“to flow”). It's essentially in words with two-lettered stems, which will only have one diacritic + infinitive verb (na) نعم البدل (talk) 01:18, 10 September 2023 (UTC)Reply
grammatically incorrect for verbs*, might I add. I think it's something that we didn't consider in the discussions about consonant clusters. نعم البدل (talk) 01:25, 10 September 2023 (UTC)Reply
@نعم البدل Are you sure? Urdu lughat does use sukoon for رہنا (though the font uses the quranic style of sukoon which resembles a ح cut in half). IMO this is more likely an issue with hi-translit, unless there a reason why he + sukoon = ā instead of "h"?
But if you think ur-translit is the problem then you'd have to ask @Benwing2 to look into it. Because I have genuinely have no idea how I would make transliteration exceptions for certain lemmas types.
Are you sure? Urdu lughat does use sukoon for رہنا – Huh, I never noticed that, I'd always thought that when it came to verbs, the Sukoon was never needed, since the stem is separate to the infinitive. If so, I guess it can stay as it is.
Because I have genuinely have no idea how I would make transliteration exceptions for certain lemmas types. – I just assumed that some botch with Module:ur-headword could create an exception of some kind, I'm not great at coding lmao, but if not then it's not the end of the world.
The issue with Module:ur-hi-convert is that the sukoon will combine the stem with the infinitive, so for instance رَہْنا(rahnā) would become रह्ना(rahnā) instead of रहना(rahnā), and this would have to change for all Urdu verbs, so I'm wondering if some sort of exception can be added in either module or whether this will cause the ur-hi convert mod to be redundant for Urdu verbs. نعم البدل (talk) 02:22, 10 September 2023 (UTC)Reply
@نعم البدل But this is a more general issue, isn't it? Consonant clusters can be rendered two ways in Devanagari (either through a conjunct consonant or two consonants placed next to each other) and AFAIK there's no way to predict which one will be used in any particular circumstance. Benwing2 (talk) 03:05, 10 September 2023 (UTC)Reply
@Benwing2: But in this case the pattern would be that the sukoon before the infinitive in verbs just needs to be ignored, ie. if the mod is being used for Template:ur-verb the sukoon before the نا – last two characters needs to just return "" or continue (or something similar)? نعم البدل (talk) 03:14, 10 September 2023 (UTC)Reply
@نعم البدل IMO having a hack for verbs is the wrong approach. Why can't you just make ALL clusters with sukuun be rendered using Module:ur-hi-convert using two consonants next to each other rather than using a conjunct? Conjuncts aren't so common in modern Hindi. Benwing2 (talk) 23:18, 12 September 2023 (UTC)Reply
for the 6th example is that really written with ghunna instead of sukoon?? I have set ghunna to be a nasal vowel unless it's in front of specifically mentioned characters. the mentioned characters being گکجچخغٹڈڑحہ , in front of those characters ghunna becomes an assimilated nasal consonant. Should I add all the dentals as well? I didn't think ghunna was used before them. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 08:44, 8 September 2023 (UTC)Reply
There isn't a guideline for how ghunna assimilates so I based the assimilation on how Hindi uses ं. And Hindi does not use ं before dentals. But obviously the scripts are different, so that might not apply to Urdu. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 08:47, 8 September 2023 (UTC)Reply
@Atitarev actually I will undo this change. According to this official dictionary from the Pakistan government (at least judging by how which words had a sukoon vs a ghunna) an assimilated noon is represented with a sukoon, and a noon ghunna is usually a nasal vowel. By the looks of it, it seems ان٘گ = aŋ, انْگ = ang. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 22:03, 8 September 2023 (UTC)Reply
@Sameerhameedy: Apologies for not responding earlier. I am flat out at work with a new project. Thanks for checking.
I think vowel + ن٘(̃) is almost like Hindi vowel + ं(̃) as in दांत(dānt) but our transliteration is phonetic, so it's transliterated as "dānt", just like दान्त(dānt) and I think there is no difference to दाँत(dā̃t). Feel free to edit and add new transliteration cases when you're more certain.
I more interested in iẓāfat in this case and how it should be transliterated - "e" or "i", with or without "y" and at what position. @نعم البدل: please also comment on vocalisation and "expected" transliterations. Anatoli T.(обсудить/вклад)06:35, 12 September 2023 (UTC)Reply
@Atitarev Historically, the izafat in Urdu has been transliterated as -i, but I'm still not sure what should be used for the TR. I likely would have leaned towards -i if we were making a distinction between individual letters, and it would have made sense since the zer is a 'short i', but since we're evidently not so strict with the TR, -e might be on the table. I believe, I've already expressed my opinion regarding nasalisation. نعم البدل (talk) 21:37, 12 September 2023 (UTC)Reply
@نعم البدل, Sameerhameedy, Benwing2:: Thank you! If you agree, we will leave iẓāfat with a zer (kasra) as "-i" or "-yi". ئے(e) seems to always produce "e". Please take another look at the iẓāfat endings in all examples. I added some comments. Are there any changes required to the vocalisations and the currently produced (automatic) transliterations? --Anatoli T.(обсудить/вклад)00:18, 13 September 2023 (UTC)Reply
@نعم البدل: Just to clarify, I saw your message about "bulãd" vs "buland" in another discussion to which I agree. So I'll make test cases considering this. It may be hard to follow and remember all discussions but if a test case is created, it will be addressed sooner or later. E.g. for صَدائے بُلَن٘د(sadāe bulãd), it should be "sadāe buland", not "sadāe bulãd". I will ping you when I make cases after your response. (And as User:Sameerhameedy said in the same discussion, بُلَنْد(buland) produces "buland", so is بُلَن٘د(bulãd) still a correct vocalisation and transliteration?) (repeating the ping, since I included Sameerhameedy as well). --Anatoli T.(обсудить/вклад)00:49, 13 September 2023 (UTC)Reply
@Atitarev Urdu lughat only uses ghunna for a nasal vowel. It only uses ghunna for consonant assimilation before kaaf and gaaf. In this case, UR-L would only use the spelling بُلَنْد for "bulānd", but would use دان٘ت for dā̃t. My usage of ghunna in fa-IPA was wrong and i'll remove it soon. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 18:26, 14 September 2023 (UTC)Reply
@Atitarev so after hours of research I found and a book by Manjari Ohala (can't access their book directly but it's quoted on wikipedia and on that site). Which are the only phonetic guides that focuses on urdu. According to both of them, nasalized vowels tend to assimilate before certain voiced stops (b,j,g) but (almost) never before voiceless stops (k,c,p). Nasal vowel assimilation before voiceless stops is a rarity and only really occurs in loan words (this seems to be the exact same case with Hindi, hi-translit removes nasal vowels before b,bh,j,jh,g,gh,Dh,dh but leaves them in front of t,th,T,Th,c,ch,k,kh,d,D. Even if they use the strict nasalization marker ँ.). So for بَین٘ک, the assimilation most likely has to be inputted manually. It seems this is a rare case anyways so it probably isn't worth creating a hack for it. There's really no other option. At least, I cannot think of a way that wouldn't fuck up entries like سان٘پ آن٘کھ پان٘چ. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 05:10, 15 September 2023 (UTC)Reply
ah -> ā
Latest comment: 1 year ago10 comments2 people in discussion
it changes back to -> "ah" if you put a sukoon on it. Since, according to wikipedia, it's not consistently pronounced as a short vowel in Urdu. If you want, I can change it so that a final -he becomes -a, but a final zabar + he = -ā. Or vice versa? سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 22:54, 8 September 2023 (UTC)Reply
Final-a words like کھاتا(khātā), کَھاتَہ(khāta)(alt-form of the first lemma) and جَگَہْ(jagah). The first and last are fine, it's just the middle one. Words which end in zabar he, shouldn't be prolonged in this case, despite the pronunciation. نعم البدل (talk) 01:01, 9 September 2023 (UTC)Reply
I was looking into this, this seems like it could be problematic due to how many entries list a final he as an -ā. Are you sure the hamza trick isn't good enough? It allows a final he to be transliterated as -ā and as -ah. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 06:45, 17 September 2023 (UTC)Reply
how many entries list a final he as an -ā – this is because final he usually become -ā in Hindi (with exceptions like जगह(jagah) / جَگَہْ(jagah)), and the TR used to be just copied over, even though they're not strictly the same form. As I said, it would treat lemmas like کھاتا(khātā) and کھاتَہ(khāta) the same. Is the issue strictly that the TR in older lemmas won't match the newer TR, or is it because the code might become too ambigious? Because many of the TR in the old lemmas need correcting anyways.
Theres no technical reason why we can't change a final -he, I was just worried about changing it since transliterated a final he as -ā seems to be common practice. But if your 100% sure it should be changed I can do it. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 19:19, 17 September 2023 (UTC)Reply
Latest comment: 1 year ago1 comment1 person in discussion
Hi @Sameerhameedy About the various transliterations of, 'n' like ہونْٹ(honṭ) and پَنْجاب(panjāb). Please normalise them as 'n', as these sounds – the Retroflex 'n' and (lesser) the Palatal 'n' aren't found in Urdu (neither in the alphabet or the phonetic inventory), and are specific to Hindi. نعم البدل (talk) 13:29, 17 September 2023 (UTC)Reply
Arabic tāʾ marbūṭa + الْ with an unmarked vowel
Latest comment: 1 year ago8 comments3 people in discussion
Will the module be able to automatically transliterate it as "dāiratu l-ma'ārif"? Is it a good and correct test case or we just transliterate such words manually? Should the الْ be spelled as ٱلْ to show that it's silent? Anatoli T.(обсудить/вклад)23:23, 24 September 2023 (UTC)Reply
@Atitarev It has to use the letter "te marbūta goal" ۃ (which is encoded differently than the Arabic one ة "ye marbūta") but it should work. In دائِرَۃُ ُالْمَعَارِف(dāiratu ulma'ārif) the ۃ is correct; When it has a vowel it becomes "atV" but otherwise is "a". However I'm not sure how to handle Arabic al- thing because i'm not sure how the module would distinguish Arabic words starting with al- vs Urdu words (which shouldn't transliterate like that). Theoretically we could get it to work with ٱلْ but that diacritic is not on Urdu keyboards. It's not type-able on either my phone or laptop. Not sure how I feel about using diacritics that are not utilized by Urdu at all and that most Urdu speakers don't normally have access too. سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 20:01, 29 September 2023 (UTC)Reply
@Sameerhameedy Are there any native Urdu words that end in "te marbūta goal" + vowel + ال in the next word? I'd think that the sequence of "te marbūta goal" + vowel indicates a phrase borrowed from Arabic. Benwing2 (talk) 20:09, 29 September 2023 (UTC)Reply
Thanks, I suppose I could get that sequence to work. The only issue is that an initial alif has to be paired to a vowel or else the module will return nill. I would have to have the alif paired to a sukoon. I could have it so that "alif + sukoon + laam + sukoon" always returns "l-". سَمِیر | Sameer (مشارکتها • کتی من گپ بزن) 20:29, 29 September 2023 (UTC)Reply
@Sameerhameedy, @Benwing2: Thanks for addressing this. You can use the same methods as in the Arabic module, even if such a combination is much less common. I also think "dāiratu ul-ma'ārif" with a hyphen would be more accurate but I don't have a strong opinion about it. It may not be so necessary to insert hyphens for etymological reasons only.
What do you think of کِتابُ الْمَعارِف(kitābu l-ma'ārif) (or kitābu lma'ārif - with no hyphen)? I think the indication that it is an Arabic borrowing and "ا" should be read as "ٱ", is the lack of a diacritic over the alif and a vowel on a preceding word. Would that work? Anatoli T.(обсудить/вклад)00:24, 30 September 2023 (UTC)Reply
@Sameerhameedy I agree with User:Atitarev, it would be better to not require the sukoon over the alif (since it's a nonstandard use of sukoon, which is normally only placed over consonants, and is likely to confuse users). That would probably mean adding a special case for this situation to allow it. Benwing2 (talk) 00:47, 30 September 2023 (UTC)Reply
Latest comment: 1 year ago2 comments2 people in discussion
@Sameerhameedy, @Benwing2: Hi, the Arabic ال article is not (yet) handled but is it preferred to do:
عِید اُلْفِطْر('īd ulfitr) (automated) or. Not correct from the Arabic point of view but produces a correct transliteration.
عِیْدُ الْفِطْر('īd ul-fitr) (manual) (possibly just "īd ulfitr" with no hyphens?)
Benwing2 suggested the Arabic article can be handled just the Arabic module but Sameerhameedy has previously removed my test case. So, just want to know your views, also regarding assimilations of L for Arabic sun letters, e.g. عیدُ الصَّوم('īd us-saum) and tāʾ marbūṭa ۃ in the previous topic (above). Anatoli T.(обсудить/вклад)00:08, 10 October 2023 (UTC)Reply
@Atitarev I removed it because I didn't think it was possible, but I'll use ben's suggestion so you can add the test case back. Or perhaps I will when I need to test it.
@Atitarev There appears to be a complication: sometimes ؤُ is intended to mean 'ū', like in طاؤس and ساؤ. I know this doesn't conform to the rules in Arabic, but in Urdu ؤُ can mean both 'u' and 'ū'. At the end of a word (from what I'm seeing in the Platts dictionary), it's apparently always long, so a rule can be introduced for that, and then there's only طاؤس left to do manually. Exarchus (talk) 09:34, 15 January 2024 (UTC)Reply
@Atitarev Well, I had already changed it to: word-finally = 'ū', word-medially = 'u', nothing difficult to implement, and you can rationalise this by saying that word-final 'u' becomes long anyway. Exarchus (talk) 21:48, 16 January 2024 (UTC)Reply
@Exarchus, Atitarev: This is what's called a Silent Vav (or Vā'o-i Ma'dūla). I don't see how you can identify a silent vao in code, considering there's no diacritic or rule that you can follow to recognise it. It would almost certainly always need to be a manual transit, and I'll be honest, I don't really think there is any need to highlight a slient vow. Translating خود as xod, even though it's pronounced as xud is fine, really. نعم البدل (talk) 22:28, 14 January 2024 (UTC)Reply