Module talk:km-translit

Hello, you have come here looking for the meaning of the word Module talk:km-translit. In DICTIOUS you will not only get to know all the dictionary meanings for the word Module talk:km-translit, but we will also tell you about its etymology, its characteristics and you will know how to say Module talk:km-translit in singular and plural. Everything you need to know about the word Module talk:km-translit you have here. The definition of the word Module talk:km-translit will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofModule talk:km-translit, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.

Module

@Stephen G. Brown Steve, what are the transliteration/pronunciation rules for letters, such as "ហ្វ", which have two names - fâ and wâ? And what "a-series" and "o-series" for vowels? For example, ភាសា (phiəsaa) is transliterated as "pīəsā", although the vowel letter is the same in both syllables - ()? @Wyang Please take a look at this module, it's not much but I'm struggling :) --Anatoli (обсудить/вклад) 02:50, 18 June 2014 (UTC)Reply

ហ្វ is the consonant cluster + (h+w). It is a rare combination in native Khmer words, and it is an artificial construct used to represent the foreign sound of f: កាហ្វេ (kaafee). However, since it contains , it can sometimes be pronounced as w (actually, this is a sound that is halfway between w and v, or /ʋ/). For example, ហ្វឹក can be pronounced either /fək/ or /ʋək/. ហ្វូង can be pronounced /ʋooŋ/), /fooŋ/, or /pʰooŋ/. ហ្វេ, borrowed from Vietnamese, is only pronounced /ʋee/ and means the Vietnamese city of Hue. ហ្វៃយ៉ង់ is only pronounced /ʋayyɑŋ/, meaning faience. ហ្វៃហ្វា is only pronounced /ʋayvaa/. There is no rule, it depends on the word.
Khmer consists of two different alphabets, called the "a-series" and "o-series". Vowels with "a-series" consonants are pronounced differently from the same vowel with "o-series" consonants. There is also a converter , which converts a-series consonants to the o-series. converts o-series consonants to the a-series.
The following tables show the a-series and the o-series consonants. is a-series, so សា = saa. is o-series, so ភា = pʰie. —Stephen (Talk) 11:51, 18 June 2014 (UTC)Reply
a-series consonants Subscript form IPA
(kɑɑ) ្ក (្kâ)
(khɑɑ) ្ខ (្khâ) kʰɑ
(cɑɑ) ្ច (្châ)
(chɑɑ) ្ឆ (្chhâ) cʰɑ
(dɑɑ) ្ដ (្dâ) ɗɑ
(thɑɑ) ្ឋ (្thâ) tʰɑ
(nɑɑ) ្ណ (្nâ)
(tɑɑ) ្ត (្tâ)
(thɑɑ) ្ថ (្thâ) tʰɑ
(bɑɑ) ្ប (្bâ) ɓɑ
(phɑɑ) ្ផ (្phâ) pʰɑ
(sɑɑ) ្ឝ (្shâ) śɑ (Pali/Sanskrit)
(sɑɑ) ្ឞ (្ssô) ṣɑ (Pali/Sanskrit)
(sɑɑ) ្ស (្sâ)
(hɑɑ) ្ហ (្hâ)
(lɑɑ) ្ឡ (្lâ)
(ʼɑɑ) ្អ (្ʼâ) ʔɑ
o-series consonants Subscript form IPA
(kɔɔ) ្គ (្kô)
(khɔɔ) ្ឃ (្khô) kʰɔ
(ngɔɔ) ្ង (្ngô) ŋɔ
(cɔɔ) ្ជ (្chô)
(chɔɔ) ្ឈ (្chhô) cʰɔ
(ñɔɔ) ្ញ (្nhô) ɲɔ
(dɔɔ) ្ឌ (្dô) ɗɔ
(thɔɔ) ្ឍ (្thô) tʰɔ
(tɔɔ) ្ទ (្tô)
(thɔɔ) ្ធ (្thô) tʰɔ
(nɔɔ) ្ន (្nô)
(pɔɔ) ្ព (្pô)
(phɔɔ) ្ភ (្phô) pʰɔ
(mɔɔ) ្ម (្mô)
(yɔɔ) ្យ (្yô)
(rɔɔ) ្រ (្rô)
(lɔɔ) ្ល (្lô)
(vɔɔ) ្វ (្vô) ʋɔ
Thank you, Stephen! Multiple unpredictable readings sounds like a little problem or an obstacle. Can we default such letters to most common pronunciations and provide phonetic respellings (or additional parameters) for such cases? Can the lists of exceptions be made or it's not very practical? Stephen, are you happy to follow the new transliteration standard as used here or you have a different preference? I see you use Sealang dictionary transliteration. This will be mostly automatic but standardised translit may be needed for test cases. Wikipedia doesn't describe well transliteration for diacritics. @Wyang, thank you for making it work already, on the basic level! Could you add spaces between syllable as with Lao? It will be important to make longer strings more readable. --Anatoli (обсудить/вклад) 12:30, 18 June 2014 (UTC)Reply
The most common pronunciation is with f, but f would be incorrect for words such as Hue. I don’t know how this can be handled. I have not seen what the new transliteration scheme will look like, but Khmer has many vowel sounds, and they will require special symbols to represent them all. I think the f and the transliteration scheme will be the least of the problems ... the biggest problem, in my opinion, is determining the ends of words. Putting a space between every syllable, like some people do with Chinese, is not going to be acceptable. It is important to put a space only between words. I think every transliteration will have to be replaced with a manual transliteration. —Stephen (Talk) 13:18, 18 June 2014 (UTC)Reply
The module won't be smart enough to determine end of words but can hopefully determine ends of syllables. If you oppose spaces, we can do without them. For manual vs automatic - the automatic transliteration can be made non-mandatory, so that manual (when "tr=" exists) overrides automatic. If a transliteration method is accepted (it can be changed and is tuned) you can preview e.g. ព្រហ្មវិហារ (prum vihiə) - copy/paste "prômvĭharô", if it's incorrect, fix, insert spaces for phrases between words, etc. Entries/translations without manual transliteration (tr=) will definitely benefit. --Anatoli (обсудить/вклад) 13:30, 18 June 2014 (UTC)Reply
ព្រហ្មវិហារ (prum vihiə) = prum vi’hie. —Stephen (Talk) 14:22, 18 June 2014 (UTC)Reply
Thanks, Stephen. I saw that it wasn't transliterated correctly. Sealang dictionary gives "prummeaʔviʔhie" or "prum viʔhie". I've added a few more words to test cases. Hopefully Frank can make this module work. --Anatoli (обсудить/вклад) 23:50, 18 June 2014 (UTC)Reply
Yes. When (rɔɔ) comes at the end of a syllable, it is not pronounced. It is similar to the British -r, which is pronounced at the beginning of a syllable but not at the end. Sometimes it can cause the preceding vowel to be long. The (bantoc) makes the preceding vowel short: បក (bɑɑk), but បក់ (bɑk). The (samyok sannya) has no pronunciation, but denotes a deviation from the general rules of pronunciation (used mostly in loan words from other languages). (toandakhiat) indicates that the base character is not pronounced. —Stephen (Talk) 06:28, 19 June 2014 (UTC)Reply
"toandakhiat" must be like Thai thankhankhat (a consonant killer) as in แมนเชสเตอร์ (Manchester) making ร silent? --Anatoli (обсудить/вклад) 06:46, 19 June 2014 (UTC)Reply
Exactly. is the repetition sign, which repeats the previous word. is the independent vowel ’u. (laʼ) (etc.) is lɑ’ or lɑ’nɨŋlɑ’. Many Khmer texts include a zero-width space (​​) between words which allows software programs to break lines at the correct place...the zero-width space should be transliterated as a word space. —Stephen (Talk) 12:07, 19 June 2014 (UTC)Reply

Question

@Atitarev @Stephen G. Brown

Wiktionary:Khmer transliteration seems to suggest that the transliteration system for Khmer here is the United Nations Romanization System for Geographical Names scheme. Should we use the UN scheme for Khmer in this module as well? ភាសា: phéasa or pʰiesaa? Would this mean syllable-final 'r' would still be 'r'? Wyang (talk) 04:03, 25 June 2014 (UTC)Reply

As for me, whatever is achievable. Apart from basics I couldn't find anything useful. If one transliteration system starts working, it would be great. Sealang dictionary seems to be using the UN scheme and Stephen has been using it or a similar scheme. --Anatoli (обсудить/вклад) 04:16, 25 June 2014 (UTC)Reply
ភាសា pʰiesaa is the only one I am familiar with. Just skimming United Nations Romanization System for Geographical Names, it appears that syllable-final r would still be r.
If we use the United Nations Romanization System for Geographical Names, then we have to rely on the transliteration program exclusively, because I would not know how to make manual transliterations in that system.
If that’s the case, I think the only way it can work is if we insert zero-width spaces (​) wherever needed. I think Lua makes it possible to filter the zero-width spaces out of links so that they would don’t appear in page names (the same way it works with Russian acute accents). The zero-width space is the standard used in native Khmer keyboards, where the spacebar by default inserts a zero-width space (instead of a Western-style word space), and if we insert them where needed, then the transliteration program would know how to separate words. —Stephen (Talk) 12:30, 25 June 2014 (UTC)Reply

misnested tags

In Module:km-translit, there is a section

			if match(syl, '៍') then
				syl = '<small><del>' .. gsub(syl, '.', function(consonant)
					if cons_conv then
						return cons_conv
					end end) .. '</small></del>'

and it would appear that (to avoid misnested tags lint errors) the opening HTML tags

<small><del>

should be closed with

</del></small>

rather than

</small></del>

as it is now. However, the edit tab doesn't give access to this section and I don't know how to do it, so I leave it for others to fix. Anomalocaris (talk) 23:19, 17 June 2018 (UTC)Reply

Done DoneSuzukaze-c 23:24, 17 June 2018 (UTC)Reply
Suzukaze-c: Thanks! I get it now! I have to click on the tool that looks like <> in order to access the source code. —Anomalocaris (talk) 00:08, 18 June 2018 (UTC)Reply