Hello, you have come here looking for the meaning of the word User talk:Wyang/Archive2. In DICTIOUS you will not only get to know all the dictionary meanings for the word User talk:Wyang/Archive2, but we will also tell you about its etymology, its characteristics and you will know how to say User talk:Wyang/Archive2 in singular and plural. Everything you need to know about the word User talk:Wyang/Archive2 you have here. The definition of the word User talk:Wyang/Archive2 will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofUser talk:Wyang/Archive2, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
Probably should not have used a comma in the first place, but multiple template parameters. By the way, I would suggest renaming the parameters to use language codes of the topolects (i.e. {{zh-pron|cmn=...|cmn2=...|wuu=...|yue=...|yue2=...|...}}) One-letter parameters may be handy to type, but they present some cognitive burden as yet another thing to look up in the documentation and memorise. Also, changing these two things at once would solve the problem of distinguishing parameters in new format from the old. — Keφr16:32, 25 May 2014 (UTC)Reply
Fixed - use ', ' as dictated by Pinyin orthography. I don't think I agree that using language codes is the better option. A user may have to look up the documentation to know that 'w' is the code for 'Wu' (chances are not, when the person sees 'm' is used for 'Mandarin'), but a user definitely has to look up the documentation if 'wuu' is the code for 'Wu'. Wyang (talk) 00:03, 26 May 2014 (UTC)Reply
There's also the matter of lects that don't have language codes. Categorization would probably need to be handled differently, but I'm sure there are some that would be worth incorporating into this framework. Knowing how dialectology and the ISO work, I would be quite surprised if there were no dialects with lexically-significant differences lacking ISO codes. I think the main problem would be deciding which ones not to cover- though lack of sources might solve that problem for us. Chuck Entz (talk) 00:55, 26 May 2014 (UTC)Reply
links to erhua-ed pinyin
Latest comment: 10 years ago5 comments3 people in discussion
The reason I showed the erhua-ed pinyin on the un-erhua-ed page is just to show that writing "兒/儿" is optional. A Beijinger may write "我去玩了", and pronounce it as "我去玩儿了" instead. Wyang (talk) 00:30, 28 May 2014 (UTC)Reply
Wuu tones 2 and 4
Latest comment: 10 years ago12 comments3 people in discussion
Thank you. From the above, is Cantonese the driver? Is there a correspondence for Mandarin -> Shanghainese -> Cantonese? Such as Mandarin tone 3 can be 2 or 5 in Cantonese and 2 or 3 in Shanghainese?
1 and 7, 3 and 8, 6 and 9 occur in complementary distribution. Jyutping merges the two into the former. The former is for non-checked syllables, the latter for checked syllables. Wyang (talk) 00:47, 28 May 2014 (UTC)Reply
I see, I thought so. Is it true that Guangzhou and Hong Kong Cantonese differ in the number of tones - 7 and 6 accordingly. Not sure where I read this now. Strangely, it's hard to find numeric values in Wikipedia for Cantonese 6 tones: 55 35 33 21 13 22, not sure about the checked tones. This "complementary distribution", does it actually mean different, lower tones for 7, 8 and 9? --Anatoli(обсудить/вклад)01:04, 28 May 2014 (UTC)Reply
The checked tones are: 5, 3, 2, for 7/8/9 (they are just shorter versions of 1/3/6). Tone 1 is typically 55 in Hong Kong, but can be 55 or 53 in Guangzhou. Some view it as two tones instead, citing characters which are basically only pronounced as 55 or 53, and never the other. At present the two values are largely interchangeable, although reading some 55 characters as 53 might sound weird. Compare 三: Hong Kong (JustinLam) and Guangzhou (greatharry). Wyang (talk) 01:22, 28 May 2014 (UTC)Reply
Thanks. Very educational. Are there numeric values for 7, 8 and 9? Can Shanghainese and Mandarin tones be mapped to each other similarly or they are completely unpredictable? --Anatoli(обсудить/вклад)01:04, 28 May 2014 (UTC)Reply
The table above should do - ignore the Cantonese column. It's a lot less regular, but there are some correspondences. Wyang (talk) 01:22, 28 May 2014 (UTC)Reply
Thank you. I've added a Shanghainese usage example in 講/讲. I noticed there's no standard form for Shanghainese. I have to make adjustments when reading 上海话方言词典 but their audio is very good. --Anatoli(обсудить/вклад)00:12, 29 May 2014 (UTC)Reply
A Question for One Word
Latest comment: 10 years ago5 comments3 people in discussion
I have a question about a certain word. Is this Chinese word an adjective? If so, what does it mean? I could suggest improving your list of yellow linkable Chinese words by adding parts of speech next to them. --Lo Ximiendo (talk) 09:34, 29 May 2014 (UTC)Reply
Latest comment: 10 years ago3 comments2 people in discussion
Making entries for inflected forms is a bit of waste of time, IMHO. It's for bots, not humans :) I hope someone may create accelerated methods for making them and/or make a bot for a quick creation. You can try something more challenging by making an entry from translations. Adverbs are not inflected, they are easier to do.
It doesn't seem to be a war at all... yet. I prefer new line, because every other parameter gets a new line granted to it. Wyang (talk) 07:31, 30 May 2014 (UTC)Reply
This is a silly non-issue. As long as it renders the same, I would leave it alone. Personally, though, I prefer them at the beginning of the line. Makes line-based diffs less messy when changing the last item. — Keφr07:37, 30 May 2014 (UTC)Reply
Result: 3 persons (Kenny, Wyang, Kephir) agree making a newline, 1 person (Lo Ximiendo) agrees appending it onto the last item.
Classical tag
Latest comment: 10 years ago5 comments2 people in discussion
Latest comment: 10 years ago2 comments2 people in discussion
Hi,
Could you make it work, please? It would be easier for me to get expected IPA, using what is currently produced. I've added some descriptions and more test cases. --Anatoli(обсудить/вклад)12:08, 1 June 2014 (UTC)Reply
Latest comment: 10 years ago3 comments2 people in discussion
Hi Frank,
Do these words have different readings? You have corrected my edit in 学校 where I used "5hhoq" from 学习 (your edit). I want to make Wu reading for 学堂, which seems more common than 学校 - used in one of my textbooks. What is the transliteration? "5hhoq daan" or "5hhiaq daan"--Anatoli(обсудить/вклад)23:13, 1 June 2014 (UTC)Reply
Fixed (except the talk page, which can't be removed from the category without removing the test transclusion). Putting a category wrapped in noincludes in a template that's transcluded by other templates only stops the transcluded template from going into the category- not the ones transcluding it. Fortunately, all the transcluding templates had the same code, so it could be deleted from the transcluded template without any effect on the entries. Chuck Entz (talk) 08:19, 3 June 2014 (UTC)Reply
I didn't delete it because I don't know if it's still needed. The Latin-script noun has been tagged for attention since January. Someone who knows Belarusian needs to create an entry under the correct Cyrillic-script spelling so the Latin-script one can removed without losing information. Chuck Entz (talk) 08:56, 3 June 2014 (UTC)Reply
There are 332 monosyllabic entries there and 908 multisyllabic ones, both intentionally omitted (the bot omits the entry if the title is monosyllabic or content lacks {{Pinyin-IPA}}). There is a complete list of the multisyllabic entries omitted here. I think the multisyllabic ones are automatable, at least semi-automatable. Not sure about the monosyllabic ones. Wyang (talk) 04:33, 5 June 2014 (UTC)Reply
A sample entry with unknown PoS, only reading is available - 恅
Latest comment: 10 years ago7 comments2 people in discussion
In this revision I have added Mandarin and Cantonese readings, removed translingual definition requests and left one for Chinese, the cat= parameter is empty. Can all single character entries be categorised in by default and PINT, Jyutping, etc. if a reading is added? (Definitions and PoS can be added but I'd like to establish a format for defintionless characters, as a sample. What do you think? --Anatoli(обсудить/вклад)23:38, 5 June 2014 (UTC)Reply
Latest comment: 10 years ago6 comments3 people in discussion
I'm adjusting okay so far to the new unified Chinese formatting for the most part. One thing I'm not seeing in Template:zh-pron is the option for adding the older Wade-Giles romanization for Mandarin. Wade-Giles isn't used as much anymore but is still found frequently in older texts and is still preferred by many Chinese linguistics experts in academia. It'd be really good to have the ability to automatically (or manually) convert from Hanyu pinyin to Wade-Giles. This page is good for showing most of the conversions from Hanyu pinyin to Wade-Giles (doesn't use IPA charts, though). Bumm13 (talk) 03:24, 6 June 2014 (UTC)Reply
Any chance that Wade-Giles will be added to Template:zh-pron anytime soon? It's actually a big reason why I'm not spending more time converting topolect sections to the new "Chinese" formatting. Just curious. Bumm13 (talk) 19:20, 20 June 2014 (UTC)Reply
(E/C)I am neutral on Wade-Giles. Like Gwoyeu Romatzyh, it's just another system to understand and support, making the Chinese pronunciation box larger. Well, we had Wade-Giles in Hanzi headers (for single-character entries) all the time, to avoid being accused of destroying it, we should probably keep it/add it but perhaps for single-character entries only(?), perhaps the same for Gwoyeu Romatzyh(?). @Kc kennylau it must be an easy task for you? @Wyang, same thinking :)--Anatoli(обсудить/вклад)03:33, 6 June 2014 (UTC)Reply
I have done a draft for the py_wg function at Module:cmn-pron. All the rudimentary monosyllabic testcases work as expected, which I think is fairly sufficient for showing its robustness if it were to be applied solely to Hanzi entries. Please see if anything needs to be improved and enable it when it is deemed trustworthy. Wyang (talk) 05:47, 6 June 2014 (UTC)Reply
Latest comment: 10 years ago5 comments2 people in discussion
I'm curious if the copula 이다(ida) might have undergone "n" deletion at some point in the distant past. Is there any chance that the negative 아니다(anida) was originally composed of 아(a) + 니다(nida), with negative prefix 안(an) originating as a contraction of 아니(ani)?
There are interesting suggestions (such as in this slide deck by Bjarke Frellesvig about Old Japanese and earlier) that classical Japanese perfective auxiliary ぬ(nu) might have developed from an older copula, and that this might be the root of even modern particles like に(ni). That and related discussions about the 未然形(mizenkei, “irrealis”) conjugation got me wondering if there were any analogs in Korean, either in the formation of negatives by using a, or in the copula, and hence my question above.
(Incidentally, etyms 3 and 4 at 안(an) look completely indistinguishable to me... and on a different note, Chinese entries are looking pretty snazzy. :) )
Interesting! I love Bjarke Frellesvig's book "A History of the Japanese Language" and the presentation you linked to is very interesting as well.
There are four negatives in Korean: an ("not", 안, 아니, 아니다, 않다), mos ("cannot", 못, 못하다, 모르다), mal ("don't", 말다), and eps ("not have", 없다). I will post a detailed reply tonight when I have access to the Korean etymology books. Wyang (talk) 03:00, 10 June 2014 (UTC)Reply
Hi Eirikr! Sorry for the delay...
Korean differs from Japanese in that its negative constructions cannot be done purely with endings, and require a combination of verbal endings and negative verb/adjective/adverbs (i.e. 생각지 않다 = 思わず), and is hence less agglutinative in morphology. To me, the first Korean negative series of an has the root form of (a)n, and I always thought this must be cognate with the n or z (< n-su) in the negative forms of Japanese verbs. There doesn't seem to be a negative-forming process by attaching a to the positive copula. The positive copula might eventually be a reduced form of 있다 (itda, “there is”), which in Middle Korean was ista ~ isita ~ sita and this might be related to Japanese aru.
There is a very interesting discussion in Lee Namdeuk's book 한국어 어원 연구 IV, Chapter 基礎 語彙의 語源과 比較 考察. He thinks that an and eps negatives in Korean are ultimately from the same source, and that the -n- negative in Korean is related to the Japanese n negative. Here is the original text: https://www.dropbox.com/s/ok868r50asv2yox/Lee.tar.gz.
So to make sure we're on the same page, it sounds like:
KO ida did not undergo "n" deletion.
By extension, KO itda did not undergo "n" deletion.
I would be very grateful if you could confirm the above two, as that would help categorically rule out any connection between these and hypothetical JA copula nu (with inflected form ni).
(As two tangential ideas, do you think KO i- in ida has any relation to JA i- in iru, classically rendered as wiru? I'm not aware of any phonological processes that might explain "w" deletion in Korean, but you certainly know more about that than I do. And do you have any thoughts on the apparent overlap between KO iss- in itda and PIE *h₁es-(“to be”), among other odd coincidental KO-PIE collisions?)
KO negatives are historically indicated primarily by the consonant /n/, with the vowel /a/ being an incidental (or otherwise not important in conveying negativity), and the vowel /a/ in its negative capacity definitely not having anything to do with verb conjugation patterns.
This is interesting as a possible point of real divergence, in that the Japanese negative nu could be analyzed as identical with perfective nu, provided one accepts the 未然形(mizenkei, “irrealis form”) as a real feature of the language and not an artifact of some sort: + nu == == . For the zu forms, Frellesvig postulates that this was a fusion between the 連用形(ren'yōkei, “continuative form”)ni of root form nu + apparent adverbial complement su: /ni/ + /su/ > /nsu/ > /zu/. Have a look at slide 34 of the linked deck -- Frellesvig diagrammed this as “*ani-su”, and that ani is what got me thinking about possible KO connections. Ultimately though, I think the “a” there is just intended to convey the mizenkei. This /ni/ + /su/ matches Lee's notes for OJ on page 79, as much as I can read of them (thank you very much for that, though I regret that I can currently only make out some of the text -- I've really got to spend more time studying Korean). JA negative nashi mentioned on that same page by Lee could be analyzed as the mizenkeina of root form nu + adjectival suffix shi.
I see on the bottom of page 79 and the start of page 80 that Lee equates this JA “n” element with the KO an element, as you mentioned. I'm still chewing on the JA; I have real trouble viewing JA “n” purely as a negative given the prevalence of affirmative meanings that can potentially be ascribed to this same root, such as modern naru, verbal auxiliary -nau, non-negative adjectival suffix -nai (as in 危ない(abunai), 少ない(sukunai), etc.), perfective nu, possibly even particle ni. I might be open to the possibility of two JA “n” roots that converged or collided somehow, but the semantics for such opposite meanings being expressed in the same sound leave me uncertain as to how that would happen. With the mizenkei verb conjugation stem providing necessary context, the overlap between affirmative and negative “n” meanings in JA can be explained. I know that some authors, Frellesvig apparently among them, have advanced the notion that the mizenkei is purely an historical artifact and not an underlying semantic feature of the language, but without reading their arguments, I can't see where that could be the case -- the mizenkei appears to be an integral feature since the earliest writings, and its semantics can explain a number of otherwise-weird constructions.
Anyway, I realize this is a lot, but if you're willing :), I'd greatly appreciate it if you could 1) confirm that I'm restating the numbered items correctly, and 2) share your thoughts on the rest of the above. I'm an incorrigible language geek, especially when it comes to figuring out how things are put together, so if this has exceeded your interest threshold, just let me know. :) ‑‑ Eiríkr Útlendi │ Tala við mig17:37, 11 June 2014 (UTC)Reply
Latest comment: 10 years ago2 comments2 people in discussion
It looks like after converting this article to "Chinese", there might be a bug with how categories are being displayed (it's trying to add to "Hakka nouns/verbs" (and Wu) when no Hakka or Wu readings are specified (plus the munged formatting of the phantom Wu nouns/verbs category links in general). Bumm13 (talk) 08:07, 7 June 2014 (UTC)Reply
Unforseen naming issue with pronunciation audio files (Template:zh-pron)
Latest comment: 10 years ago3 comments2 people in discussion
After converting the 恩 article to using "Chinese" and the new templates, I noticed that the current Mandarin pronunciation Ogg file has a name that it isn't expecting (and thus showing a red link). The expected name is zh-ēn.ogg but the actual Wikimedia Commons filename for that sound is at Zh-en.ogg instead. I expect that this issue will continue to show up for many other such sound files. :\ Bumm13 (talk) 08:25, 7 June 2014 (UTC)Reply
Latest comment: 10 years ago3 comments2 people in discussion
Hi Frank,
I will get back on the Russian pronunciation appendix. Sorry for not doing much lately. Is that OK? I hope you won't lose interest :) I may not be able to describe assimilative palatalisation and gemination rules in good details.
I have checked the remaining multi-syllabic Mandarin entries. All of them either miss pronunciation sections altogether or use some old-style non-standard pronunciation method. You could probably fix them with AWB and your bot. Could you do that when you have time, please? They are too many to do manually. --Anatoli(обсудить/вклад)01:29, 10 June 2014 (UTC)Reply
No worries - Great works are not finished in a day :). I will be on a lookout for the testcase and talk pages, so please add anything that needs to be improved whenever you think of them.
Latest comment: 10 years ago18 comments7 people in discussion
Hi Wyang, I just had a quick question. Is it true that under the current formatting arrangements, there is no categorisation of simplified and traditional scripts? I just realised this may be the case. ---> Tooironic (talk) 07:59, 10 June 2014 (UTC)Reply
I personally don't miss this categorisation but I can imagine it won't be hard to introduce but without PoS separation. As previously agreed, 中國 and 中国 are now sorted the same way, by numbered pinyin. --Anatoli(обсудить/вклад)00:42, 11 June 2014 (UTC)Reply
Seems to be a bit of a shame to me. This kind of categorisation and its related data could be useful for both the average user and people who wish to make use of the data. ---> Tooironic (talk) 09:11, 11 June 2014 (UTC)Reply
What's the alternative, guys? I see a big issue with sorting, for example in topical categories (they quickly get out of hand, if |sort= is not specified). How are you going to sort Chinese entries, e.g. Category:Chinese nouns, if we drop pinyin sorting? Back to radical sort or by characters themselves? I'm not suggesting that Mandarin should overwhelm topolects but it's better to have some sorting key than nothing. I see that topolect categories are sorted by the appropriate romanisation but what if we decide not to split by Mandarin traditional/simplified, Cantonese traditional/simplified, etc.? --Anatoli(обсудить/вклад)05:08, 13 June 2014 (UTC)Reply
My preference is numbered pinyin or have alternative sorting. If radicals are chosen, it would be great then to have a radical index on categories then, as a minimum, otherwise finding a word in a list of thousands won't be possible, one could use Category:Mandarin nouns, of course. There's a table at the top of Category:Mandarin nouns in traditional script but it's no longer usable because rs= value no longer exists. --Anatoli(обсудить/вклад)05:46, 13 June 2014 (UTC)Reply
The Chinese dictionary I have has an index of radicals at the beginning of the book and under each radical is a list of characters that incorporate the said radical ordered by the number of strokes of the phonetic element. The actual dictionary is ordered by pinyin from A to Z. Jamesjiao → T ◊ C22:55, 15 June 2014 (UTC)Reply
Query: Is there any way to add back-end routines to {{zh-pron}} that would add categorizations for each reading (i.e. topolect) as it's added? I haven't explored Lua enough to know if it's even possible, but what of code that could parse the page for POSes and pronunciations, and auto-generate the corresponding categories? ‑‑ Eiríkr Útlendi │ Tala við mig06:34, 13 June 2014 (UTC)Reply
Sorry I haven't been very responsive lately... I don't quite understand what you meant above. {{zh-pron}} currently operates under that premise (it seems), generating the corresponding categories depending on the readings and PoS parameter value given. Wyang (talk) 04:19, 20 June 2014 (UTC)Reply
Eirikr probably means using zh-pron to make e.g. "Cantonese nouns in traditional script", etc. I think we shouldn't split by topolects and PoS but that's only me. Just "Chinese terms in traditional script" and ...simplified would do but some people will disagree. --Anatoli(обсудить/вклад)04:26, 20 June 2014 (UTC)Reply
Hi, yes, that's more what I was trying to convey. I'm not much for using the Chinese on this site, but I could see some utility in being topolect-specific -- as a user, to find readings for terms in a specific topolect; and as an editor, in order to find those entries that might still need topolect data. ‑‑ Eiríkr Útlendi │ Tala við mig18:15, 20 June 2014 (UTC)Reply
No problem, I have unprotected it for a week for you. Please let me know if that is not long enough or if you need to edit other templates. Wyang (talk) 05:14, 11 June 2014 (UTC)Reply
It doesn't answer the question of the assimilative palatalization and gemination, though. Well, we can keep working on it and deal with problematic cases when they arise.
Could you add handling for some prefixes (they do cause gemination) - I will try to maintain the list. One of the currently failed tests: отдохну́ть (prefix: от-). --Anatoli(обсудить/вклад)23:59, 15 June 2014 (UTC)Reply
Yellow Link Deal
Latest comment: 10 years ago4 comments2 people in discussion
Generate different indentation for quotations not following definitions, following definitions but inline, and following definitions and not inline. See the bottom of Wiktionary:Feedback for an example of in_notes. Wyang (talk) 12:44, 17 June 2014 (UTC)Reply
You have an "if in_notes then" in L343 which overrides the "elseif in_notes then" in L336. By the way, how can I not make it a quotation? --kc_kennylau (talk) 14:16, 17 June 2014 (UTC)Reply
L336 defines other_lines_indent and simp_indent, whereas L343 defines first_line_indent. What do you mean by a non-quotation? in_notesandinline? Wyang (talk) 23:51, 17 June 2014 (UTC)Reply
Speedy deletion
Latest comment: 10 years ago7 comments3 people in discussion
I've just restored a large number of deleted cmn categories that had subcategories- in fact, most of the category tree for cmn was obliterated, leaving redlinks all over the place.
There are many topical categories that are populated by {{context}}, so the only way to empty all cmn categories would be to get rid of either the context template or the "lang=cmn" parameter. Unless you do that, there will be non-empty cmn categories, which will in turn be categorized in the parent categories set by {{topic cat}} subtemplates.
Just so we're clear: a category that has subcategories is not empty, and shouldn't be deleted. Not every category is designed to directly contain entries- many are just for navigating between sister categories. If we're going to have cmn topical categories, we should have a category tree to link them together.
If you're going to delete a category, first empty and delete all of its subcategories (and sub-subcategories, etc.). Otherwise, leave it alone. Thanks Chuck Entz (talk) 02:55, 20 June 2014 (UTC)Reply
That'll work... As much as I hate to see all my work in creating and then restoring the categories just evaporate like that, my only real problem was with deleting non-empty categories. As long as you make sure the categories are empty before you delete them, I can live with it. Sorry for the extra work! Chuck Entz (talk) 05:22, 20 June 2014 (UTC)Reply
Hey folks, I noticed that entries such as 檳榔, that need both hira and kata specified, wind up getting two romaji listings in the headline. That doesn't seem quite right... ‑‑ Eiríkr Útlendi │ Tala við mig01:13, 29 June 2014 (UTC)Reply
Hmm, I just found that now 鮎#Japanese isn't showing any romaji at all -- specifically for the first etym noun sense, where the あゆ reading is supplied as an unnamed positional parameter and the アユ reading is supplied as the named kata= parameter. ‑‑ Eiríkr Útlendi │ Tala við mig22:56, 14 July 2014 (UTC)Reply
Latest comment: 10 years ago2 comments2 people in discussion
This entry apparently got accidently scrambled by an AWB edit of yours in the heat of the topolect merger, and it's been reverted to the edit previous to that by a contributor. I thought you might want to take a look at it. Chuck Entz (talk) 04:22, 30 June 2014 (UTC)Reply
Almost. There are still 9 entries with a variation on the same error that don't respond to null edits. Thanks for the other 111 entries, though. Chuck Entz (talk) 01:10, 4 July 2014 (UTC)Reply
I know about those- they went away after null edits- but these are in och-pron, not yue-pron (in several cases, an entry had both, a few lines apart from each other). Chuck Entz (talk) 02:05, 4 July 2014 (UTC)Reply
It's hard to be sure of anything, with all the edits cycling through the edit cue. After I read your comment, I checked and saw 55 entries in the category. In the time it took me to do a quick null edit on one entry, it was empty again. Still, I haven't seen anything that displayed a module error on the page or that survived a null edit, so I think we're out of the woods on this one- for now, anyway. Thanks! Chuck Entz (talk) 04:47, 4 July 2014 (UTC)Reply
Thanks a lot! I was a bit bored converting them manually :) However, there are still a list of verbs, phrases and quite a lot of proverbs and idioms. --Anatoli(обсудить/вклад)11:06, 10 July 2014 (UTC)Reply
Frank, pls put on your to do-list finishing those pesky cmn templates, e.g. proper nouns , idioms, proverbs, etc. :) It's just seems much easier for you with AWB. If any of them are hard to do because of bad formatting, I'll finish manually. I have a question: how would you write IPA for 三Q? Not sure how to convert it to the new format. --Anatoli(обсудить/вклад)00:43, 15 July 2014 (UTC)Reply
Sorry! I missed your message earlier. There is no need to do them manually since time would be better spent on other tasks, although on the other hand I haven't been very free lately... I reckon we should disable Zhuyin, IPA etc. if the entry title contains non-Chinese characters. Wyang (talk) 23:35, 21 July 2014 (UTC)Reply
That's OK, whenever you have time, just wanted to make sure you read my message. Thanks. :) Re: IPA, Zhuyin, 卡拉OK may get hits in Zhuyin, besides, users may want to know how Chinese pronounce those words but it's too hard, well... --Anatoli T.(обсудить/вклад)23:47, 21 July 2014 (UTC)Reply
Latest comment: 10 years ago2 comments2 people in discussion
Created an entry for 合期. This has a rare alternate reading of gaggo, for which {{ja-pron}} has produced the unlikely IPA of replace g with ɡ, invalid IPA characters (gg). So far as I know, geminate "g" sounds in Japanese (rare as they are) never manifest this way. Could someone look into this? ‑‑ Eiríkr Útlendi │ Tala við mig19:14, 10 July 2014 (UTC)Reply
Good point. I thought the Japanese are unable to pronounce voiced geminates and thus bed and bet would end up basically identical (even though written differently), which is why I devoiced the first part of the geminate with no audible release (probably also influenced by the limited distribution of checked tone to voiceless codas in Chinese). I have changed them to truly voiced geminates, and added a voicelessness sign, and removed the optional nasalisation of g in gg. Wyang (talk) 09:13, 11 July 2014 (UTC)Reply
The act of quietly accumulating shares of stock by traders when the stock is at a lower price? Would it sound too literal? Wyang (talk) 09:18, 11 July 2014 (UTC)Reply
Middle Chinese
Latest comment: 10 years ago3 comments2 people in discussion
Now that {{zh-pron}} includes Middle Chinese pronunciation info, what should be done with these 275 Middle Chinese entries that were discussed in the BP in January (list)? Can the bizarrely-annotated, half-hidden pronunciation information be removed from the ==Middle Chinese== sections of those entries now, once the entries are made to use {{zh-pron}}? (If the info isn't removed, I'd like to standardize the wording and make it visible, like this.) - -sche(discuss)02:05, 11 July 2014 (UTC)Reply
I parsed through your list. 41 articles are gone and here is the updated version:
I would just leave them as they are as the Chinese merger is actively ongoing and they would probably be gone in a year. Unless someone wants to decimate them now... Wyang (talk) 09:46, 11 July 2014 (UTC)Reply
Latest comment: 10 years ago4 comments2 people in discussion
Hi Wyang,
The unified "Chinese" with Template:zh-pron is working quite nicely for the most part. I have found a few small errors that eventually will need to be fixed regarding romanization readings. For Mandarin (Wade-Giles), the pinyin "gui" is showing as "kui" when it should be showing "kuei". Example: 龜.
There's also two relatively minor Min Nan romanization issues. Going from Peh-oe-ji to Tai-Lo (in both cases), the Tai-Lo -eh (as in "ngeh") is supposed to correspond to POJ -oeh (as in "ngoeh"), at least according to the sources I checked against. The template is changing POJ -oeh to -ueh in Tai-Lo. Example: 夾.
Also, the -o͘ suffix in Peh-oe-ji corresponds to -oo in Tai-Lo but is showing up as if they are the same. Example: 走. Other than that, everything looks great so far. Keep up the good work! :) Bumm13 (talk) 20:54, 11 July 2014 (UTC)Reply
For the pinyin "gui" conversion, here are two good sources: and . These are both university library sources (Hong Kong University of Science and Technology and the University of Chicago, respectively. The former is actually in China, while the latter is basically the equivalent of an Ivy League institution in the United States. I'll have to get back to you on the "ngeh" Min Nan issue. My sources for that one are (admittedly somewhat weak) the Open Dictionary Network - Min Nan Dictionary (kaifangcidian.com) for Peh-oe-ji compared with the Taiwan Min Nan Common Words Dictionary (based in Taiwan at twblg.dict.edu.tw - Taiwan Ministry of Education) for Tai-Lo. Bumm13 (talk) 17:10, 12 July 2014 (UTC)Reply
Hello again. Is there an expression that can easily convert the parameter {{{vol}}} in Roman numerals (I, II, ..., X) into Arabic numbers (1, 2, ..., 10) in Template:R:xcl:HAB, in the &volume={{{vol}}} part? --Vahag (talk) 08:14, 28 July 2014 (UTC)Reply
Latest comment: 10 years ago3 comments2 people in discussion
We are missing about four extra senses here. Don't suppose you'd be interested in taking a stab? I'm busy with something else at the moment. ---> Tooironic (talk) 11:53, 16 July 2014 (UTC)Reply
Latest comment: 10 years ago2 comments2 people in discussion
I tried adding a second Mandarin pronunciation .ogg file entry to Template:zh-pron in the 教 article and nothing I've tried seems to work in causing the second file's click play button thing to show up in my browser(s). Could you check the article to see if I'm doing something wrong? I have both Mandarin readings in the "m=" parameter, so I would think the audio files would show up without a lot of effort. Bumm13 (talk) 08:41, 21 July 2014 (UTC)Reply
It could be generated by putting ,2a=y in the |m= field, please see what I did. Ideally the pronunciations should be split since they have alternative etymologies, as in 知 or 會, but for short articles like 教, extra parameters like ,2a=, 3a=, and 4a= are available for use. Wyang (talk) 23:33, 21 July 2014 (UTC)Reply
Latest comment: 10 years ago2 comments2 people in discussion
This looks real, but obviously messy, and I am not sure if this is a brand name, in which case it would need to pass WT:BRAND. (The same IP has also added prigle, for what it may be worth.) Can you take care of this? — Keφr06:38, 26 July 2014 (UTC)Reply
Latest comment: 10 years ago11 comments2 people in discussion
Hi Wyang
Thanks for response in the Beer Parlour. We could like to contribute the data for Dzongkha dictionaries. I think it will need someone familiar with Wikimedia software and something like Python to convert an import this data - unfortunately I don't have those skills. Any help would be appreciated. CFynn (talk) 13:06, 31 July 2014 (UTC)Reply
@CFynn Hi Chris! It's great to have you here. I wrote a module for transliterating Tibetan/Dzongkha a while ago (Module:bo-translit) and have written a few simple Python scripts for either retrieving or uploading data from/to Wiktionary. I have used the dictionary at dzongkha.gov.bt a couple of times, and was impressed by how well-organised the website is. For creating an entry of a word in Wiktionary, we need two pieces of information about the word: the definition and part of speech. I am more than glad to help out if you have any questions. Thanks, Wyang (talk) 23:26, 31 July 2014 (UTC)Reply
@WyangHi. We have XDXF (XML) files of the dictionaries which are probably the easiest format to deal with. The Dzongkha-English dictionary has part of speech and English definition and sometimes a Dzongkha synonym. I think this could be used to make the basic entries. There are separate files which list verb forms (past, present future) and honorific forms of words - which might be added on top. The English-Dzongkha dictionary has English word, part of speech, Dzongkha definition(s). The Dzongkha-Dzongkha dictionary is just word+definition with separate field for part of speech - though this information is often embedded within the definition.
The Tibetan-Dzongkha dictionary has Word+Definition with part of speech within square brackets as the first part of the definition which should be easy to extract. Sometimes the square brackets also contain a code indicating the head word is Sanskrit in Tibetan script or an archaic form. The Dzongkha-English and English-Dzongkha dictionaries are clearly going to be the easiest to deal with. The differences in format are due to the fact that these were originally compiled by different people at different times using only a word processor - not even a database. At that time people were only concerned about print publication. 07:17, 5 August 2014 (UTC)
In the XDXF files Dzongkha-English entries look like this:
<ar><k>ཀྲུམ་ཀྲུ</k> <def> noun cartilage (པགས་ཀོ་ཧྲབ་ཧྲོབ།) adj. crisp, crunchy, gristle (ཕྲུམ་ཕྲུམ།)</def> </ar>
@CFynn Thanks for the reply. The xml file easiest to use here would be the Dzongkha-English dictionary, as well as the additional files of verb conjugation and honorifics. The English-Dzongkha dictionary can only be used here for adding to translation tables (eg. aardvark), which is less straightforward (multiple translation tables, linking to components in translations). Embedding of part of speech in the definition shouldn't be too much of a problem. Would there be anything to take care of in terms of copyright (referencing) and externally linking to the dzongkha.gov.bt website? It seems it's not a very difficult task, and we can get started on this soon. Wyang (talk) 23:22, 5 August 2014 (UTC)Reply
OK. I'll get the latest versions of those files together and post a link here to the files. If Wikimedia need an official letter saying the data is released under CC-BY-SA 3.0 + GFDL I can get the Secretary of the DDC to write one and we can fax it or send it by snail mail if you can tell me where and to whom this should be sent. Is there some kind of standard release form? A note or references saying the Dzongkha data is from the DDC and a link to their website http://dzongkha.gov.bt/ would be nice. (BTW PDF versions of all the dictionaries are available on the DDC site.) CFynn (talk) 04:28, 6 August 2014 (UTC)Reply
Thank you. References containing link to the website would be appended to all entries. Here is the Wikipedia policy on donating copyrighted information: w:Wikipedia:Donating copyrighted materials#Granting us permission to copy material already online. Copyright is usually less of a concern at Wiktionary, since the material involved is generally short in length and not of innovative nature. If we want to be safe, we could request that the Secretary send a brief email declaring permission to use DDC data. Wyang (talk) 04:49, 6 August 2014 (UTC)Reply
OK - this may take me a few days as I'm recovering from a minor operation on my foot and it is a little difficult for me to get around. CFynn (talk) 21:00, 6 August 2014 (UTC)Reply
BTW you may need to slightly modify your Tibetan transliterating tool for some Dzongkha entries. Dzongkha syllables sometimes contain a second root which does not occur in Tibetan. This mostly happens when the tseg between syllables is dropped to reflect Dzongkha pronunciation. e.g. Tibetan བླ་མ་ (bla ma / Lama) = Dzongkha བླམ་ (blam / Lam). About 12 years ago when I was working on Tibetan & Dzongkha collation I compiled a spreadsheet which shows all the possibilities for a second root in a Dzongkha syllable which might be useful to you. I'll try and find it and post a link. CFynn (talk) 17:29, 12 August 2014 (UTC)Reply
A minor tweak for zh-pron
Latest comment: 10 years ago2 comments2 people in discussion
At 或, I just fixed an edit where spaces before the commas in the cat= part caused the module not to recognize the POS abbreviations. I found out about it from an entry in Special:WantedCategories for Category:Hakka pron (I would highly recommend regulatly checking Special:WantedCategories for non-catastrophic module errors- it updates every 3 days). Is it too much trouble to have the module allow for whitespace in arguments to avoid this in the future? Thanks. Chuck Entz (talk) 18:42, 1 August 2014 (UTC)Reply
I would say "1) to be absurd; preposterous; 2) to be down on one's luck; 3) to go to hell; to hell with ...; 4) damn; damn you; for Christ's sake" for these words. Wyang (talk) 11:31, 26 August 2014 (UTC)Reply
Latest comment: 10 years ago2 comments2 people in discussion
I saw that you edited the article 枯萎 two weeks ago. How did you do that? I tried to show the Taiwan pronunciation by typing the character on the pronunciation section, but it resulted in a module error. Please explain so that I can do the same thing for other articles, thank you. --Mar vin kaiser (talk) 06:35, 3 September 2014 (UTC)Reply
Hi, Mar vin kaiser. I'll try to answer, as I am also using this method now. It's not so straightforward. The module Module:zh/data contains the following lines:
={'wěi','wēi'}
={'Mainland','Taiwan'}
So, any term containing this character instead of pinyin should contain just character, e.g. |m=kū萎 in this case. If a character with variant pronunciations (Mainland, Taiwan) is missing, it needs to be added. BTW, please add a babel to your user page. --Anatoli T.(обсудить/вклад)06:47, 3 September 2014 (UTC)Reply
Latest comment: 10 years ago3 comments2 people in discussion
Hi Frank,
What would be the correct format for Korean terms with variant hanja? Also, could you consider adding synonyms, etc. to Korean entry creation templates to match Chinese? --Anatoli T.(обсудить/вклад)04:59, 9 September 2014 (UTC)Reply
-- For that matter, 狸#Japanese doesn't show the expected cats either. Maybe it's just this particular character being parsed funny?
Hi. ja-kanjitab uses the "read as" for all grade 1-6, jouyou and jinmeiyou kanjis, as well as a proportion of hyougai kanjis (厭昌之芽昌浩智晃淳敦聡晃旭亮糊桂隘阿唖撫鼠阿耘迂寅已伊餡姦闊..., see Module:ja-kanjitab). I have added 狸 there. Not sure what the source of the exempted hyougai kanjis is... Wyang (talk) 02:34, 8 October 2014 (UTC)Reply
輕ㄑ者
Latest comment: 9 years ago4 comments3 people in discussion
Correct me if I'm wrong, but I believe the second character here is the ditto mark, indicating that the character from the line above should be copied. As such, reading your link source, the term would presumably be fully spelled out as 輕哨者.
Eirikr is right, it is the iteration mark 〱. Thus the translation of poppyzon given in the book is 輕哨者, meaning "one who whistles lightly". Wyang (talk) 22:06, 9 October 2014 (UTC)Reply
Hmm, I'm not familiar with that meaning. But of course this is a dialectal term - prevalent in the south I guess. Which Chinese term are you referring to? ---> Tooironic (talk) 06:43, 13 October 2014 (UTC)Reply
It's a northern Chinese colloquialism. 挫 literally means "short in stature". From what I see in google:挫男, most of the results in page 1 refer to men who are ugly/behave awkwardly and are therefore unable to attract girls. Wyang (talk) 06:49, 13 October 2014 (UTC)Reply
tempête
Latest comment: 9 years ago2 comments2 people in discussion
Latest comment: 9 years ago5 comments2 people in discussion
I know the old Derived terms formatting was outdated, but at least you could view the list in alphabetical order, now it seems to be all randomised, what's up with that? ---> Tooironic (talk) 01:17, 24 October 2014 (UTC)Reply
@Tooironic They are positioned in the same order but apparently the template {{der3}} changes the order automatically. An alternative is to use {{der-top}} and {{der-bottom}} around the list.
Latest comment: 9 years ago6 comments3 people in discussion
I don't understand why you made changes to this entry. The current standard at Wiktionary is to indicate part of speech for all entries. I don't think you can just make up your own headings like "Definitions", etc. Please explain. ---> Tooironic (talk) 12:27, 29 October 2014 (UTC)Reply
I don't understand why PoS headers for Chinese should go. I think they are very useful, especially considering many dictionaries - both online and paper-based - do not include them. Regardless, I think 研究 should be restored to match how all 词 entries are currently formatted until a consensus is reached. What do you think? ---> Tooironic (talk) 06:33, 30 October 2014 (UTC)Reply
@Tooironic Please you read those links. Wyang's argument is that Chinese has no inflection, so PoS headers have little value. Also,
PoS is not inherent to Chinese words. You can't tell by their form, if they are nouns, verbs, etc. PoS can be determined in the complete phrases, not as stand-alone words.
PoS can change, depending on the usage. Most adjectives can be used as verbs or nouns, verbs can be used as nouns, etc.
Various dictionaries treat various PoS differently. I mentioned these discrepancies. We have to either list all PoS possible or limit to one.
It's too complicated for idioms and 字 words to determine and list all PoS and adding PoS headers doesn't add any value. Please check 個/个 (gè). As you mentioned yourself, dictionaries not always use PoS, only sometimes to for calrifications, like "protest" (n.)
Change the format of 研究 back if you want, until consensus is reached. Other languages without inflections could be reviewed as well - Vietnamese, Thai, Lao, Khmer, Burmese, etc. Chinese doesn't have to be "exceptional" in this regard. --Anatoli T.(обсудить/вклад)21:22, 30 October 2014 (UTC)Reply
I don't actually have much of an opinion at this point. Having read through the arguments, I can see why some editors here would like to list the translations as "Definitions", with some part of speech information included there within. Looking at an entry like 保险 I can see how that would work - currently there is a of white space, so it's not as user-friendly IMO. But I'm concerned that such a radical change (for 词 I mean, not 字 entries) would be a logistical nightmare, and hard for non-programmer editors like myself to deal with. ---> Tooironic (talk) 06:11, 31 October 2014 (UTC)Reply
thúi/thúy
Latest comment: 9 years ago4 comments3 people in discussion
I don't often agree with Fête's phonetic observations, but in this case they're absolutely right: Module:vi-pron had Thúy (female name) homophonous with thúi (dialectal word for "stinky"). "-uy" should be /wi/ and "-ui" should be /-uj/ or even /-ui/. (In other words, Thúy should be /tʰwɪ/, while thúi should be /tʰuj/.) I've restored their change. – Minh Nguyễn💬08:37, 13 November 2014 (UTC)Reply
Good. My actions were based on their making far-reaching unilateral changes without giving enough time for a response to their comment, rather than the substance of those changes (about which I know nothing). Chuck Entz (talk) 14:13, 13 November 2014 (UTC)Reply
vi-new
Latest comment: 9 years ago5 comments2 people in discussion
@Angr Yes, there are a few new cases at Module:my-translit/testcases. It seems "-teen" numerals are all in that boat, pls ass them there, so that we have them in one place. Frank, sorry for giving you more work, no-one seems to be able to work with these. :) Lao module also needs attention. For Russian, I have some requests for fixing secondary stress but this can wait as you seem to be busy. --Anatoli T.(обсудить/вклад)03:33, 17 November 2014 (UTC)Reply
Latest comment: 9 years ago6 comments3 people in discussion
When I converted the 懙 article to use the unified "Chinese" header and Template:zh-pron, attempting to use the "oc=y" parameter for Old Chinese gave me a red-text "Lua error in Module:och-pron at line 23: attempt to perform arithmetic on global 'codepoint' (a nil value)" error. Could you check the article to make sure my syntax is correct and to spot any possible module issues? It's not urgent but I want to make sure I'm not doing anything wrong. Cheers! Bumm13 (talk) 05:40, 27 November 2014 (UTC)Reply
Looks like we have a similar issue when trying to use "mc=y" at the 不 article. The error in red text reads: "Lua error in Module:ltc-pron at line 514: attempt to concatenate global 'fanqieB' (a nil value)". Bumm13 (talk) 17:23, 6 December 2014 (UTC)Reply
After further editing, I haven't seen this error show up on any other article lacking a Middle Chinese infobox, just that one. Bumm13 (talk) 19:27, 6 December 2014 (UTC)Reply
Latest comment: 9 years ago5 comments2 people in discussion
I was curious what you think of Special:Contributions/118.6.149.25. I'm currently seeing just edits to JA entry い (i), and KO entries 이 (i) and 가 (ga). I'm quite interested in the OKO connection suggested, and whether you know anything more about that? Also, I remember reading from a couple different linguists that 가 (ga) was relatively recent, and was probably derived from JA, which this anon seems to be discounting. Their description of the JA particle in the KO entry is both misplaced and misleading, which raises doubts about their trustworthiness. TIA, ‑‑ Eiríkr Útlendi │ Tala við mig00:58, 9 December 2014 (UTC)Reply
I find the four edits to 이 (i) very puzzling - I'm not even sure the current etymology is what the IP intended to write. The current etymology there is incorrect, as the archaic Korean form of "ni" was non-existent. I would be curious to know where he/she got the "ni" etymology from. Korean "ga" is quite recent and it was used initially as an alternative emphatic particle to i, only after i/y-ending nouns. More discussion can be found here. Wyang (talk) 06:54, 9 December 2014 (UTC)Reply
Thank you for the link. Unfortunately:
You have either reached a page that is unavailable for viewing or reached your viewing limit for this book.
... but I think I might have an earlier edition of this same book at home (with the almost-kelly-green cover, also Cambridge University Press, same cover design as Shibatani's The Languages of Japan from the Cambridge Language Surveys series).
In light of your comments here, I'm reverting the anon's edits to 이 (i). (The edits to い (i) were on the mark.) I'll see if I have the book at home, and if I do, read up on 가 (ga) and make a judgment then. Or, feel free to beat me to it and revert/alter the anon's edits to 가 (ga) as you see fit. :)
Latest comment: 9 years ago6 comments2 people in discussion
Frank, are you able to add data for Min Nan from Min Nan Wiktionary? It would be great if readings could be automatically loaded like Cantonese and Hakka, if it's possible. Doesn't have to be now, I know you're busy. BTW, mn_note (and other notes) parameter seems to have stopped working. --Anatoli T.(обсудить/вклад)04:57, 12 December 2014 (UTC)Reply
No problem. I am getting the pronunciations elsewhere now and will format and upload them when it finishes. Notes at 生 seem to be working well. Wyang (talk) 06:10, 12 December 2014 (UTC)Reply
@Tooironic Carl, it seems you need to create fantizi before making jiantizi. Both entries are OK now, I've made fantizi entries, pls. check if they are what you wanted them to be. The problem can be replicated by making jiantizi without corresponding fantizi entries. Ideally even if fantizi doesn't exist, it won't give module errors. I'm sure Frank can fix it. --Anatoli T.(обсудить/вклад)11:15, 13 December 2014 (UTC)Reply
Oh I see. Is there a way to display the contents at both the simplified and traditional entries? At the moment the user has to click on a redirect, it's not very user-friendly. ---> Tooironic (talk) 13:58, 13 December 2014 (UTC)Reply
Oh. That's a shame. I was under the impression that we could. Now I feel that users of simplified Chinese are at a disadvantage having to click through to see the contents of most the entries they look up. ---> Tooironic (talk) 16:03, 13 December 2014 (UTC)Reply
@Tooironic It's strange that you voted in support, although the topic had clear example entries. Note that no published dictionaries use both scripts equally, on or the other script is always the primary script and the other is provided once. Do you oppose the centralisation of entries under fantizi? --Anatoli T.(обсудить/вклад)21:35, 14 December 2014 (UTC)Reply
Like I said, I think I misunderstood the nature of the centralisation. I thought that entry content would be viewable on either script. I don't think it's very user-friendly to require any user - whether it be simp or trad form user - to have to click-through to see the content of an entry. ---> Tooironic (talk) 05:29, 15 December 2014 (UTC)Reply
Latest comment: 9 years ago4 comments2 people in discussion
I'm not sure why, but there's been a module error at the documentation page for module:zh since your edits to it the other day, which is showing up at the module page, too, via transclusion (it's the first time I've ever seen a module with a module error). The invocations with the error:
{{#invoke:zh|hzbox|光合作用}}
{{#invoke:zh|hzbox|光合作用|22}}
{{#invoke:zh|hzbox|葉綠體}}
{{#invoke:zh|hzbox|葉綠體|21}}
Since it's restricted to just one location, which is outside of mainspace, it's not exactly an emergency, but I thought I'd bring it to your attention, anyway. Chuck Entz (talk) 04:12, 15 December 2014 (UTC)Reply
Latest comment: 9 years ago3 comments2 people in discussion
I've lost all the "derived terms on 气 but I don't find it easy to convert to traditional or, even better, reformat and use {{zh-l}} with both forms. There's no easy way to do that, is there? That's one concern (entries are out of sync) and a reason for the centralisation (to avoid this). Editors edit one version but not the other. --Anatoli T.(обсудить/вклад)06:19, 17 December 2014 (UTC)Reply
The best I can do is automatic simp->trad conversion of these Simplified lists of compounds. An example is {{zh-der}}, where surrounding the derived terms list with {{zh-der|...}} syntax and previewing it can give the formatted list. The results must be doublechecked, though. Wyang (talk) 06:45, 17 December 2014 (UTC)Reply
Hi Anatoli, I'm inclined to not include the Brand name if there are no other meanings. By the way, hong2 qi2 and wang4 kei4 are the correct pronunciations, but they are hardly used in real life. Most people pronounce it as hong2 ji1 and wang4 gei1/hung4 gei1/hung4 kei4. Wyang (talk) 02:30, 23 December 2014 (UTC)Reply
zh-forms documentation
Latest comment: 9 years ago9 comments3 people in discussion
Hi Wyang. I can see that you have made the template zh-forms and it looks really cool. The documentation is just a list of examples with no explanation. I can figure out that the parameters means but where does it get the translations in the table from and how are people supposed to edit them? I have looked at 人民 and it says that 民 means "the people; nationality; citizen" but the entry don't have a definition in the Chinese section only in the character section which is "people, subjects, citizens". Could you write a few words in the documentation about how it works and how people should use it? Kinamand (talk) 18:47, 30 December 2014 (UTC)Reply
Hi Frank, how would you use {{zh-forms}} in entries with commas or punctuation marks, e.g. 和尚打傘,無法無天? The template should ignore, including in the |type= parameter. There are still quite a few entries with "Mandarin" header and {{cmn-idiom}}, etc.
I used the same method on 車同軌,書同文, although {{zh-new}} doesn't work well with commas (it needs to convert full-width "," to "," in pinyin). Further trouble is with {{zh-usex}} when there are English words inside. I couldn't force spaces. I think you fixed it a while ago on a Min Nan mixed script usage example. --Anatoli T.(обсудить/вклад)11:02, 2 January 2015 (UTC)Reply
Module:zh-see and multiple possible traditional forms
Latest comment: 9 years ago3 comments3 people in discussion
Hi Wyang, is there any way to add multiple traditional character links to Template:zh-see? Occasionally, a simplified character will have more than one possible traditional equivalent. Thanks Bumm13 (talk) 05:06, 3 January 2015 (UTC)Reply
I feel that the template should only take one parameter. 偽 and 僞 are variants of each other, and one of them should be redirected too. In cases where multiple non-variant characters simplify to the same character, it should be separated by Etymology 1/2... like 干. Wyang (talk) 07:39, 3 January 2015 (UTC)Reply
I've reformatted the entry as the "dealer" sense is more common. I can also make the code try to extract another sense from entries, but I'm not sure that will be useful and not too confusing. Wyang (talk) 23:16, 5 January 2015 (UTC)Reply
No Chinese hanzi box is supposed to be used in the translingual section. The error message is due to multiple hanzi boxes being used on page, including one in the translingual section. In the simplified entry it should be
Why should simplified entries only use the zh-see template and not have the whole definition . Since simplified characters are used far more often than traditional it seem odd to me. Have there been a discussion about this. I would like to know the arguments behind that decision. Kinamand (talk) 07:19, 6 January 2015 (UTC)Reply
There was extensive discussion. As I understand it, it's easier technically to convert from traditional to simplified than in the other direction, so the full information is the traditional entry. There's nothing political about it: Wyang is from the Mainland and grew up with the simplified script, so he would have done it otherwise if he could have. Chuck Entz (talk) 07:55, 6 January 2015 (UTC)Reply
Not just me, other editors have started to change entries to the new format a while ago. There is no point for a vote when no Chinese-language editor opposes the proposal. The vote is a means for a bunch of utter standers-by to dictate what chores others should do. Wyang (talk) 12:22, 6 January 2015 (UTC)Reply
The problem is that you forget to make documentation of new templates for example of zh-see and that makes it very difficult for others to make contributions. Kinamand (talk) 13:28, 6 January 2015 (UTC)Reply
I think you are confused by how the bot works (diff and diff) - the previous bot edits on those pages did not remove {{also templates. Anyway there needs to be better automatic handling of such correspondence sets, since we are possibly looking at thousands of affected entries. Comprehensive variant lists such as this may be used to compile lists of variant forms, which are then maintained automatically by form-templates. Wyang (talk) 06:17, 7 January 2015 (UTC)Reply
I agree about the auto handling but to illustrate what I'm referring to: diff and diff. It processed the (simplified) Chinese entry, then wiped the {{also matter above. Even when there is a Japanese entry on the page: diff. Hongthay (talk) 11:13, 7 January 2015 (UTC)Reply
I was trying to see if I could make it work. :) There is still one unfixed testcase, which either results from error in the en.wp article or an exception to DerekWinters' rules. More testcases are needed. Wyang (talk) 02:09, 9 January 2015 (UTC)Reply
I changed the list of special consonants to remove 'ṇ'. When I first made the list, I was just going off of my own intuition. Now it seems actually that I had made a mistake with 'ṇ'. DerekWinters (talk) 11:20, 9 January 2015 (UTC)Reply
Great. What is the rule concerning anusvara as in aṁgrez? Should preceding anusvara be treated as vowel-like in vowel dropping? Wyang (talk) 11:26, 9 January 2015 (UTC)Reply
ṁ in front of a velar (k, kh, g, gh) is ṅ. In front of a palatal letter (c, ch, j, jh) it is ñ. In front of a retroflex (ṭ, ṭh, ḍ, ḍh) it is ṇ. In front of a dental (t, th, d, dh) and all remaining consonants (y, r, l, v, ḷ, ś, ṣ, s, h) it is simply n.
The case of the anusvara is rather strange. Originally, it only indicated a word-final 'm' or a nasalization. Later, there was an orthography reform in which the anusvara took the place of all nasals in cluster-initial position. Thus, 'k' + 'i' + anusvara + 'g' = 'king'. So even though it is a diacritic, it took the place of consonants and thus should be treated as such. DerekWinters (talk) 11:39, 9 January 2015 (UTC)Reply
Great. The rule should actually be XTaCV = XTaCV. Also, I must tell you that the rules I gave were only a few. There are sure to be many more.
About the other testcases. व्यवच्छेद (vyavaccheda without any dropped vowels) would get split into vyav|cched. स्वत्वहरण (svatvaharaṇa without any dropped vowels) would get split into svat|va|ha|raṇ. संगमरमर is actually an ambiguous case, so I'll take care of that one. DerekWinters (talk) 12:05, 9 January 2015 (UTC)Reply
What would be the best way to automate these? The code currently has two vowel dropping rules: 1) word-final -CSa is reduced to -CS. 2) the sequence 'VCaCV' is reduced to 'VCCV', applying from right to left. How should the code be modified to account for those reductions or non-reductions? Wyang (talk) 12:19, 9 January 2015 (UTC)Reply
Forgive me, I completely missed it. Ok. Actually, a word-final -CSa should not lose its 'a', however a word final -CRa should becone a -CR.
So, ignore vyavacched. Svatvaharaṇ has an underlying CSaCSaCaCaCa structure, it being a compound of 2 other structures: CSaCSa (which would lose no vowels) and CaCaCa (which would lose its final 'a').
With sangamarmar, and all the other aṁgQQQ forms, I have no clue what to do. I'm getting conflicting pronunciations. I'm consulting online dictionaries, myself, and my cousin, and sometimes we all agree, and sometimes not at all. Sometimes I feel as if I could pronounce it both ways, with and without it. See, disregarding those, everything else it working, so suffice it to say we can easily hardcode them.
With antarrāṣṭrīya and bhārtīya I realized that we need to have a few special rules regarding 'y'. A word final īya, eya, and aiya (the single vowel 'ai' (ै)) maintain their final 'a's. In word-medial position at the end of a syllable, normally the 'a' is dropped (latakaro -> latkaro). However, layakaro would not become laykaro, it would instead maintain its 'a'. DerekWinters (talk) 12:46, 10 January 2015 (UTC)Reply
Sorry, I was supposed to say the first rule was "word-final -a is dropped unless in -CSa". I guess much of the variation comes from compounding. For the moment I only added another non-dropping rule for '-ya', and I'm not sure how to automate the rest. Please tell me if there are other things that you would like me to add or modify. Wyang (talk) 13:58, 10 January 2015 (UTC)Reply
No problem, I know it's a difficult task. You've done ridiculously well so far. Lets see, could you perhaps modify the syncope thing to work the medial y non-dropping rule? Also, could you add the bit where 'ṁ' (anusvara) becomes the correct nasal preceding another consonant as I mentioned above? I think everything else is good. We'll just have to hardcode the aṁg words, perhaps with the two alternate pronunciations. DerekWinters (talk) 14:17, 10 January 2015 (UTC)Reply
Also, for the anusvara. If there exists a word-final -aṁ, it becomes -am. If it is word final on any other vowel it nasalizes it to ā̃, ẽ, ĩ, ī̃, ũ, ū̃, etc. DerekWinters (talk) 14:20, 10 January 2015 (UTC)Reply
I've done the following changes: 1) made 'VyaCV' sequence not become 'VyCV'; 2) added anusvara assimilations (consonant and word-final vowels); 3) added a functionality "+", marking compounding boundary such that medial vowel dropping cannot apply across "+", e.g. रंगपटल(raṅgapṭal) would be input as "रंग+पटल" "रंग+पटल(raṅgpaṭal)". Please check the testcases, thanks! Wyang (talk) 01:00, 11 January 2015 (UTC)Reply
You are an absolute genius. Thank you. I think this is ready for application. This is ready for Hindi. I'm hesitant to make it usable for any other language at the moment. I'll make some more testcases for Marathi and then give it a go. DerekWinters (talk) 11:59, 11 January 2015 (UTC)Reply
I'm joining a praise. Excellent job again, Frank! It's not the hardest module he has done, though. Korean, Burmese, etc. are much more complicated. Sorry for not helping much in the last few days. @DerekWinters re: Bengali, Oriya, Gujarati (also Amharic/Tigrinya) modules: the logic for schwa-dropping is the same or almost the same as Hindi but the code is too complicated for me to simply transfer it to other modules. Bengali would be a higher priority after Hindi (it's an official language of a very populous country and we have lots of entries) but we need to make the basic module first. --Anatoli T.(обсудить/вклад)13:04, 11 January 2015 (UTC)Reply
Super sorry, but I forgot that for the labials (p, ph, b, bh, m) the anusvara becomes a 'm'. I already took care of it though, just put it here for future notice. Also, let's make this module official for Hindi and Marathi as of now @Atitarev. I'll see if any other languages can come under it. Nepali, Newari, Sanskrit, and the Prakrits cannot because they follow different rules under devanagari. DerekWinters (talk) 14:10, 12 January 2015 (UTC)Reply
I'm certain that it becomes an 'm' in front of an 'm'. Truthfully I've never seen it before an 'ṅ'. I tried to see what it would be like and I wasn't able to make any sense out of anusvara + 'ṅ'. I think it's safe to assumer we'll never see that combination. DerekWinters (talk) 08:08, 13 January 2015 (UTC)Reply
I have already added the module to Module:languages/data2 for Hindi, Marathi and Nepali (non-mandatory, i.e. overrideable with manual translit). Nepali could be taken out, if it doesn't work (I judged by examples in my Nepali phrasebook and assumed it's the same as Hindi, Marathi contributors, on the other hand, provide all vowels, ignoring schwa dropping, for some reason). For languages, which use multiple scripts, a script detention should be used (like i.e. Mongolian). @DerekWinters How do you think nasal diphthongs should be transliterated, e.g. हैं(ha͠i) ("h͠ai' is used on the entry)? हैं(ha͠i). Also, I think there should be an apostrophe between vowels, e.g. डाउनलोड(ḍāunloḍ) should be "ḍā'unloḍ". What do you think? --Anatoli T.(обсудить/вклад)21:40, 12 January 2015 (UTC)Reply
Please do not make it for Nepali. Nepali is supposed to work strictly like Sanskrit, with each letter having the 'a' unless a virama is used. Although I guess I'm not 100% sure. Recent examples, including online dictionaries, seem to use devanagari the same as Marathi and Hindi, but I can't tell if it's just laziness, a new trend, or something else altogether. Truly, I'm uncertain and I'll try and figure it out. For nasalizations of diphthongs, both vowels should show the tilde, because in Hindi/Marathi it's treated as one vowel (h͠ai). The apostrophe makes it seem like there is a break between the vowels, almost like a glottal stop. I know google translate employs the apostrophe, but I personally just don't like the apostrophe. If you feel it's necessary then that's fine. DerekWinters (talk) 08:08, 13 January 2015 (UTC)Reply
I have changed Nepali to use Sanskrit module for now (it's definitely better than nothing). Added a new test case for हैं(ha͠i) to use "h͠ai", if it's OK with you guys. Apostrophe would be essential to separate diphthong "ai" (ऐ) and "au" (औ) (or diacritics) from consecutive "a + i" and "a + u", if they occur. User:Dijan had some question on my talk page: User_talk:Atitarev#hi-translit.
I may have gone a little overboard with the extra letters not used in Hindi, but at the time I was envisioning the Pahari languages employing the module as well. It turns out that those languages use some other rules, but there isn't enough literature on them. However, most of the extra letters get used at least once or twice in Hindi.
Thanks, I think he has done a very good job at the i entry and there is nothing I'd like to change there. My comment is that the content is presented in a very confusing way at i due to Wiktionary's entry formatting. I also had a look at some of the other changes of his; they seem okay to me, although some templates for Korean syllables should have been simplified with Lua. Wyang (talk) 22:36, 14 January 2015 (UTC)Reply
Latest comment: 9 years ago3 comments2 people in discussion
Hi Frank, is there a way to get the pinyin in the example sentence to display as Léi Fēng rather than LéiFēng? I asked Anatoli but he wasn't sure. Thanks. ---> Tooironic (talk) 02:49, 18 January 2015 (UTC)Reply
Latest comment: 9 years ago3 comments3 people in discussion
I hope you can find time to have a look at 正山小種, 立山小種, and 拉普山小種. The spelling alternation between 立 and 拉普 kinda makes sense phonetically, but I'm baffled how, why, and when 正 came into the picture, and how it is that the pronunciation is still "lap". I added a note to that effect in the 正山小種#Japanese entry's Etymology section, but that note might need changing. TIA, ‑‑ Eiríkr Útlendi │ Tala við mig18:51, 19 January 2015 (UTC)Reply
This really is an area of mystery. From what I found on Google Books and Chinese sources: The name "Souchong" appeared first in late 18th century, and it's quite certain that it came from Chinese 小種/小种 (xiǎozhǒng, “a small (tea) variety; small sort”). The name "Lapsang" is modifier of "souchong" which started to appear around the 1830s-1840s. I haven't found a definitive etymology of "lapsang" (to me it sounds like 臘腸, lol), and here are the hypotheses I found:
From Cantonese "立山" (lap saan). 立山 seems to be a mountain in the Wuyi Mountains, where lapsang souchong is found.
From Min Nan "內山" ("inner mountain"), to distinguish it from the "outer mountain small tea varieties" .
From Cantonese "爉生酥種" ("the smoked, fresh and fragile variety") . Unlikely.
Simply an invented commercial name .
Uncertain . :)
拉普山 is a later Mandarin translation rendering of English "lapsang", via phonological transcription due to its etymological obscurity. 正山 (the "lineal mountain") or "內山" (the "inner mountain") refers to the tea variety produced in Tongmuguan, Xingcun Village, Chong'an County, Fujian (福建崇安县星村乡桐木关) and is differentiated from "外山" (the "outer mountains"), which are "non-lineal varieties" (, "The Chinese Tea Bible" (中国茶经)). Wyang (talk) 05:33, 20 January 2015 (UTC)Reply
zh-pron
Latest comment: 9 years ago4 comments2 people in discussion
Hi Wyang. The template zh-pron sometimes gives a play button which makes it possible to hear the word. The documentation of the template don't say anything about how it works but it works on 很 but not on 好. The sound file for 好 is here . Why does the template not find it? How can I fix it? Can you update the documentation with info about it? Kinamand (talk) 15:09, 21 January 2015 (UTC)Reply
I have now found and read the documentation and made this edit but I don't understand why it did not work in the first place with ma=y. The filename is zh-PINYIN.ogg so it should work according to the documentation. Kinamand (talk) 11:04, 25 January 2015 (UTC)Reply
I now see that you have edited 好 since I did ask the first time and you fixed an error in the use of zh-pron. I have therefore undone my last edit of 好. Kinamand (talk) 11:17, 25 January 2015 (UTC)Reply