Hello, you have come here looking for the meaning of the word Template talk:zh-forms. In DICTIOUS you will not only get to know all the dictionary meanings for the word Template talk:zh-forms, but we will also tell you about its etymology, its characteristics and you will know how to say Template talk:zh-forms in singular and plural. Everything you need to know about the word Template talk:zh-forms you have here. The definition of the word Template talk:zh-forms will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofTemplate talk:zh-forms, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
rfm
Latest comment: 13 years ago4 comments2 people in discussion
I suggest moving this template to Template:Hani-forms, and keeping the old name indefinitely as a redirect.
The code "zh" is ambiguous and unwanted per consensus for a number of reasons. In particular, this template begins with "zh" (which means "Chinese", or "Mandarin", depending on how you look at it), but it is also used in other languages written in Han script, whose code is "Hani". This template serves the purpose of showing varieties of Han script, so a name beginning with that code seems to be a very natural choice.
If there are many good template names to be chosen, then you can consider my proposal of "Hani-forms" as completely arbitrary, but a proposal nonetheless, that I believe to be better than the current system.
However, I do think that "Hani-forms" is even better than "zhx-forms". The template is used with Translingual entries, that are neither Sinitic nor of any other family, but are written with Han characters nonetheless. --Daniel12:28, 8 June 2011 (UTC)Reply
I originally put "This category includes any Chinese word containing two consecutive identical characters." in the description of Category:Chinese reduplications to show that this is only a category for all words with reduplicated characters. I tightened the criteria a bit to exclude sole transcriptions, and reduplications crossing component boundaries, but it may be hard to achieve the linguistic sense of reduplication automatically. Wyang (talk) 01:57, 26 October 2016 (UTC)Reply
Latest comment: 8 months ago24 comments10 people in discussion
@Wyang, @Zcreator alt, @Justinrleung, @Suzukaze-c Hi. After the most recent edit in Module:zh-forms, the zh-forms box isn't displaying the proper traditional and simplified forms for 溍 and 溍 (both encoded under the same code point) based on the language tag. Also, I think it would be preferable to add in such characters manually rather than letting it do so automatically. If you look at revision 49664286 , some characters added to the unified_char list such as 芔(huì), 郮(zhōu) show no significant difference between traditional and simplified forms in the Unicode charts. I don't think it is necessary to split between traditional and simplified forms for characters that show only minor cosmetic differences (mostly in the stroke direction) such as 今/今, 氐/氐, 令/令, 艾/艾, 叟/叟, 丰/丰, 犮/犮, 壬/壬, 呈/呈. Instead, these differences should be noted in the translingual section (either as alternative forms or in their respective ids). KevinUp (talk) 13:08, 8 June 2018 (UTC)Reply
I think it would be much better to add something such as "zh-forms|s=t" to characters such as 珊/珊 (U+73CA), 琤/琤 (U+7424), 猺/猺 (U+733A), 瘟/瘟 (U+761F), 莒/莒 (U+8392) as these characters are special exceptions that have been unified when compared with derived characters of 冊 (U+518A)/册 (U+518C), 爭 (U+722D)/争 (U+4E89), 䍃 (U+4343), 𥁕 (U+25055)/昷 (U+6637), 呂 (U+5442)/吕 (U+5415) such as 姍 / 姗 (U+59CD)/姗 (U+59D7) and 睜 / 睁(zhēng) (U+775C)/睁(zhēng) (U+7741) and 搖 / 摇(yáo) (U+6416)/摇(yáo) (U+6447) and 溫 / 温(wēn) (U+6EAB)/温(wēn) (U+6E29) and 宮 / 宫(gōng) (U+5BAE)/宫(gōng) (U+5BAB) that have been disunified. It should be noted that Han unification is slightly inconsistent, with frequently used characters split into separate code points whereas rarely used characters are unified. Hence, I would suggest adding "zh-forms|s=t" to such anomalies when encountered instead of having a unified_char list that is prone to errors when not properly checked. KevinUp (talk) 02:12, 6 July 2018 (UTC)Reply
@KevinUp: 溍 looks fine on my computer. What does it look like on your system? About unified_char, I do agree that the list needs improvement, but I like the idea that this is done automatically. We can always update the list when needed. There are still some problems to consider: (1) not all systems have the right fonts; (2) some simplified glyph shapes are acceptable (or even standard in Hong Kong) in traditional Chinese; (3) how different is different?—to me, 犮 and 犮 are different enough. — justin(r)leung{ (t...) | c=› }02:30, 6 July 2018 (UTC)Reply
I'm not sure if the problem still persists on your (KevinUp) computer, but I can see a trad-simp form difference on 溍, same as Justin above. Wyang (talk) 03:07, 6 July 2018 (UTC)Reply
@Wyang: No, it's still not working for me. However, if I were to copy the code from your previous edit at 49663403 and apply it to the page for 傜, I would be able to distinguish between the two forms. Otherwise I'm only seeing the simplified forms in both boxes. KevinUp (talk) 04:40, 6 July 2018 (UTC)Reply
@Wyang:, @Justinrleung: I managed to get the characters to display correctly via this edit . Can you all check to see if the fonts are applied correctly on the devices that you are using? Thanks. KevinUp (talk) 07:56, 6 July 2018 (UTC)Reply
@Justinrleung: (1) On my system I am able to distinguish between 溍 and 溍. Before this, in edit 49666022 , I was still able to see the difference. But since edit 49666028 , only the simplified form is shown. (2) Can you list a few more examples where the glyph shape in Hong Kong is different from the one used in Taiwan besides 户(hù) (standard in Hong Kong/mainland China) vs 戶 / 户(hù) (standard in Taiwan) and 昷 (standard in Hong Kong/mainland China) vs 𥁕(wēn) (standard in Taiwan)? So far I'm only aware of these two, as well as 𤏁/𤏁 (U+243C1) and 𤇍/𤇍 (U+241CD) which have different compositions in Hong Kong compared to Taiwan based on HKSCS 2016. In this case, adding usage notes for the respective characters would be more helpful. (3) I agree that 犮 and 犮 are different enough because there is an additional horizontal stroke for the form used in mainland China. Most of the characters that look different in mainland China due to Xin Zixing (新字形) and are encoded under the same code point in Unicode should not be considered as "simplified forms" as this would cause some confusion. Simplified characters should be defined as those that are found in 1956 《漢字簡化方案》, 1964 《簡化字總表》, 1988 《現代漢語通用字表》, 2013 《通用規范漢字表》 and 1956 《第一批异体字整理表》 (Revised 1986, 1988, 1993). Besides this, I am of the opinion that characters which have separate code points in Unicode such as 別 / 别(bié) (U+5225) and 别 (U+522B), 內 / 内 (U+5167) and 内 (U+5185) or preferred forms that are encoded separately such as 玨(jué) (preferred in Taiwan) and 珏(jué) (preferred in mainland China) can be listed as being traditional/simplified in the zh-forms box. However, I don't think it is a good idea to consider Xin Zixing characters as being simplified. Some traditional characters in China are composed of simplified elements due to Xin Zixing such as 殺 (mainland China) vs 殺 (Taiwan) and 鷀 (mainland China) vs 鷀 (Taiwan). In this case both mainland China and Taiwan character forms are encoded under the same code point but are composed of different forms and have different stroke counts. I think having the unified_char list is great but it needs to be properly checked and compared with the Unicode charts to ensure that the characters are actually different. To me, characters that were unified inconsistently (such as the anomalies given in the second top level of this discussion) should be added to the list while those that are unified consistently across its set of derived characters such as 犮/犮 should not be added to the list. Consider 任/任, listed as being traditional/simplified) due to the difference in composition of 壬/壬. By analogy the derived characters of 任 such as 凭/凭 should be added as well. But if someone were to add in derived characters of 任 en masse, some anomalies are bound to occur such as in 凭, which is both a traditional character found in Shuowen Jiezi and the simplified form of 憑 / 凭(píng). Hence I don't think it is a good idea to define Xin Zixing characters that are unified consistently as being simplified. By the way, I'm using Source Han Sans fonts. It covers the differences in glyph shapes between different regions and supports all characters found in HKSCS 2016. KevinUp (talk) 04:40, 6 July 2018 (UTC)Reply
We should determine a reasonable limit to this, or otherwise we might as well show zh_CN-Hans, zh_CN-Hant, zh_HK-Hant, and zh_TW-Hant at all times. —Suzukaze-c◇◇05:42, 6 July 2018 (UTC)Reply
@Suzukaze-c: I think that one way to overcome this issue is to upload SVG files of Open-source Unicode typefaces such as Source Han Sans, Source Han Serif and Google Noto Sans/Serif CJK to Wikimedia Commons so that the different character forms can be displayed independently of the fonts used by the user's computer system. Another possibility is to put a special note to specify that the character may appear differently due to Xin Zixing character forms rather than splitting the box into traditional and simplified forms. Note that some 新字形 and 舊字形(jiùzìxíng) images have already been uploaded to Wikimedia Commons, and these can be found on the 新字形 page on Chinese Wikipedia. KevinUp (talk) 07:56, 6 July 2018 (UTC)Reply
I am flattered to be pinged to this discussion, but I'm afraid I can contribute very little to these kinds of technical issues. I defer to the experts here. ---> Tooironic (talk) 07:08, 6 July 2018 (UTC)Reply
@KevinUp, @Justinrleung, @Suzukaze-c, @Wyang The problem with the output is that this template is outputting lang="zh", which only contains a language code. To get correct display between simplified and traditional Chinese, you need to use ISO 15924 script codes in the lang attribute (i.e., lang="zh-Hans" and lang="zh-Hant") because this information is what Web browsers use for correct glyph selection. For a proof of concept, see Template:CJKV-forms, which had the problem this template currently has; I just fixed it.
If it's necessary to display distinct glyphs for Hong Kong and Taiwan traditional Chinese, you'll have to get even more specific and use the language codes for Cantonese (yue) and Mandarin (cmn) (i.e., lang="cmn-Hant" and lang="yue-Hant"). Or so I assume; I've never tried to display distinct glyphs in this case.
You can also use region codes: lang="zh-Hant-HK" and lang="zh-Hant-TW". However, I dislike this approach because it ties a language to a political designation.
For the first, simpler case, it looks like there are two places in the code where lang="zh" attributes are output and need to be fixed. In each, lang="zh-Hant" needs to be output when the script arguments are 'trad', lang="zh-Hans" when they are 'simp', and lang="zh-Hani" otherwise.
I can attempt to fix this template myself, but I would prefer that someone else try since I don't feel particularly comfortable modifying live code in a programming language I don't know. (I have strong abilities in several programming languages, but Lua isn't one of them.) If no one tries, I'll probably make an attempt anyway.
@Justinrleung: I also feel that region codes are a bad idea (as previously stated), but ISO 15924 script codes should be used. Users' browsers are already picking fonts since Wiktionary doesn't serve its own fonts. It's using a stylesheet to make educated guesses about what fonts are available on a user's system, but those fonts can't be predicted reliably and the guesses are more likely to be wrong for users on minority operating systems (e.g., Ubuntu (Linux)) such as myself. It therefore should be assumed that browsers will need this information until Wiktionary serves its own fonts.
As for that stylesheet, CSS has a :lang() selector specifically for dealing with this subject, but it doesn't work properly if script codes aren't specified. This is evidenced in said stylesheet, which is using classes in an attempt to work around a lack of script codes. For example, code like .Latn is brittle and breaks as soon as someone adds a script or region code; it should be :lang(zh-Latn).
AFAIK, allographic variant characters like 次/次, 草/草, 道/道 or 骨/骨 are rather regional differences than differences between traditional and simplified characters, also because these characters aren't part of the Complete List of Simplified Characters. Therefore it's probably better to abandon that list. By the way, this list is far from complete. --SelfishSeahorse (talk) 21:31, 18 February 2020 (UTC)Reply
Edit: Some examples of characters that look the same in mainland China and Hong Kong, but different in Taiwan:
Mainland China
Hong Kong
Taiwan
情
情
情
次
次
次
雇
雇
雇
And and example of a character that looks different in mainland China, Hong Kong and Taiwan:
Honestly we should probably display a "Taiwan" section at all times, like zh.wiktionary, instead of maintaining these huge lists. —Fish bowl (talk) 01:02, 7 April 2022 (UTC)Reply
@Fish bowl: I think it would be nice, though would it look too clunky on the side? Another issue is that sometimes Taiwan or HK (and in the rare occasion, simplified) may have more than one acceptable/accepted variant (officially or otherwise); it’s hard to say for HK sometimes because there is much less standardization at the 詞 level afaik, given that there aren’t big official dictionaries for HK afaik. A third issue is that places like Macau don’t have a clear standard afaik; do we assume it’s traditional or following Hong Kong? BTW, the HK standard (according to 常用字字形表) should be 説夢話. — justin(r)leung{ (t...) | c=› }13:26, 1 May 2022 (UTC)Reply
Do you mean something like the Chinese Wiktionary template that shows variants and relatives?
Also what about adding a remote character composer renderer?
Like ⿱𥫗旦 becoming 笪 automatically?, it could pull a svg renderer or combining with a tag would render over those characters maybe? Kernel-chan (talk) 02:12, 2 May 2022 (UTC)Reply
The problem is old and in fact since Unicode 4 has a clean solution. The attempt to support variants in browsers using language tagging (either with deprecated Unicode language tag characters, or using rich-text tagging with HTML, XML, or even CSS) is deprecated since years. The real solution (that works even in modern browsers, and renderers, even in plain text) is to use variation sequences (i.e. to follow the unified ideograph by a variation selector, which are standardized in the Ideographic Variation Database (IVD), a integral standard part of the Unicode character database (UCD). However this template (and the associated module) does not use any such IVD sequence.
Note that the module would need to specify which "variation selector" to use for each form of each ideograph (the same "variation selector" used after different characters are not warrantied to select the same form, and in fact Han ideographs may have MORE than just two forms ("simplified" and "traditional"). These variant forms may be encoded and added at any time in the Unicode standard (in the IVD) long after the encoding of isolated ("unified" or "compatibility") ideographs and isolated variation selectors: you need to use the normative data from the IVD (there, you'll find multiple variants for traditional forms, and multiple variants for the simplified form, depending on the language: Chinese, Japanese, Vietnamese, Korean, or relevant national standards).
It is the standardisation of the IVD that allowed Unicode and the ISO TC to affirm that there would no longer be any new addition to "compatibility ideographs" and that any such request for standardization will be now rejected (the two existing compatibility blocks in the BMP have been "frozen", except to fix a few missing characters that were forgotten in the relevant standards that were accepted and normatively referenced in past Unicode/ISO standard versions, due to past defects in the Han unification: all seems to be fixed now, and there are more quality assurance tools used by Unicode and the IRG to make sure that all variants are referenced in the IVD (all past compatibility ideographs are present in the IVD with their defined variation selector, along with the variation selector for the unified ideograph, so that canonical equivalence now works perfectly with Han characters). All "compatibility ideographs" are now deprecated (this does not concern 12 characters from the "IBM 32" subset that are present in one compatibility block, but that are NOT "compatibility ideographs" but are unified ideographs. Since this IVD standardization, all new additions to Han ideographs have only occurred with new blocks allocated exclusively for "unified ideographs", all of them mappable at any time in the IVD to assign their needed variants.
I then strongly suggest you to include support for the IVD (part of the standard UCD and integrated in the Unihan Database). And then generate variation sequences in this template, instead of relying of language tagging (which was experimental, and was removed from all modern browsers, whose text renderers are already capable to correctly display the variation selectors (with quality fonts that have mapping from them; legacy font mappings on compatibility ideographs is also starting to disappear, moderns fonts are now removing these old mappings in favor of mappings of variation sequences!).
Latest comment: 5 years ago5 comments4 people in discussion
Is the |alt= parameter for written or spoken variants? 圖窮匕見 / 图穷匕见(túqióngbǐxiàn) is unfortunately mixing the two.
I think it's better to move spoken variants to the "Alternative forms" section and format them with {{zh-l}} so that their pronunciation is visible. That's what I'm proposing for Japanese too.
I think of 'alt=' as written variants which are pronounced in the same way as the word being defined (like on the 停車 / 停车(tíngchē) or 高雄(Gāoxióng) pages). But here's another page where that rule is not being followed: 柴米油鹽醬醋茶 / 柴米油盐酱醋茶(chái mǐ yóu yán jiàng cù chá) --Geographyinitiative (talk) 22:06, 25 November 2019 (UTC)Reply
@Dine2016: I agree. I prefer to have |alt= reserved for variations in orthography only, i.e. they are all pronounced the same way as the main entry. Any other type of "alternative form" should either be treated as a synonym (if it's very different) or as an alternative form under the "alternative form" header. — justin(r)leung{ (t...) | c=› }22:43, 25 November 2019 (UTC)Reply
Latest comment: 4 years ago2 comments2 people in discussion
@Erutuon Could the IDS functions from Module:zh-sortkey be called so that these work properly? (and perhaps moved to a more general module? Module:Hani?)
Latest comment: 2 years ago4 comments3 people in discussion
Using this template, sometimes the box is sitting on the left, unchanged. Sometimes it has it's justify set to the right. I can't see anything in the module code that would do this, and it's very annoying. Is this meant to be this way? Levi OP (talk) 16:49, 10 January 2022 (UTC)Reply
@Fish bowl Nice find. I might just be bold and change this. If anyone has an issue with it it can be reverted but I can't see any reason that it would be like this. Levi OP (talk) 01:08, 15 April 2022 (UTC)Reply
I think the reasoning for this is that above a certain length it takes up the whole row anyway, so it may as well be aligned left. I don’t really mind it, but it seems to be set too short, and it does make it inconsistent. Theknightwho (talk) 12:32, 1 May 2022 (UTC)Reply
|ss= and 二簡 1977 vs. 1981
Latest comment: 2 years ago1 comment1 person in discussion
@Theknightwho, Kernel-chan You guys should probably figure out a way to adapt |ss= for whatever this difference is.
It is not at all critical. The problem may have to do with there being two templates on that line in the module. There are many instances of {{vern}} and {{taxlink}} in that module, but I don't recall ever seeing the two together.
The purpose of those templates is to count links to determine which organism names are most worth adding. I would like to count all uses of the name, from definitions, image captions, etymology sections, and the forms boxes. But, as I have to use the XML dumps to count the templates, I can't count links in the forms boxes. Arguably, they are of lesser importance than the links from the other items, but it may lead to failure to add organism-name entries that are of fundamental cultural importance in China and elsewhere in Asia. As far as I can tell, my template count finds only three links to Quercus serrata, whereas a search finds 18 uses. DCDuring (talk) 02:10, 7 May 2023 (UTC)Reply
{{{ss}}} needs simplified form suppression
Latest comment: 10 months ago2 comments2 people in discussion