Hello, you have come here looking for the meaning of the word Wiktionary:Category and label treatment requests. In DICTIOUS you will not only get to know all the dictionary meanings for the word Wiktionary:Category and label treatment requests, but we will also tell you about its etymology, its characteristics and you will know how to say Wiktionary:Category and label treatment requests in singular and plural. Everything you need to know about the word Wiktionary:Category and label treatment requests you have here. The definition of the word Wiktionary:Category and label treatment requests will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofWiktionary:Category and label treatment requests, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
This is the page for proposing changes to Wiktionary's category and label treatment practices, including creating, renaming, merging and splitting.
Use this page if you want to propose a non-trivial change to:
A category that is currently handled by the category tree (i.e. whose definition is handled by {{auto cat}}), particularly but not limited to cross-language topic and grammatical categories. Note that topic categories cover a particular semantic area, such as Category:Hindu deities, and their language-specific incarnations are prefixed with the code of the language in question (e.g. Category:en:Hindu deities), while grammatical categories cover a particular syntactic or morphological area, such as Category:Proper nouns by language, and their language-specific incarnations are prefixed with the name of the language in question (e.g. Category:English proper nouns). Categories limited to a particular language (which currently is only possible for grammatical categories) may be best discussed among the editors of that language community in question.
A label that is currently defined in the label data, particularly but not limited to cross-language labels. Labels restricted to a particular language (which are often names of varieties of that language) may be best discussed among the editors of that language community in question.
Archiving: Category and label treatment requests, once closed and (if applicable) acted upon, are archived on Wikipedia-style archive subpages. These can be found at Wiktionary:Category and label treatment requests/Archives and in the list below:
Category and label treatment requests: Archive index
Latest comment: 1 year ago27 comments6 people in discussion
Reviving the earlier discussion, I'm still bothered by the fact that we have two different categories for names. But the previous discussion also made it clear that it's not as easy as just merging them.
I think Category:en:Place names should probably be renamed to Category:en:Places, since it's really meant to contain terms for places. That is, since it's a topical/set-type category, the focus should be on the referent of the word, whereas part-of-speech categories like Category:English names focus on the word itself. A word is a name, and it refers to something bearing that name.
Category:en:Demonyms is a bit more problematic and I brought it up before, though I don't remember where. "Demonym", again, is a term focused on the word, not the referent. A word is a demonym. Perhaps this could be renamed to something else? Category:en:Peoples maybe?
"Category:en:Transliteration of personal names" could be renamed to "Category:English names transliterated from other languages", I suppose. What's the matter with the demonyms category? It contains demonyms, as expected. Would it be better titled "English demonyms", on the model of "English phrases"? - -sche(discuss)06:02, 10 November 2015 (UTC)Reply
@ExcarnateSojourner There being no opposition here, only support (albeit mostly old support), and no opposition or interest when I brought this up in the BP, let's revise whatever needs to be revised to put (at a minimum) all given names and surnames into subcategories of Category:Names by language, instead of some of them being in subcategories of Category:Names. The split is haphazard and arbitrary; I see the intention — put a name that was given within English in one top-level category and a name transliterating a foreign name in a different top-level category — but in practice that's not maintained, since e.g. Alexandra in the context of discussing ancient Greek is transliterating the Ancient Greek name, Sergei has been given to babies born in the Anglosphere (and to characters in English fiction), and we don't maintain such a split with place names. - -sche(discuss)16:01, 24 April 2023 (UTC)Reply
It making no sense to have Alexandra (in works about ancient Greece where it's romanizing a Greek name), Alexandra (in fiction about ancient Greece where it's a given name), Alexandra (as borne by British or American people today), Sonya, Vadim and Vladimir divided haphazardly into two different top-level categories, "Names" vs "Names by language", I'm now (attempting) editing the modules to consolidate them into "Names by language" subcategories. - -sche(discuss)14:37, 5 May 2023 (UTC)Reply
@ExcarnateSojourner @-sche I am going to take a stab at implementing this. Can you help with what the renames should be? I understand the separation between poscat categories and topic categories should be "lexical" vs. "semantic" but I sometimes have trouble putting this into practice. A tentative list based on what's already been proposed:
'DESTLANGCODE:SOURCELANG male given names' -> 'DESTLANG male given names transliterated from SOURCELANG'; same for 'female given names', 'surnames', etc. This doesn't work; these are not DESTLANG names but SOURCELANG names rendered into DESTLANG. So I propose 'DESTLANG renderings of SOURCELANG male given names' or similar. ("Transliteration" isn't quite right; sometimes these are transliterations, sometimes respellings, sometimes mere borrowings (cf. Italian Clinton).)
'LANGCODE:Foreign personal names' (a grouping category) -> 'LANG foreign personal names'
'LANGCODE:Named roads' -> 'LANGCODE:Names of roads' and remove from 'LANGCODE:Names'
'LANGCODE:Named prayers' -> 'LANGCODE:Names of prayers' and remove from 'LANGCODE:Names'
What about the following:
Subcategories of 'LANGCODE:Demonyms':
'LANGCODE:Armenian demonyms'?
'LANGCODE:Celestial inhabitants'?
'LANGCODE:Ufology' -> stays as a topic category.
'LANGCODE:Latvian demonyms'?
'LANGCODE:Nationalities'
'LANGCODE:Tribes'
'LANGCODE:Celtic tribes'
'LANGCODE:Germanic tribes'
'LANGCODE:Native American tribes'
See also 'LANGCODE:Mongolian tribes' under 'LANGCODE:Ethnonyms'.
Subcategories of 'LANGCODE:Ethnonyms':
'LANGCODE:Mongolian tribes' -> Goes wherever 'LANGCODE:Celtic tribes', 'LANGCODE:Germanic tribes' and 'LANGCODE:Native American tribes' go.
'LANGCODE:Place names' -> Delete and reclassify the terms under them using {{place}} so they end up in 'Places in FOO'.
'LANGCODE:Places' -> Leave as a topic category but remove 'LANGCODE:Names' as a parent?
Script-specific variants of 'LANGCODE:Letter names': 'LANGCODE:Arabic letter names', 'LANGCODE:Devanagari letter names', 'LANGCODE:Imperial Aramaic letter names', 'LANGCODE:Korean letter names', 'LANGCODE:Latin letter names'?
Subcategories of 'LANGCODE:Nicknames':
'LANGCODE:Nicknames' itself? This is a grouping category.
'LANGCODE:Nicknames of individuals'?
'LANGCODE:City nicknames'?
'LANGCODE:Country nicknames'?
'LANGCODE:Racist names for countries' -> Terminate with extreme prejudice, see WT:BP.
'LANGCODE:Sports nicknames' -> either 'LANGCODE:Sports team nicknames', 'LANGCODE:Nicknames of sports teams', 'LANG sports team nicknames', 'LANG nicknames of sports teams'
See also 'LANGCODE:Couple nicknames' above.
'LANGCODE:Onomastics' -> stays as topic category but should not have 'LANGCODE:Names' as one of its parents.
'LANGCODE:Language families'? Regardless, it should not have 'LANGCODE:Names' as one of its parents.
'LANGCODE:Languages'? Regardless, it should not have 'LANGCODE:Names' as one of its parents.
'LANGCODE:Taxonomic names' and subcategories:
'LANGCODE:Taxonomic names' itself?
'Taxonomic eponyms by language': Already a pos category.
'Specific epithets' -> 'Translingual specific epithets'?
Other topic categories not directly reachable through 'LANGCODE:Names' but needing consideration:
Sorry, didn't mean to ignore your ping, but got distracted by life after seeing it. As far as the categories for "English renderings of Ukrainian names" (or whatever), I have no strong preference for any particular name at this time. My immediate concern was just with addressing the odd point of bifurcation where "native English placename like Warwick or Alberta; English rendering of an Armenian placename like Stepanakert; English rendering of a personal name someone gave a baby born in Ukraine like Volodymyr" are in one top-level category system ("LANGCODE:Names", named like 'set' categories), and "personal name someone gave a baby born in Canada" is in a different top-level category system ("LANGNAME names", treated like a quasi-part of speech). It's hard to decide where exactly to split the spectrum of categories we're dealing with here, if we're wanting to keep e.g. "John" in "Category:English male given names" at that (part-of-speech-esque) category name, but wanting to consider some things like Category:en:Native American tribes to be clearly a set/list category (a set/list of tribes); my immediate point was just that I don't see a sound basis for considering "John, Jane" a POS-type (LANGNAME) category but "Volodymyr, Sergei" a LANGCODE:-set-type category — surely they're both one or both the other, and the greater momentum seems to be towards considering "names" a POS-type/LANGNAME category. But maybe we should think about that more carefully and consider them all to be "sets"? (But then, "Category:English verbs" is also just a category containing the set of English verbs. Hmm... should we perhaps allow only things that are truly "parts of speech" to have "Category:LANGNAME foobars" names, and make all the "names" categories that contain John and Volodymyr into set categories? Should that be the direction in which we eliminate the bifurcation of the 'John' vs 'Volodymyr' categories?) I do think even keeping names in two subcategories like "English given names" vs "English renderings of Ukrainian names"/"English renderings of Chinese names"/etc based on, in effect, whether they were born in Ukraine vs to a Ukrainian family in Canada (or in China vs to a Chinese family in America) may be less than ideal; e.g. what do we do if a transliterated Ukrainian or Chinese name is common in English-language fiction? What about if it's a German name; does the fact that those names are "natively" Latin script make the threshold for considering them to have become "English names" lower? Does it make a difference if the fiction is set in lightly-fictionalized Germany or Ukraine or China, vs in a space future or a generic medievalesque Middle Earth / Westeros? But I don't have time to think through and suggest any proposal for any better approach to that yet. "LANG foreign personal names" (e.g. "English foreign personal names") sounds a bit odd; would "LANG renderings of foreign personal names" (aligning with your proposed "DESTLANG renderings of SOURCELANG male given names") be better, iff we're sticking with moving "Names" categories to LANGNAME names and not LANGCODE names? I will try to respond more, and to the rest, later. - -sche(discuss)17:54, 4 November 2023 (UTC)Reply
@-sche Thanks for your comments. I have no issue with "LANG renderings of foreign personal names". I see your point about the line between nativized foreign-origin names and renderings of actual foreign names being fuzzy, but there does feel to me like a distinction, esp. in languages like Latvian that tend to respell foreign names according to Latvian spelling conventions, and the distinction is fairly clearly made in reality between e.g. the large number of Russian names respelled according to Latvian conventions (and used e.g. by the large population of Russians in Latvia) vs. the smaller number of Russian-origin names that have become nativized for naming of ethnic Latvians. In a multi-ethnic society like the US or Canada where nationality and ethnicity aren't always clearly distinguished, things get a lot fuzzier, although it still feels like there's some sort of distinction between names like Volodymyr or Volha that are unlikely to be borne by anyone other than someone who is Ukrainian (resp. Belarusian) or whose parents or grandparents are Ukrainian (resp. Belarusian), vs. a name like Vladimir or Olga that might be given to someone with no particular connection to Russia. As for whether these should use LANGNAME-type or LANGCODE-type naming, I'm not sure although I gather the distinction is supposed to be lexical vs. semantic, if that helps at all. Benwing2 (talk) 23:57, 4 November 2023 (UTC)Reply
I guess we should stick with LANGNAME naming for given names / surnames, then, at least for now. (Switching gears for a moment to address a different aspect:) Regarding "horse given names", we also have (but apparently don't currently categorize) dog given names likes Scruffy, Fido, and Spot, and we have Polly as a name for a parrot, and Mittens, Kitty, Socks for cats (also e.g. Miming in Cebuano). Perhaps we should merge all the different animals into one category for "animal given names". To me, at least, it seems intuitive to then handle this category in whatever way we handle the human given name categories—so, if we're naming the category that contains 'John' "English male given names", then 'Fido' goes in "English animal given names", or if we're using language codes, then use codes for both. (Back to the first gear:) We also have names that belong to specific individual people (Confucius, Cicero) or animals (Laika, and mythically Cerberus, Garm); we seem to put these in LANGCODE-set categories; I suppose the rationale is that the category that contains "Confucius, Cicero" contains a set of individuals, whereas "John" and "Jane" are 'less restricted'... in practice, people have undoubtedly also named babies 'Confucius' and 'Cicero', but if we demonstrate that, then we add a {{given name}} sense, so I guess we're fine leaving the individuals in LANGCODE-set categories and the {{given name}}s in LANGNAME categories... I guess this also explains the difference between nicknames (LANGNAME nicknames) and relationship names (the category contains a set of specific ships)...? nevermind, "Category:Nicknames" doesn't contain what I would've expected ("Bob, Jim, Tom" for Robert, James, Thomas) - -sche(discuss)18:45, 5 November 2023 (UTC)Reply
Just checking, when your "list based on what's already been proposed" includes "'LANGCODE:Demonyms' -> 'LANG demonyms'" but then your follow-up proposal is for Subcategories of 'LANGCODE:Demonyms': like 'LANGCODE:Armenian demonyms'?, you're proposing to not actually rename "'LANGCODE:Demonyms' -> 'LANG demonyms'", right? I'm just checking that we're going to handle "Demonyms" and the subcategories like "Armenian demonyms" the same way, either all using LANGCODEs or all using LANGNAME. I could see handling the categories that actually have the word "demonyms" in their name either way, but since some of the other subcategories like "LANGCODE:Native American tribes" do seem more like set categories, maybe it's best to consider the whole batch to be set categories and stick with LANGCODE names like they have at present? (But maybe move them out of the "Names" category?) "Couple nicknames" is an interesting case, because intuitively it seems like those and (relation)ship names should be handled the same way, since they seem like the exact same thing: "Lumity" is the portmanteau name for the two specific individuals Luz Noceda and Amity Blight, and Billary is the portmanteau name for the two specific individuals Bill Clinton and Hillary Clinton... maybe LANGCODE:Couple nicknames should be renamed "LANGCODE:Couples" to be more clearly a set category? and moved out from under the "names" category, since we don't categorize ship names as "names"? - -sche(discuss)02:34, 6 November 2023 (UTC)Reply
@-sche Thanks for pointing out that inconsistency. Rua's point awhile ago was that 'Native American tribes' is named correctly as a set category because the contents are "names of Native American tribes" but 'Armenian demonyms' isn't named correctly as the contents aren't "names of Armenian demonyms". Rua suggested renaming 'Demonyms' -> 'Peoples' although that seems a bit strange to me as the term 'demonym' is fairly well established, and furthermore a distinction could be made between nominal demonyms and adjectival demonyms (note, we have {{demonym-noun}} and {{demonym-adj}} for these two, respectively), which is clearly a lexical distinction. That suggests maybe they should all be considered lexical categories, esp. since I think something like Category:en:Exonyms doesn't make sense as a set category (being an exonym is completely a lexical property. If we are to make Category:en:Armenian demonyms a lexical category, IMO it should be Category:English demonyms for Armenians as Category:English Armenian demonyms doesn't make much sense. As for CAT:en:Couples, that seems ambiguous so maybe it should be CAT:en:Nicknames of couples or something (which would be keeping with future names like CAT:Types of stars and such). Benwing2 (talk) 02:54, 6 November 2023 (UTC)Reply
"CAT:en:Nicknames of couples" works. Or should it even be "Nicknames of pairs", since it currently contains a few things like Bushbama {{subst:dash}} or should we remove those? (We don't categorize e.g. Republicrat as anything but "US politics".) Good point about exonyms. "Demonyms", or at least the things currently in the "Demonyms" categories, seem to straddle the line between being a set category like "Occupations", vs being lexical like "Exonyms"... ugh, as you said earlier, it's hard to pin down and "put into practice" the difference, since so many of these categories exist in a grey area with characteristics of both. Like: it would not technically be wrong AFAICT to say "Category:English male given names and Category:English nouns are set categories containing the set of all English male given names or nouns respectively" (it would just be madness, heh). And in the other direction, isn't being a placename as much a lexical property as being a given name? But should they go into the same top-level "LANGNAME names" category, or is that madness? Thinking aloud for a moment, I guess one difference is whether a term refers to one specific entity, or to an open-ended cast, which would rationalize why "John" and "Bob"—as names that can be given to an open-ended variety of people, new babies every day—are in (or belong in, in the case of "Volodymyr") "LANGNAME names" categories, whereas "Baghdad Bob" (individual's nickname), "Billary" and "Lumity" (real and fictional couples' nicknames) and e.g. "Saskatchewan" and "Yerevan" (placenames) refer to specific entities, and so are LANGCODE set categories...? So then, since demonyms like "Saskatchewanian" and "Yerevanian" also refer to an open-ended set of people (new babies born in Saskatchewan every day), and as you say, 'being a demonym' can be argued to be a lexical property like 'being an exonym', that justifies them being "LANGNAME demonyms" categories...? (Then the "type of"-set categories, like the category for "the set of all types of stars" or "the set of Native American tribes", are LANGCODE-set categories for a different reason.) - -sche(discuss)19:04, 6 November 2023 (UTC)Reply
@-sche Yes, that seems to make a lot of sense. BTW I have written the script to move topic (langcode) categories to lexical (langname) categories and I'm probably going to run it on exonyms first. Benwing2 (talk) 19:59, 6 November 2023 (UTC)Reply
Relevant to the discussion above about creating a general animal given names category, this discussion points out "Ralph" for a raven, as well as "Rover" as another dog name. Whenever the situation with human names is sorted out, I suggest moving "LANGCODE:Horse given names" ("is:Horse given names") to "LANGNAME animal given names" ("Icelandic animal given names"), unless anyone has objections... (or we could add a general "animal given names" category and retain subcategories for specific animals if one or more languages had a lot of names for them, as might be the case for dogs and horses...) - -sche(discuss)17:24, 11 November 2023 (UTC)Reply
Keep - not in itself a rational for deletion. also yourself: is the category useful? does it fit into a schema of categorisation? is it likely that we are goingto have things to vategorise into it, inm the future? if the answer to anty ofthese is "yes", then we should keep it, rather than have to repeat the work later. respectfully, Lx 121 (talk) 10:55, 29 April 2019 (UTC)Reply
Delete all - the first one is more dubious than the rest though, as Kangxi Radicals block is just a Unicode block category. If the contents are the same though, there's little point in keeping them separate. — surjection ⟨??⟩ 18:46, 19 October 2021 (UTC)Reply
So the entries requiring manual review are those only in CJKV radicals. It appears to me, as a CJK outsider, that many of these are essentially alternative forms of radicals. Many of these have corresponding radical appendices, and many of them have the residual strokes parameter set to "00", which I assume is meant to do something magical.
@This, that and the other It's a really badly-named category. It's added by {{ja-kanji}} if grade=r is manually set, which is (a) not what that parameter is supposed to be for, and (b) clearly not something that should be controlled by a Japanese template. I suspect this is a holdover from the very early days of Wiktonary.
The entire kanji grade system is something that's already handled via the back-end anyway and shouldn't have any kind of manual override since it's all strictly defined.
The reason the residual strokes parameter is set to 0 is because residual strokes are defined by reference to a character's radical (e.g. it's radical X + 3 additional strokes). This is useful for sorting purposes, as the radical is used as the primary sorting weight, with the residual strokes being used for fine-tuning within that. Naturally, characters which are themselves radicals have 0 additional strokes. Theknightwho (talk) 15:49, 12 July 2024 (UTC)Reply
I presume that such templates are categorized by the target language, not the language in which they are written. Do we not care about the language in which the reference is written? What about a multilingual dictionary? (There are at least two such templates.) DCDuringTALK16:15, 23 February 2017 (UTC)Reply
They're placed in whichever language they're relevant to as a reference. So the language it's written in is not taken into account, but they can be placed into more than one language category. —CodeCat16:21, 23 February 2017 (UTC)Reply
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ In that case, delete according to the reason provided by the nominator. — SMUconlaw (talk) 19:12, 23 February 2017 (UTC)Reply
I've seen both categories, & they seem to serve different purposes. this one is general-purpose (any template related to references), & the other one Wiktionary: Reference templates seems to be narrowly-defined (a list of dictionary-reference templates). So either merge or differentiate them better? & 'Reference Templates' is still the obvious meta-category for ALL reference templates. Lx 121 (talk) 15:28, 29 April 2019 (UTC)Reply
Support. I oppose the existence of categories with language code like "en:" in the first place, but what is proposed here seems to be an improvement over the status quo. --Daniel Carrero (talk) 20:27, 20 October 2017 (UTC)Reply
@Rua It looks sane to me if politics are let out. But why is Abkhazia in Georgia though it is an independent state, statehood only depending on factual prerequisites and not on diplomatic recognition which has nothing to do with it? Where does the Crimea belong to? (article Sevastopol is only in Category:en:Ukraine because it has not really been edited since 2014.) I can think of two solutions: First possibility: We focus on geographical and cultural constants. Second possibility: We focus on the actual political power. I disprefer the second slightly because it can mean much work in cases of war (i.e. how much the Islamic state holds etc., or say the current factions in Libya). But in neither case Abkhazia is in Georgia. But the first possibility does not even answer what the Crimea belongs to, i.e. I am not sure if it is historically correct to speak of the Crimea as Ukraine. And geographical terms are often fuzzy and subject to editorial decisions. All seems so easy if you start your concepts from the United States, which do not even have a name for the region they are situated in. And even for the USA your idea is questionable because the constituent states of the United States are states in their own right (Teilstaat, Gliedstaat in German), as is also the case for the Federal Republic of Germany and the Russian Federation partially (according to the Russian constitution only those of the 85 subjects are states which are called Republic, not the Oblasti etc.). Is Tatarstan Russia? Not even Russians can agree with such a sentence, as in Russia one sharply distinguishs русские and россияне, Россия and Российская федерация. Technically Ceuta and Melilla are in Morocco because Spain is not in Africa. Also, Kosovo je Srbija, and it would become just a coincidence if a place important in Serbian history is listed as X, Kosovo or X, Serbia. Palaestrator verborum (loquier) 16:06, 14 November 2017 (UTC)Reply
Starting with the above, I don't know how the Tokyo ward system works, but I imagine it's a subdivision of the city. In England wards are subdivisions in cities, boroughs, local government districts, and possibly counties. "Wards in" is the natural usage.
Municipalities similarly. For example in Norway there are hundreds of municipalities (kommuner) which are subdivisions within counties (fylker). Some of these can be large, especially in the north, but so are the counties in the north. To me "municipalities in" is the natural wording.
States and provinces in the USA and Canada: In nearly all cases it is unnecessary to add the country name as the names are unambiguous. The only exception I can think of is Georgia, USA. This could also apply to prefectures in Japan and states in India (is there a Punjab in Pakistan?). DonnanZ (talk) 18:52, 14 November 2017 (UTC)Reply
Yes, there is, like there is in India. Maybe categorisations should be abundant? Cities can belong to Punjab as well as to Punjab, India, and the Crimea is part of administration of both the Russian Federation and the Republic Ukraine at least for some purposes in the Republic Ukraine. We can make the least thing wrong by adding Sheikh Zuweid (presuming it exists) as well to the Islamic State as to the Arab Republic of Egypt, because we do not want to judge morally and formally states and terror organizations are indistinguishable. On the other hand of course we need sufficient data to relate towns to administrative divisions and ISIS presumably does not publish organigrams. Palaestrator verborum (loquier) 19:44, 14 November 2017 (UTC)Reply
A benefit to having it as a category is that theoretically it ought to be addable by the headword templates examining the pagename (like "English terms spelled with Œ"), which, if implemented (...if it could be implemented without excessive memory costs), would allow it to be kept up to date automatically. - -sche(discuss)17:16, 15 March 2018 (UTC)Reply
Meh. Mehhhhhh. On one hand, I still like the idea of a category which can be populated automatically any time a new relevant entry is added. OTOH, it's very trivial. Well, it would be simple for someone to copy the current contents of the category over to the appendix and then remove the category from the entries (maybe with AWB to speed things up). - -sche(discuss)09:04, 28 December 2023 (UTC)Reply
Latest comment: 11 months ago7 comments6 people in discussion
After some discussion on Category talk:Baybayin script (that went a bit off-topic), some of the Indian language editors (@Bhagadatta, Msasag and myself) have agreed that this category should be renamed to Category:Eastern Nagari script, the reasons being (1) several languages other than Bengali use this script, and (2) the Bengali alphabet is just a subset of this script and lacks some of the glyphs used by other Bengali-script languages (most prominently Assamese which has a separate r-glyph). I want to make sure that there are no objections to this by editors who were not in the discussion. —AryamanA(मुझसे बात करें • योगदान)02:06, 20 July 2018 (UTC)Reply
I feel determiner is the more common name for this in English; the different definitions of these terms across languages should not be a concern - e.g. we also use adjective differently for Korean. adnominal may be confused with the -eun, -neun, -eul, -deon forms of Korean verbs and adjectives. Wyang (talk) 03:57, 13 October 2018 (UTC)Reply
A lot of this is redundant to our suffix derivation categories. In many cases, the suffix used already determines what something is derived from. For example, -ness always forms deadjectival nouns, it can't really be anything else. —Rua (mew) 18:47, 25 February 2019 (UTC)Reply
However this does not work with non-catenative morphology thus far – you may link the previous discussions on those infix categorization matters here, but even if that pattern collecting is solved the derived terms listed at صَلِيب(ṣalīb, “cross”), for instance, would only be categorized by pattern but nothing would imply that the terms are denominal –, and the point I have made about the categorization and naming of these categories is still there. But I give you green light in any case, if you want to replace all those “ deverbals” and “ denominal verbs” categorizations by suffigation categories of the format “ words suffixed with -∅ ”, as well if it concerns action towards categorization of noncatenative morphology language terms, since your idea of uniformity is correct. Fay Freak (talk) 19:49, 25 February 2019 (UTC)Reply
Nonconcatenative morphology is still an underexplored part of Wiktionary, which is kind of annoying. But quite often, we simply show the concatenative part as the affix, and then leave a usage note saying what other changes occur when this form of derivation is used. For example on Northern Sami-i and -hit. —Rua (mew) 20:40, 25 February 2019 (UTC)Reply
How to create an affix category with an id: add the id to the definition line in the affix's entry with {{senseid|language code|id}}, add {{affix|language code|affix|id1=id}} (at minimum) to the etymology section of a term that uses the affix, find the resulting red-linked category and create it with {{auto cat}}. — Eru·tuon20:51, 25 February 2019 (UTC)Reply
Thanks, this is easier than I imagined, so it takes the category name from {{senseid}}. I thought it is in some background module data. Now where to document it? Add it to the documentation of {{affix}} under |idN=? This is the main or even only use of this parameter in this template, right? Fay Freak (talk) 21:18, 25 February 2019 (UTC)Reply
It's not that {{senseid}} has any effect on the category name, but that a category with a parenthesis after it, such as Latin words suffixed with -tus (action noun), expects a matching {{senseid}} in the entry for -tus, in this case {{senseid|la|action noun}} because the link in the category description points to -tus#Latin-action_noun, which is the format of the anchor created by {{senseid}}. The |id= type parameters, including in {{affix}}, generally create a link of that type. In {{affix}}, the parameter also has the effect of changing the category name. Sorry, I am not sure if I am explaining this clearly. — Eru·tuon22:36, 25 February 2019 (UTC)Reply
You explain this clearly. I just rolled it up from that side that I need to choose the name in {{senseid}} that I want to have in the category name so later with affix I will categorize in a reasonably named category because in other cases the id can arbitrary – not that {{senseid}} has an effect on the category name. Fay Freak (talk) 22:53, 25 February 2019 (UTC)Reply
Our affix system is not sufficient to handle morphological derivation we have to deal with (unless you want us to introduce lambdas...) Serbo-Croatian hardly has the intricacy of Arabic conjugation, but there are plenty of nouns that are created from verbal roots through apophony, and this needs to be categorized somehow. Crom daba (talk) 17:24, 2 March 2019 (UTC)Reply
@Crom daba At least for Indo-European, we do have a system for handling combinations of affixation + ablaut, like on *-os (notice the parentheses showing the root grade) and -ος(-os). Our current system totally fails where there is no affix, though, a case which also exists in Indo-European. For example, there are some Indo-European forms of derivation, called "internal derivation", which are built entirely around changing ablaut grades and accents: *krótus(“strength”) > *krétus(“strong”) or τόμος(tómos, “slice”) > τομός(tomós, “sharp”). We have no systematic way to indicate this kind of derivation, but it is sorely needed. —Rua (mew) 23:42, 30 April 2019 (UTC)Reply
Numerals can be words (one, two in spelling alphabets), while numeral symbols are not (Roman numerals). The difference is subtle, but I think it is there. — surjection ⟨??⟩ 18:51, 19 October 2021 (UTC)Reply
I don't mind one way or another, but the whole category tree then needs to be renamed for consistency. (@Donnanz: how is car ambiguous? Do you mean it could be confused for, say, a train carriage or something?) — SGconlaw (talk) 10:34, 3 June 2019 (UTC)Reply
Well, car is used especially in US English for a railroad car (either freight or passenger), and can be used in BrE for a railway passenger carriage. I feel the word auto can be ambiguous as well; "auto parts" can be used in the UK, but "car parts" is preferred. The word "auto" isn't used for a motor car in the UK. There is another category, Category:Automotive, so Category:Automotive parts may be a solution. DonnanZ (talk) 13:52, 3 June 2019 (UTC)Reply
Latest comment: 6 months ago4 comments3 people in discussion
We say ourselves in the entry for oxymoron that its use to mean "contradiction in terms" is loose and sometimes proscribed (despite the fact that many people use it this way nowadays). We say much the same thing at contradiction in terms as well.
The so-called oxymorons in this category are all or almost all contradictions in terms, where the contradiction is accidental or comes about only by interpreting the component words in a different way from their actual meanings in the phrase. An oxymoron in the strict sense has an intentional contradiction. I think we should be more precise about this, in the same way as we already are with using the term "blend" instead of "portmanteau", which has a narrower meaning. I therefore suggest we move this page to "Category:English contradictions in terms" (but see my second comment below). Likewise for any corresponding categories for other languages. — Paul G (talk) 06:51, 25 August 2019 (UTC)Reply
On second thoughts, I think this category should be retained but restricted to true oxymorons, such as "bittersweet" and "deafening silence". Ones such as "man-child" and "pianoforte" are not intended to be oxymoronic and are only accidentally contradictions in terms. — Paul G (talk) 17:18, 26 August 2019 (UTC)Reply
California at one time probably had about a hundred indigenous languages and represented the intersection of the Algic languages (which extend to the east coast), the Athabascan languages (which extend from Alaska to northern Mexico), the Uto-Aztecan languages, (which extend to Central America), a few still-to-be-proven language families like Hokan and Penutian, and a few probable isolates like the Chimariko language and the Karuk language, with a very high percentage endemic to the state. Right now the category contains only one language which was added by a clueless editor based on a bogus etymology, but we already have hundreds of entries in upwards of 5 dozen indigenous languages- about a fifth of Category:Languages of the United States. I should also mention that we have Category:Languages of Hawaii, among others. Chuck Entz (talk) 04:04, 13 April 2020 (UTC)Reply
What about nonindigenous languages? Besides English and Spanish, Chinese, Korean, Vietnamese, and Tagalog are all widely spoken in California. —Mahāgaja · talk08:16, 13 April 2020 (UTC)Reply
Yes, and I've been to stores with signs in Arabic, Armenian, Hebrew, Hindi, Indonesian, Japanese, Persian, Russian and Thai, and I've met people from Greek, Malagasy, Samoan and Tongan communities as well. The Los Angeles County election websites can be viewed in Spanish, Chinese, Tagalog, Hindi, Khmer, Korean, Vietnamese and Thai, and American Sign Language interpreters are in considerable demand. I understand that we have lots of people speaking American Indian languages from the rest of the US and from other parts of the Americas. I've even heard of a radio station somewhere in the Central Valley broadcasting in Assyrian Aramaic. I should add that I know there are lots of people speaking other South Asian languages than Hindi and other Chinese languages than Mandarin, but I don't know which ones. Chuck Entz (talk) 10:16, 13 April 2020 (UTC)Reply
I don't know, but I disagree with categorizing Category:American Sign Language into Languages of California. ASL is used in all 50 states. I don't think it needs to be in potentially 51 location categories when 1 covers that same information. Leave the demographic specifics to Wikipedia. Ultimateria (talk) 01:41, 16 April 2020 (UTC)Reply
That's what I'm arguing for, rather than put e.g. Spanish in 300 categories for all the states of the US and South American countries. Ultimateria (talk) 16:21, 17 April 2020 (UTC)Reply
Delete because it is ambiguous. If we talk about native languages we go also beyond the current state borders and might think about the California of the now United Mexican States. Fay Freak (talk) 15:14, 13 April 2020 (UTC)Reply
Deletion for the reason given immediately above is completely inappropriate. The rationale would suggest renaming. "Early languages...", "Pre-Columbian languages..." might work for the instant case.
We use current governmental borders for categories such as this because of the administrative processes that govern almost all the research on such matters and because that is how most of our users would approach the subject matter. California may secede after the coming election so it would seem prudent to wait before any rash deletion or renaming. DCDuring (talk) 17:13, 13 April 2020 (UTC)Reply
Keep because it seems useful to have a category of indigenous languages. That said, I'm not sure about cases like Mandarin, Tagalog, Korean, and Vietnamese, all currently in the category. Worth noting that neither English or Spanish currently are in the category, and they undoubtedly have the most speakers. I can see a case for including these non-indigenous languages, but if we were to admit any language that has a community of speakers in California then the category would probably stop being useful, even though I suspect there may be more speakers of Icelandic in California than of Valley Yokuts. Not sure where the line should be. There might be a case for only including languages that are (semi-)unique to California, which would probably limit the category to indigenous languages. There might also be a case for having separate categories for indigenous and non-indigenous. Or we could just hope that having all indigenous languages + maybe the top ten non-indigenous is a sane heuristic and hope nobody decides to add Icelandic. (This issue is not by any means unique to California, and I'm not sure whether there's a general rule that's been agreed upon.) 70.172.194.2506:57, 16 February 2023 (UTC)Reply
Maybe the rule should be indigenous to California OR speakers per capita in California >> speakers per capita in US, which would exclude English but plausibly include those Asian languages. There may still be some disagreement about what ">>" means, however, as I assume Spanish is spoken at a greater rate per capita in California than in the US as a whole, but it seems weird to include Spanish but not English. I don't have a fully satisfactory answer. 70.172.194.2507:10, 16 February 2023 (UTC)Reply
Keep: states have a variety of languages and this saves space in the main category. The Australia category is big too, so I'm creating some for Australian states and territories (the NT will probably have the biggest category because of all the living Indigenous languages there). 2001:8004:2778:4E8D:40F:54A0:A43F:F69508:16, 6 January 2024 (UTC)Reply
Latest comment: 10 months ago4 comments3 people in discussion
An example of w:U and non-U English, which probably should be decided for the latter. While “scent” can possibly be broader, this category also has the danger of just about including anything that has a strong odour naturally. Hence I included بَارْزَد(bārzad, “galbanum”) and جُنْدُبَادَسْتَر(jundubādastar, “castoreum”). The English category has a weak six entries since created in 2011. But even Category:en:Perfumes includes dubious things. I doubt perfumes are something that can be categorized well – it’s basically anything smelly? –, maybe delete all? Fay Freak (talk) 01:09, 27 July 2020 (UTC)Reply
I think a case could be made for "scent" being not something that smells, but smell itself (like musk and maybe putridity). I don't see any reason why perfumes can't be categorized. I don't think it's meant to include anything that could be used as the scent of a perfume, but words that specifically describe perfumes. For instance, cologne isn't "cologne-scented", it's the name of a type of perfume; jasmine is a plant, but it is also used as the word for a perfume, not just to describe a perfume (you could say, "She always wore a liberal quantity of jasmine" and not just "She always wore a liberal quantity of jasmine-scented perfume". Of course, you could also say "She always wore a liberal quantity of Autumn Breeze" because it's a proper noun, but I don't think you could say "She always wore a liberal quantity of lilac". Instead you would say "lilac perfume".) Andrew Sheedy (talk) 03:07, 27 July 2020 (UTC)Reply
Latest comment: 4 years ago6 comments4 people in discussion
IMO it does not make sense to have some terms categorized directly into Category:Regional English (not its subcategories) and other terms categorized directly into Category:English dialectal terms, because in practice no-one seems to be maintaining a distinction as far as putting one kind of entry in one and another in the other, it seems haphazard as to whether an entry uses e.g. {{lb|en|US|regional}} / {{lb|en|UK|regional}} like pope, mercury, jack, snap, wedge, phosphate, tab, or gob, or else uses {{lb|en|US|dialectal}} / {{lb|en|UK|dialectal}} like pope (!), admire, haunt, on, sook, book, yinz, and gon. Many of the {{lb|en|US|dialectal}} / {{lb|en|UK|dialectal}} terms go on to specify which regions they're used in, like "Pittsburgh and Appalachia" or "Northern England" or "Scotland". And we put every more specific dialect category as a subcat of "Regional", not of "Dialectal". I'm not entirely sure which category the entries in the two top-level categories should be consolidated into, but I'm inclined to think they should go in one or the other. Or do we want to try to implement some distinction? (At the very least, entries that use "regional" but then go on to specify the regions, like "US, regional, Pittsburgh", can drop the unnecessary "regional".) The one situation I can think of where simply changing "regional" to "dialectal" would not work is that some entries are labelled "regional AAVE". Thoughts? - -sche(discuss)01:06, 10 October 2020 (UTC)Reply
I personally think that dialectal and regional terms should be separated. Since a term for something in a region from an out-of-region dialect should be categorize into both regional dialects. -- 65.92.244.14716:29, 22 November 2020 (UTC)Reply
That doesn't make sense. It's not the thing referred to that makes it regional or dialectal, it's the term itself. Do you have an example in mind? Chuck Entz (talk) 18:21, 22 November 2020 (UTC)Reply
I think the real problem is that it's not clear what we mean when we say something is dialectal. Linguistically, a dialect can be any speech variety that is separate from the rest of the language. With a language such as English that has multiple standards, you could say that much of the language is dialectal, though no one uses the term that way. I suspect there may be a value judgment involved: dialectal English is the way local people talk when they're not using proper English. Regional has less of that: I say potayto and you say potahto, but that's just a matter of geography. Theoretically, sociolects like AAVE and Cockney would be better described as dialectal than regional, but I'm not sure whether they're described as either. For a lot of people, though, it's probably whatever it's called in the references they check (or copy from). Chuck Entz (talk) 18:21, 22 November 2020 (UTC)Reply
"dialectal English is the way local people talk when they're not using proper English".
What, pray tell, is proper English? General Australian? Standard Canadian English? General American (*had trouble including that as a suggestion with a straight face*)? Standard Indian English?
If someone were to suggest that whatever is arbitrarily declared to be the 'standard' dialect of the English in their country is thus "proper English", and every other dialect is not, then that is obvious nonsense. I get that that is the reason why you used the phrasing value judgement, but if what you suggest to be going on is actually going on, then that is a problem.
Wiktionary aims to be descriptive, not prescriptive. So if the category "Regional English" is being used to suggest that certain dialectal terms are more "proper" than others, then we need to get rid of one category or the other. Tharthan (talk) 18:42, 22 November 2020 (UTC)Reply
I'm not agreeing with the value judgment. I was too lazy this morning to put everything in quotation marks. The basic problem is that this terminology goes back to earlier academic standards and it's hard to tell what it means in a more modern context. A dialectologist or other linguist would probably have a more rigorous definition, but we don't seem to. Chuck Entz (talk) 19:36, 22 November 2020 (UTC)Reply
November 2020
Category:en:Artificial languages
Latest comment: 3 years ago4 comments4 people in discussion
Changing the name of the category will lead to greater consistency with Category:Conlanging, putting the contrast between the purpose of each category (names of constructed languages vs. conlanging terminology) in sharper relief.
The odd choice of wording was intended to avoid the topical category conflicting with Category:Constructed languages, which is a holding category for those languages. Given that our MediaWiki trappings make it impossible to resolve this conflict, I support this proposal as a better compromise. —Μετάknowledgediscuss/deeds06:16, 8 November 2020 (UTC)Reply
Latest comment: 4 years ago2 comments2 people in discussion
These are terms that were historically used in the Dutch East Indies, perhaps to some degree also in Malay-speaking territories of the Dutch East India Company. A rename to Category:Dutch_East_Indies_Malay makes the most sense. It is doubtful that a category "Netherlands Malay" is needed because the number of speakers of Malay in the Netherlands is not very high. ←₰-→Lingo BingoDingo (talk) 19:57, 10 January 2021 (UTC)Reply
Latest comment: 2 months ago11 comments10 people in discussion
I'm all but certain that one can't have a word without pronounced vowels, but I feel that it reads better if it's explicitly stated anyway. Johano★01:15, 15 June 2021 (UTC)Reply
Yeah (except maybe "English terms"), that would also reduce how dumb it looks that the category includes lots of numbers which are quite regularly pronounced with vowels, and things where the vowels have merely been obscured (b****cks), and abbreviations that aren't even "words" per se, like BHD. - -sche(discuss)22:03, 8 July 2021 (UTC)Reply
Also, why is it a subcategory of Category:English shortenings? Sure, a lot of shortenings omit the vowels, but the converse isn't true: hmm, grr, 1984 (unless every number is a shortening of its spelled out form, which doesn't seem all that useful). Do I need to start a separate request to remove a subcategory? Medmunds (talk) 18:53, 18 March 2022 (UTC)Reply
I also think we should demote the resulting Category:Classical Chinese language to an etym-only language of Category:Chinese language. It's purely a literary construct and not on the same level as the spoken varieties. Note for example that we don't have separate languages for Classical Latin, Koranic Arabic or Modern Standard Arabic.
@Benwing2: "Literary Chinese" is the generally regarded the same as "Classical Chinese", but reading w:Classical_Chinese#Definitions I think we might want to keep them separate for lexicalgraphic purposes, by treating Literary Chinese as the later stages. Note that Classical Chinese is also distinct from "literary Chinese" (note the capitalisation), although they overlap in certain places (such as Ming/Qing era usage).
Using the hypothetical conjunction "if" as an example, 誠 / 诚(chéng) and 向使(xiàngshǐ) are found in Qin-Han era Classical Chinese, but not in Ming-Qing era Classical Chinese (which uses words closer to modern usage instead, i.e. literary terms) - I wouldn't call the former ones as literary terms, instead more like obsolete or archaic.
Early Classical Chinese (Qin-Han) is significantly different from modern Chinese in terms of grammar and pronunciation, a reasonably educated person would have a hard time understanding a text even with annotations; Tang-Song era Classical Chinese is still somewhat incomprehensible with a couple of obsolete (in modern standards) terms; late Classical Chinese (Ming-Qing) is more fuzzy and one might simply call it literary Chinese; in early modern times these are all considered to be one thing, which is why we have the misnomer Literary Chinese.
I think #1 of your proposal would be relatively uncontroversial, though I would wait to see input from others.
#2 is questionable, depending on what do we regard Chinese to be. Because we have everything placed under Chinese, this corresponds to like stuffing everything from Old Latin (or even earlier) to Neo-Latin into a subvariety of a Latin-Romance language without treating Latin itself as a language, which is a very awkward thing.
Classical Chinese has its own quotations (and as Fish bowl have mentioned we have Category:Korean Classical Chinese and Category:Vietnamese Classical Chinese - the quotations for these varieties are also placed under the Chinese L2), which are categorised as Category:Literary Chinese terms with quotations. Changing Classical Chinese to etymology-only would mean these quotations have nowhere to go - it is often impossible to discern where they should otherwise be treated. I would rather counter-propose that Category:Old Chinese language and Category:Middle Chinese language be treated as an etymology-only variant of Classical Chinese - OC and MC are essentially just a snapshot of the sound system at a particular time point in the history of Chinese.
#3 is 1000% a no, though I would support it if one day we were to accept Altaic languages as valid :)
Fish bowl's proposal might have been motivated by the similarities between late Classical Chinese and literary terms in modern Chinese, but a more in-depth look would suggest that this is untrue. – wpi (talk) 06:09, 19 September 2023 (UTC)Reply
@Wpi Old Chinese and Middle Chinese are more conventional languages, even if semi-reconstructed, so I would argue they should stay as full languages, whereas Classical Chinese is somewhat of an artificial construct, and normally we place those as etym languages. I think the issue here is that there is more than one Classical Chinese, whereas most Classical Foo languages are fairly unified. This suggests we should separate Classical Chinese into something like Old Classical Chinese or Early Classical Chinese (an etym language of Old Chinese), Middle Classical Chinese (an etym language of Middle Chinese), and Late Classical Chinese (an etym language of Chinese?). Benwing2 (talk) 06:20, 19 September 2023 (UTC)Reply
A few issues here. I think we've been kind of sloppy when it comes to the literary/classical distinction. Most entries have been using "literary" since that was what was the norm back in the day. "Classical" came later as the "Classical" label was introduced to other languages, which is probably why we have fewer uses of this label. While I think the Literary/Classical distinction is useful, I wonder if in labelling how we should be making the distinction. If a term is used in both Classical Chinese and Literary Chinese, such as 首 "head" (now labelled "archaic"), do we have to label it with both? Whatever we decide on, I think we also need to think about how this is organized in {{zh-x}}.
In principle, I think #1 would be something okay to do. #2 doesn't seem okay per Wpi. The issue with Classical Chinese is that it cannot fit neatly in OC or MC because as Wpi said, these are snapshots of the phonology. There is also Classical/Literary Chinese works written way after the Middle Chinese period, but not necessarily able to be considered as any modern Chinese variety. #3 would probably need to be worked out entry by entry. Some entries should probably be moved to Classical Chinese, but others may be used in highly formal modern writing. It might be difficult to distinguish the two given our previous usage of "literary" as a label. We would need to set stricter definitions for what goes where. — justin(r)leung{ (t...) | c=› }03:33, 20 September 2023 (UTC)Reply
@Justinrleung @Wpi @Fish bowl Pinging the people who previously participated as well as @Theknightwho. I am trying to convert all the bespoke variety codes in {{zh-x}} to standard codes. I added zhx-lit for "Literary Chinese", specifically the later stage of Classical Chinese; but this conflicts with the name of zhx. I really think we should rename zhx, probably to Classical Chinese. The only issue with this term is that sometimes Classical Chinese specifically seems to refer to the 5th century BC - 2nd century AD period, as in Category:Classical Chinese, or some other similar time period. I'm thinking maybe we need a different term for this: Han Classical Chinese? Although strictly speaking, the Han dynasty only began in the 3rd century BC. Or "Late Old Classical Chinese"? Please note, I also added zhx-pre for "Pre-Classical Chinese" corresponding to the old CL-PC code; but I have no idea if this makes any sense, as it seems awfully similar to Old Chinese. Benwing2 (talk) 04:47, 27 March 2024 (UTC)Reply
I should add, I also added a code cmn-bec for "Beijingic Mandarin", which is the primary branch of Mandarin that includes Beijing and environs. This is described in Wikipedia under Beijing Mandarin (division of Mandarin) whereas Beijing Mandarin itself (code cmn-bei) is described under Beijing dialect. The term "Beijingic" comes from Glottolog. This was added to correspond to the M-UIB code added and used primary by User:Dokurrat. Since M-UIB is described as "dialectal Beijingesque Mandarin", I assume it approximately corresponds to the Beijingic primary branch. Note that the existence of Beijingic is somewhat controversial as some researchers place Beijing and surrounding dialects into Northeastern Mandarin. I also added labels (but not etym codes) for all primary Mandarin branches and many individual dialects under these branches; basically, any dialect that had 4 or more mentions among the labels as well as any dialect where I could find a corresponding English Wikipedia page describing it. (There are more dialects with Chinese Wikipedia pages but I haven't yet found them all.) Eventually I think we should assign etym codes to most or all of these dialects but for the moment I'm mostly just collecting them into labels; once I have a fairly complete set of labels it will be easier to assign codes in a semi-consistent fashion. Also, I am ready to push the code to allow both new (standard) and old (bespoke) variety codes in {{zh-x}} and then convert all uses to the new codes, but this can't run until my current run obsoleting {{zh-noun}} and {{zh-hanzi}} finishes. (It's run for ~ 22 hours so far and has maybe 14 hours to go.) Benwing2 (talk) 07:01, 27 March 2024 (UTC)Reply
BTW here is the current mapping I have worked out from old bespoke {{zh-x}} codes to standard codes:
On a second thought, I don't think we should have yue-wvc and zhx-tai-wvc as they seems to be too similar to lzh-yue and lzh-tai, plus WVC-C is used only once in 萊苑 (and none for WVC-C-T). @Fish bowl who added the ux in 萊苑 for comment. – wpi (talk) 13:07, 27 March 2024 (UTC)Reply
WVC-C can probably be merged into C-LIT, but I don't have any particular suggestion for the *-C-T codes.
Mentioning this again: perhaps we should use a bipartite system giving the text language and the pronunciation language, such as lzh/cmn-TW (Literary Chinese in Taiwanese Mandarin pronunciation). —Fish bowl (talk) 03:49, 30 March 2024 (UTC)Reply
@Fish bowl Thanks for bringing this up; I missed it last time. In fact my recent overhaul of Module:zh-usex/data implemented something very similar. Essentially, there is the variety code, which is typically an etym-only language code, and then for each such variety there is a second "norm code" that is used for romanization (i.e. pronunciation) purposes. There's nothing preventing us from implementing your suggestion on top of this, if it proves necessary. Benwing2 (talk) 04:15, 30 March 2024 (UTC)Reply
@Fish bowl @Wpi Just to confirm: All three of WVC-C (yue-wvc) "Written vernacular Cantonese", CL-C (lzh-yue) "Classical Cantonese" and C-LIT (yue-lit) "Literary Cantonese" can be merged? This is based on @Wpi suggesting merging WVC-C with CL-C and @Fish bowl suggesting merging WVC-C with C-LIT. I assume that if these were different they would refer to different time periods (?), but I don't know if there's enough difference to warrant separation. Benwing2 (talk) 05:02, 30 March 2024 (UTC)Reply
@Benwing2: Incorrect. I believe the problem here is that we have multiple understanding of the usage of the Cantonese codes. General speaking, there are these types:
modern spoken Cantonese
modern written vernacular Chinese (e.g. Hong Kong Chinese)
written vernacular Chinese from 19th/early 20th century
Classical Chinese that uses Cantonese pronunciation
spoken Cantonese from 19th/early 20th century (e.g. dictionaries from missionaries)
They are especially difficult to tell apart when the phrase/sentence is short and does not contain much grammatical features.
Justin, Alex and I uses C for #1 and #5 (or the newly added C-HK when applicable), C-LIT for #2, and C-CL for #3 and #4.
Fish Bowl uses C-GZ for #1, WVC-C for #2 and #3, and C-CL for #4. (Please correct me if my understanding is incorrect)
@Wpi OK thanks and apologies for my confusion, I haven't encountered before (as a linguist) the situation where there's a big gap between the written and spoken forms and multiple ways of pronouncing a given written form. Benwing2 (talk) 20:11, 30 March 2024 (UTC)Reply
I should add, does anyone mind my renaming the old {{zh-x}} codes to the new ones? As shown in the above table, no information will be lost because there is a one-to-one mapping between the old and new codes. Benwing2 (talk) 02:56, 31 March 2024 (UTC)Reply
I don't particularly support the usage of the C-GZ tag, and think that (for zh-usex at least) it can be safely merged into C . For 2 (modern HK) I also use C-LIT.
@Fish bowl Is your thought that we should use just "Cantonese" (yue, or yue-can as suggested in the RFM discussion below) as the language code? This would be parallel to the normal handling of Latin, where Classical Latin terms are usually identified as just "Latin" (code la), although there's also a code for "Classical Latin" (code la-cla or CL.). The use of Guangzhou Cantonese specifically (yue-gua) as a code would then be restricted to cases where it's important to distinguish usage that is specific to urban Guangzhou speech as opposed to Standard Cantonese. I think this is also parallel to the use of cmn (Standard Mandarin) vs. cmn-bei for Beijing Mandarin. Benwing2 (talk) 00:48, 1 April 2024 (UTC)Reply
yue-gua: maybe yue-gzh? Keeping the initials is more sane IMO but I also remember that you wrote your own guideline in one of the other discussion.
Even the category name is not grammatically correct, it should be either multiracial people or multiracials.
A lot of mixed-race group names are not dictionary material, being SoPs. Therefore, I do not think we need any category dedicated to multiracial people (the name as used in that category, which itself links to Wikipedia). ·~dictátor·mundꟾ21:36, 26 February 2022 (UTC)Reply
Issue #1 could be addressed very easily by adding it to the category tree. Issue #3 could be addressed by renaming the category. I don't find #4 a very convincing argument. We could just keep the ones that aren't SOP and delete the ones that are, which is the same rule we apply to any other kind of term. There are evidently plenty of such terms in English that are single words or idiomatic. Also, the RfD of Irish American closed as keep, and although that isn't a term related to mixed ancestry, it shows that hyphenated or spaced combinations of nationalities aren't necessarily regarded as SOP by the community.
Point #2 seems to be the most substantial, but as I wrote in my comment below I think there might be value in separating out these terms from other ethnonyms. And not all of these fall under Scientific racism or Eugenics. I don't think Blasian or Finndian are necessarily associated with racism or eugenics. Such terms seem to be used as identities by members of the groups themselves. That said, I don't particularly object to deletion either. 70.172.194.2502:17, 23 February 2023 (UTC)Reply
I can see some potential value in keeping a separate category for things like mulatto, Blasian, Chindian, the last of which isn't even in the category currently. I think there is something different about those terms as compared to most ethnonyms, and Wikipedia seems to agree given that it has a "Multiracial affairs" category with similar terms as members.
While an argument could be made that almost all human populations have admixture from multiple groups and so almost everyone is in some sense multi-ethnic, I don't think most would e.g. include Desi in this category even if one could theoretically make an argument that the Indian subcontinent has a mix of Indo-Aryan and Dravidian genetics. It has to be a term for someone whose recent ancestors came from different groups, not way back in (pre)history. One doesn't have to be a racial essentialist to realize that people who fall under this umbrella are viewed differently in society (hence the existence of such terms).
That said, I don't particularly object to deletion, as the issue may be more trouble than it's worth, most commenters support deletion, and the category mixes relatively PC and highly offensive terms as though there is no distinction.
The inclusion of BIPOC in this category perplexes me. I thought "multiracial" in this context describes a person with mixed ancestry, not a coalition of different ethnic groups. 70.172.194.2501:55, 23 February 2023 (UTC)Reply
Delete. Category:Ethnicity is enough, this looks like a race fetish based on the social construct of race that is particular to the English language having this word race and discourses about it. America moment. Fay Freak (talk) 15:12, 11 December 2024 (UTC)Reply
It would be impossible to add these automatically without rendering the categories essentially meaningless. For example, d has several POSs which consist only of abbreviation senses (which evidently don't count as "words" in the eyes of this categorisation system), but the headword line template has no way of knowing that. This, that and the other (talk) 04:27, 12 March 2022 (UTC)Reply
Not to mention the fact that we don't want to increase Lua memory burden on Latin script letter pages, so a lot of headword templates on a few such entries (currently a, A, b, o, u) are using {{head-lite}} anyway. 70.172.194.2505:22, 12 March 2022 (UTC)Reply
Keep I agree the one and two letter categories are useful. As pointed out, a bot can't make proper categorization since it can't separate words from other character groups like abbreviations. Bots could assist maintenance, if all n-letter entries were flagged with either "include" or "exclude" templates or some such; the bots could report entries missing either flag into a maintenance category for manual attention. --R. S. Shaw (talk) 18:11, 21 May 2022 (UTC)Reply
Delete. Racist. Everything in Semitic languages is three letters, the most important words shorter than that. Not to speak of CJK. Databases usable for word games should be designed by external applications, this is basically special-casing Standard Average European languages. Also disproportionate maintenance burden. Fay Freak (talk) 15:09, 11 December 2024 (UTC)Reply
i agree with your last point. it is a high maintenance burden for those who maintain it. i do not agree with your first point that the category is racist because three letter words are common in semitic languages. unless you can explain why the existence of semitic languages makes this language-independent category racist, that line of reasoning is invalid. TheQWERTYCoder (talk) 20:12, 3 April 2025 (UTC)Reply
@Koavf: Please see the bottom of that category that contains reference templates pertaining to the Indo-Aryan language family as a whole, rather than specific languages or even chronolects. Since Wiktionary does not treat Indo-Aryan as a united macrolanguage like Sinitic/Chinese, it makes more sense to dedicate a separate category for the current reference templates that deal with Indo-Aryan linguistics. That said, we may remove the 56 individual language categories from the list. (Pinging @Kutchkutch, Bhagadatta, Svartava, AryamanA for more input.)
I’m not sure what to be done with other families, or if consistency is needful across all languages: in that case you could raise the matter in the BP. ·~dictátor·mundꟾ11:32, 13 March 2022 (UTC)Reply
@Inqilābī: You are correct that some of these are about Indo-Aryan at large, but 1.) they can just be put into specific language categories as they are used on entries for those languages, 2.) what I'm suggesting is already done in practice for several of these categories (and was before I started editing them), and 3.) there are definitely other references that apply to more than one (e.g.) Romance language or Semitic language as well, so we're back to either sorting one reference template into several individual language categories (my preference) or building out the module and hierarchy to include language families. —Justin (koavf)❤T☮C☺M☯15:45, 13 March 2022 (UTC)Reply
@Koavf I find the existence of these categories very convenient and do not support their deletion. There is a long history of comparative literature on these families and in that respect being able to easily find reference templates for related language varieties is useful. If these categories are outliers, then I would suggest that the creation of more like them would actually be a better idea. I could see such categories being particularly useful for Iranian languages, Dravidian languages, Austro-Asiatic, for example. عُثمان (talk) 16:27, 21 July 2023 (UTC)Reply
No—the larger a category is, the more challenging I find it to navigate it. (Even the difference in the amount of time it takes to load the pages is non-trivial since I don't have a great internet connection.) عُثمان (talk) 17:29, 22 July 2023 (UTC)Reply
If you make smaller subcategories, the containing category becomes larger. There also aren't any established genetic subgroupings of Indo-Aryan to make smaller categories with. (Most schemes which do so out of convenience on solely geographic criteria end up splitting mutually intelligible dialects between subgroups.) So it is not clear what sort of categories you are suggesting be made عُثمان (talk) 03:11, 23 July 2023 (UTC)Reply
The first category contains things like "state", "county", "province". The second contains things like "California", "Yorkshire", "Guangdong". 70.172.194.2518:58, 12 April 2022 (UTC)Reply
The intended distinction (which, when I spot-check a few categories, actually seems to be decently well maintained) seems to be as IP 70.172 says. But I am inclined to agree that the current names don't convey a meaningful distinction. If we want to continue having separate categories for "county, burgh, kingdom, ..." vs "Mayo, Yorkshire, Idaho, ...", it would be better to devise more distinct names for the categories... - -sche(discuss)23:14, 20 February 2023 (UTC)Reply
IP is right. I just came here because Ottoman Turkish قضا(kaza) was in the wrong category, and pushed the panic button. The naming should be something more intelligent. Fay Freak (talk) 03:33, 21 February 2023 (UTC)Reply
I agree that the names are highly confusing. Maybe we should rename the first one “types of administrative division”, or something similar. Incidentally, that’s exactly the name of the corresponding en.wikipedia category. 70.172.194.2503:39, 21 February 2023 (UTC)Reply
Now the yerba gave me the idea. We just name the latter “named political subdivisions”, to avert the exemplified mistake. The former shall not be renamed because it is added manually while the other is a mediate effect of Template:place etc. I also briefly thought about going to Wikipedia to see how they do but we don’t have the same problems. Fay Freak (talk) 03:47, 21 February 2023 (UTC)Reply
New Caledonia is a sui generis overseas collectivity of France. It has membership in the French parliament and France's rule of law and citizenship extends there just like in Corsica or Guadelope or Lyons. None of these are dependencies: they are all first-level administrative divisions of the French Republic. —Justin (koavf)❤T☮C☺M☯00:48, 24 June 2022 (UTC)Reply
I want a category for all overseas territories of France, and I don't much care about the technicalities. What is the right category? Benwing2 (talk) 01:52, 24 June 2022 (UTC)Reply
The regular approach is to list these at WT:RFM. This seems, however, a place where proposals go to linger in limbo: there is an unresolved category move request (WT:RFM § Category:WC) from 2015. The sledgehammer approach is to create a vote at WT:VOTE. --Lambiam17:20, 15 July 2022 (UTC)Reply
It would be good if we fixed it, as we have with category and label inconsistencies previously. If not now, I am sure someone will bring this issue up and fix it sometime. J3133 (talk) 16:50, 15 July 2022 (UTC)Reply
Personally, I would lowercase the label (and anything else). On the other hand, Google Books Ngrams suggests Internet is more common. That said, it's less work to lowercase the label than to move all the categories... - -sche(discuss)23:33, 23 July 2022 (UTC)Reply
It should be capitalised. There is such a thing as "an internet" or internetwork (generic; although you very rarely hear this terminology any more), versus "the Internet" (the global thing we all use all the time). Same deal with "the Web" versus (I suppose) "a web" although I don't remember even the most braggart webmasters using the latter. As always, citable usage trumps what I say, but I am historically correct. Equinox◑03:14, 13 March 2023 (UTC)Reply
"Category: " is the standard naming convention of lexical categories. Category:English irregular nouns, Category:English onomatopoeias, Category:English fandom slang, etc. This category contains only English-language DoggoLingo terms, and thus the correct name should be "Category:English DoggoLingo". German-language DoggoLingo terms would go under "Category:German DoggoLingo", French DoggoLiggo would go under "Category:French DoggoLingo", etc. (Presuming this meme has spread to other languages.) WordyAndNerdy (talk) 06:24, 30 July 2022 (UTC)Reply
We could use some empirical data here. Does DoggoLingo or a close equivalent actually exist in German or French? If it does, that provides some reason to approve this proposal (and possibly to update the relevant articles). If not, it provides some reason to reject it. 98.170.164.8806:40, 30 July 2022 (UTC)Reply
I agree with this. WordyAndNerdy, do you have proof that Internet slang related to dogs (i.e., of the type of DoggoLingo) exists in other languages, and would use the same name derived from English slang? J3133 (talk) 06:45, 30 July 2022 (UTC)Reply
*deep existential sigh* English-language lexical categories have an established naming convention. I have never seen an English-language lexical category that was just "Category:Word type" (e.g. "Category:Fandom slang", "Category:Military slang", etc.) in 10+ years of contributing. Can't speak for lexical categories in other languages, but if someone wants to change an established convention, they need to do so by obtaining consensus, not by unilaterally imposing a new standard. This is an extremely straightforward request and having to get bogged down in bureaucratic discussions like this means less time for doing productive things like attesting Internet slang. WordyAndNerdy (talk) 07:15, 30 July 2022 (UTC)Reply
A proper noun that specifically refers to English, if you are not already aware. Like Rotwelsch is a proper noun referring to German. J3133 (talk) 07:30, 30 July 2022 (UTC)Reply
The Wikipedia article defines DoggoLingo as an "Internet language" and doesn't specify that it's limited exclusively to English in said definition. In any case, this is completely perpendicular to the issue of what the category should be named. No one had to prove the existence of Dutch Twitch-speak, Korean Twitch-speak, etc. to create "Category:English Twitch-speak." That's what the category ought to be named following the established naming convention of English-language lexical categories. (And given that you haven't incorporated this category into the category tree module -- which is like step two of creating a new category -- maybe it isn't prudent to act as if you have special expertise or authority in this area.) WordyAndNerdy (talk) 07:51, 30 July 2022 (UTC)Reply
And the consensus of that discussion seems to be that "Thieves' cant" is a strictly English historical example of criminal slang, and that the non-English entries in Category:Thieves' cant should be moved to language-specific criminal slang subcategories- the opposite of this proposal.
It's true that there's a naming convention to put language names in category names, but that doesn't apply to this kind of entry, and saying it does shows a misunderstanding of the convention. While there's nothing to stop other languages from having their own equivalents to DoggoLingo, it seems to have been created by English-speakers using humor based on the peculiarities of the English language. If other languages come up with their own equivalents, I sincerely doubt that they would be called DoggoLingo. DoggoLingo is a variety of English, just like pig Latin and double Dutch, and "English DoggoLingo" would be redundant. Chuck Entz (talk) 08:13, 31 July 2022 (UTC)Reply
This was put up for RFM back in 2022, but I couldn't find a discussion so I assume one was never started. This should definitely be at a topic category Category:en:Doggo lingo if we should even have such a category in the first place (which I don't think we do). @WordyAndNerdy: pinging original nominator - saph ^_^⠀talk⠀14:32, 3 February 2025 (UTC)Reply
@Saph: I have merged your comment with the discussion. As has been mentioned by Chuck Entz (and I agreed), “English DoggoLingo” would be redundant (not “Doggo lingo” (see DoggoLingo), and not with “en:” because this is not a topic category). J3133 (talk) 14:46, 3 February 2025 (UTC)Reply
Do any languages other than English have "DoggoLingo" terms, which they call DoggoLingo? It's believable, but AFAICT undemonstrated (and I haven't spotted any in my own limited search). If "DoggoLingo" is, as presently defined, definitionally English-only, then the current name is in line with e.g. Category:Verlan (which is a subcategory of CAT:French language without having to be named "French verlan", since there is apparently no other kind of verlan), and , as patiently pointed out to OP above. (Category:English Thieves' Cant has "English" in the name despite being English-only, but that's partly my own fault and was the result of an RFM trying to solve a different problem, of people putting other languages' criminal slang in the category. Still, it means Wiktionary does not have one singular standard way of dealing with "language-specific slangs".) OTOH, if non-English DoggoLingo entries exist, then the categories would need to be specified by language. - -sche(discuss)23:56, 4 February 2025 (UTC)Reply
Latest comment: 11 months ago2 comments2 people in discussion
As AG202 stated in the DoggoLingo category RFM, “There’s no overarching Category:Pig Latin or Category Pig Latin terms, nor does there seem to be other languages linked to it, so there really shouldn’t be an English label there.” This was after Chuck Entz used the argument there that “English DoggoLingo” would be redundant, “just like pig Latin”, then Binarystep pointed out that the Pig Latin category does use the English label—redundantly. J3133 (talk) 11:24, 1 August 2022 (UTC)Reply
2. Terms denoting a currency or something similar that is in no way specific to this period in time (halerz, korona)
3. Terms denoting a place name of a town or city that still exists (Prague, ベオグラード)
The category for Yugoslavia does include a handful of entries that do not fall into these types, but are still pretty basic and do not imho need a category (these are terms like Yugoslav People's Army).
I feel like these categories denote historical periods of countries that are not culturally significant enough to warrant a separate category, but rather should be distributed among the country's descendant's/s' categories. I don't see - judging by the entries now present in the category - how Yugoslavia is more culturally and/or historically significant than, say, the Batavian Republic, the Kingdom of Italy or the French Third Republic. I do not think these three categories are at all comparable with CAT:Soviet Union or even CAT:East Germany (again, judging by the current entries that are added to the respective categories). Thadh (talk) 18:15, 21 August 2022 (UTC)Reply
It seems that at least for Category:West Germany, the category just isn't populated in the same way that Category:East Germany is, at least for English. I've created the category Category:en:West Germany, and populated with a few entries. I also added terms like Wessi and Besserwessi to the respective category for German. With that, I think that the categories just need to have entries actually marked with them before really talking about deletion. AG202 (talk) 18:57, 21 August 2022 (UTC)Reply
My point was more that if Category:en:East Germany exists then Category:en:West Germany should as well, looking at the entries in both. I'm aware that Category:DDR_German exists as well for East German German, but that can exist easily without Category:de:East Germany. Either they should both go or they should both stay. Same with the Yugoslavia category. The only one I'd maybe support deleting is the Czechslovakia category, but I have a strong feeling that it just hasn't been populated and there may be Czech or Slovak terms that would have a special place in it. AG202 (talk) 22:54, 21 August 2022 (UTC)Reply
The category seems to be a very small set of characters that may or may not actually be derived from a process of "breaking". I also don't know if the title of the category is appropriate. — justin(r)leung{ (t...) | c=› }02:37, 19 January 2023 (UTC)Reply
What do you mean? I'm not sure how much you know about Chinese glyph origins, but is it not obvious that 彳亍 is a broken up version of 行? It's pretty much accepted that 行 came first and then the other characters were derived from breaking it into two halves. And thus with all other characters in the category. WiktionariThrowaway (talk) 00:46, 20 January 2023 (UTC)Reply
@Justinrleung any response here? Even as a non-CJK speaker I kind of understand what this category is doing, even though the name feels wrong (it seems like it should be "Han characters derived by breaking" or something). This, that and the other (talk) 07:04, 31 March 2023 (UTC)Reply
@WiktionariThrowaway, This, that and the other, Sgconlaw: Sorry for not checking again. I kind of understand what the category is for now, but I do agree that the name of the category looks wrong since the characters in the category are not breakable AFAICT. While 彳亍 is from "breaking up" 行, other uses of 彳 (mostly as a radical) are simplifications of 行 rather than "breaking up" the character. I'm not sure if there's an established name for this phenomenon. @RcAlex36, Wpi31, do you know of any names for this? — justin(r)leung{ (t...) | c=› }15:32, 1 April 2023 (UTC)Reply
@Justinrleung: ah. Well, it's clear that the category name is wrong, because the category contains characters that are the result of "breaking" other characters, so the first-mentioned characters are not themselves "breakable". My question is whether there is any particular value in having a category of characters that are derived from other characters in this way. For example, if such "breaking" (for want of a better term) is not a common way of forming new characters, then it doesn't sound very useful. The fact that such characters are formed from other characters is already explained in the relevant etymology sections. — Sgconlaw (talk) 15:43, 1 April 2023 (UTC)Reply
I doubt there is a proper academic name for this, not even in Chinese. It seems that all these characters come in (vaguely) mirrored pairs, each constituting half of a common character, so maybe "Han characters with etymologically-related mirror characters" or "Han characters derived from halving a character"? – Wpi31 (talk) 16:07, 1 April 2023 (UTC)Reply
@Wpi31: yes, but is there even any point in categorizing characters in this way? The category currently has just 14 entries. Are there likely to be more? — Sgconlaw (talk) 18:00, 1 April 2023 (UTC)Reply
Latest comment: 2 years ago2 comments2 people in discussion
English. There seems to be some conflation between the two. {{lb|en|China}} categorizes into the former, though people often do meant the latter, which only has 3 entries. For example, typhoon shelter, Hong Kong foot, add oil, and aiya are labelled as both {{lb|en|China}} and {{lb|en|Hong Kong}}.
Also, "Chinese English" technically includes Hong Kong English by the criteria of geography, but linguistically and lexicographically speaking, there is very little influence on HKE from the mainland, which means there are not many instances where we actually need to categorize into both; the existing ones in the category that I'm aware of are (excluding the four already mentioned above) joss stick, Ins, KMT, and ACG. Note that this also causes abominations like the one at ACG, which is meant to include Taiwan as well. (We can ignore Macau for the sake of simplicity, since the English used there is basically a toned down version of formal HKE) – Wpi (talk) 17:29, 30 May 2023 (UTC)Reply
Off-topic: In my opinion, KMT and joss stick are not regional forms of English; indeed, the latter is currently not labelled as such. (Indeed, 'joss' is not so labelled, though it's not part of my active vocabulary.) --RichardW57m (talk) 09:18, 2 June 2023 (UTC)Reply
Latest comment: 1 year ago5 comments5 people in discussion
Should these categories be merged? Many terms in the -cide categories end in -icide, and thus should be moved, unless we decide not to make this distinction. J3133 (talk) 12:45, 14 June 2023 (UTC)Reply
(Merged from a request to merge into Category:en:Christianity)
Not clear to me how these are supposed to be distinguished. The boilerplate description at Category:Ecclesiastical terms by language says "terms used only by religious figures", but that's manifestly wrong for the terms at Category:English ecclesiastical terms which are also variously used by commentators like historians or musicologists who may or may not be religious themselves. In reality the category, certainly for English, seems to just contain terms topically related to Christian churches—not just religion in general—and these should be listed under Category:Christianity instead. The "ecclesiastical" label should perhaps also be made an alias of "Christianity". @Andrew Sheedy —Al-Muqanna المقنع (talk) 15:56, 29 August 2023 (UTC)Reply
This is actually already on this page! See Ioaxxere's discussion above. It was pointed out that not all the terms are related to Christianity. However, I do agree that "ecclesiastical" is not the best label. Simply labelling according to religion would be preferable, I think. Andrew Sheedy (talk) 17:42, 29 August 2023 (UTC)Reply
@Andrew Sheedy: Oops, completely missed that. I'll merge the discussions (and add a template to save anyone else making the same mistake). @Theknightwho The Thai category is very interesting, looking through it, but it seems to be describing a very different thing from the English category—maybe the problem is specifically how the English category is being used? —Al-Muqanna المقنع (talk) 18:03, 29 August 2023 (UTC)Reply
It seems like everything which is in this category would be better off in a specific religion's category or, if pan-religious, in the "religion" category. (But many things currently in the "religion" categories are Christianity-specific, as I raised at Wiktionary:Information desk/2023/August#Christianity_terms_labelled_broadly_"religion" and intend to deal with at some point.) The widespread misuse of the label / category for terms that are better in other categories means we might be better off retiring it, although the other possibility is making it an alias of "religion" and then trying to monitor misuse, which we have to do with "religion" already anyway. - -sche(discuss)16:33, 6 September 2023 (UTC)Reply
In the Thai case it might be useful to distinguish between terms that are topically relevant to religions and terms used in religious contexts. I'm not convinced that distinction is generally useful, though: stuff like PBUH would certainly fall into the latter category but I think the (Islam) context label does the job (and labelling it "ecclesiastical" would come off as decidedly odd in general). My inclination would also be to merge it in the way you describe, so moving it to the relevant religion(s) or to the overall religion category if it's non-specific, but that leaves more complicated stuff like the POS subcategories at Category:Thai ecclesiastical terms up in the air given that we don't generally do that kind of breakdown for topic categories. —Al-Muqanna المقنع (talk) 17:19, 6 September 2023 (UTC)Reply
Do the Thai POS subcategories make any sense or can they simply be deleted? "to kill (a god, high priest, or royal person)." does not seem to be an "ecclesiastical verb"-as-different-from-a-"verb", any more than deicide is an "English ecclesiastical noun", it seems to just include religions figures in its scope. And if specific verbs are only used by Buddhists (or whatever), then using the usual POS categories and then also using {{lb}} would seem to be the normal way of handling that, right? - -sche(discuss)03:13, 7 September 2023 (UTC)Reply
Since the BP discussion as grown stale I'm going to vote delete all (with the understanding that the key changes will be to the modules, not the categories themselves). — excarnateSojourner (ta·co)20:36, 7 March 2024 (UTC)Reply
If a set category is truly restricted to one language (e.g. Translingual), should we leave it at whatever prefixless name it may have, or move it to "mul:" (or whatever other language code is appropriate) and put it into the "set category" system, even if it only exists for one language?
Do the categories named above actually only exist in one language (Translingual)? Should .გე, .հայ, .한국, etc go in the same category as .de, or would they belong in "ka:Top-level domain codes" (and "hy:", "ko:", etc)?
@-sche My current tendency is to only create topic and set categories if they exist (or may exist) for more than one language. However, I think is probably the wrong thing to do. The poscatboiler system supports language-specific categories like Category:Bulgarian conjugation 2.1 verbs and I don't see why we can't do the same in the topic category system. (BTW the poscatboiler system now handles all categories of all sorts except for topic and set categories. I've been thinking for awhile of making it handle topic/set categories as well and eliminate the separate topic category system; this would make it possible to consolidate the generic category code into the poscatboiler system, so there's only one unified category system.) For #2, I'm not really sure, but my instinct is that non-Latin-script top level domains should also be translingual. Note for example that Korea created Korean-specific Latin-script TLD's like .kia, .samsung and .hyundai (see .kr on Wikipedia); if these are translingual I don't see why the Korean-script ones shouldn't be. Benwing2 (talk) Benwing2 (talk) 02:22, 26 September 2023 (UTC)Reply
I just added this category to the list template {{ccTLD}} which goes in TLD entries so that all the mainspace transclusions will be in the category. I figured we might as well have the category full until we decide to change it. That means there will be some entries with the category both hard-coded and template-generated. Chuck Entz (talk) 15:34, 12 December 2024 (UTC)Reply
Latest comment: 1 year ago1 comment1 person in discussion
There are multiple constellation systems, but we only have one category for all constellations - contrast this with Category:Chinese astronomy which is a subcategory of Category:Astronomy. In the label tree there is already {{lb|zh|Chinese constellation}} which categorises into Category:LANG:Chinese astronomy and Category:LANG:Constellations, and therefore makes these categories very messy (see e.g. Category:zh:Constellations where terms ending in 座 are in the European system while the rest are the Chinese ones - I'm in the progress of adding more for the latter). Also note that there are still a bunch more that have been incorrectly categorised, e.g. Ox which has {{lb|en|astronomy}} rather than {{lb|en|Chinese constellation}}, so there would be a decent amount of terms to warrant a split.
@Victar I don't see the point of Category:Rest. It groups Category:Sitting and Category:Sleep, which don't form a natural category. I also think we don't need a Category:Sitting. This category exists in only one language (Proto-West-Germanic), in which it has only two morphologically-related terms, a verb for "sit" and a verb for "sit on, occupy, etc.". It is better to use affix or root categories to group morphologically-related terms, and it is enough to create Category:Body positions (a set category) to enumerate verbs related to body positions, like sit, stand, lie, crouch, loom, etc. Benwing2 (talk) 07:16, 28 September 2023 (UTC)Reply
"Sitting" is not necessarily rest, and it seems too specific to be a category of its own. We don't need categories for every single semantic concept; that would be far too ramified. IMO we need to be parsimonious (which Chrome underlines in red, for no obvious reason) with our categories so we don't end up overwhelming the end user or creating categories that are largely unfilled. I really think you should think about creating larger-scale categories and only create the finer-scale ones when there's a genuine need. (BTW non-humans have bodies too, so Category:Body positions should be fine for animals, and "sitting" in reference to inanimate objects is not the same concept.) Benwing2 (talk) 22:39, 28 September 2023 (UTC)Reply
I hate to vote to delete something that another contributor to this project clearly thought/thinks was a good idea, but I have to agree with Benwing that nothing about grouping "sleep" and "sitting" together as "rest" seems logical or necessary to me; I would delete the "Rest" category, at least as it is presently constituted. "Sitting" is a more plausible category, I would like to see how many terms it could contain: sit (and trivial variations like sit up and sit down), terms like Indian style and criss-cross applesauce, and what else? If it's only a handful of terms, I'm not sure it's useful, but if it's a lot of terms, then sure... - -sche(discuss)00:33, 29 September 2023 (UTC)Reply
Hmm, if we can flesh it out (categorize more entries into it, or list entries that would go in it), I could get a better sense of how useful a Category:Rest would be; maybe it's useful after all. I see Category:Sitting has been populated with a variety of entries. When I add a new category, I try to populate it with entries right away to demonstrate that it could contain a significant number of entries, to head off RFDs (or at least let them proceed based on evaluation of what a decently fleshed out category looks like). - -sche(discuss)14:05, 30 September 2023 (UTC)Reply
I also have a request for you: please let's have a moratorium on further Victar-created categories until we've made sure the dozens you've recently created are needed. Many of them appear to be unneeded or ill-thought-out, and I need to have time to review them individually while you're not adding even more. Benwing2 (talk) 07:27, 28 September 2023 (UTC)Reply
Again, Category:Work is a port of Wikipedia Category:Work. There is work that isn't employment, which is why Category:Employment is a separate category on Wikipedia. I can tell you, none of the categories I've created are "ill-thought-out", and I've put a lot of consideration into them. But please feel free to audit me on any of my addtions. -- Sokkjō21:20, 28 September 2023 (UTC)Reply
Absolutely, we have no imperative to follow Wikipedia, but seeing as their categories are far more extensive and already vetted, it's not a bad starting point. Category:Work contains both Category:Employment and Category:Slavery, which are both clearly distinct, yet still fall under "work". I think there's plenty of "rhyme or reason". -- Sokkjō08:20, 29 September 2023 (UTC)Reply
@Sokkjo From looking at Wikipedia's categories, they are a sorry mess, much like a lot of the Wikipedia modules that people wrongly think are a good idea to "port over". They are too ramified and highly duplicative. I would not use Wikipedia as a starting point. You have a point that slavery is a type of forced labor, which is a type of labor, and employment is also a type of labor; so maybe we should rename Work -> Labor, which makes it clearer. (Wikipedia again, in their messiness, distinguishes Work from Labor, but I disagree.) I also see you created 'Slaves' as a topic cat separate from Slavery; this makes no sense at all, so i'm RFD'ing Category:Slaves. Once again I ask you to stop creating new categories until we have a chance to work through the dozens you've already created. Benwing2 (talk) 08:49, 29 September 2023 (UTC)Reply
I agree that we're going to have a hard time making users keep "Work" and "Employment" distinct (and if we rename "Work" to "Labor", then we also have a hard time distinguishing it from Pregnancy / labor). I also appreciate the point that not all work is employment. What if we solve this in the other direction: keep CAT:Work and delete CAT:Employment, which currently contains no entries in any language and no subcategories besides CAT:Occupations? CAT:Employment seems like it does nothing but make users put in an extra click when navigating between CAT:Occupations and CAT:Business (or whatever higher-level category we might decide to put Occupations in). If we make Occupations a subcategory of Work instead, then any work-that's-not-employment could go in Work, as can any employment-that's-not-an-occupation (while still putting anything that does belong in a more specific category like Occupations, or e.g. CAT:Hobbies, in those categories). - -sche(discuss)14:45, 30 September 2023 (UTC)Reply
OK, I have recategorized "occupations" from "employment" to "work" (in the module). Because "occupations" categories were the only contents of the "employment" categories for all but three languages, the "employment" categories are now empty. There were only six entries categorized as "employment"; everything else was already in a more relevant category like "occupations". AFAICT, "employment" can now be removed from the module and all the empty "employment" categories which were so useless and unused for so long can be deleted. - -sche(discuss)17:03, 28 March 2024 (UTC)Reply
I imagine this was done this way because in the US, "football" universally refers to American football (or occasionally Canadian football, which is quite similar), and never to soccer (except in the names of certain soccer clubs, which often call themselves "football clubs" (F.C. for short) in imitation of European football clubs). But it looks out of place, and Canada similarly refers to Canadian football as just "football" but our category is Category:Canadian football not Category:Football (Canadian). Wikipedia has its article on American football at American football (logically) and similarly for Commons. BTW once we rename Category:Football (American) to Category:American football, we might consider renaming the soccer category to Category:Association football (consistent with Wikipedia), but that's a separate can of worms. Benwing2 (talk) 04:01, 4 October 2023 (UTC)Reply
I would think that our contributors could tolerate a lack of parallelism in topical categories where the base terms reflect common usage and the differentia are in parentheses. This seems like overtidying, letting one's own personal preferences for parallelism override broader, user-oriented considerations. The (non)problem only appears in the Category:Football page. DCDuring (talk) 12:16, 4 October 2023 (UTC)Reply
I would've guessed it was done this way so someone typing "Footba..." into Hotcat (or typing "Category:Footba..." into search) would notice that they needed to specify rather than just using bare "Football". I'm not wedded to the current names, but I don't see a compelling reason to change them, either. - -sche(discuss)05:56, 3 November 2023 (UTC)Reply
Football categories
the following categories should be moved for name harmonisation, as well as their child categories.
non-integrated topical categories and what to do about them
Latest comment: 1 year ago17 comments8 people in discussion
(moved to WT:CLTR)
Solomonfromfinland and other users have been busy creating and populating random non-integrated topical categories. I found all of them that are prefixed with en:. The following are my suggestions for how to handle them. Note that "keep" implies integrating them into the topic cat system.
Delete? Seems too general. An allotrope is "Any form of an element that has a distinctly different molecular structure to another form of the same element, with different physical properties and often different chemical properties." Note, we do have CAT:en:Isotopes.
Keep, I think, but the individual entries (only 4 of them) need to be in subcategories like Category:en:Counties of Victoria (only some Australian states and territories have counties).
??? This has several such terms like Putinism, Stalinism, Trumpism, Bidenism, etc. but seems rather specific; maybe rename to 'Eponymous ideologies' and/or make it a POS category?
Yeah, this is one reason I keep coming back to the idea that maybe we should "prefix" all the categories (so all the topic categories get "topic:" added to their name, and all the different types of sets get their own wording like "Named " vs "Types of ", or maybe even "set:Named ..." and "set:Types of ..."); then we'd have "Category:topic:English" for the top-level category — assuming it's intended to be a topic category, anyway; it currently contains a mix of types of English and terms which have some sort of connection to the topic of language and spelling as used by English speakers. - -sche(discuss)05:40, 10 October 2023 (UTC)Reply
@-sche I've been meaning to make a longer posting about sets vs. names vs. types. I added the functionality to support a type param in the topic definitions, with the values "topic", "set", "name" and "type", with the intention that "set" categories should be converted into either "name" or "type" categories. You can actually put a comma-separated list in type, something like type = "type,topic" for Category:Beards and type = "topic,name,type" for Category:Flags. These are indicative of categories that should be split. One issue I'm running into is that not all set categories can be easily assigned to the name vs. type distinction, or at least it isn't clear which one should be used. For example, Category:en:Heraldic charges contains terms for shapes found on coats of arms, and Category:en:Heraldic tinctures contains terms for colors found on coats of arms (heraldry has its own lingo for colors and shapes). Maybe these are "types" but then the default definition becomes e.g. "English terms for types of heraldic tinctures" which reads weirdly to me and suggests that it should contain generic types of colors rather than specific actual colors. Similarly for all the animal categories, which frequently contain genuses and species; you could argue the genuses are types, but the species seem rather specific for that. I wonder if we should use the terminology "class" instead of "type"; this terminology gets a bit technical but "class" is opposed to "instance", and the class-instance distinction maps well onto what we're trying to accomplish by splitting "types" and "names". Note also that some categories currently have 'type' or a synonym in their name, e.g. Category:Types of planets, Category:Bicycle types, Category:Literary genres (= "types of literature"), Category:Manga genres (= "types of manga", etc.), Category:Musical genres, Category:Film genres, Category:Video game genres, Category:Forms of government (= "types of government"), Category:Forms of discrimination (= "types of discrimination"). Benwing2 (talk) 08:29, 10 October 2023 (UTC)Reply
Oof, yes, that's tricky. I wonder if we should use 'word-then-colon' style prefixes in the actual names of the categories, and add verbiage after "set:" whenever it was necessary to distinguish "specific foobars" from "types of foobars", so "Category:en:set:Types of wars" (civil war, war of attrition, war of conquest, police action, armed conflict, ...), "Category:en:set:Named wars" or "Category:en:set:Individual wars" or "Category:en:set:Specific wars" or whatever (World War I, World War II, Vietnam War, ...), but then some things could just have "set:" without the extra verbiage if we aren't going to split the category, so "Category:en:set:Heraldic tinctures". Technically, we could split individual tinctures like gules and argent and vair off from types of tinctures, but the latter category could only ever contain a handful of entries—metal, colour, stain, fur—so I'm not sure it'd be reasonable to split. If we want to include placenames in the naming scheme, they could also just have "set:" like "Category:en:set:Towns in the United Kingdom". And either {{autocat}} could know to continue to display "English names of " or "English terms for " when the category name is just "set:" and not "set:Types of..." or "set:Named...", or we could store the information somewhere (the way we currently have Module:category tree/topic cat/data/History etc) that tells it to do that. (This proposal should still work even if we decide to leave placenames — and other names? — where they are.) - -sche(discuss)22:00, 10 October 2023 (UTC)Reply
@-sche Thanks for your input. All of this makes sense and I think I will proceed for now in trying to clarify the nature of each existing category. Since we are likely to have categories where the type (topic, set, name, type) isn't derivable from the category name (cf. CAT:en:Literary genres, which is probably clearer than CAT:en:Types of literature), I think it makes sense to continue to require that the type field be specified in the definition of each category. The way the code works currently, if the description has the value of "default", it will automatically display "names of", "types of", "terms for various" or "terms related to" + the topic name. You can also customize just the topic name by specifying the description as e.g. "=the {{w|Barbie}} fashion doll produced by Mattel" for CAT:en:Barbie, which will automatically get converted to "{{{langname}}} terms related to the {{w|Barbie}} fashion doll produced by Mattel" since CAT:Barbie is specified using type = "topic". Benwing2 (talk) 22:57, 10 October 2023 (UTC)Reply
Agree with the list, if only in order to obstruct not, though the astrophysical stuff is above my paygrade. If there is something intelligent wrongly deleted I can still create an integrated category.
Presumably each category to be deleted will be run through WT:RFDO?
Could we have any simple objective criteria for determining:
when a category should/could be deleted without being run through WT:RFDO and
what should be done with the members.
Categories that are "too small" and whose members are includable or included in other categories (that capture the essence of the to-be-deleted category ) would be examples. A procedure for categorization the members would be desirable. This may not be possible, but perhaps we could at least lay down practical guidelines. DCDuring (talk) 13:10, 10 October 2023 (UTC)Reply
@DCDuring I did it this way because having 136 or so separate entries in WT:RFDO would be overwhelming, but I agree that it would be useful to have some clear criteria for how to decide when a category can be deleted without going through WT:RFDO. Hopefully we can get some consensus on which categories to keep and which to delete through the BP process. I can RFD some of the groups of categories together that I think should be deleted (e.g. the chemical-element-related categories and the "racist/vulgar terms for" categories) but I'm not sure about the one-offs. As for the members of the categories I'm proposing to delete, I can add a column indicating approximately where to move them to, although there will inevitably be some per-category judgment needed for certain entries. Benwing2 (talk) Benwing2 (talk) 23:03, 10 October 2023 (UTC)Reply
It's a good way to start, but there are only about 30 items which you think should be deleted. I don't think that would overwhelm WT:RFDO. Maybe some can be speedied. There also seem to be items for WT:RFC and WT:RFM. Grouping would be nice to help focus, yet generalize, a deletion discussion. BP is the place for policy. If we can tease out some policies or, more likely, guidelines from reviewing a large number of cases, that seems to me like a good use of BP.
BTW, the category system and the use of {{autocat}} do not seem transparent to less-frequent contributors or to aging ones like me. As you know, the result of excessive template opacity is that folks are more likely to hard-code (if they know how) instead of using templates. I believe that those who have trouble and can't work around are likely to fail to contribute or even abandon enwikt as users. DCDuring (talk) 15:31, 11 October 2023 (UTC)Reply
This is a good point, DCDuring. If you follow the instructions in the edit notice at Cat:en:Something and add {{auto cat}} to the page, you see this message:
The automatically-generated contents of this category has errors.
The label given to the {{topic cat}} template is not valid. You may have mistyped it, or it simply has not been created yet. To add a new label, please consult the documentation of the template.
This is doubly problematic: I didn't give a label to {{topic cat}}, I just added {{auto cat}}! And the words "documentation of the template" don't link to the documentation; you're expected to know that you need to click the template name.
And then once you get to the documentation of Template:topic cat/documentation, the page very much assumes you know Lua and how our modules are organised.
lmao I'm the panphobic "free speech extremist" your mother warned you about, but "Racist names for continents", "Racist names for countries", and "Racist names for places" can surely only have been planted by the CIA or MI5 to try to discredit us. Hilarious. Equinox◑15:33, 12 October 2023 (UTC)Reply
@Sokkjo You are reading hostility where there is none. I simply stated my reasons why I think this category should be deleted. BTW I don't think "could be filled" is a valid reason for keeping a category, nor is "exists in Wikipedia". Benwing2 (talk) 23:55, 12 October 2023 (UTC)Reply
You keep asserting that I shouldn't refer to Wikipedia, but as I said before, the categories on Wikipedia are far more built out then those on this project, and to say they have nothing worth taking as guidance is absolute hubris. So yes, I will be continuing to make comparisons and citing their project. -- Sokkjō06:07, 13 October 2023 (UTC)Reply
@Sokkjo Why does CAT:Nautical not suffice? Once you remove the set entries from consideration, the non-set entries are small and can go into CAT:Nautical or nowhere. You seem not to understand that synonyms do not really belong in topic categories; they go in the Thesaurus. If we are to have a related-to "Ports and harbors" category, we need a DIFFERENT set category for ports and harbors. They should not be mixed. BTW under what circumstances will you be willing to admit that one of your created categories doesn't belong? It seems like never. Benwing2 (talk) 06:14, 13 October 2023 (UTC)Reply
Comment: since the category boilerplate and contents are for a topic (not set) category, Ioaxxere's "keep" !vote is functionally a vote to delete the currently-existing category and make a new one with partly different contents that happens to have the same name (repurposing and repopulating it). I'm not sure there are enough types of ports to make for much of a category, though. I almost think the topic category has a stronger claim to having enough entries to merit existing (since terms denoting types of ports are also terms pertaining to the topic of ports, so could go in a "topic:ports" category alongside the various terms like harbormaster and stevedore/longshoreman that pertain to ports but are not types of ports, making for a decent number of entries). I'm on the fence for now. - -sche(discuss)17:24, 28 March 2024 (UTC)Reply
Created by User:Koavf, who then added a bunch of tangentially-related terms to this category, whose only connection is containing the word "brick" in them:
There might be use for a category that could house terms semantically related to bricks and brickwork that do not contain the morpheme brick-or it could all go into Category:en:Masonry, which looks underpopulated. DCDuring (talk) 13:30, 8 July 2024 (UTC)Reply
Having such a subcategory of Masonry seems reasonable, although I prefer the name Brickwork, which suggests a related-to category (as opposed to Bricks). Einstein2 (talk) 11:47, 1 October 2024 (UTC)Reply
Created by User:Sokkjo/User:Victar. We already have Category:Deception. Contains only Proto-West-Germanic terms: four verbs meaning "to cheat" or "to deceive", and two tangentially related terms meaning "unloyal" and "adultery". There is no reasonable way to make this into a set category, and the items here should either be listed as synonyms of a basic verb "to deceive", moved to Category:Deception or removed. Benwing2 (talk) 00:57, 13 October 2023 (UTC)Reply
Delete Make sure context is appropriately placed in entries. Some of this would belong in a well-designed Thesaurus, if we had one. All of this seems to be the result of our having this ponderous, readily abusable topic category structure without explicit, comprehensive, understandable criteria for their formation and population. Seems to fall under it seemed like a good idea at the time and grow like Topsy. DCDuring (talk) 01:03, 13 October 2023 (UTC)Reply
@DCDuring I agree. I have done some work recently to clarify the types of topic categories ("related-to", "set", "type" or "name") and include verbiage stating what belongs in the categories, but I completely agree we need some clear criteria for what topics are appropriate. Benwing2 (talk) 01:08, 13 October 2023 (UTC)Reply
The combination of the {{topics}} and {{autocat}} templates seems highly prone to abuse. At least one or the other needs oversight. I haven't paid much attention because I have no use for such categories. I wonder who does. This will be a matter for BP. DCDuring (talk) 01:15, 13 October 2023 (UTC)Reply
Created by User:Sokkjo/User:Victar. This is populated only with a single subcategory Category:War, except for Proto-West-Germanic, where it has a smattering of terms meaning "to quarrel", "quarrel" or "conflict", and a couple other vaguely related terms. These belong as synonyms of a basic term "to quarrel"; terms related to war can be moved to Category:War. This should be deleted and Category:War moved up one level. Benwing2 (talk) 01:05, 13 October 2023 (UTC)Reply
@Sokkjo Why did you need a category for such words? I really think you misunderstand what the topic category system is for. Topics should not be used for vague collections of synonyms; that's what the Thesaurus and Synonyms sections and inline synonyms are for. Benwing2 (talk) 01:47, 13 October 2023 (UTC)Reply
I think this line you're drawing of what's too "vague" is arbitrary and gatekeeping. Higher categories by their very concept are more generalized. Again, war and conflicts are not synonyms. The terms brawl, spat, feud, dispute, rivalry, etc. simply do not belong under CAT:War. -- Sokkjō06:20, 13 October 2023 (UTC)Reply
Terms like battle and conflict can go in the "war" topic category. I am not sure terms like quarrel need a category; I am not sure there are enough closely-topically-related terms that relating to quarreling but not war to make a sensible non-war conflict category. So, delete. - -sche(discuss)18:14, 28 March 2024 (UTC)Reply
They're not. Not all multiword terms are phrases by our reckoning, and the multiword term category contains more than 10 times as many entries as the phrase category. —Mahāgaja · talk08:49, 13 November 2023 (UTC)Reply
The descriptions in the categories are
Hebrew groups of words elaborated to express ideas, not necessarily phrases in the grammatical sense.
and
Hebrew lemmas that are an idiomatic combination of multiple words.
I agree the descriptions aren't clear, but "phrases" in Wiktionary are a grammatical concept and indicate things that can't be clearly classified as nouns, verbs, adjectives and the like, while any POS can be multiword. Benwing2 (talk) 23:09, 14 November 2023 (UTC)Reply
Strong oppose. @Musetta6729 and I have discussed this previously in private and have already cleaned up Shanghainese Chinese, which we both found unnecessary as most of the terms in it can be classified as either "chiefly Shanghainese (Wu)" or just plain Shanghainese. As correctly identified previously, the Chinese category contained mostly Wu terms, which we have already dealt with. We have already dealt with the majority of the category's pages, and left four that could also be removed:
鄉下人/乡下人 (shián-gho-gnin), 硬盤/硬盘 (ngan-boe), and 硬盤人/硬盘人 (ngan-boe-gnin) are all generally "xenophobic" terms that can be classed as "chiefly Shanghainese (Wu)" (or something similar)
三環/三环 (sé-gue) is a geographical term that pertains to the city of Shanghai. We can simply remove the Shanghainese Chinese label and deal with it much like the other geographical terms, cf. 筲箕灣/筲箕湾 as just one example
If we implement these two measures, the Chinese category will be completely vacated and can potentially be removed. Even if we do not remove it, I would like for at least some dignity to be given to Shanghainese, as the to-be completely unused label will get the succinct "Shanghai" name while the language of urban Shanghai will be relegated to the term "Shanghainese Wu", which to be frank, we both found somewhat insulting. — 義順 (talk) 12:57, 26 January 2024 (UTC)Reply
@ND381 I am confused why you think "Shanghainese Wu" is insulting, unless you deny that Shanghainese is a variety of Wu. As for the label, that is an orthogonal discussion and we can change it any way we want. Benwing2 (talk) 19:48, 26 January 2024 (UTC)Reply
@Benwing2: Wu is a grouping of languages. No one speaks "Wu". We treat it as part of Chinese for practical reasons, but the Wu languages are quite divergent from the rest of Chinese, and presumably fairly distinct from each other. I suppose they see it as analogous to "English West Germanic" or "Ukrainian East Slavic". Chuck Entz (talk) 20:30, 26 January 2024 (UTC)Reply
I would like to add a bit onto what has already been said here. Shanghai is incredibly complex sociolinguistically, and what is referred to as "Shanghainese" (on wiktionary as much as elsewhere) tends to be the city-centre varieties that developed during the course of the last centuries as a lingua franca between the original Shanghai locals and migrant populations from nearby areas who now constitute a major part of Shanghai.
But Shanghai in fact has a whole range of regional languages - a range of Wu varieties, in fact, which can all be fairly divergent from each other but still very much maintain mutual contact and influence internally. When someone speaks of "Shanghainese", if they don't specify non-city-centre Shanghainese, then one would usually assume they are talking about city-centre or something adjacent to that. But "Shanghainese Wu" feels then more vague somehow as to whether it refers to any dialect, sociolect or topolect that can be considered "a Wu variety of Shanghai which is not necessarily city-centre", a label which is not in itself necessarily useful, and can potentially even be quite confusing in my opinion.
As of now we have been adding modifiers such as "urban" or "suburban" in front of "Shanghainese" when we come across situations where we need to clarify, and that's been working alright. But coming back to the original point, I think it is also just that "Shanghainese Chinese" - which currently is used as "Standard Mandarin terms found in Shanghai" (the language itself not being native to Shanghai, simply spoken in Shanghai for being the official national language) - should arguably not take precedence to the Chinese varieties that are native to Shanghai instead. — Musetta6729 (talk) 02:47, 27 January 2024 (UTC)Reply
This discussion is very much more of a footnote, but the fact that the significantly more irrelevant category gets the label that the language is meant to have (ie. I would prefer for S’nese the language to get "Shanghainese" or even just "Shanghai" like how other non-top level groups/lects are handled) and instead we have to settle for the (intentionally obtuse?) mouthful that is "Shanghainese Wu" — not even Northern Wu à la Quanzhou Hokkien or Hong Kong Cantonese. Again, this is very much not the main point and from your profile I'm assuming you don't know that much about socioling and language politics in the area so it would be I suppose easier to leave the discussion here
The main problem is still just the category: S’nese Wu is unnecessarily obtuse and if we can get back to the point of whether or not we can just clear S’nese Chinese’s four remaining pages we can have a more fruitful consensus — 義順 (talk) 20:34, 26 January 2024 (UTC)Reply
As there has not been any negative comments regarding the vacating of Category:Shanghainese Chinese, I have removed all four remaining entries in the category.
Regarding the situation of the naming convention, unless there are any further objections, the current Category:Shanghainese Wu should be renamed to just "Shanghainese", and S'nese Chinese is to be either kept as is, renamed to something like "Standard Chinese in Shanghai", or deleted. Of the three options, I believe the last one would be best, as there genuinely isn't a need for it: "chiefly Shanghainese" would cover for most if not all cases of words in Standarin that are used in Shanghai, as those terms almost/always originate from the local variety anyways. Misspellings or Shanghainese-influenced sayings in Standarin that are not found in Shanghainese should perhaps be labelled with "influenced by Shanghainese", if, again, is necessary, which I highly doubt.
For the time being, the "Shanghainese Wu" label will be renamed to "Shanghainese" as per above discussions, and to stay in line with other "-(n)ese" labels (cf. Hainanese, Sichuanese). If for whatever reason S'nese Chinese (ie. Standarin used in Shanghai that isn't "chiefly Shanghainese") is actually needed, unless there are any objections, something along the lines of "Standard Chinese, Shanghai" or "influenced by Shanghainese" (if appropriate) is to be used, though again, there really aren't any words that would warrant this designation. — 義順 (talk) 13:50, 31 January 2024 (UTC)Reply
Apologies for the late reply; I too am fine with renaming or removing "Category:Shanghainese Chinese" (and updating Module:labels). In some similar situations we've used noun forms instead of adjectives to make this kind of distinction, e.g. "Category:Switzerland German" (for de) was renamed to that name to distinguish it from "Swiss German" the Alemannic lect, so if we need a category for "standard Chinese / Mandarin terms chiefly found in Shanghai", it would fit the overall schema to name it something like "Category:Shanghai Chinese"... but if people just don't want such a category, and want {{lb|zh|Shanghainese}} / {{lb|cmn|Shanghainese}} to throw an error and put the entry in a cleanup category so someone can re-code it as a wu entry, that works too... - -sche(discuss)01:30, 6 March 2024 (UTC)Reply
Comment: If we are trying to make a distinction, one category should be referring to Shanghainese Wu, and another should be referring to any variety spoken in Shanghai (i.e. both Shanghainese Wu and Mandarin). I don't know if this distinction should/can be made, though. — justin(r)leung{ (t...) | c=› }04:06, 11 October 2020 (UTC)Reply
I guess the issue then is, do we have native Shanghainese speakers here who can make this distinction? It looks to me like most entries in both categories are Wu terms. Benwing2 (talk) 22:05, 11 October 2020 (UTC)Reply
If we have any entries that make this distinction (and one such entry has been convincingly adduced above), then merger would result in losing information. Do you want Shanghai-specific Mandarin terms to go uncategorised as such? —Μετάknowledgediscuss/deeds03:26, 12 October 2020 (UTC)Reply
@Benwing2, Metaknowledge: @Thedarkknightli probably knows the Mandarin terms and may know some of the Wu terms. For Shanghainese, we have some resources we can consult, so it's the Mandarin terms that are more difficult to figure out. The terms that are in CAT:Shanghainese are Wu for sure (and I would prefer to call the category "Shanghainese Wu" to make it clear). We would need to sift through the CAT:Shanghainese Chinese category to check what's actually Wu and relabel them with "Shanghainese Wu" or just "Wu". BTW, there might be some need to revamp other labels/categories, like "Sichuan" displaying as "Sichuanese" and categorizing to CAT:Sichuanese Mandarin, which could be confusing when we introduce terms in Sichuanese Hakka or Xiang (which we might have some already). — justin(r)leung{ (t...) | c=› }03:40, 12 October 2020 (UTC)Reply
(edit conflict) A native Shanghainese speaker would be User:辛时雨 but he is not very active.
What we lack with regional labels, which is specific to Chinese since the merger needs to work for varieties and subvarieties is the ability to add variety specific categories, {{lb|zh|Shanghai|Wu}} is meant to not only label a term but also categorise it as Shanghainese Wu but {{lb|zh|Shanghai}} is for general Chinese, esp. Mandarin. --Anatoli T.(обсудить/вклад)03:43, 12 October 2020 (UTC)Reply
I think you would need to use {{lb|zh|Shanghai Wu}} or something, not {{lb|zh|Shanghai|Wu}}, since I don't think the same label ("Shanghai") can categorize into two categories. Anyway, add my voice to those saying that if there is intended to be a distinction here, the category names (and, probably, boilerplate texts) should be made clearer. We could also consider "see also"-style crossreferencing them, like Category:Louisiana French and Category:Louisiana Creole French language. - -sche(discuss)17:26, 13 October 2020 (UTC)Reply
The way we define Greek and Ancient Greek pretty much excludes any borrowing from the former into Coptic while it was a living language. In fact, after going through the entries in the main category starting with the first letter in the Coptic alphabet I was unable to find any that had modern Greek in their etymologies. These are all due to misuse of the {{given name}} template's "from" parameter: i.e., |from=Greek instead of |from=Ancient Greek. The clincher is that this main category has no borrowing subcategory, while the Ancient Greek equivalent does.
There are still 82 entries left where "Greek" needs to be changed to "Ancient Greek" in the templates, or I might have speedied these myself. It does give us an opportunity, though, to consider what might be done to spot blatant mismatches such as these between the etymology templates and the name templates- maybe not in real time, but as a periodic bot or AWB task fed from the dumps. This is unusual in being categorically 100% wrong, but you can be sure that there are similar errors scattered through other derived terms categories Chuck Entz (talk) 01:58, 25 December 2023 (UTC)Reply
@Chuck Entz This is an issue that comes up with any language that has a name frequently used to refer to its ancestors: cf. Mongolian, French etc. The real issue is that we need to overhaul our name templates, since they use their own bespoke system that simply categorises entries based on whatever you put in the from= parameter. This is useful when used with things that aren't languages, but with languages it simply causes a headache. Pinging @Benwing2 who may have ideas on how to fix this. Theknightwho (talk) 19:12, 25 December 2023 (UTC)Reply
Yes, let's fix the entries so their etymologies refer to the right language. I agree there are many pairs of languages where we could flag the existence of categories like this as likely in error and something to fix. The reverse case, where a living language is said to have "borrowed" a word from a dead language, is also something to monitor: it's not always an error, because things like ghrelin do exist, but for many dead languages (e.g. Old High German, as opposed to Latin) it's usually an error in my experience. One idea (perhaps there are better ones!) is to have a TODO/monitoring page containing links to every 'suspect' category we can think of, e.g. every instance of a modern language "borrowing" from a dead language (excluding any cases where such borrowing is actually common, like from Latin or Greek) and vice versa (like here, Coptic borrowing from Greek or German or whatnot); if possible, it'd be great to auto-generate the list; even better would be to only generate bluelinks i.e. cases where the category exists; but if it's not possible to auto-generate the list, we can just make a massive page of manually-added links. Users could scroll down the page and check any links that are blue (say, if Category:Russian terms borrowed from Old English turns blue, you know that's something you want to look into because probably someone used "bor" where they should've used "der"). - -sche(discuss)18:32, 28 March 2024 (UTC)Reply
The dates wouldn't need to be super accurate, but it would make it easy for me to generate a todo list WT:Todo/Lists/Anachronistic etymologies which would contain entries that derive from a language spoken later. There would be a section to list known good entries that should be skipped over.
(2) We start a page like WT:Todo/Lists/Anachronistic etymologies/List which contains pairs of languages where one cannot derive from the other. For example (in reality I would format it as a table)
Keep: As set categories, cat:Hit and cat:Rub should contain terms for kinds of hits and rubs, and they do seem to. (I don't know any of these languages, but I saw terms defined as "slap" and "kick" as hits and "polish" and "scrub" as rubs.) We could convert them to thesaurus pages, but I think we could convert any set category to thesaurus pages. — excarnateSojourner (ta·co)00:04, 25 August 2024 (UTC)Reply
Not specific to any one simplification system, so it's a random mix of Chinese and Japanese characters without any language-specific indication.
Is it supposed to refer to characters which were unchanged by simplification? Presumably not (since that would include almost all of them), but a strict reading of the name would include them.
What about situations like 再, 𠕅 and 𠕂 where simplified Chinese simplifies all three to the first one?
None of this is explained, and the entries seem to be pretty random. At the very least if we keep this, we should subcategorise by simplification system.
Theknightwho (talk) 09:47, 7 January 2024 (UTC)Reply
This is definitely a flawed category, but I am very interested in having a category where the characters that are used in both Simplified Chinese and Traditional Chinese, i.e. the characters that WEREN'T simplified, are identified. --Geographyinitiative (talk) 02:04, 12 September 2024 (UTC)Reply
Essentially these categories are categorizing compound terms made up of specific words, e.g. pǟva(“day”), vālda(“white”) and mer(“sea”). In general, we don't categorize in this fashion; certainly not in English, for example. The set of words out of which such categories are made seems to be chosen essentially arbitrarily and if we did this consistently, it would lead to endless category bloat. @Neitrāls vārds who created the categories and Template:liv-compoundcat. Benwing2 (talk) 06:45, 29 February 2024 (UTC)Reply
Delete. I've added the pages in these categories to the Derived terms sections where they belong, except the three compounds with vālda because the page doesn't exist. Ultimateria (talk) 19:08, 1 March 2024 (UTC)Reply
I agree there is not enough distinction. I think the distinction some people hope for is "unknown means no-one has any ideas, uncertain means people have ideas" (?), but I'm sure I've seen even other dictionaries use "Uncertain." as the complete etymology for a word they have no ideas about, and conversely I've seen things like "Unknown. Theories include..."; there is no logical or maintainable distinction; if you're not certain what the etymology is, you don't know (with certainty) what it is (you just hypothesize), and conversely if it's unknown you're not certain what it is. I would not object to renaming the category as Benwing proposes, but I would also not object to just merging "unknown" into "uncertain" (or vice versa). - -sche(discuss)15:55, 4 April 2024 (UTC)Reply
The argument is fallacious because editors regularly do not have precious knowledge about existence and extent of previous attempts, so template application is quite a guess and theology. Given that the different categorization invites wasteful concerns of editors (adding to the learning curve load), I do not only not see the utility of if but also reckon it harmful, and am also sure that Metaknowledge would position himself likewise, as confronted by my argument about underspecified species names vs. uncertain meaning words on Talk:بركة. If you go from unknownness to uncertainty you can also visit underspecification and other more “science-theoretical” details that can only be left to philosophy papers nobody will actually want to write. Fay Freak (talk) 16:27, 4 April 2024 (UTC)Reply
Coming back to this, if the distinction is not clear enough to have two categories, then why have two templates? Could we merge the templates, as @-sche mentioned? AG202 (talk) 02:51, 7 January 2025 (UTC)Reply
Fine with me. We seem to have an apparent consensus for merging (you, me, Victar, Surjection, -sche and I think FF, although as usual his writing is impenetrable). Maybe ask on Discord to see if anyone else has any opinions? If not I can go forward with it. I think the merged template should be called {{uncertain}} because it's rare, at least in well-researched languages, for an etymology to be truly unknown; typically there are various speculations. Benwing2 (talk) 04:22, 7 January 2025 (UTC)Reply
I agree we should merge the templates, as there is no clearly definable difference between an unknown etymology and an uncertain one. —Mahāgaja · talk08:19, 7 January 2025 (UTC)Reply
April 2024
Categories for entries "spelled with" Ideographic Description Characters
Latest comment: 7 months ago11 comments3 people in discussion
@This, that and the otherSupport. The existing categories are especially problematic when you have multiple ideographic description characters, such as ⿰⿳⿰SIR木阝. However, why are you proposing to use "entry titles" in the category instead of just "terms"? Benwing2 (talk) 06:31, 5 April 2024 (UTC)Reply
Well, the term itself is not spelled with the ideographic description character. That's just a consequence of the fact the character is not encoded in Unicode. Nobody would consider these characters to be part of the spelling of the term. Moreover, it's ludicrous to say that ⿰亻尭 is spelled with ⿰ when 侥 is not – they are both equally composed of two CJK characters placed side-by-side (not sure of the technical CJK term for that). Compare this to Category:Translingual terms spelled with ◌́, which includes terms that use the combining accent character as well as those using precomposed Unicode characters, hence truly containing all terms spelled with the accent. This, that and the other (talk) 09:31, 5 April 2024 (UTC)Reply
Hah, I see you didn't actually argue for the use of the word "spelled". Whoops! I guess my argument against "terms" still runs along the same lines though. The terms themselves do not use these sequences, it is their Unicode encodings of the entry titles that do. This, that and the other (talk) 09:33, 5 April 2024 (UTC)Reply
@Benwing2 I see this situation as unique, on the grounds that no other category tree picks so heavily on the specific Unicode encoding of the entry title at the total exclusion of the term's actual, human-centred visual appearance or orthography. Even Category:English terms spelled with ◌́ includes titles that use the precomposed characters like é.
Anyway, I'm not going to press the point any further - a merger is the most valuable outcome here. I'd be satisfied to merge to "Category:Translingual terms spelled with/using ideographic description sequences" or any similar name. This, that and the other (talk) 04:05, 1 December 2024 (UTC)Reply
@Benwing2 @This, that and the other There are two ways to handle these: where possible, I think we should do what Module:zh-pron currently does by constructing the character (e.g. see Category:Chinese terms spelled with ⿰氵厶). Where that isn't possible, either the term is a mistake, or the IDS characters are being used for some other purpose (i.e. the term actually contains them as characters), so they should retain the current separate categories. I can only imagine the latter case coming up with emoticons. What I would oppose would be any kind of category like "terms using IDS" or whatever - the IDS are just a tool to represent unencoded characters; we, as a dictionary, only care about them insofar as they help us create entries, but the terms are in no way actually spelled with them (with the exception of the emoticon example I mentioned before). TTATO isn't being nitpicky by pointing this out, imo - it's actually a crucial distinction. Theknightwho (talk) 04:32, 1 December 2024 (UTC)Reply
@Theknightwho The reason I proposed this merge is that it seems valuable or useful to have a category keeping track of entries using IDS in their titles - the Han characters using IDS are necessarily unusual or exceptional in some way and it somehow makes sense to me to group them into the category. Splitting by the individual IDS character used doesn't seem worthwhile, but an overarching category may be. (Perhaps farfetched, but I could imagine it being useful for people looking for new Han characters to propose for inclusion in Unicode!) This, that and the other (talk) 04:49, 1 December 2024 (UTC)Reply
@This, that and the other I have no problem with having an IDS category, but it should be “entries with IDS in the title” or something, not “terms spelled with”, since that’s just factually wrong, as the IDS is only there due to the fact these characters haven’t been encoded yet.
I don’t think there’s a problem in having separate categories for them - they’re still separate characters in their own right, and the only thing that unites them is the fact they aren’t encoded, which isn’t lexically relevant. Chinese and Japanese already do just fine with them, and the general headword template only creates those categories when the title consists of more than a single character anyway (note that I’m taking a complete IDS sequence to be one character). Chinese and Japanese have language-specific reasons for creating categories for single-character entries, but most (maybe all) our IDS entries wouldn’t create Translingual categories for this reason in the first place. They only exist now because the headword module doesn’t know IDS sequences are special. Theknightwho (talk) 12:30, 1 December 2024 (UTC)Reply
Support. "France French" fits existing practice, as you say; we already use nouns rather than adjectives in some other cases, like "Switzerland German" to avoid the ambiguity of "Swiss German". In fact, now that it's possible to have labels categorize differently for different languages, we could consider changing "Switzerland French" and "Switzerland Italian" back to "Swiss...", since those two are not ambiguous and were just collateral damage of people wanting to rename the German category. But in the other direction... I wonder if we should consider changing not only "French French" but also e.g. "French Yiddish" to "France Yiddish", and "Vietnamese Chinese" to "Vietnam Chinese": I wonder if we should in general try to avoid categories that look like " ". But that's probably a bigger discussion... - -sche(discuss)05:58, 15 April 2024 (UTC)Reply
In the vein of "Peninsular Spanish", it occurs to me that "French French" could be "Metropolitan French" (though then people unfamiliar with that term might think it means French spoken in metropolises, so I don't know if that's better or worse than "France French"). "England English" seems to be an actual term I can find in use (contrasted with e.g. "American English" and "Australian English"). - -sche(discuss)15:52, 15 April 2024 (UTC)Reply
Latest comment: 11 months ago7 comments3 people in discussion
I'm really not keen on these two categories, because they don't really make sense with the way that Japanese is traditionally analysed (and the way we treat it everywhere else on Wiktionary). For instance:
き(ki) is described as "the seventh syllable in the gojūon order", but the etymology section clearly refers to the origin of the kana itself (i.e. the glyph), not the development of the sound in Japanese. The distinction is clearer if you remember that あ(a) and ア(a) are distinct kana that refer to the same mora (a).
キャ(kya) is described as a "katakana syllable", and while it can function as a syllable, if you were to analyse Japanese syllabically, you could rightly say that キャン(kyan, “kiang”) consists of one syllable that can be broken down into two morae: キャ(kya) and ン(n). I don't think anyone would support having a syllable entry for キャン(kyan), though, since there's nothing meaingful about that. This is in contrast to every other language in Category:Syllables by language, where you can't subdivide their syllables into component units (other than letters, in some cases).
Category:Japanese combining forms is used as a kludge to get around the problems caused by calling (full-size) kana syllables, as it's a dumping ground for the kana (and other glyphs) that can't be analysed as syllables. This is mostly okay for vowels like ゃ(-ya), which is described as a "combining form of や(ya) used in yōon mora ...", but it makes a lot less sense with っ and ー, which are full morae in their own right, and therefore function in a completely different way to the small vowel kana. However, because they can't form independent syllables, they've been shoved into the same category. I can also see that 酒 and 水 have been put in there as well, for some reason, which suggests this category just causes confusion at best.
I therefore suggest the following:
Allow a "kana" part of speech (as an alias of "letter", in the same way "kanji" is an alias of "Han character"), which should be used for full-size and small kana. The definitions should refer to the glyphs themselves, so the kana entries for あ(a) and ア(a) would be distinct, since they belong to different systems and have different origins, even though they refer to the same sound. This also goes for any hentaigana etc.
Allow a "mora" part of speech, which should encompass yōon like キャ(kya), but also the gojūon as well, which are written with a single kana. In that respect, きゃ(kya) and キャ(kya) both refer to the same mora, so it's fine for one or other to be an alt form.
As a side point, I also think these entries need serious cleaning up, as I'm not convinced some of these morae actually exist. For instance, ゐゅ(wyu) claims it is "rarely used, with うゅ seeing more use", but is うゅ(wyu) even used in the first place? Seems like someone just got overexcited and created all the theoretical syllables they could think of.
I agree that Category:Japanese syllables seems like an inappropriate way to categorize what are orthographic representations of mora. I'm not totally convinced that kana and mora are "parts of speech" in the sense of grammatical roles, but as categories for things in a broad dictionary, they are probably better than calling e.g. え a "syllable". Similarly, the small kana are not "combining forms" in any real sense – though I would argue that things such as 酒 and 水 really are. I think those were categorized automatically because the lemma entries use {{com form}}. So in the latter case, the problem may how the category is currently used rather than the category as such. To the side point, I can't recall ever seeing ゐゅ and can't imagine it being used, but one never can tell. Advertising, for example, sometimes uses bizarre forms to capture attention. Cnilep (talk) 23:31, 17 July 2024 (UTC)Reply
@Cnilep I completely agreed that "mora" and "kana" aren't part of speech in the strict sense, but I don't think it's possible to craft a definition that excludes them while still including "syllable" or "letter", which we use quite widely cross-linguistically (especially "letter"). If we still want to keep using the "combining forms" category for 酒 and 水 then I have no issue with that, but that's quite a different meaning of "combining form" (more akin to an affix). Theknightwho (talk) 23:47, 17 July 2024 (UTC)Reply
For my part, the inclusion of 酒(saka-) and 水(mi-) in Category:Japanese_combining_forms looks like a mistake, brought on by unclear definitions.
As I understand it, Category:Japanese_combining_forms is intended for combining orthographic forms, while the saka- and mi- readings for 酒 and 水 are combining morphophonemic forms, relating (in part) to still-poorly-understood vowel-fronting behavior seen in certain ancient nouns when used as standalone nouns or the latter element in a compound, versus when used as the first component in a compound; and (in part) to how certain ancient nouns could appear in compounds in abbreviated forms (perhaps the original words? or perhaps as contractions? uncertain).
Similarly, the inclusion of ん in this category also appears to be a mistake -- this glyph is not a combining orthographic form, nor does any such exist (AFAIK).
If you mean the "I therefore suggest the following:" part above about new pseudo-POS headers and consequent entry restructuring, I support your proposal 👍, with the addendum that I think we need to treat combining orthographic forms separately from combining morphophonemic forms, regarding 酒(saka-) and 水(mi-).
@Eirikr Thanks - good to know. Re the categories, they can be deleted right now tbh, since they’re empty. If we do discover any bizarre terms using those morae and add them, they’ll get populated/readded automatically anyway. Theknightwho (talk) 19:21, 21 August 2024 (UTC)Reply
Alternatively “Category:English terms spelled with numbers” should be renamed to an Arabic-numeral category and instead be a supercategory for Arabic- and Roman-numeral subcategories. J3133 (talk) 06:07, 11 August 2024 (UTC)Reply
I am not convinced that there is any need for action on either issue:
The "spelled with" categories call out the specific characters used to spell the term, with no concern for what they are being used to represent. The entries featuring Roman numerals in their titles are "spelled with" I, X, etc.
Looking at the situation in English in isolation, the term "number" is not ambiguous and changing it to "Arabic numerals" seems somewhat pedantic. The situation could be different in languages which use other systems of numeration, but I'm not familiar with these and can't comment.
The claims about the pronunciation of the sequence, enriching that time’s glosses, moreover are largely disinfo preying on the sketchy awareness about the production of speech and its transcription. Contrary to the category description, it is not pronounced /mf/, like at all, but —or just . Fay Freak (talk) 21:05, 3 August 2024 (UTC)Reply
For better or worse, we do comparably (currently) have Category:English words ending in "-gry" (although see RFM). I have no strong feelings on the matter. I am not opposed to deleting the category; the five words in this set could be mentioned in a usage note at fünf, which is surely the most common of the lot. (Edited to add: I'm fine with keeping it, too. Ideally we should be consistent in deciding whether to have this andthe English -gry and -yre categories, or consistently delete them all.) - -sche(discuss)01:27, 10 August 2024 (UTC)Reply
I am going to de-tag the category and call this kept for lack of any consensus to delete. It would be nice to move the above-mentioned English categories, dropping the quotation marks, for consistency with this and our various affix categories. - -sche(discuss)22:43, 21 June 2025 (UTC)Reply
Removing the "Scottish" from regional varieties of Scottish Gaelic
Latest comment: 6 months ago19 comments12 people in discussion
I suggest we remove the word "Scottish" from the following categories, as it is redundant and in most if not all cases less common than the name with just plain "Gaelic":
Sounds reasonable to me. I agree Scottish in this context is redundant. (Arran Gaelic might be slightly confusing to some due to Aran islands being a Gaeltacht area in Ireland… but I don’t think the impact of this would be significant anyway, the spelling is different and specific dialects of Ireland are rarely referred to with the word Gaelic) // Silmeth@talk00:00, 24 August 2024 (UTC)Reply
"Canadian Gaelic" might be confusing for similar reasons: without knowing about Scottish immigration in Canada, it might not be clear which Gaelic. Also, I hope the module isn't just tacking the language name to the end of the variety to make the category. Chuck Entz (talk) 00:25, 24 August 2024 (UTC)Reply
@Chuck Entz: Canadian Scottish Gaelic is the one in the list for which I am most amenable to keeping "Scottish" in the name. And yes, Module:labels/data/lang/gd lists regional_categories = true, which means it is just tacking the language name to the end of the variety to make the category. But we can change it to plain_categories = "Argyll Gaelic", etc. to change the category name. —Mahāgaja · talk09:39, 24 August 2024 (UTC)Reply
It would seem highly strange to me to have the categories for dialects of a language not contain the actual name of the language. Scottish English is a dialect of English; it's not at all necessary to speak of a given sub-dialect as Glasgow Scottish English. The Scottish is absolutely redundant there. But Scottish Gaelic is not merely a variant or dialect; it's a fully standardized language of its own. To me, logistically, this idea feels somewhat equivalent to, for example, renaming Gotlandic Swedish to Gotlandic Norse.
There's a few other reasons, as well. At least in the United States, Gaelic sans modifier is almost completely synonymous with Irish, unless the context makes it explicit that one is discussing Scottish Gaelic or the Goidelic family as a whole. (Which is why individual dialects would most often be referred to as simply Gaelic; they're not being presented in a multilingual context like Wiktionary.) I feel like the possibility for innocent confusion isn't too dangerously high, given the fairly niche nature of the language and the lack of Irish spoken natively in Scotland; but to Chuck Entz's point, there was also a dialect of Newfoundland Irish, extant all the way into the 1900s, that could rightly be deemed Canadian Gaelic. Qwertygiy (talk) 02:24, 24 August 2024 (UTC)Reply
@Qwertygiy: It's really a very different case from Gotlandic Swedish. Swedish is never called simply "Norse" the way Scottish Gaelic is frequently – even usually – called simply "Gaelic". As I mentioned above in my reply to Chuck, I am most amenable to keeping the word "Scottish" in Canadian Scottish Gaelic, but even there, I think "Canadian Gaelic" is a more common name for it; and in the literature "Canadian Gaelic" refers only to Scottish Gaelic spoken in Canada. As for Newfoundland Irish, Irish language in Newfoundland says "The Irish language was once spoken by some immigrants to the island of Newfoundland before it disappeared in the early 20th century", suggesting it may have been extinct before 1949, when Newfoundland joined Canada. In that case, Newfoundland Irish really couldn't reasonably be called Canadian Gaelic (quite apart from the fact that it never is). —Mahāgaja · talk09:59, 24 August 2024 (UTC)Reply
Well, we also need to ask ourselves whose standard we want to follow:
For a majority of speakers in Ireland and the UK, saying "Gaelic" automatically and exclusively refers to Gàidhlig na h-Albann, while "Irish" automatically and exclusively refers to Gaeilge na hÉireann. As @Qwertygiy reports, this is not the case in the U.S., where "Gaelic" seems to refer to Gaeilge na hÉireann.
I think it is redundant to denote that Argyll, Perthshire, Uist, etc. are all in Scotland by adding "Scottish", but we should be aware of the fact that this has the potential to cause confusion for less well-informed people.
P.S.: I was never before confused by the term "Canadian Gaelic", because I didn't even think a variety of Gaeilge might have ever been spoken there, but I stand corrected! The potential for confusion abounds everywhere, it seems.
I really think the potential for confusion is minimal, especially since these aren't L2 language headers we're talking about, but just labels next to terms under a ==Scottish Gaelic== header and the corresponding categories that are all inside CAT:Regional Scottish Gaelic. —Mahāgaja · talk15:53, 3 September 2024 (UTC)Reply
I am cautiously in favour. Gaelic really isn't used to refer to either of the other members of the family (in English), and those category names are what those subvarieties are habitually called (including Canadian Gaelic). embryomystic (talk) 01:38, 28 August 2024 (UTC)Reply
Oppose for the reasons cited by User:Qwertygiy. In general we do include the full name of the language in categories containing varieties of that language, and I at least would find it confusing to see something like Category:Harris Gaelic, as I would not know if this is a variety of Scottish Gaelic or Irish. For similar reasons, varieties of e.g. 'Walser German' should be 'Foo Walser German' not just 'Foo German'. Benwing2 (talk) 09:56, 3 October 2024 (UTC)Reply
You wouldn't be confused for long, since anything in CAT:Harris Gaelic would be under a ==Scottish Gaelic== header, and the category itself would be within CAT:Regional Scottish Gaelic. The comparison with Walser German is poorly chosen, since "German" unmodified refers to a different language (de), while "Gaelic" does not refer to a different language. Or consider the varieties of Regional Ancient Greek, which with the exception of Egyptian Ancient Greek are called "Foo Greek", not "Foo Ancient Greek" (despite the fact that "Greek" unmodified refers to the modern language). I ran some Google Ngrams searches:
In every single instance, the "Foo Scottish Gaelic" variant was too rare to be plottable on an ngram. In the cases not listed above, both "Foo Gaelic" and "Foo Scottish Gaelic" were too rare. There was no case where "Foo Scottish Gaelic" was common enough to be plotted at all, let alone being more common than "Foo Gaelic". —Mahāgaja · talk10:18, 3 October 2024 (UTC)Reply
I’m not a Scottish Gaelic editor. But what about a compromise like “Scottish (Arran) Gaelic” or “Scottish Gaelic (Arran)”? Just a thought. — Sgconlaw (talk) 11:19, 3 October 2024 (UTC)Reply
A tough one. I feel that a good analogy would be if we called English "Modern English". Then we'd end up with regional categories like "Australian Modern English" instead of "Australian English". This name isn't wrong, nor is it even confusing. It would just be a Wiktionary-ism; readers could easily make the connection to the formal language name and realise why we were using an unusual name for the dialect.
On the other hand, this project has seen fit to call the category "Attic Greek" instead of the Wiktionary-ism "Attic Ancient Greek". I would prefer the latter name, but if we make exceptions for Ancient Greek it's difficult to justify why we shouldn't do so for Scottish Gaelic too. This, that and the other (talk) 14:01, 3 October 2024 (UTC)Reply
Given that several people have either expressed outright opposition, suggested alternatives that are don't involve removing the word "Scottish" or given at most grudging acceptance, I would not recommend doing this at all. Benwing2 (talk) 19:28, 3 November 2024 (UTC)Reply
It may not logistically work but e.g. "Harris Gaelic" is universally used both by linguists and speakers, and e.g. "Harris Scottish Gaelic" is not a phrase anyone uses. If this change is possible without messing up the coding of the site I would strongly recommend this change.
Latest comment: 10 months ago3 comments3 people in discussion
Google Ngrams shows incomparable to be over 400 times more common than uncomparable. I appreciate that incomparable is often used with the meaning "beyond compare", rather than "not comparable" but it is trivially easy to find examples of it in use in terms like incomparable adjective and incomparable adverb, while Ngrams doesn't even register uncomparable adjective when I try to compare them (). While it's certainly possible to find some examples of uncomparable being used as a grammatical term (), results for incomparable are much more numerous ().
For some reason, our entry at incomparable had had the "not comparable" sense marked as "rare" by an old admin since 2008 (), so I think the current situation stems from their misconception that uncomparable was the proper term when talking about grammar, which does not seem to be the case. Theknightwho (talk) 01:51, 4 September 2024 (UTC)Reply
Comment: "incomparable adjective" (presumably pronounced "incompárable") sounds wrong to me because of the clash with "incomparable" pronounced "incómparable". Benwing2 (talk) 22:47, 15 September 2024 (UTC)Reply
Latest comment: 6 months ago2 comments2 people in discussion
The categories do not contain citations per se, but Citations: namespace pages. There are plenty of citations in the main namespace, which are not in these categories.
Latest comment: 7 months ago3 comments3 people in discussion
There are currently two different category trees for taxonomic names
Category:Taxonomic name is a variety of Category:Translingual language (language code mul-tax) that's based on the concept of taxonomic nomenclature as a language: its members are all names in the standard taxonomic nomenclatural systems. Aside from the names of viruses, taxonomic nomenclature is basically a rather artificial construct formed from New Latin. It's new and was made possible by the expansion of the capabilities of language varieties a.k.a etymology-only languages. Right now it's a redlink, and I haven't figured out how to get {{auto cat}} to recognize it as valid.
Besides which, the name is kind of silly, since no one uses it to refer to taxonomic names as a group or a system, let alone a language. It would have been better named "Category:Taxonomic names", but the name has been taken by the following.
Category:Taxonomic names is a name category, part of the topical category system, and based on the concept of taxonomic names being something that different languages have. This system has been developed over the years in an ad hoc fashion, so it's a real mixed bag.
Prescriptively, "taxonomic name" refers only to names that are part of past and present systems used to create and manage the official names of biological taxonomic entities. All the names in specific languages as opposed to "translingual" or "multilingual" terms shared across all languages can't be referred to that way. That's because modern taxonomy is done by scientists acting as taxonomists, who have all agreed to abide by the taxonomic codes that dictate what valid taxonomic names are.
The language-specific subcategories mostly violate that. Many of their members are just adapted borrowings of the "real" taxonomic names, equivalent to "cervids" in English from the family Cervidae, which are deer. Others are vernacular names from within the languages that coincide with the taxonomic entities referred to by the official taxonomic names.
Remove all the entries from subcategories that aren't either taxonomic names in the strict sense or derived from taxonomic names in the strict sense.
Move the "terms derived from taxonomic name" categories to "terms derived from taxonomic names" categories under the new Category:Taxonomic names
Convert the language-specific categories to "terms derived from taxonomic names" categories and move them under the new Category:Taxonomic names, or merge them with any of the categories in the previous step if they would have the same name.
Either that, or leave them as they are, but move them under the "terms derived from taxonomic names" categories in the new Category:Taxonomic names and also make them children of the Category:Eponyms by language subcategories
It seems like a good program to rationalize what has grown a bit like Topsy, fitting in to the existing framework, however awkwardly. I'll look at the individual-entry category cleanup today and report back if there are any problems.
I think it is clear that "Translingual" is too big a wastebasket. Segregating taxonomic entries and CJKV characters would leave us with a much smaller wastebasket, itself to be rationalized eventually. DCDuring (talk) 16:03, 15 December 2024 (UTC)Reply
This is a relic that seems to predate even the old system of catboiler templates- it has no templates, just hard-coded category markup. That has allowed this to be overlooked while categories such as CAT:Cuneiform script have been created using our regular category infrastructure and employed in all of the more recently added cuneiform entries. Judging by the rfcs archived on the talk page, the entries that were still in this category (now removed) were mostly there because they've been neglected. Also, two of the subcategories are only used in a single entry: 𒌋Chuck Entz (talk) 16:13, 11 December 2024 (UTC)Reply
Okay, someone depopulated the category, so that all that remains are three subcats:
@This, that and the other: I'd keep the Cuneiform Syllabary category as a subcategory of Cuneiform script. Those are the cuneiform sings used to spell out words phonetically. I don't think the "Cuneiform Luwian" is a category of its own, but the Hittite syllabary could make sense. I don't do Hittite, though, so I can't really tell for sure. — Sartma【𒁾𒁉 ● 𒊭 𒌑𒊑𒀉𒁲】15:55, 22 December 2024 (UTC)Reply
Please note there was a previous discussion which is archived at “Category talk:Nautical”. I had previously suggested “Category:Water transport” to align the name with “Rail transport” and “Road transport”, but it didn’t gain sufficient consensus. I’m not sure “Nautical terms” is a good alternative; we don’t use the word terms in labels. — Sgconlaw (talk) 23:46, 7 January 2025 (UTC)Reply
What exactly is this category supposed to be useful for? It has more than 3000 entries in just about every language imaginable. I think it just adds clutter to the category list at the bottom of entries. This, that and the other (talk) 00:53, 25 January 2025 (UTC)Reply
I can see two possible uses: to let people who don't add explanations know that everyone will have a way to find out about it, or to allow someone to go through the body of explanationless requests for attention and look for problems with the usage of the template. In the first case, that's pretty much canceled out by the fact that most people don't know the category exists. The second runs into the problem of the lack of context- no tagging for who did it or in which types of entries. To understand what's going on, you have to visit the entries.
I suppose it might let one spot bad usage: e.g. that a particular editor is simultaneously creating lots of stubs in a language they don't know and tagging all of them to shift the responsibility for fixing them to others- but one actually would have to take the time to patrol the newer additions to the category. There are so many other ways to look for problems that I've never bothered to even look at this category. Chuck Entz (talk) 02:19, 25 January 2025 (UTC)Reply
Keep. This is potentially useful to draw attention to entries that have content problems of a kind likely only readily noticed by a human and not easily otherwise searchable. A review of such entries that led to correction of the problem, better categorization of the problem, or removal of a spurious template would be a better-than-average contribution to Wiktionary IMHO. DCDuring (talk) 15:07, 21 February 2025 (UTC)Reply
My understanding of the rationale is basically that these seven classes date back to Proto-Germanic or PIE and the verbs have undergone such heavy evolution since that period that the classes are essentially meaningless in English.
Looking from a modern English perspective, I don't see any particular commonalities between verbs of the same "class". Sure, most of the so-called "class 4" verbs have a past tense in -o-, but so does much of "class 1", and there are exceptions. Some of the included verbs (seethe, starve) don't even seem to show any "strong" characteristics in Modern English.
Delete. If someone wants to propose etymology categories, like "English verbs derived from Proto-Germanic class 4 verbs", we can discuss that, but these categories as they presently exist don't seem ... coherent? useful? - -sche(discuss)20:32, 13 February 2025 (UTC)Reply
The title of this request says "Category:English strong verbs and its subcategories", but the body of the request only gives reasons to delete the subcategories. Is the deletion of Category:English strong verbs also being proposed? I don't think the overarching category is meaningless in modern English. Strong verbs as a whole still generally show the characteristic features of vowel change between present versus past or past participle; no dental suffix in the past tense and participle; and (often) addition of the suffix -en or -n in the past participle. As for "seethe" and "starve", the question is whether strong forms of these verbs such as "starven" are attested since 1500, our cutoff date for Modern English: if so, there isn't anything erroneous in principle about including them in a category of Modern English strong verbs, even if the strong forms are not used in present-day English.--Urszag (talk) 22:38, 13 February 2025 (UTC)Reply
@Urszag Even w.r.t. "strong" and "weak" verbs, this distinction is very muddled in English. Etymologically, meet/met is a weak verb whereas bite/bit and hold/held are strong verbs, but there is little to indicate this in modern English; the only remnant of the -en past participle is alternative past participle bitten and the semi-archaic adjective beholden. As for light/lit, I have no idea whether it's etymologically strong or weak, and there's no obvious indication of either in the modern language. Meanwhile you have dive/dove, shine/shone, sneak/snuck and others that are etymologically weak verbs but have gotten a strong-like past tense by analogy to other verbs. Overall I think it makes more sense to simply distinguish "regular" and "irregular" verbs, as most grammatical books on English do. Benwing2 (talk) 22:56, 13 February 2025 (UTC)Reply
Delete only subcategories but keep the main category of strong verbs as a subcategory of Category:English irregular verbs. I'll admit that English vowel shifts and inconsistent stem generalization were not kind to the original strong subclasses. However, I still firmly believe that strong verbs with their characteristic ablaut and -en participles are still a synchronically distinguished subclass of irregular verb. The fact that secondary transfers to strong verbs like snuck an dove even occur in the first place indicate that yes, English speakers still distinguish strong verb ablaut as a distinctive subkind of irregularity. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 06:55, 9 March 2025 (UTC)Reply
Hmm are we sure there is a clear line between strong and other irregular verbs? Example: tell/told (etymologically weak) vs. hold/held (etymologically strong). I would say that examples like snuck and dove (and more recent ones like dranken and tooken, both of which I've heard in the wild) just show that English speakers are willing to generalize from common irregular patterns (much as how German speakers extracted the umlaut+-er plural formation for neuters from a handful of Old High German examples). In particular, catch/caught was originally an analogical formation based on inherited teach/taught, which is irregular but etymologically weak. Benwing2 (talk) 08:05, 10 March 2025 (UTC)Reply
Told is obviously not strong given the weak -d suffix there vs. the present stem. And to me I see that there are separate strong and weak irregular patterns to even generalize in the first place (strong snuck + dove + dranken + tooken with a vowel change and/or -en suffix, no damage to the coda of the root, and no extra coronal stop tacked on the end), and irregularity like catch and teach (where the entire rhyme of the root is torn off with the root-final consonant never to be seen again, unlike strong verbs that keep the root's coda or lack of one around). I still see English strong verbs having a distinctive manifestation of irregularity compared to other irregulars with different behaviour. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 21:58, 10 March 2025 (UTC)Reply
OK how about meet/met? This is etymologically weak but has no clear indication of this anymore and has apparent vowel ablaut, and you'd have to be pretty well-versed in English historical linguistics to know why this is weak and not strong. Benwing2 (talk) 22:05, 10 March 2025 (UTC)Reply
meet, feed, bleed, breed are characterized by /i/ to past /ɛ/ to participle /ɛ/ "ablaut". Certainly they stand out from the actual old-style strong verbs? I can't think of any historically strong verbs that ended up with this ablaut. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 22:46, 10 March 2025 (UTC)Reply
What I'm saying is that they "stand out" only because you know the phonological history of English. Someone who doesn't know the difference between ablaut and umlaut, for example, won't have any idea why the vowel change of meet -> met is classified differently from the vowel change of bite -> bit. You might say "well the latter has past participle bitten" but in many people's speech, the past participle of bite is bit not bitten. How about light -> lit? Is this weak or strong (and why do we even care)? My point is that from a synchronic perspective things are too mixed up to make a cogent strong/weak distinction. Benwing2 (talk) 23:15, 10 March 2025 (UTC)Reply
Meet/met actually still does not require etymological information; /i/-/ɛ/-/ɛ/ alternation characterizes verbs like sweep. From sweep and meet one can see a class of iC - ɛCt - ɛCt verbs where the -t is suppressed after an alveolar stop. And on "Is this weak or strong (and why do we even care)?" I am not intending to classify English irregular verbs in a dichotomy of strong or weak; I do not believe there is a unified "irregular weak verb" class. But I do still believe that there are subcategories of the irregular verbs with distinct formation patterns that we can sort verbs in, even without saying "weak verb". — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 23:51, 10 March 2025 (UTC)Reply
I still believe that a strong verb is redefinable synchronically: as a verb that simply, without any additional endings, forms its past tense by exclusively vowel ablaut (forget about the -en), except for the iT-ɛT-ɛT ablauting group (since its vowel alternation resembles the iC-ɛCt-ɛCt e.g. sweep group so closely that iT-/ɛT-/ɛT can be seen as a variant of iC-ɛCt-ɛCt with conditionally suppressed -t). And re: light, I believe you are assuming that I would not classify it as strong because of its earlier history. However, if the diachronic definition of "strong verb" is replaced with a synchronic definition, I would classify both light and bite as strong verbs; as we allowed earlier, it would not be the first or last English verb to secondarily acquire strong conjugation. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 02:53, 11 March 2025 (UTC)Reply
March 2025
Pasadena-related categories (re-creation)
Latest comment: 4 months ago7 comments4 people in discussion
The places category formerly contained Annandale, Oak Knoll and Linda Vista, all neighborhoods within the city, but they were removed from the category inexplicably by the editor who wanted to delete the categories. The Pasadena category, in addition to containing places, also contained the entry Rose City; again, this was inexplicably removed by another editor. Having a category for places or neighborhoods within a city is standard; there are many of them and there was no reason for deletion. "Empty cat" is not a reason either because there clearly exist entries to populate the cat. Would also note being malformed isn't a reason for deletion either; malformed categories should be FIXED, not deleted. And I'm flummoxed as why creation-protection was invoked on an entry that had only been created once. Purplebackpack8912:31, 22 March 2025 (UTC)Reply
@Purplebackpack89 The immediate reason for the deletions was no doubt the module error resulting from its absence from the modules. The relevant modules have been substantially reworked, so that may not have been true until recently. As for why there's no data for your categories in the modules: I notice that we have Category:Long Beach, not Category:Long Beach, California, even though Wikipedia has Long Beach, California and Long Beach, New York. There might be some time in the future when we'll have to figure out how to deal with homographic city categories, but that's something that can be discussed. I also notice that there are categories for only the top 50 cities in the US by population. I can only guess as to whether either of those points to the actual reason to not allow your categories, but I didn't see you asking what might be wrong with the categories. I've created a lot of categories over the past decade and a half, and some have been deleted, renamed, or substantially changed. I would never even consider undoing any of those actions without a discussion. Chuck Entz (talk) 20:21, 22 March 2025 (UTC)Reply
Are there things wrong with the categories? Possibly, maybe even probably. Was deletion the remedy? I would say "no", because we would be better served by fixing them or renaming them rather than deleting them. At very least, discussion first, deletion afterward, creation-protection WAY afterward.
As for "we only have categories for the top 50 cities", shouldn't we allow creation of a category for any city that has enough entries to fill it? Purplebackpack8920:36, 22 March 2025 (UTC)Reply
We don't normally have categories for cities under 1,000,000 people or so and Pasadena has < 200,000. On top of that, the categories were created manually so they don't fit into the {{auto cat}} system. Pasadena is only the 45th biggest in California per Wikipedia; we can't reasonably create categories for every random city in the US. Benwing2 (talk) 22:12, 22 March 2025 (UTC)Reply
We CAN and SHOULD create categories for every city that has three or more entries related to it. There's zero problem with doing that, like, AT ALL. Purplebackpack8922:55, 22 March 2025 (UTC)Reply
Three is a poor baseline number for a category. It will lead to excessive proliferation of categories and an unmanageable and unnavigable category structure. I would think around 20 CFI-compliant place names would be a bare minimum for a set category of this kind. WT:Categorization could and should offer some guidance here, but it currently doesn't. This, that and the other (talk) 06:46, 24 March 2025 (UTC)Reply
Latest comment: 3 months ago1 comment1 person in discussion
I noticed that most languages on Wiktionary start suffixes with "-" but there are some suffixes listed in Telugu that don't start with "-". I think this is because Charles Philip Brown's A Telugu-English dictionary doesn't distinguish root morphemes from affixes with the dash and those entries were just copied into Wiktionary. Some lemmata, like లో, can be used both as a word and a suffix and still keep the same meaning, making this distinction a bit blurry, but other ones, like వు, aren't entire words. Can @Emmanuel Asbon, or @Rajasekhar1961, or anyone else familiar with these pages confirm that this was intentional or if we should move the affixes to indicate that they aren't true words in their own right? A handful of Category:Telugu prefixes also have the same issue. Anivegesana (talk) 02:14, 12 April 2025 (UTC)Reply
If you search for CJKV characters on Google... this category page is the first result! Clearly few or no people outside the wikiverse are using this term. (Also note the odd capitalisation.)
/ʊ/ is not phonemic in Dutch, and the only page on it (WW, which I corrected) has a pronunciation that does not match the rhyme (it actually rhymes with /-eː/, and it has two syllables, not six — and the full word, Werkloosheidswet, has four syllables).
As of now, the English category only has 174 entries, so it doesn't seem like it needs to be diffused for the purpose of navigation, but I certainly wouldn't object if you can identify a subset of kinds of vessels that have a sufficient distinction from other such vessels. —Justin (koavf)❤T☮C☺M☯06:00, 10 June 2025 (UTC)Reply
Support as there are more than enough to make a category, and many aren't even categorized currently. I also think Category:Vessels should potentially be renamed, as it's easily confused with the other meaning of "vessel" as a type of transportation and most people aren't even clear on the distinction between "vessel" and "container". I'm not sure though what to call it, maybe "food and drink containers"? Benwing2 (talk) 19:27, 10 June 2025 (UTC)Reply
True, re "vessel" being a confusing name. There are currently some things in "Vessels" that are not food/drink containers, e.g. the fuel flimsy, and there are some things in "Containers" that are drink containers (bottle, maybe ghurra). I have no objection to renaming "vessels" to "food and drink containers" (a subcategory of "containers"), and then this alcoholic-drink-container category, if created, could be a subcategory of "food and drink containers". - -sche(discuss)01:01, 15 June 2025 (UTC)Reply
What precisely do we want the scope of the category to be? I was thinking of the way coupes vs flutes vs Nick&Noras (etc) have specific shapes, but e.g. dead marine (currently in "CAT:en:Containers") is arguably an "alcoholic drink container"... do we want a category only for specific shapes (what could it be named?), or for any "alcoholic drink containers"? - -sche(discuss)01:01, 15 June 2025 (UTC)Reply
@-sche: Swerving off the immediate topic, I would say we would want "Drinking containers" (the "-ing" is to exclude things like pitchers and milk cartons) as the parent. It seems odd to include things like steins and martini glasses together, and there are non-alcoholic drinks like maté that have their own special drinking containers. I wonder if there are similar material-cultural complexes in Islamic societies that don't allow alcohol. Chuck Entz (talk) 02:03, 15 June 2025 (UTC)Reply
Proposal: "English diaeretic spellings" category
Latest comment: 1 month ago4 comments4 people in discussion
I'm proposing creating a category for handling the "New Yorker style" spellings - coöperate, reënter, reäppearance - where a diaeresis is used between consecutive vowels in compounds and prefixed terms and conventional spelling forms would be closed (cooperate) or hyphenated (co-operate). I think Category:English diaeretic spellings would be useful in the same way that Category:Oxford spellings is - these are forms that are correct in the house style of a few significant and respected publishers, but very rare or incorrect in all contemporary standard Englishes, so it makes sense to flag them specially and group them in one place.
This category would not include loan words like doppelgänger (which is umlaut, not diaeresis) or naïve (which is not a compound and cannot be written as na-ive, and is more widely accepted than coöperate).
diaeretic: Spelled with a diaeresis. Where a compound or affix results in two vowels being placed consecutively, this can be marked with a diaeresis (for example, coöperate). This practice is archaic or obsolete in all standard forms of English, but is maintained by a few publishers (perhaps most notably, The New Yorker). In standard usage, these terms would usually be spelled closed (cooperate) or with a hyphen (co-operate).
"diaeretic" is a rare term, but I think linking to the glossary should be enough explanation. Calling it "New Yorker spelling" is an alternative (like "Oxford spelling" as shorthand for "'Oxford English Dictionary spelling") but historically this spelling was used much more widely (spellings like zoölogy were once standard in educated writing) so it feels weird to call out one specific publication.
I think this should be a fairly uncontroversial change, but I just wanted to run it past everyone because it would involve quite a few edits, and we'd need to be clear about edge cases like aërial and zoölogy - my personal feeling is that aërial would not be included, since it occurs within a stem, but zoölogy would be because it's on the boundary between two stems, even though zo-ology is extremely rare and would now likely be interpreted as zoo- + -logy. (FWIW, the New Yorker stopped using aërial in the 1930s, and it usually - but not always - uses zoology instead of zoölogy.) Smurrayinchester (talk) 09:29, 10 June 2025 (UTC)Reply
I agree with @Urszag here in everything said. "Diaeretic" sounds like a type of medicine (cf. diuretic) and is going to confuse more than anything else. There appear to be several different reasons why words may be spelled with a diaeresis/umlaut; they would need to be teased out. There's no support currently for adding a tag to "spelled with" categories of a form like Category:English terms spelled with ◌̈ (diaeresis), but it could be added. Benwing2 (talk) 22:24, 10 June 2025 (UTC)Reply
I agree with Chuck Entz' comment that "spelled with diaereses" or "Alternative spelling with diareses" may be best.
"Those two dots, often mistaken for an umlaut, are actually a diaeresis (pronounced “die heiresses”; it’s from the Greek for “divide”)." I, too, am one of those who did not know a distinction between diearesis and umlaut. One way to write this could be "Alternative spelling using a diearesis" Rather than Diaertic spelling. Trying to differentiate the umlauts and diaereses, and having a common parent category for them would be valuable for helping readers understand and study the usage patterns. The effort to classify each word would really help people understand these words much better. It brings to mind Lüeyang and similar, which I believe has an umlaut. My guess is that umlaut spellings are more likely to be printed today than diaeresis spellings? Geographyinitiative (talk) 22:48, 10 June 2025 (UTC)Reply
@Benwing2 those names are fine by me, but I'd prefer if there was a shortcut to easily search for Afghan-Uzbek without typing in such a long name... but if that's not possible then I suppose I'll just deal with it. — BABR・talk18:43, 12 June 2025 (UTC)Reply
@Babr We could potentially put a category like Category:Uzbek terms in Afghan-Uzbek Arabic script underneath CAT:Afghan Uzbek, if that makes logical sense. I don't know if something similar could be done for Yangi Imlo, as I'm not very familiar with Uzbek dialects. Also, IMO the names need to include the language and script in them for clarity. Maybe the names could be restructured to have Yangi Imlo and such at the beginning but that definitely makes the category tree code trickier; if we were to do that, it should be done generally so that we maintain consistency in naming. Benwing2 (talk) 20:36, 12 June 2025 (UTC)Reply
Related Romanian issue
This is off the immediate topic, but while trying to think whether the Uzbek situation was analogous to—and thus could be named similarly to—anything else, I noticed two things: Category:Oxford spellings is a subcategory of Category:British English, when maybe it should be a subcategory of Category:British English forms? And similar to the situation with Uzbek, Category:Romanian Cyrillic spellings does not currently differentiate "pre-1860s" Romanian spellings like нє, and "post-1930s" / Moldovan spellings like вис, so if we start subcategorizing different Uzbek orthographies, let's also consider subcategorizing Romanian ones (and perhaps we can find better labels for the Romanian orthographies; both dates are suboptimal), and consider whether to switch Romanian to the "Language terms in Foo script" category system (rather than "Language Foo spellings"). - -sche(discuss)05:10, 11 June 2025 (UTC)Reply
Better yet, we should have a policy that excludes the ‘archaic Romanian Cyrillic script’. It is a vastly inconsistent and unstandardised writing system used across wide time intervals, there is no lexicographical precedent to including it, and there is no benefit in bothering with it. The handful of entries we have in it should be deleted.
I heartily disagree with excluding the archaic Romanian Cyrillic spellings. If they're attested, they should be included. We aren't paper; there's no reason to prevent an interested user from finding the word they're looking for, just because it's in "a vastly inconsistent and unstandardised writing system used across wide time intervals". —Mahāgaja · talk10:53, 11 June 2025 (UTC)Reply
Categories related to hanafuda and karuta
Latest comment: 1 month ago5 comments3 people in discussion
Japanese and Korean have a bunch of terminology related to hanafuda (a type of playing cards, also known as hwatu), some of which have been added to Wiktionary. A dedicated hanafuda category would probably be useful, to make such terms easier to find and organize. So I suggest adding the following to Module:category tree/topic/Games:
labels = {<br/>
type = "related-to",<br/>
description = "=] playing cards, also known as ]",
parents = {"card games"},
}
I've noticed that Module:category tree/topic/Games already includes a "Go-Stop" category, which is the most popular hanafuda game in Korea. This category seems to currently be entirely unused (or at least, Category:Go-Stop, Category:ko:Go-Stop and Category:en:Go-Stop all contain no pages), and I think any Go-Stop-specific terminology probably wouldn't be out of place in a more general hanafuda category, so I'd suggest removing the Go-Stop category from the module. But if we're keeping it, we should change its "card games" parent category to "hanafuda".
Lastly, I noticed that there is a "karuta" category, which has some problems. Karuta are a category of card games of Japanese origin that include hanafuda, but in common parlance the term often refers specifically to the game of uta-garuta, and right now it's ambiguous which of the two meanings of the word the category is about. (Currently, all terms in Category:ja:Karuta seem to be related to uta-garuta specifically.) We should probably either unambiguously make the category about uta-garuta by changing its description and perhaps rename it, or move the terms to a new uta-garuta-specific category, and have "karuta" be the parent category of both "uta-garuta" and "hanafuda". I'm not sure how useful it is to have such a parent category including both, as the two classes of games are quite different and don't seem to have a lot of terminology in common. As far as I know, other types of karuta games aren't nearly as popular, and any terminology for those games could maybe fit well enough under the "card games" label. But I'm not sure what would be better. Spenĉjo (talk) 00:20, 17 June 2025 (UTC)Reply
@Spenĉjo I moved your request to WT:CLTR, where these sorts of requests are normally handled (the Grease pit is for technical issues, and the Beer parlour is for policy issues). I don't know anything about these games, but your request sounds reasonable. How many terms currently exist in Wiktionary related to hanafuda? Generally we'd want there to be at least 10 terms in at least one language for it to make sense to have a dedicated category. As for the specific topics, I think it makes the most sense to have two categories for hanafuda and uta-garuta that are directly under card games, rename the existing Category:ja:Karuta to Category:ja:Uta-garuta, and remove the Go-Stop topic. Benwing2 (talk) 05:06, 17 June 2025 (UTC)Reply
Thank you for moving it to the right place, I'll remember it for the future. I can currently find the following 18 Japanese hanafuda terms on English Wiktionary: 親, 子, かす, 猪鹿蝶, 脱衣花札, 立直, 松, 梅, 桜, 藤, 菖蒲, 牡丹, 萩, 芒, 菊, 紅葉, 柳, 桐. In Korean, I only managed to find one: 풍. (It is currently the sole entry in Category:ko:Karuta, but it could easily be moved to Category:ko:Hanafuda.) Japanese Wikipedia also has a list with dozens more hanafuda terms at 花札#用語, of which I was considering adding some of the most notable ones to Wiktionary. Spenĉjo (talk) 07:20, 17 June 2025 (UTC)Reply
"outdated" or "no longer used" parent category/label of "obsolete", "archaic" and "dated"
Latest comment: 1 month ago13 comments4 people in discussion
@Vininn126 @-sche Pinging a couple of people who may have opinions, but opinions are welcome from all. We currently make a distinction between obsolete, archaic and dated, and have both terms and forms categories for each, with the forms category (e.g. Category:English archaic forms) being a subcategory of the terms category (e.g. Category:English archaic terms), and the terms category a subcategory of terms by usage. Unfortunately this three-way distinction is problematic in many languages because the respective dictionaries don't make such a distinction. For example, Russian dictionaries label all such terms indiscriminately as устар. approximately meaning "outdated" or "no longer in use". For Russian in particular we've followed the principle of using the dated label to translate устар., but I think this gives a misleading impression as many of these terms are not merely dated but are archaic (less likely obsolete; obsolete terms are often left out of dictionaries entirely except for historical-minded dictionaries like Dal's). I'm thinking we can create a label outdated (or maybe no longer used), which is restricted to certain languages (or at least not allowed for English), and is used to translate устар. and similar notations for other languages. The corresponding category would be either Category:English outdated terms or Category:English terms no longer in use or something similar, and it would be the parent of the three obsolete, archaic and dated categories. (Possibly "outdated" is better for this purpose than "no longer used" because archaic and especially dated terms are indeed sometimes still used.) Thoughts? (BTW Happy Juneteenth; we had a major parade through our neighborhood this morning.) Benwing2 (talk) 21:36, 19 June 2025 (UTC)Reply
The solution is for editors to use their own judgement. And to learn what we mean by dated, archaic and obsolete, of course. ―K(ə)tom (talk) 21:49, 19 June 2025 (UTC)Reply
The problem is that this is impossible for someone like me who is not a native Russian speaker. This assumes that all editors are native speakers of the language they're editing, which is not in practice the case. Benwing2 (talk) 22:26, 19 June 2025 (UTC)Reply
And even for native speakers, they are unlikely to be sufficiently familiar with older literature and poetic language, etc. to be able to make such a judgment. That's why we rely on dictionaries to tell us this. Benwing2 (talk) 22:31, 19 June 2025 (UTC)Reply
Speaking of dictionaries: in my experience, even the best and most modern are so behind the times as to not tag terms which are dated (by our standards) in any way. When a dictionary tells you a word is archaic/obsolete, then you know it’s for real. ―K(ə)tom (talk) 22:43, 19 June 2025 (UTC)Reply
I also have some thoughts on the way the affective character of the ’dated’ and ‘archaic’ labels lead to their failing to cover and accurately describe the totality of not-entirely obsolete vocabulary, making necessary the use of {{lb|now|uncommon}} and possibly its spinning off onto a category separate from ‘uncommon terms’—but that’s probably just an entirely personal issue. ―K(ə)tom (talk) 21:56, 19 June 2025 (UTC)Reply
Polish dictionaries are not always regular with this either. WSJP more matches our system, where PWN and Doroszewski uses just daw. I prefer having a three-way distinction because in my opinion there clearly is one, I'd most just change "archaic" to "archaicizing". Vininn126 (talk) 08:19, 20 June 2025 (UTC)Reply
I understand you want to preserve the 3-way system, but what do you do when it just says 'daw.' or 'устар.'? Do you just put "archaic" and hope for the best? That seems questionable. Benwing2 (talk) 08:34, 20 June 2025 (UTC)Reply
I'll add also that with older dictionaries it's easier, as they tend to be obsolete; also that I do see the benefit of a catch-all, not necessarily because of other dictionaries, but rather due to editors themselves poorly distinguishing them. Perhaps a two-way distinction, outdated vs archaicizing. Vininn126 (talk) 09:05, 20 June 2025 (UTC)Reply
You are lucky to have a dictionary that makes the distinctions ... I haven't yet come across a single Russian dictionary that distinguishes them. Benwing2 (talk) 17:40, 20 June 2025 (UTC)Reply
@Vininn, if we were going to have just a two-way distinction, wouldn't it be better to group "(out)dated" and "archaic(izing)" together on one side, and have the other side be "obsolete"="no longer used at all"? It seems easier to ascertain (through RFV if needed) "do citations more recent than such-and-such cutoff date exist?", and thus whether the term is on the "obsolete/unused" side of the line or not, than to ascertain whether a still-used term's connotations are merely dated / outdated or are fully archaic / archaicizing. @Benwing, what if we add a label but treat it as a cleanup category that a native speaker will ideally come along and specify as either dated or archaic, maybe even have the label display "dated or archaic"? (I'm not sure what to have the label input itself or the category name be; should that too just be "dated or archaic"?) To avoid T:orthographic borrowing- and smoug-type problems (that whenever something exists for one language, people who don't know why it exists try to invent ways to apply it to every other language), perhaps we only allow it for certain languages? - -sche(discuss)06:14, 21 June 2025 (UTC)Reply
I'm not sure about that, as archaicizing is an active part of the language, whereas dated/obsolete are more "passive", if that makes sense. What I mean is that dated/obsolete are indeed on a sliding scale, whereas archaicizing is a separate process (that is a word being used intentionally to evoke an old sound, which is not the same as a word falling out of use). Vininn126 (talk) 06:55, 21 June 2025 (UTC)Reply
The Ottoman Empire label references the Near East label which doesn't seem to exist. I think it should be updated to Ancient Near East, but the page is edit protected. — NaomiAmethyst00:54, 21 June 2025 (UTC)Reply
Latest comment: 1 month ago1 comment1 person in discussion
Category:English short forms is generated by {{short for}}, while Category:English shortenings is the parent category of "short forms" and all other sorts of abbreviations, ellipses and the like. The problem is that "short forms" is completely ill-defined; all members of this category need to be cleaned up and categorized properly as an ellipsis, clipping, acronym, abbreviation or other type of shortening. I propose simply dumping anything coming out of {{short for}} into the parent category (and creating an abuse filter warning people against new uses of {{short for}}, but that's a different matter). Benwing2 (talk) 04:16, 26 June 2025 (UTC)Reply
Adding a label for dubbing/voice-over
Latest comment: 13 days ago4 comments3 people in discussion
labels={type="related-to",description="], the replacement of a voice part in media",parents={"film","television","video games"},--not sure if we put it under each one or just "mass media"; also just realized there isn't a topic cat for "translation" :/--}
And the corresponding label in Module:labels/data/topical, although I have the perms to create this one myself:
@Sgconlaw: hmmmm. For dubbing it would be fine, but I don't think things like dubber and voice actor would fit so well. I actually proposed this thinking of languages that have a bigger dubbing culture and thus have more specific terms, like anel and loop in Brazilian Portuguese. Trooper57 (talk) 16:53, 29 June 2025 (UTC)Reply
Latest comment: 6 days ago17 comments5 people in discussion
I wonder if the words given with the {{suf}} or {{suffix}} template can automatically categorize redirect to the "X terms derived from X" category. This would help me fill in the unstated part in the pie chart I’m working on. – BurakD53 (talk) 03:49, 18 July 2025 (UTC)Reply
This seems like a completely ordinary way in which words are formed within a language. In fact, I would even hazard that this is the most common way (compared to borrowing, inheritance, etc). Why does it have to be specifically categorized? — Sgconlaw (talk) 21:37, 19 July 2025 (UTC)Reply
Aren't categories there to make it easier for people to find what they're looking for? The suffix template doesn't link to suffixed lemmas. What's the point? Why isn't there a template list? İnherited template links to inherited terms, borrowed terms link to borrowed terms, and derived so on, but what about suffixed terms? It does not have to be X terms derived from X. It can be just 'X suffixed terms' which is very useful for anyone. - BurakD53 (talk) 07:35, 20 July 2025 (UTC)Reply
Anyway, if you think this is not reasonable, just don't do it, I can make it myself, I will do it myself by collecting the suffixes own categories, even though it will be an unnecessary occupation. I was just asking for a basic category, but OK. - BurakD53 (talk) 07:46, 20 July 2025 (UTC)Reply
I think what Burak wants is to make the "unstated" section in charts like this one to reflect the origin of the base of derivation, so that the etymological make-up of a language is not obscured by the big majority morphologically derived terms. If the base of the derivation ultimately goes back to a language that is set as an ancestor, it should be included into the "inherited" (or, more correctly, native) vocabulary, whereas if the base of the derivation is ultimately any other language, it should be included into the other languages that are already counted among underived items (in case of Turkobaijani - Arabic or Persian, mostly). Allahverdi Verdizade on a flying visit (talk) 07:34, 21 July 2025 (UTC)Reply
I thought about that. But wouldn't it be difficult to do so? How could we extract the root of the word like selamlamak using a suffix template? We should indicate the etymology of selam too with another template, which requires edits to all suffixed pages. BurakD53 (talk) 13:47, 21 July 2025 (UTC)Reply
Also, unless we're careful to exclude {{surf}} (and to make sure it's always used when it should be), a lot of terms will be counted twice. —Mahāgaja · talk21:10, 21 July 2025 (UTC)Reply