This is the documentation page for the generic category tree system, as well as for its submodules. Collectively, these modules handle generating the descriptions and categorization for all category pages which are automated as part of the category tree system (i.e. all categories except those that use manual wikicoding). All pages handled by the category tree should use {{auto cat}}
as the text of the category page. (In some cases, parameters need to be specified to {{auto cat}}
; see its documentation for details.)
For historical reasons, the generic category tree implementation is split between Module:category tree and Module:category tree/poscatboiler, but these two modules will eventually be merged. (Originally, Module:category tree was a generic module in the object-oriented tradition with many implementations, of which Module:category tree/poscatboiler was only one. It originally handled part-of-speech categories like Category:French nouns and Category:German lemmas, and corresponding "umbrella" categories such as Category:Nouns by language and Category:Lemmas by language; hence the name.)
The main data module at Module:category tree/data does not contain data itself, but rather imports the data from the category tree submodules, and applies some post-processing.
subpages
list at the top of Module:category tree/data.The category tree system internally distinguishes the following types of categories:
LANG LABEL
(e.g. Category:French lemmas and Category:English learned borrowings from Late Latin). Here, LANG
is the name of a language, and LABEL
can be anything, but should generally describe a topic that can apply to multiple languages. Note that the language mentioned by LANG
must currently be a regular language, not an etymology-only language. (Etymology-only languages include lects such as Provençal, considered a variety of Occitan, and Biblical Hebrew, considered a variety of Hebrew. See here for the list of such lects.) Most language categories have an associated umbrella category; see below.LABEL by language
, and group all categories with the same label. Examples are Category:Lemmas by language and Category:Learned borrowings from Late Latin by language. Note that the label appears with an initial lowercase letter in a language category, but with an initial uppercase letter in an umbrella category, consistent with the general principle that category names are capitalized. Umbrella categories themselves are grouped into umbrella metacategories, which group related umbrella categories under a given high-level topic. Examples are Category:Lemmas subcategories by language (which groups umbrella categories describing different types of lemmas, such as Category:Nouns by language and Category:Interrogative adverbs by language) and Category:Terms derived from Proto-Indo-European roots (which groups umbrella categories describing terms derived from particular Proto-Indo-European roots, such as Category:Terms derived from the Proto-Indo-European root *preḱ- and Category:Terms derived from the Proto-Indo-European root *bʰeh₂- (speak)). The names of umbrella metacategories are not standardized (although many end in subcategories by language
), and internally they are handled as raw categories; see below.
LANG phrasebook/AREA
(e.g. Category:English phrasebook/Health), whose umbrella category has the nonstandard name Phrasebooks by language/AREA
(e.g. Category:Phrasebooks by language/Health). Another example is categories of the form LANG terms borrowed back into LANG
, with a nonstandard umbrella category Category:Terms borrowed back into the same language. Both of these examples are handled by disabling the standard umbrella category support and listing the nonstandard umbrella category as an additional parent.by language
suffix; an example is Category:Terms borrowed from Latin, which groups categories of the form LANG terms borrowed from Latin
. There is special support for umbrella categories of this nature, so they do not need to be handled as described above for umbrella categories with nonstandard names.LANG LABEL
as regular language categories, but with the difference that the label in question applies only to a single language, rather than to all or a large group of languages. Examples are Category:Belarusian class 4c verbs, Category:Dutch separable verbs with bloot, and Category:Japanese kanji by kan'yōon reading. For these categories, it does not make sense to have a corresponding umbrella category.Under the hood, the category tree system distinguishes two types of implementations for categories: individual labels (or individual raw categories), and handlers. Individual labels describe a single label, such as nouns
or refractory rhymes
. Similarly, an individual raw category describes a single raw category. Handlers, on the other hand, describe a whole class of similar labels or raw categories, e.g. labels of the form learned borrowings from SOURCE
where SOURCE
is any language or etymology language. Handlers are more powerful than individual labels, but require knowledge of Lua to implement.
A sample entry is as follows (in this case, found in Module:category tree/lemmas):
labels = {
description = "{{{langname}}} terms that give attributes to nouns, extending their definitions.",
parents = {"lemmas"},
umbrella_parents = "Lemmas subcategories by language",
}
This generates the description and categorization for all categories of the form "LANG adjectives" (e.g. Category:English adjectives or Category:Norwegian Bokmål adjectives), as well as for the umbrella category Category:Adjectives by language.
The meanings of these fields are as follows:
description
field gives the description text that will appear when a user visits the category page. Here, {{{langname}}}
is automatically replaced with the name of the language in question. The text in this field is also used to generate the description of the umbrella category Category:Adjectives by language, by chopping off the {{{langname}}}
and capitalizing the next letter.parents
field gives the labels of the parent categories. For example, Category:English adjectives will have Category:English lemmas as its parent category, and Category:Norwegian Bokmål adjectives will have Category:Norwegian Bokmål lemmas as its parent category. The umbrella category Category:Adjectives by language will automatically be added as an additional parent.umbrella_parents
field specifies the parent category of the umbrella category Category:Adjectives by language (i.e. the umbrella metacategory which this page belongs to; see #Concepts above).The following fields are recognized for the object describing a label:
parents
name
and sort
. In the latter case, name
specifies the parent label name, while the sort
value specifies the sort key to use to sort it in that category. The default sort key is the category's label.breadcrumb
setting, as described below.)raw = true
to specify that the parent is a raw category.lang = code
to specify that the parent is a language category for the language code code
instead of the current language. Note that template substitutions happen in the lang
field; see #Template substitutions in field values below.lang = false
to specify that the parent is an umbrella category.is_label = true
and lang = code
to specify that the parent is a language category with the specified language code code
. Template substitutions happen in the lang
field, as above.is_label = true
and lang = false
to specify that the parent is an umbrella category.Category:
it is interpreted as a category outside the category tree system. It can still have its own sort key as usual.sc = script_code
to specify a script code script_code
for script-specific categories (e.g. Category:Pali nouns in Thai script) and/or args = {...
} to specify additional arguments, for categories implemented using a handler that accepts or requires additional arguments passed to {{auto cat}}
(e.g. a category like Category:Latin terms suffixed with -inus or Category:Okinawan language). Template substitutions happen in the values of both of these properties; see #Template substitutions in field values below.description
additional
field described below, and put {{wikipedia}}
boxes in the topright
field described below so that they are correctly right-aligned with the description. Template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
will be expanded appropriately; see #Template substitutions in field values below.breadcrumb
name
and nocap
. In the latter case, name
specifies the breadcrumb text, while nocap
can be used to disable the automatic capitalization of the breadcrumb text that normally happens.displaytitle
{{DISPLAYTITLE:...}}
magic word (see mw:Help:Magic words). The value of this is either a string (which should be the formatted category title, without the preceding Category:
) or a Lua function to generate the formatted category title. A Lua function is most useful inside of a handler (see #Handlers below). The Lua function is passed two parameters, the raw category title (without the preceding Category:
) and the language object of the category's language (or nil
for umbrella categories), and should return the formatted category title (again without the preceding Category:
). If the value of this field is a string, template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
will be expanded appropriately; see below. See Module:category tree/etymology and Module:category tree/lang/nl for examples of using displaytitle
.topright
{{wikipedia}}
and other similar boxes. Template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
are expanded appropriately, just as with description
; see #Template substitutions in field values below. Compare the preceding
field, which is similar to topright
but used for left-aligned text placed above the description.preceding
description
field. The difference between the two is that description
text will also be shown in the list of children categories shown on the parent category's page, while the preceding
text will not. For this reason, use preceding
instead of description
for {{also}}
hatnotes and similar text, and keep description
relatively short. Template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
are expanded appropriately, just as with description
; see #Template substitutions in field values below. Compare the topright
field, which is similar to preceding
but is right-aligned, placed above the edit and recent-entries boxes.additional
description
field. The difference between the two is that description
text will also be shown in the list of children categories shown on the parent category's page, while the additional
text will not. For this reason, use additional
instead of description
for long explanatory notes, See also references and the like, and keep description
relatively short. Template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
are expanded appropriately, just as with description
; see #Template substitutions in field values below.umbrella
false
to indicate that there is no umbrella category. The umbrella category is normally called "LABEL by language". For example, for adjectives, the umbrella category is named Category:Adjectives by language, and is a parent category (in addition to any categories specified using parents
) of Category:English adjectives, Category:French adjectives, Category:Norwegian Bokmål adjectives, and all other language-specific categories holding adjectives. This table contains the following fields:
name
umbrella = false
and list the nonstandard umbrella category as an additional parent (and add a raw-category entry for the umbrella category itself; see the implementation of categories like Category:English terms borrowed back into English for an example).description
description
field of the category itself by removing any {{{langname}}}
, {{{langcode}}}
or {{{langcat}}}
template parameter reference and capitalizing the remainder. Text is automatically added to the end indicating that this category is an umbrella category that only contains other categories, and does not contain pages describing terms.parents
Category:
prefix), a table with fields name
(the category name) and sort
(the sort key, as in the outer parents
field described above), or a list of either type of entity.breadcrumb
displaytitle
topright
topright
field on regular category pages; see above.preceding
preceding
field on regular category pages; see above.additional
additional
field on regular category pages; see above.toc_template
, toc_template_full
umbrella_parents
parents
subfield of the umbrella
field. This typically specifies a single umbrella metacategory to which the page's corresponding umbrella page belongs; see #Concepts above). A separate field is provided for this because the umbrella's parent or parents always need to be given, whereas other umbrella properties can usually be defaulted. (In practice, you will find that most entries in a subpage of Module:category tree/data do not explicitly specify the umbrella's parent. This is because a default value is supplied near the end of the "LABELS" section in which the entry is found.)toc_template
CODE-categoryTOC
, where CODE
is the language code of the category's language. (If no such template exists, no table of contents bar is displayed. If the category has no associated language, as with umbrella pages, the English-language table of contents bar is used.) For example, the category Category:Spanish interjections (and other Spanish-language categories) use {{es-categoryTOC}}
to display a Spanish-appropriate table of contents bar. (In the case of Spanish, this includes entries for Ñ and for acute-accented vowels such as Á and Ó.) To override this behavior, specify a template or a list of templates in toc_template
. The first template that exists will be used; if none of the specified templates exist, the regular behavior applies, i.e. the language-appropriate table of contents bar is selected.
{{{langcode}}}
(to specify the language code of the category's language) can be used in the template names; see below.false
to disable the table of contents bar.{{got-categoryTOC}}
to display a Gothic-script table of contents bar. This is inappropriate for this particular category, which contains Latin-script romanizations of Gothic terms rather than terms written in the Gothic script. To fix this, the "romanizations" label specifies a toc_template
value of {"{{{langcode}}}-rom-categoryTOC", "en-categoryTOC"
}, which first checks for a special Gothic-romanization-specific template {{got-rom-categoryTOC}}
(which in this case does exist), and falls back to the English-language table of contents template.toc_template_full
toc_template
but used for categories with large numbers of entries (specifically, more than 2,500 entries or 2,500 subcategories). If none of the specified templates exist, the templates listed in toc_template
are tried, and if none of them exist either, the default behavior applies. In this case, the default behavior is to use a language-appropriate "full" table of contents template named CODE-categoryTOC/full
, and if that doesn't exist, fall back to the regular table of contents template named CODE-categoryTOC
. An example of a "full" table of contents template is {{es-categoryTOC/full}}
, which shows links for all two-letter combinations and appears on pages such as Category:Spanish nouns, with over 50,000 entries.catfix
catfix()
function in Module:utilities on this page. The catfix()
function is used to ensure that page names in foreign scripts show up in the correct fonts and are linked to the correct language.
LANG
in pages of the form LANG LABEL
). If the category has no associated language, or if the setting catfix = false
is used, the catfix mechanism is not applied.catfix = false
is used, for example, on the romanizations
label (which holds Latin-script romanizations of foreign-script terms, rather than terms in the language's native script) and the terms with redundant transliterations
labels (which holds pages mentioning terms in the language in question with redundant transliterations). If this is omitted, for example, then pages in Category:Manchu romanizations will show up oriented vertically despite being in Latin script, and pages in Category:Cantonese terms with redundant transliterations will show up using a double-width font despite mostly not being Cantonese-language pages.catfix = "en"
is used for example on categories of the form Requests for translations into LANG
(see Module:category tree/entry maintenance) because these categories contain English pages need translations into a given language, rather than containing pages of that language.catfix
will normally cause that language's table of contents page to display in place of the category's normal language, and setting a value of false
will normally cause the English table of contents page to display. In both cases, this behavior can be overridden by specifying the toc_template
or toc_template_full
fields.|hidden = true
|can_be_empty = true
Arbitrary template invocations can be inserted in the text of description
, parents
(both name and sort key), breadcrumb
, toc_template
and toc_template_full
values, and will be expanded appropriately. In addition, the following special template-like invocations are recognized and replaced by the equivalent text:
{{PAGENAME}}
{{{langname}}}
{{{langcode}}}
en
for English, de
for German). Not recognized in umbrella fields.{{{langcat}}}
Raw categories are treated similarly to regular labels. The main differences are:
raw_categories
table. The key is the full category name (rather than the label name, as in the case of language categories), and the value is a structure much like for language categories.umbrella
and umbrella_parents
fields are unnecessary and do nothing. If you want an umbrella category that groups several related raw categories, you should add the umbrella category yourself as an additional parent (and create a separate entry in the raw_categories
table for this umbrella category).See Module:category tree/modules for an example of a module with several labels and raw categories.
It is also possible to have handlers that can handle arbitrarily-formed labels and raw categories. There are two types of handlers:
lang ###-syllable words
for any lang
and ###
(e.g. Category:English 3-syllable words), and lang learned borrowings from source
for any lang
and source
(e.g. Category:Spanish learned borrowings from Ancient Greek);Rhymes:lang/rhyme
for any lang
and rhyme
(e.g. Category:Rhymes:Polish/ajkɛ).Note that the difference between the two is that label handlers are used for categories prefixed with the language name (and associated umbrella categories, such as Category:3-syllable words by language and Category:Learned borrowings from Ancient Greek by language), while raw handlers are used for arbitrarily-named raw categories. Raw categories may have a language name or code in them (as in the example above), but it generally does not occur as a prefix.
As an example, the following is the label handler for the label terms coined by coiner
(such as Category:English terms coined by Lewis Carroll):
table.insert(handlers, function(data)
local coiner = data.label:match("^terms coined by (.+)$")
if coiner then
return {
description = "{{{langname}}} terms coined by " .. coiner .. ".",
breadcrumb = coiner,
umbrella = false,
parents = {{
name = "coinages",
sort = coiner,
}},
}
end
end)
The handler checks if the passed-in label has a recognized form, and if so, returns an object that follows the same format as described above for directly-specified labels. In this case, the handler disables the umbrella category Terms coined by coiner by language
because most people coin words in only one language.
The handler is passed a single argument data
, which is an object containing the following fields:
label
: the label;lang
: the language object of the language at the beginning of the category, or nil
for no language (this happens with umbrella categories);sc
: the script code of the script mentioned in the category, if the category is of the form lang label in script
, or nil
otherwise;args
: a table of extra parameters passed to {{auto cat}}
.If the handler interprets the extra parameters passed as data.args
, it should return two values: a label object (as described above), and the value true
. Otherwise, an error will be thrown if any extra parameters are passed to {{auto cat}}
. An example of a handler that interprets the extra parameters is the affix-cat handler in Module:category tree/affixes and compounds, which supports {{auto cat}}
parameters |alt=
, |sort=
, |tr=
and |sc=
. The |alt=
parameter in particular is used to specify extra diacritics to display on the affix that forms part of the category name, as in categories such as Category:Latin terms suffixed with -inus (properly -īnus).
For further examples, see Module:category tree/lexical properties, Module:category tree/terms by script or Module:category tree/etymology.
Raw handlers are similar to label handlers in that they also accept a single argument data
, but this object contains only the following fields:
category
: the raw category;args
: a table of extra parameters passed to {{auto cat}}
.Here, there is no language or script object passed in. If there is a language in the category name, it needs to be handled inside of the handler. For example, the following is the raw handler for categories of the form Varieties of lang
:
table.insert(raw_handlers, function(data)
local langname = data.category:match("^Varieties of (.*)$")
if langname then
local lang = require("Module:languages").getByCanonicalName(langname)
if lang then
return {
lang = lang:getCode(),
description = "Categories containing terms in varieties of " .. lang:makeCategoryLink() .. " (regional, temporal, sociolectal, etc.).",
parents = {
"{{{langcat}}}",
{name = "Language varieties", sort = langname},
},
breadcrumb = "Varieties",
}
end
end
end)
Note that if a handler is specified, the module should return a table holding both the label and handler data; see the above modules.
Support exists for labels and handlers that are specialized to particular languages. A typical label such as verbs
applies to many languages, but some categories have labels that are specialized to a particular language, e.g. Category:Belarusian class 4c verbs or Category:Dutch prefixed verbs with ver-. Here, the label class 4c verbs
is specific to Belarusian with a description and other properties only for this particular language, and similarly for the Dutch-specific label prefixed verbs with ver-
. Yet, it is desirable to integrate these categories into the category tree hierarchy, so that e.g. breadcrumbs and other features are available. This can be done by creating a module such as Module:category tree/lang/be (for Belarusian) or Module:category tree/lang/nl (for Dutch), and specifying labels and/or handlers in the same fashion as is done for language-agnostic categories. See Module:category tree/lang/documentation for more information. Note that once you create a per-language module, you must add the language code to the langs_with_modules
table in Module:category tree/lang listing all the languages with language-specific modules; otherwise, the corresponding categories won't be recognized.
-- Prevent substitution.
if mw.isSubsting() then
return require("Module:unsubst")
end
local export = {}
local category_tree_submodule_prefix = "Module:category tree/"
local category_tree_styles_css = "Module:category tree/styles.css"
local m_str_utils = require("Module:string utilities")
local m_template_parser = require("Module:template parser")
local m_utilities = require("Module:utilities")
local ceil = math.ceil
local class_else_type = m_template_parser.class_else_type
local concat = table.concat
local deep_copy = require("Module:table").deepCopy
local full_url = mw.uri.fullUrl
local insert = table.insert
local is_callable = require("Module:fun").is_callable
local log10 = math.log10 or require("Module:math").log10
local new_title = mw.title.new
local pages_in_category = mw.site.stats.pagesInCategory
local parse = m_template_parser.parse
local remove_comments = require("Module:string/removeComments")
local sort = table.sort
local split = m_str_utils.split
local string_compare = require("Module:string/compare")
local trim = m_str_utils.trim
local uupper = m_str_utils.upper
local yesno = require("Module:yesno")
local current_frame = mw.getCurrentFrame()
local current_title = mw.title.getCurrentTitle()
local namespace = current_title.namespace
local poscatboiler_subsystem = "poscatboiler"
local extra_args_error = "Extra arguments to {{((}}auto cat{{))}} are not allowed for this category."
-- Generates a sortkey for a numeral `n`, adding leading zeroes to avoid the "1, 10, 2, 3" sorting problem. `max_n` is the greatest expected value of `n`, and is used to determine how many leading zeroes are needed. If not supplied, it defaults to the number of languages.
function export.numeral_sortkey(n, max_n)
max_n = max_n or require("Module:list of languages").count()
return ("#%%0%dd"):format(ceil(log10(max_n + 1))):format(n)
end
function export.split_lang_label(title_text)
local getByCanonicalName = require("Module:languages").getByCanonicalName
-- Progressively remove a word from the potential canonical name until it
-- matches an actual canonical name.
local words = split(title_text, " ", true)
for i = #words - 1, 1, -1 do
local lang = getByCanonicalName(concat(words, " ", 1, i))
if lang then
return lang, concat(words, " ", i + 1)
end
end
return nil, title_text
end
local function show_error(text)
return require("Module:message box").maintenance(
"red",
"]",
"This category is not defined in Wiktionary's category tree.",
text
)
end
-- Show the text that goes at the very top right of the page.
local function show_topright(current)
return current.getTopright and current:getTopright() or nil
end
local function link_box(content)
return ("<div class=\"noprint plainlinks\" style=\"float: right; clear: both; margin: 0 0 .5em 1em; background: var(--wikt-palette-paleblue, #f9f9f9); border: 1px var(--border-color-base, #aaaaaa) solid; margin-top: -1px; padding: 5px; font-weight: bold;\">%s</div>"):format(content)
end
local function show_editlink(current)
return link_box((""):format(tostring(full_url(current:getDataModule(), "action=edit"))))
end
function show_related_changes()
local title = current_title.fullText
return link_box((""):format(
tostring(full_url("Special:RecentChangesLinked", {
target = title,
showlinkedto = 0,
})),
title
))
end
local function show_pagelist(current)
local namespace = "namespace="
local info = current:getInfo()
local lang_code = info.code
if info.label == "citations" or info.label == "citations of undefined terms" then
namespace = namespace .. "Citations"
elseif lang_code then
local lang = require("Module:languages").getByCode(lang_code, true)
if lang then
-- Proto-Norse (gmq-pro) is the probably language with a code ending in -pro
-- that's intended to have mostly non-reconstructed entries.
if (lang_code:find("%-pro$") and lang_code ~= "gmq-pro") or lang:hasType("reconstructed") then
namespace = namespace .. "Reconstruction"
elseif lang:hasType("appendix-constructed") then
namespace = namespace .. "Appendix"
end
end
elseif info.label:match("templates") then
namespace = namespace .. "Template"
elseif info.label:match("modules") then
namespace = namespace .. "Module"
elseif info.label:match("^Wiktionary") or info.label:match("^Pages") then
namespace = ""
end
return ([=[
{| id="newest-and-oldest-pages" class="wikitable mw-collapsible" style="float: right; clear: both; margin: 0 0 .5em 1em;"
! Newest and oldest pages
|-
| id="recent-additions" style="font-size:0.9em;" | '''Newest pages ordered by last ]:'''
%s
|-
| id="oldest-pages" style="font-size:0.9em;" | '''Oldest pages ordered by last edit:'''
%s
|}]=]):format(
current_frame:extensionTag(
"DynamicPageList",
([=[
category=%s
%s
count=10
mode=ordered
ordermethod=categoryadd
order=descending]=]
):format(current_title.text, namespace)
),
current_frame:extensionTag(
"DynamicPageList",
([=[
category=%s
%s
count=10
mode=ordered
ordermethod=lastedit
order=ascending]=]
):format(current_title.text, namespace)
)
)
end
-- Show navigational "breadcrumbs" at the top of the page.
local function show_breadcrumbs(current)
local steps = {}
-- Start at the current label and move our way up the "chain" from child to parent, until we can't go further.
while current do
local category, display_name, nocap
if type(current) == "string" then
category = current
display_name = current:gsub("^Category:", "")
else
if not current.getCategoryName then
error("Internal error: Bad format in breadcrumb chain structure, probably a misformatted value for `parents`: " ..
mw.dumpObject(current))
end
category = "Category:" .. current:getCategoryName()
display_name, nocap = current:getBreadcrumbName()
end
if not nocap then
display_name = mw.getContentLanguage():ucfirst(display_name)
end
insert(steps, 1, ("]"):format(category, display_name))
-- Move up the "chain" by one level.
if type(current) == "string" then
current = nil
else
current = current:getParents()
end
if current then
current = current.name
end
end
local templateStyles = require("Module:TemplateStyles")(category_tree_styles_css)
local ol = mw.html.create("ol")
for i, step in ipairs(steps) do
local li = mw.html.create("li")
if i ~= 1 then
local span = mw.html.create("span")
:attr("aria-hidden", "true")
:addClass("ts-categoryBreadcrumbs-separator")
:wikitext(" » ")
li:node(span)
end
li:wikitext(step)
ol:node(li)
end
return templateStyles .. tostring(mw.html.create("div")
:attr("role", "navigation")
:attr("aria-label", "Breadcrumb")
:addClass("ts-categoryBreadcrumbs")
:node(ol))
end
local function show_also(current)
local also = current._info.also
if also and #also > 0 then
return ('<div style="margin-top:-1em;margin-bottom:1.5em">%s</div>'):format(require("Module:also").main(also))
end
return nil
end
-- Show a short description text for the category.
local function show_description(current)
return current.getDescription and current:getDescription() or nil
end
local function show_appendix(current)
local appendix = current.getAppendix and current:getAppendix()
return appendix and ("For more information, see ]."):format(appendix) or nil
end
local function sort_children(child1, child2)
return string_compare(uupper(child1.sort), uupper(child2.sort))
end
-- Show a list of child categories.
local function show_children(current)
local children = current.getChildren and current:getChildren() or nil
if not children then
return nil
end
sort(children, sort_children)
local children_list = {}
for _, child in ipairs(children) do
local child_name, child_pagetitle = child.name
if type(child_name) == "string" then
child_pagetitle = child_name
else
child_pagetitle = "Category:" .. child_name:getCategoryName()
end
if new_title(child_pagetitle).exists then
insert(children_list, ("* ]: %s"):format(
child_pagetitle,
child.description or
type(child_name) == "string" and child_name:gsub("^Category:", "") .. "." or
child_name:getDescription("child")
))
end
end
return concat(children_list, "\n")
end
-- Show a table of contents with links to each letter in the language's script.
local function show_TOC(current)
local titleText = current_title.text
local inCategoryPages = pages_in_category(titleText, "pages")
local inCategorySubcats = pages_in_category(titleText, "subcats")
local TOC_type
-- Compute type of table of contents required.
if inCategoryPages > 2500 or inCategorySubcats > 2500 then
TOC_type = "full"
elseif inCategoryPages > 200 or inCategorySubcats > 200 then
TOC_type = "normal"
else
-- No (usual) need for a TOC if all pages or subcategories can fit on one page;
-- but allow this to be overridden by a custom TOC handler.
TOC_type = "none"
end
if current.getTOC then
local TOC_text = current:getTOC(TOC_type)
if TOC_text ~= true then
return TOC_text or nil
end
end
if TOC_type ~= "none" then
local templatename = current:getTOCTemplateName()
local TOC_template
if TOC_type == "full" then
-- This category is very large, see if there is a "full" version of the TOC.
local TOC_template_full = new_title(templatename .. "/full")
if TOC_template_full.exists then
TOC_template = TOC_template_full
end
end
if not TOC_template then
local TOC_template_normal = new_title(templatename)
if TOC_template_normal.exists then
TOC_template = TOC_template_normal
end
end
if TOC_template then
return current_frame:expandTemplate{title = TOC_template.text, args = {}}
end
end
return nil
end
-- Show the "catfix" that adds language attributes and script classes to the page.
local function show_catfix(current)
local lang, sc = current:getCatfixInfo()
return lang and m_utilities.catfix(lang, sc) or nil
end
-- Show the parent categories that the current category should be placed in.
local function show_categories(current, categories)
local parents = current.getParents and current:getParents() or nil
if not parents then
return nil
end
for _, parent in ipairs(parents) do
local parent_name = parent.name
local sortkey = type(parent.sort) == "table" and parent.sort:makeSortKey() or parent.sort
if type(parent_name) == "string" then
insert(categories, ("]"):format(parent_name, sortkey))
else
insert(categories, ("]"):format(parent_name:getCategoryName(), sortkey))
end
end
-- Also put the category in its corresponding "umbrella" or "by language" category.
local umbrella = current:getUmbrella()
if umbrella then
-- FIXME: use a language-neutral sorting function like the Unicode Collation Algorithm.
local sortkey = current._lang and current._lang:getCanonicalName() or current:getCategoryName()
sortkey = require("Module:languages").getByCode("en", true):makeSortKey(sortkey)
if type(umbrella) == "string" then
insert(categories, ("]"):format(umbrella, sortkey))
else
insert(categories, ("]"):format(umbrella:getCategoryName(), sortkey))
end
end
-- Check for various unwanted parser functions, which should be integrated into the category tree data instead.
-- Note: HTML comments shouldn't be removed from `content` until after this step, as they can affect the result.
local content = current_title:getContent()
if not content then
-- This happens when using ] to call {{auto cat}} on a nonexistent category page,
-- which is needed by Benwing's create_wanted_categories.py script.
return
end
local defaultsort, displaytitle, page_has_param
for node in parse(content):iterate_nodes() do
local node_class = class_else_type(node)
if node_class == "template" then
local name = node:get_name()
if name == "DEFAULTSORT:" and not defaultsort then
insert(categories, "]")
defaultsort = true
elseif name == "DISPLAYTITLE:" and not displaytitle then
insert(categories,"]")
displaytitle = true
end
elseif node_class == "parameter" and not page_has_param then
insert(categories,"]")
page_has_param = true
end
end
-- Check for raw category markup, which should also be integrated into the category tree data.
content = remove_comments(content, "BOTH")
local head = content:find("[[", 1, true)
while head do
local close = content:find("]]", head + 2, true)
if not close then
break
end
-- Make sure there are no intervening "[[" between head and close.
local open = content:find("[[", head + 2, true)
while open and open < close do
head = open
open = content:find("[[", head + 2, true)
end
local cat = content:sub(head + 2, close - 1)
local colon = cat:match("^**():")
if colon then
local pipe = cat:find("|", colon + 1, true)
if pipe ~= #cat then
local title = new_title(pipe and cat:sub(1, pipe - 1) or cat)
if title and title.namespace == 14 then
insert(categories,"]")
break
end
end
end
head = open
end
end
local function generate_output(current)
if current then
for _, functionName in pairs{
"getBreadcrumbName",
"getDataModule",
"canBeEmpty",
"getDescription",
"getParents",
"getChildren",
"getUmbrella",
"getAppendix",
"getTOCTemplateName",
} do
if not is_callable(current) then
require("Module:debug").track{"category tree/missing function", "category tree/missing function/" .. functionName}
end
end
end
local boxes, display, categories = {}, {}, {}
-- Categories should never show files as a gallery.
insert(categories, "__NOGALLERY__")
if current_frame:getParent():getTitle() == "Template:auto cat" then
insert(categories, "]")
end
-- Check if the category is empty
local totalPages = pages_in_category(current_title.text, "all")
local hugeCategory = totalPages > 1000000 -- 1 million
-- Categorize huge categories, as they cause DynamicPageList to time out and make the category inaccessible.
if hugeCategory then
insert(categories, "]")
end
-- Are the parameters valid?
if not current then
insert(categories, "]")
insert(categories, totalPages == 0 and "]" or nil)
insert(display, show_error(
"Double-check the category name for typos. <br>" ..
"] to check if this category should be created under a different name (for example, "Fruits" instead of "Fruit"). <br>' ..
"To add a new category to Wiktionary's category tree, please consult " .. current_frame:expandTemplate{title = "section link", args = {
"Help:Category#How_to_create_a_category",
}} .. "."))
-- Exit here, as all code beyond here relies on current not being nil
return concat(categories, "") .. concat(display, "\n\n"), true
end
-- Does the category have the correct name?
local currentName = current:getCategoryName()
local correctName = current_title.text == currentName
if not correctName then
insert(categories, "]")
insert(display, show_error(("Based on the data in the category tree, this category should be called ''']'''."):format(currentName)))
end
-- Add cleanup category for empty categories.
local canBeEmpty = current:canBeEmpty()
if canBeEmpty and correctName then
insert(categories, " __EXPECTUNUSEDCATEGORY__")
elseif totalPages == 0 then
insert(categories, "]")
end
if current:isHidden() then
insert(categories, "__HIDDENCAT__")
end
-- Put all the float-right stuff into a <div> that does not clear, so that float-left stuff like the breadcrumbs and
-- description can go opposite the float-right stuff without vertical space.
insert(boxes, "<div style=\"float: right;\">")
insert(boxes, show_topright(current))
insert(boxes, show_editlink(current))
insert(boxes, show_related_changes())
-- Show pagelist, unless it's a huge category (since they can't use DynamicPageList - see above).
if not hugeCategory then
insert(boxes, show_pagelist(current))
end
insert(boxes, "</div>")
-- Generate the displayed information
insert(display, show_breadcrumbs(current))
insert(display, show_also(current))
insert(display, show_description(current))
insert(display, show_appendix(current))
insert(display, show_children(current))
insert(display, show_TOC(current))
insert(display, show_catfix(current))
insert(display, '<br class="clear-both-in-vector-2022-only">')
show_categories(current, categories)
return concat(boxes, "\n") .. "\n" .. concat(display, "\n\n") .. concat(categories, "")
end
--[==[
List of handler functions that try to match the page name. A handler should return the name of a submodule to
] and an info table which is passed as an argument to the submodule. If a handler does not
recognize the page name, it should return nil. Note that the order of handlers matters!
]==]
local handlers = {}
-- Thesaurus per-language category
insert(handlers, function(title)
local code, label = title:match("^Thesaurus:(%l*%a):(.+)")
if code then
return poscatboiler_subsystem, {label = title, raw = true}
end
end)
-- Topic per-language category
insert(handlers, function(title)
local code, label = title:match("^(%l*%a):(.+)")
if code then
return poscatboiler_subsystem, {label = title, raw = true}
end
end)
-- Lect category e.g. for ] or ]
insert(handlers, function(title, args)
local lect = args.lect or args.dialect
if lect ~= "" and yesno(lect, true) then -- Same as boolean in ].
return poscatboiler_subsystem, {label = title, args = args, raw = true}
end
end)
-- poscatboiler per-language label, e.g. ]
insert(handlers, function(title, args)
local lang, label = export.split_lang_label(title)
if not lang then
return
end
local baseLabel, script = label:match("(.+) in (.-) script$")
if script and baseLabel ~= "terms" then
local scriptObj = require("Module:scripts").getByCanonicalName(script)
if scriptObj then
return poscatboiler_subsystem, {label = baseLabel, code = lang:getCode(), sc = scriptObj:getCode(), args = args}
end
end
return poscatboiler_subsystem, {label = label, code = lang:getCode(), args = args}
end)
-- poscatboiler label umbrella category
insert(handlers, function(title, args)
local label = title:match("(.+) by language$")
if label then
-- The poscatboiler code will appropriately lowercase if needed.
return poscatboiler_subsystem, {label = label, args = args}
end
end)
-- poscatboiler raw handlers
insert(handlers, function(title, args)
return poscatboiler_subsystem, {label = title, args = args, raw = true}
end)
-- poscatboiler umbrella handlers without 'by language'
insert(handlers, function(title, args)
return poscatboiler_subsystem, {label = title, args = args}
end)
function export.show(frame)
local args, other_args = require("Module:parameters").process(frame:getParent().args, {
= {type = "title", sublist = "comma without whitespace", namespace = 14}
}, true)
if args.also then
for k, arg in next, args.also do
args.also = arg.prefixedText
end
end
for k, arg in next, other_args do
other_args = trim(arg)
end
if namespace == 10 then -- Template
return "(This template should be used on pages in the ] namespace.)"
elseif namespace ~= 14 then -- Category
error("This template/module can only be used on pages in the ] namespace.")
end
local first_fail_args_handled, first_fail_cattext
-- Go through each handler in turn. If a handler doesn't recognize the format of the category, it will return nil,
-- and we will consider the next handler. Otherwise, it returns a template name and arguments to call it with, but
-- even then, that template might return an error, and we need to consider the next handler. This happens, for
-- example, with the category "CAT:Mato Grosso, Brazil", where "Mato" is the name of a language, so the poscatboiler
-- per-language label handler fires and tries to find a label "Grosso, Brazil". This throws an error, and
-- previously, this blocked fruther handler consideration, but now we check for the error and continue checking
-- handlers; eventually, the topic umbrella handler will fire and correctly handle the category.
for _, handler in ipairs(handlers) do
-- Use a new title object and args table for each handler, to keep them isolated.
local submodule, info = handler(current_title.text, deep_copy(other_args))
if submodule then
info.also = deep_copy(args.also)
require("Module:debug").track("auto cat/" .. submodule)
-- `failed` is true if no match was found.
submodule = require(category_tree_submodule_prefix .. submodule)
local cattext, failed = generate_output(submodule.main(info))
if failed then
if not first_fail_cattext then
first_fail_cattext = cattext
first_fail_args_handled = info.args and true or false
end
elseif not info.args and next(other_args) then
error(extra_args_error)
else
return cattext
end
end
end
-- If there were no matches, throw an error if any arguments were given, or otherwise return the cattext
-- from the first fail encountered. The final handlers call the boilers unconditionally, so there should
-- always be something to return.
if not first_fail_args_handled and next(other_args) then
error(extra_args_error)
end
return first_fail_cattext
end
-- TODO: new test entrypoint.
return export