This is the documentation page for the main data module for the Module:category tree/topic cat category tree subsystem, as well as for its submodules. Collectively, these modules handle generating the descriptions and categorization for topic pages such as Category:en:Birds, Category:es:France and Category:zh:State capitals of Germany, and the corresponding non-language-specific pages such as Category:Birds, Category:France and Category:State capitals of Germany. (All other categories handled through the {{auto cat}}
system are handled by the Module:category tree/poscatboiler subsystem.)
The main data module at Module:category tree/topic cat/data does not contain data itself, but rather imports the data from its submodules, and applies some post-processing.
subpages
list at the top of Module:category tree/topic cat/data.The topic cat system internally makes a distinction based on which languages a category applies to:
langcode:label
(e.g. Category:es:Birds and Category:de:States of the United States). Here, langcode
is the language code of a recognized full Wiktionary language (see WT:LOL for the list of all such languages and their codes), and label
is a topic, generally one that can apply to multiple languages. The intended category contents is terms in the language in question that are either related to, instances of or types of the topic in question (depending on the type of category; see below). Associated with each per-language category is an umbrella category; see below. The following restrictions apply to per-language categories:
langcode
must currently be a full language, not an etymology-only language. (Etymology-only languages include lects such as Provençal, considered a variety of Occitan, and Biblical Hebrew, considered a variety of Hebrew. See here for the list of such lects.)label
as found in the category name always begins with a capital letter, whether or not the underlying form of the label is capitalized (contrast Category:en:Birds with Category:en:France). Internally, this is different, and the internal form of a label begins with a lowercase or uppercase letter as appropriate (birds but France).label
, i.e. a bare category label. As with per-language categories, this label is always capitalized in the category name, regardless of the underlying form of the label. Examples are Category:Birds, Category:France and Category:State capitals of Germany. Umbrella categories serve to group all the per-language categories for a particular topic. They also serve to group more specific subcategories, e.g. under Category:Birds can be found Category:Birds of prey, Category:Freshwater birds, Category:Columbids (which includes doves and pigeons), etc. as well as Category:Eggs and Category:Feathers. Umbrella categories should not normally directly contain any terms.In addition to the above distinction, the topic cat system divides categories according to the category type, which specifies the relationship between the category and the members of that category:
type = "related-to"
) contain terms that are semantically related to the category topic. For example, Category:en:Chess contains terms such as checkmate, rank (a row on a chessboard), endgame, en passant, Grandmaster, etc. "Related to" is a nebulous criterion, and as a result the terms in the category should be related to the category as directly as possible, to avoid the category becoming a grab bag of random terms.type = "name"
) categories contain terms that are names of individual, specific instances of the category. For example, Category:Chess openings contains names of specific openings, such as Ruy Lopez and Sicilian Defense. Even more clearly, Category:Moons of Jupiter contains names of individual moons that orbit the planet Jupiter.type = "type"
) categories contains terms for types of the entity described by the category name. For example, Category:Checkmate patterns contains types of checkmates, such as ladder mate and smothered mate. Even more clearly, Category:Hobbyists contains terms for types of hobbyists, such as oenophile (a wine enthusiast), numismatist (a stamp collector), etc. (If this were a name category, it would contain names of specific, presumably famous, hobbyists — something that would probably not be dictionary-worthy material.)type = "set"
) categories are used when the distinction between names and types of a given topic may not always be clear, but the overall membership is still well-defined. For example, Category:Heraldic charges contains terms for components of coats of arms, e.g. bend sinister (a diagonal band from lower left to upper right), fleur-de-lis (a stylized image of a lily, as is commonly associated with New Orleans) and quatrefoil (a symmetrical shape made from the outline of four circles).type = "grouping"
) categories are higher-level categories that are used only to group more specific categories and should not contain elements themselves (but nevertheless sometimes do). An example is Category:Industries, which contains subcategories devoted to particular industries (e.g. Category:Banking, Category:Mining, Category:Music industry, Category:Oil industry, etc.).type = "toplevel"
) categories are special high-level categories that list all the categories of one of the above types, and which are always named List of type categories
, e.g. Category:List of related-to categories (listing all the "related-to" umbrella categories) or Category:es:List of name categories (listing all the Spanish name-type categories). The number of top-level categories is fixed.Note that name, type and set categories are conceptually similar to each other, in that each contains terms that have an is-a relationship with the topic in question, whereas related-to categories express a weaker sort of relation between term and topic, merely asserting that the term is in some way "related" or "pertinent" to the topic in question. For this reason, when creating new topics, you should always strive to create name, type or set topics whenever possible, and avoid related-to topics unless there is no alternative and you're convinced this topic is really necessary. Before creating such a category:
brick
, do not add terms like brick house, thick as a brick or yellow brick road merely becaues they have the word "brick" in them; instead, use the ===Related terms=== section of the brick lemma to include these terms).It should also be noted that name, type and set categories typically use the plural in their topic name, which related-to categories often use the singular. This is not a hard and fast rule, however, and there are exceptions in both directions. If it's not obvious what type of category a given topic refers to, consider making this explicit in the topic name, e.g. names of stars
or types of stars
rather than just stars
. (In the future, all, or at least most, topic categories may be named in such a fashion.)
A sample entry is as follows (in this case, found in Module:category tree/topic cat/data/History):
labels = { type = "related-to", description = "default", parents = {"history"}, }
This generates the description and categorization for all per-language categories of the form langcode:Ancient history
(e.g. Category:en:Ancient history) as well as for the umbrella category Category:Ancient history (see above for the definition of per-language and umbrella categories).
The meaning of this snippet is as follows:
Ancient Near East
(as in the example below) is capitalized because the label refers to a specific region, and toponyms are capitalized in English.type
field specifies the category type, as described above. This label is a "related-to" category.description
field gives the description text that will appear when a user visits the category page. Certain special values are recognized, including "default"
, which generates a default label. The value of the default label depends on the label's name, the language of the category, and the label's type. In this case, it is equivalent to "{{{langname}}} terms related to ] ]"
(where {{{langname}}}
is replaced with the name of the language in question) and "terms related to ] ]"
" for the umbrella category. See #Descriptions below for more information on specifying descriptions.parents
field gives the labels of the parent categories. Here, the category specifies a single parent "history"
. This means that a category such as Category:en:Ancient history will have Category:en:History as its parent. An additional top-level list parent will automatically be added (in this case Category:en:List of related-to categories) as well as the umbrella parent Category:Ancient history.Another example follows:
labels = { type = "name", displaytitle = "places in ''Romance of the Three Kingdoms''", description = "=places in ''{{w|Romance of the Three Kingdoms}}''", parents = {"Romance of the Three Kingdoms", "China"}, }
This is a subcategory of "Romance of the Three Kingdoms"
(a 14th century Chinese historical novel) and accordingly specifies "Romance of the Three Kingdoms"
as the parent, along with "China"
(note the capitalization, in accordance with the principles laid out above). A description is given explicitly, preceded by =
(which in this case prepends "names for specific" to the description). The displaytitle
field is also set so that the name of the work is italicized.
The following fields are recognized for the object describing a label:
type
flags
currently has type = "related-to,name,type"
because it contains a mixture of terms related to flags (e.g. flagpole and grommet), terms for individual flags (e.g. Star-Spangled Banner) and terms for types of flags (e.g. prayer flag, flag of convenience). Mixed categories are strongly dispreferred and should be split into separate per-type categories.description
additional
field described below, and put {{wikipedia}}
boxes in the topright
field described below so that they are correctly right-aligned with the description. Template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
will be expanded appropriately; see #Template substitutions in field values below. Certain values are handled specially, including "default"
(and variants such as "default with the"
, "default wikify"
and "default no singularize"
) and phrases preceded by an =
sign, as explained in more detail below.parents
name
and sort
. In the latter case, name
specifies the parent label name, while the sort
value specifies the sort key to use to sort it in that category. The default sort key is the category's label.Category:
it is interpreted as a raw category name, rather than as a label name. It can still have its own sort key as usual.breadcrumb
setting, as described below.)breadcrumb
name
and nocap
. In the latter case, name
specifies the breadcrumb text, while nocap
can be used to disable the automatic capitalization of the breadcrumb text that normally happens.displaytitle
{{DISPLAYTITLE:...}}
magic word (see mw:Help:Magic words). The same formatting is also applied to breadcrumbs, descriptions and other mentions of the label in formatted text. The value of this is either a string (which should be the formatted label, e.g. "The Matrix"
, "people in Romance of the Three Kingdoms"
or "Glee (TV series)"
) or a Lua function to generate the formatted category title. The Lua function is passed two parameters: the raw label (without any preceding language code) and the language object of the category's language (or nil
for umbrella categories). It should return the appropriately formatted label. If the value of this field is a string, template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
will be expanded appropriately; see below. See Module:category tree/topic cat/data/Culture for examples of using displaytitle
.topright
{{wikipedia}}
and other similar boxes. Template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
are expanded appropriately, just as with description
; see #Template substitutions in field values below. Compare the preceding
field, which is similar to topright
but used for left-aligned text placed above the description.preceding
description
field. The difference between the two is that description
text will also be shown in the list of children categories shown on the parent category's page, while the preceding
text will not. For this reason, use preceding
instead of description
for {{also}}
hatnotes and similar text, and keep description
relatively short. Template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
are expanded appropriately, just as with description
; see #Template substitutions in field values below. Compare the topright
field, which is similar to preceding
but is right-aligned, placed above the edit and recent-entries boxes.additional
description
field. The difference between the two is that description
text will also be shown in the list of children categories shown on the parent category's page, while the additional
text will not. For this reason, use additional
instead of description
for long explanatory notes, See also references and the like, and keep description
relatively short. Template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
are expanded appropriately, just as with description
; see #Template substitutions in field values below.wp
true
to link to an entry that is the same as the label; a string, to link to that entry; or a list of strings or true
, to generate multiple boxes, one per list item. For example, if the label pesäpallo
has wp = true
, a box will be generated that links to Pesäpallo on Wikipedia, and if the label football (American)
has wp = "American football"
, a box will be generated that links to American football on Wikipedia.wpcat
wp
except that the link is to a category (the generated entry or entries is/are prepended with Category:
). For example, if the label animals
has wpcat = true
set, a box will be generated that links to Category:Animals on Wikipedia.commonscat
wpcat
except that the link is to Wikimedia Commons instead of Wikipedia. For example, if the label racquet sports
has commonscat = true
set, a box will be generated that links to Category:Racquet sports on Wikimedia Commons.topic
type
field) and what sorts of terms should go into it. This does not normally need to be specified, as it's derived directly from the label. But it is useful e.g. for the label types of planets, which sets topic = "planets"
, because the auto-generated "additional" message contains the text " ... It should contain terms for types of {{{topic}}}, ..."
, and using the label directly will result in redundant text. Template invocations and special template-like references such as {{{langname}}}
and {{{langcode}}}
are expanded appropriately, just as with description
; see #Template substitutions in field values below. The value of this field can be "default"
or "default with the"
, which will be expanded appropriately based on the label.umbrella
parents
) of Category:en:Ancient history, Category:fr:Ancient history and all other language-specific categories holding adjectives. This table contains the following fields:
description
description
field of the label itself by removing language references (specifically, {{{langname}}}
, {{{langcode}}}:
, {{{langcode}}}
and {{{langcat}}}
) and adding This category concerns the topic: before the result. Text is automatically added to the end indicating that this category is an umbrella category that only contains other categories, and does not contain pages describing terms.breadcrumb
topright
topright
field on regular category pages; see above.preceding
preceding
field on regular category pages; see above.additional
additional
field on regular category pages; see above.topic
topic
field on regular category pages; see above.umbrella_description
description
subfield of the umbrella
field.Template invocations can be inserted in the text of description
, parents
(both name and sort key), breadcrumb
, toc_template
and toc_template_full
values, and will be expanded appropriately. In addition, the following special template-like invocations are recognized and replaced by the equivalent text:
{{PAGENAME}}
{{{langname}}}
{{{langcode}}}
en
for English, de
for German). Not recognized in umbrella fields.{{{langcat}}}
{{{langlink}}}
{{{umbrella_msg}}}
{{{topic}}}
topic
field (or the umbrella.topic
field for umbrella categories), if specified; else, the value of displaytitle
(if specified) or the label, with "the" added if the description is "default with the"
or a variant containing "with the"
(such as "default with the wikify"
).The description field is of one of three types:
=
and not ending in a period."default"
or one of its variants, such as "default with the"
or "default wikify"
.If preceded by =
, the description is generated from the specified phrase by prepending {{{LANGNAME}}}
(which is replaced with the language name) followed by standard type-dependent text, and appending a period. The text prepended is currently as follows:
Type | Text |
---|---|
related-to |
terms related to |
set |
terms for types or instances of |
name |
names of specific |
type |
terms for types of |
grouping |
categories concerning more specific variants of |
toplevel |
N/A |
For example, for the label biblical characters
, the description is currently "=characters in the ]"
, which expands to {{{LANGNAME}}} names of specific characters in the ].
, and in turn is expanded to e.g. French names of specific characters in the ].
(if the category is Category:fr:Biblical characters).
Note that no standard text is provided for top-level categories, all of which include a custom description.
If "default"
or one of its variants is used as the description, a default description is generated as if the description consisted of =
prepended to the label, except that the word the
might be added to the beginning of the label, and the words in the label might be wikilinked. Specifically:
"default with the"
(or a form such as "default with the wikify"
, "default with the no singularize"
, etc.), the word the
is prefixed to the label."default wikify"
(or a related form), the label is linked to Wikipedia. If the label ends in an -s, the label is linked to a Wikipedia entry based on the singular form of the label (which converts -ies to -y; converts -xes, -ches or -shes, respectively, to -x, -ch or -sh; and otherwise just removes -s), unless the label is "default wikify no singularize"
or a related form, in which case the label is linked unchanged.no singularize
is not specified in the description, and the singular form of the label (generated according to the algorithm described just above) is a Wiktionary term, the label is linked to that term. Note that "is a Wiktionary term" simply means that a page of this name exists; the code does not currently check to see whether there is an English entry or whether the term is a lemma.no singularize
is not found in the description, in that the code first attempts to link the word to its singular equivalent, falling back to the word itself if the singular equivalent doesn't name a Wiktionary term.For example, a label video games
will be linked as ]s
because the page video game exists, but Arabic deities
will be linked as ] ]
because neither Arabian deity nor Arabian deities exists as a page. The use of no singularize
is needed with labels such as linguistics
, comics
and humanities
, because their respective singular forms linguistic, comic and humanity exist as Wiktionary pages.
Finally, note that the components of a default-type description (wikify
, with the
and no singularize
) can be given in any order if more than one of them needs to be specified.
It is also possible to have handlers that can handle arbitrarily-formed labels, e.g. political subdivisions of country
for any country
(categories such as Category:tg:Political subdivisions of the United Arab Emirates) or divisions of polity
for any division
and polity
(e.g. Category:fr:Counties of South Korea or Category:pt:Municipalities of Tocantins, Brazil). Currently, handlers exist only in the toponym-handling code in Module:category tree/topic cat/data/Places and in Module:category tree/topic cat/data/Names. As example, the following is the handler for script letter names
:
table.insert(handlers, function(label) local script = label:match("^(.*) letter names$") if script then local sc = require("Module:scripts").getByCanonicalName(script) if sc then local script_page local appendix = ("Appendix: %s script"):format(script) local appendix_title = mw.title.new(appendix) if appendix_title and appendix_title.exists then script_page = appendix else script_page = "w:" .. sc:getWikipediaArticle() end local link = ("]"):format(script_page, script) return { type = "name", description = ("{{{langname}}} terms that serve as names for letters and symbols directly based on letters, " .. "such as ]s and letters with ]s, of the %s."):format(link), parents = {"letter names"}, } end end end)
The handler checks is passed a single argument (the label), checks if the passed-in label has a recognized form, and if so, returns an object that follows the same format as described above for directly-specified labels. In this case, the handler makes sure the given script name specifies an actual script, and constructs an appropriate link for the script, depending on whether an appendix page for the script exists (falling back to Wikipedia).
NOTE: The handler needs to be prepared to handle both umbrella categories and per-language categories. The label is passed in as it appears in the category; this means the handler may need to handle both uppercase-initial and lowercase-initial variants of the label. (For this handler, this isn't an issue because the script always appears uppercased.) One way to do that is to convert the label to lowercase-initial before further processing, using mw.getContentLanguage():lcfirst()
.
Note also that if a handler is specified, the module should return a table holding both the label and handler data; see the above modules.
local labels = {}
local handlers = {}
local m_shared = require("Module:place/shared-data")
local m_strutils = require("Module:string utilities")
--[=[
This module contains specifications that are used to create labels that allow {{auto cat}} and
to create the appropriate definitions for topic categories for places (e.g. 'de:Hokkaido',
'es:Cities in France', 'pt:Municipalities of Tocantins, Brazil', etc.).
Note that this module doesn't actually create the categories; that must be done manually,
with the text "{{auto cat}}" as the definition of the category. (This process should automatically
happen periodically for non-empty categories, because they will appear in ]
and a bot will periodically examine that list and create any needed category.)
There are two ways that such labels are created: (1) by manually adding an entry to the 'labels'
table, keyed by the label (minus the language code) with a value consisting of a Lua table
specifying the description text and the category's parents; (2) through handlers (pieces of
Lua code) added to the 'handlers' list, which recognize labels of a specific type (e.g.
'Cities in France') and generate the appropriate specification for that label on-the-fly.
]=]
local function lcfirst(label)
return mw.getContentLanguage():lcfirst(label)
end
labels = {
type = "name",
description = "{{{langname}}} names for geographical ]s; ]s.",
parents = {"names"},
}
-- Generate bare labels in 'label' for all political subdivisions.
-- Do this before handling 'general_labels' so the latter can override if necessary.
for subdiv, desc in pairs(m_shared.political_subdivisions) do
desc = m_shared.format_description(desc, subdiv)
labels = {
type = "name",
description = "{{{langname}}} names of " .. desc .. ".",
parents = {"political subdivisions"},
}
end
--[=[
General labels. These are intended for places of all sorts that are not qualified by a holonym (e.g. it does not include
'regions in Africa'). These also do not need to include any political subdivisions listed in 'political_subdivisions' in
]. Each entry is {LABEL, PARENTS, DESCRIPTION, WPCAT, COMMONSCAT}:
* PARENTS should not include "list of names", which is added automatically.
* DESCRIPTION is the linked plural description of label, and is formatted using format_description() in
], meaning it can have the value of 'true' to construct a default singularized description
using link_label() in ] (e.g. 'atolls' -> ']s', 'beaches' -> ']es',
'countries' -> ']', etc.). A value of "w" links the singularized label to Wikipedia.
* WPCAT should be the name of a Wikipedia category to link to, or 'false' to disable this. A value of 'true' or 'nil'
links to a category the same as the label.
* COMMONSCAT should be the name of a Commons category to link to, or 'false' to disable this. A value of 'true' or
'nil' links to a category the same as the label.
]=]
local general_labels = {
{"airports", {"places"}},
{"ancient settlements", {"historical settlements"}, "former ], ]s and ]s that existed in ]"},
{"atolls", {"islands"}},
{"bays", {"places", "bodies of water"}},
{"beaches", {"places", "water"}},
{"bodies of water", {"landforms", "water"}, "]"},
{"boroughs", {"polities"}},
{"capital cities", {"cities"}, "] ]: the ] for a country or ] ] of a country"},
{"census-designated places", {"places"}},
{"cities", {"polities"}},
{"city-states", {"polities"}, "] ]s consisting of a single ] and ]"},
{"communities", {"polities"}, "] of all sizes"},
{"continents", {"places"}, "the ]s of the world"},
{"countries", {"polities"}},
{"dependent territories", {"polities"}, "w"},
{"deserts", {"places"}},
{"forests", {"places"}},
{"ghost towns", {"historical settlements"}},
{"gulfs", {"places", "water"}},
{"headlands", {"places"}},
{"historical and traditional regions", {"places"}, "regions that have no administrative significance"},
{"historical capitals", {"historical settlements"}, "former ] ] and ]s"},
{"historical dependent territories", {"dependent territories"}, "] (colonies, dependencies, protectorates, etc.) that no longer exist"},
{"historical political subdivisions", {"polities"}, "] ]s (states, provinces, counties, etc.) that no longer exist"},
{"historical polities", {"polities"}, "] (countries, kingdoms, empires, etc.) that no longer exist"},
{"historical settlements", {"historical polities"}, "], ]s and ]s that no longer exist or have been merged or reclassified"},
{"hills", {"places"}},
{"islands", {"places"}},
{"kibbutzim", {"places"}, "]im"},
{"lakes", {"places", "bodies of water"}},
{"landforms", {"places", "Earth"}},
{"micronations", {"places"}},
{"mountain passes", {"places"}, "]es"},
{"mountains", {"places"}},
{"moors", {"places"}},
{"neighborhoods", {"places"}, "]s, ]s and other subportions of a ]"},
-- FIXME, is the following parent correct?
{"oceans", {"seas"}},
{"parks", {"places"}},
{"peninsulas", {"places"}},
{"plateaus", {"places"}},
{"political subdivisions", {"polities"}, "] ]s, such as ]s, ]s or ]s"},
{"polities", {"places"}, "] or ] ]s"},
{"rivers", {"places", "bodies of water"}},
{"seas", {"places", "bodies of water"}},
{"straits", {"places", "bodies of water"}},
{"subdistricts", {"polities"}},
{"suburbs", {"places"}, "]s of a ]"},
{"towns", {"polities"}},
{"townships", {"polities"}},
{"unincorporated communities", {"places"}},
{"valleys", {"places", "water"}},
{"villages", {"polities"}},
{"volcanoes", {"landforms"}, "]es"},
}
-- Generate bare labels in 'label' for all "general labels" (see above).
for _, label_spec in ipairs(general_labels) do
local label, parents, desc, commonscat, wpcat = unpack(label_spec)
desc = m_shared.format_description(desc, label)
labels = {
type = "name",
description = "{{{langname}}} names of " .. desc .. ".",
parents = parents,
commonscat = commonscat == nil and true or commonscat,
wpcat = wpcat == nil and true or wpcat,
}
end
labels = {
type = "name",
-- special-cased description
description = "{{{langname}}} informal alternative names for ] (e.g., ] for ]).",
parents = {"cities", "nicknames"},
}
labels = {
type = "name",
-- special-cased description
description = "{{{langname}}} ]s.",
parents = {"places"},
}
-- Generate bare labels in 'label' for all polities (countries, states, etc.).
for _, group in ipairs(m_shared.polities) do
for key, value in pairs(group.data) do
group.bare_label_setter(labels, group, key, value)
end
end
local function city_description(group, key, value)
-- The purpose of all the following code is to construct the description. It's written in
-- a general way to allow any number of containing polities, each larger than the previous one,
-- so that e.g. for Birmingham, the description will read "{{{langname}}} terms related to the city of
-- ], in the county of the ], in the ] of ],
-- in the ]."
local bare_key, linked_key = m_shared.construct_bare_and_linked_version(key)
local descparts = {}
table.insert(descparts, "the city of " .. linked_key)
local city_containing_polities = m_shared.get_city_containing_polities(group, key, value)
local label_parent -- parent of the label, from the immediate containing polity
for n, polity in ipairs(city_containing_polities) do
local bare_polity, linked_polity = m_shared.construct_bare_and_linked_version(polity)
if n == 1 then
label_parent = bare_polity
end
table.insert(descparts, ", in ")
if n < #city_containing_polities then
local divtype = polity.divtype or group.default_divtype
local pl_divtype = m_strutils.pluralize(divtype)
local pl_linked_divtype = m_shared.political_subdivisions
if not pl_linked_divtype then
error("When creating city description for " .. key .. ", encountered divtype '" .. divtype .. "' not in m_shared.political_subdivisions")
end
pl_linked_divtype = m_shared.format_description(pl_linked_divtype, pl_divtype)
local linked_divtype = m_strutils.singularize(pl_linked_divtype)
table.insert(descparts, "the " .. linked_divtype .. " of ")
end
table.insert(descparts, linked_polity)
end
return table.concat(descparts), label_parent
end
-- Generate bare labels in 'label' for all cities.
for _, group in ipairs(m_shared.cities) do
for key, value in pairs(group.data) do
if not value.alias_of then
local desc, label_parent = city_description(group, key, value)
desc = "{{{langname}}} terms related to " .. desc .. "."
local parents = value.parents or label_parent
if not parents then
error("When creating city bare label for " .. key .. ", at least one containing polity must be specified or an explicit parent must be given")
end
if type(parents) ~= "table" then
parents = {parents}
end
local key_parents = {}
for _, parent in ipairs(parents) do
local polity_group, key_parent = m_shared.city_containing_polity_to_group_and_key(parent)
if key_parent then
local bare_key_parent, linked_key_parent =
m_shared.construct_bare_and_linked_version(key_parent)
table.insert(key_parents, bare_key_parent)
else
error("Couldn't find entry for city '" .. key .."' parent '" .. parent .. "'")
end
end
-- wp= defaults to group-level wp=, then to true (Wikipedia article matches bare key = label)
local wp = value.wp
if wp == nil then
wp = group.wp or true
end
-- wpcat= defaults to wp= (if Wikipedia article has its own name, Wikipedia category and Commons category generally follow)
local wpcat = value.wpcat
if wpcat == nil then
wpcat = wp
end
-- commonscat= defaults to wpcat= (if Wikipedia category has its own name, Commons category generally follows)
local commonscat = value.commonscat
if commonscat == nil then
commonscat = wpcat
end
local function format_boxval(val)
if type(val) == "string" then
val = val:gsub("%%c", key):gsub("%%d", label_parent)
end
return val
end
labels = {
type = "related-to",
description = desc,
parents = key_parents,
wp = format_boxval(wp),
wpcat = format_boxval(wpcat),
commonscat = format_boxval(commonscat),
}
end
end
end
-- Handler for "cities in the Bahamas", "rivers in Western Australia", etc.
-- Places that begin with "the" are recognized and handled specially.
table.insert(handlers, function(label)
label = lcfirst(label)
local place_type, place = label:match("^(-) in (.*)$")
if place_type and m_shared.generic_place_types then
for _, group in ipairs(m_shared.polities) do
local placedata = group.data
if placedata then
placedata = group.value_transformer(group, place, placedata)
local allow_cat = true
if place_type == "neighborhoods" and placedata.british_spelling or
place_type == "neighbourhoods" and not placedata.british_spelling then
allow_cat = false
end
if placedata.is_former_place and place_type ~= "places" then
allow_cat = false
end
if placedata.is_city and not m_shared.generic_place_types_for_cities then
allow_cat = false
end
if allow_cat then
local parent
if placedata.containing_polity then
parent = place_type .. " in " .. placedata.containing_polity
elseif place_type == "neighbourhoods" then
parent = "neighborhoods"
else
parent = place_type
end
local bare_place, linked_place = m_shared.construct_bare_and_linked_version(place)
local keydesc = placedata.keydesc or linked_place
local parents
if place_type == "places" then
parents = {{name = parent, sort = bare_place}, bare_place}
else
parents = {{name = parent, sort = bare_place}, bare_place, "places in " .. place}
end
return {
type = "name",
topic = label,
description = "{{{langname}}} names of " .. m_shared.generic_place_types .. " in " .. keydesc .. ".",
parents = parents
}
end
end
end
end
end)
-- Handler for "places in Paris", "neighbourhoods of Paris", etc.
table.insert(handlers, function(label)
label = lcfirst(label)
local place_type, in_of, city = label:match("^(places) (in) (.*)$")
if not place_type then
place_type, in_of, city = label:match("^(-) (of) (.*)$")
end
if place_type and m_shared.generic_place_types_for_cities then
for _, group in ipairs(m_shared.cities) do
local city_data = group.data
if city_data then
local spelling_matches = true
if place_type == "neighborhoods" or place_type == "neighbourhoods" then
local containing_polities = m_shared.get_city_containing_polities(group, city, city_data)
local polity_group, polity_key = m_shared.city_containing_polity_to_group_and_key(
containing_polities)
if not polity_key then
error("Can't find polity data for city '" .. place ..
"' containing polity '" .. containing_polities .. "'")
end
local polity_value = polity_group.value_transformer(polity_group, polity_key, polity_group)
if place_type == "neighborhoods" and polity_value.british_spelling or
place_type == "neighbourhoods" and not polity_value.british_spelling then
spelling_matches = false
end
end
if spelling_matches then
local parents
if place_type == "places" then
parents = {city}
else
parents = {city, "places in " .. city}
end
local desc = city_description(group, city, city_data)
return {
type = "name",
topic = label,
description = "{{{langname}}} names of " .. m_shared.generic_place_types_for_cities .. " " .. in_of .. " " .. desc .. ".",
parents = parents
}
end
end
end
end
end)
-- Handler for "political subdivisions of the Philippines" and other "political subdivisions of X" categories.
table.insert(handlers, function(label)
label = lcfirst(label)
local place = label:match("^political subdivisions of (.*)$")
if place then
for _, group in ipairs(m_shared.polities) do
local placedata = group.data
if placedata then
placedata = group.value_transformer(group, place, placedata)
local bare_place, linked_place = m_shared.construct_bare_and_linked_version(place)
local keydesc = placedata.keydesc or linked_place
local desc = "{{{langname}}} names of ] ]s of " .. keydesc .. "."
return {
type = "name",
topic = label,
description = desc,
breadcrumb = "political subdivisions",
parents = {bare_place, {name = "political subdivisions", sort = bare_place}},
}
end
end
end
end)
-- Handler for "provinces of the Philippines", "counties of Wales", "municipalities of Tocantins, Brazil", etc.
-- Places that begin with "the" are recognized and handled specially.
table.insert(handlers, function(label)
label = lcfirst(label)
local place_type, place = label:match("^(-) of (.*)$")
if place then
for _, group in ipairs(m_shared.polities) do
local placedata = group.data
if placedata then
placedata = group.value_transformer(group, place, placedata)
local divcat = nil
local poldiv_parent = nil
if placedata.poldiv then
for _, div in ipairs(placedata.poldiv) do
if type(div) == "string" then
div = {div}
end
if place_type == div then
divcat = "poldiv"
poldiv_parent = div.parent
break
end
end
end
if not divcat and placedata.miscdiv then
for _, div in ipairs(placedata.miscdiv) do
if type(div) == "string" then
div = {div}
end
if place_type == div then
divcat = "miscdiv"
break
end
end
end
if divcat then
local linkdiv = m_shared.political_subdivisions
if not linkdiv then
error("Saw unknown place type '" .. place_type .. "' in label '" .. label .. "'")
end
linkdiv = m_shared.format_description(linkdiv, place_type)
local bare_place, linked_place = m_shared.construct_bare_and_linked_version(place)
local keydesc = placedata.keydesc or linked_place
local desc = "{{{langname}}} names of " .. linkdiv .. " of " .. keydesc .. "."
if divcat == "poldiv" then
return {
type = "name",
topic = label,
description = desc,
breadcrumb = poldiv_parent and m_shared.call_key_to_placename(group, place) or place_type,
parents = poldiv_parent and
{{name = poldiv_parent, sort = bare_place}, bare_place} or
{"political subdivisions of " .. place, {name = place_type, sort = bare_place}},
}
else
return {
type = "name",
topic = label,
description = desc,
breadcrumb = place_type,
parents = {bare_place},
}
end
end
end
end
end
end)
-- Generate bare labels in 'label' for all types of capitals.
for capital_cat, placetype in pairs(m_shared.capital_cat_to_placetype) do
local pl_placetype = m_strutils.pluralize(placetype)
local linkdiv = m_shared.political_subdivisions
if not linkdiv then
error("Saw unknown place type '" .. pl_placetype .. "' in label '" .. label .. "'")
end
linkdiv = m_shared.format_description(linkdiv, pl_placetype)
labels = {
type = "name",
description = "{{{langname}}} names of ]s of " .. linkdiv .. ".",
parents = {"capital cities"},
}
end
-- Handler for "state capitals of the United States", "provincial capitals of Canada", etc.
-- Places that begin with "the" are recognized and handled specially.
table.insert(handlers, function(label)
label = lcfirst(label)
local capital_cat, place = label:match("^(- capitals) of (.*)$")
-- Make sure we recognize the type of capital.
if place and m_shared.capital_cat_to_placetype then
local placetype = m_shared.capital_cat_to_placetype
local pl_placetype = m_strutils.pluralize(placetype)
-- Locate the containing polity, fetch its known political subdivisions, and make sure
-- the placetype corresponding to the type of capital is among the list.
for _, group in ipairs(m_shared.polities) do
local placedata = group.data
if placedata then
placedata = group.value_transformer(group, place, placedata)
if placedata.poldiv then
local saw_match = false
local variant_matches = {}
for _, div in ipairs(placedata.poldiv) do
if type(div) == "string" then
div = {div}
end
-- HACK. Currently if we don't find a match for the placetype, we map e.g.
-- 'autonomous region' -> 'regional capitals' and 'union territory' -> 'territorial capitals'.
-- When encountering a political subdivision like 'autonomous region' or
-- 'union territory', chop off everything up through a space to make things match.
-- To make this clearer, we record all such "variant match" cases, and down below we
-- insert a note into the category text indicating that such "variant matches"
-- are included among the category.
if pl_placetype == div or pl_placetype == div:gsub("^.* ", "") then
saw_match = true
if pl_placetype ~= div then
table.insert(variant_matches, div)
end
end
end
if saw_match then
-- Everything checks out, construct the category description.
local linkdiv = m_shared.political_subdivisions
if not linkdiv then
error("Saw unknown place type '" .. pl_placetype .. "' in label '" .. label .. "'")
end
linkdiv = m_shared.format_description(linkdiv, pl_placetype)
local bare_place, linked_place = m_shared.construct_bare_and_linked_version(place)
local keydesc = placedata.keydesc or linked_place
local variant_match_text = ""
if #variant_matches > 0 then
for i, variant_match in ipairs(variant_matches) do
variant_matches = m_shared.political_subdivisions
if not variant_matches then
error("Saw unknown place type '" .. variant_match .. "' in label '" .. label .. "'")
end
variant_matches = m_shared.format_description(variant_matches, variant_match)
end
variant_match_text = " (including " .. require("Module:table").serialCommaJoin(variant_matches) .. ")"
end
local desc = "{{{langname}}} names of ]s of " .. linkdiv .. variant_match_text .. " of " .. keydesc .. "."
return {
type = "name",
topic = label,
description = desc,
parents = {{name = capital_cat, sort = bare_place}, bare_place},
}
end
end
end
end
end
end)
-- "regions in (continent)", esp. for regions that span multiple countries
labels = { -- for multinational regions which do not fit neatly within one continent
type = "name",
description = "{{{langname}}} names of ]s in the world (which do not fit neatly within one country or continent).",
parents = {"places"},
}
labels = {
type = "name",
description = "{{{langname}}} names of ]s in the Americas.",
parents = {"America"},
}
-- "countries in (continent)", "rivers in (continent)"
for _, continent in ipairs({"Africa", "Asia", "Central America", "Europe", "North America", "Oceania", "South America"}) do
labels = {
type = "name",
description = "{{{langname}}} names of ] in ].",
parents = {{name = "countries", sort = " "}, continent},
}
labels = {
type = "name",
description = "{{{langname}}} names of ]s in ].",
parents = {{name = "rivers", sort = " "}, continent},
}
labels = {
type = "name",
description = "{{{langname}}} names of ]s of ].",
parents = {{name = "regions", sort = " "}, continent},
}
end
-- autonomous communities, oblasts, etc
labels = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the ].",
parents = {{name = "political subdivisions", sort = "Spain"}, "Spain"},
}
labels = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the ].",
parents = {{name = "political subdivisions", sort = "Spain"}, "Spain"},
}
-- boroughs
labels = {
type = "name",
description = "{{{langname}}} names of boroughs, local government districts and unitary authorities in ].",
parents = {{name = "boroughs", sort = "England"}, "England"},
}
labels = {
type = "name",
description = "{{{langname}}} names of boroughs in ], USA.",
parents = {{name = "boroughs in the United States", sort = "Pennsylvania"}, "Pennsylvania, USA"},
}
labels = {
type = "name",
description = "{{{langname}}} names of boroughs in ], USA.",
parents = {{name = "boroughs in the United States", sort = "New Jersey"}, "New Jersey, USA"},
}
labels = {
type = "name",
description = "{{{langname}}} names of boroughs in ].",
parents = {{name = "boroughs in the United States", sort = "New York City"}, "New York City"},
}
labels = {
type = "name",
description = "{{{langname}}} names of ]s in the ].",
-- parent is "boroughs" not "political subdivisions" and category says "in"
-- not "of", because boroughs aren't really political subdivisions in the US
-- (more like cities)
parents = {{name = "boroughs", sort = "United States"}, "United States"},
}
-- census-designated places
labels = {
type = "name",
description = "{{{langname}}} names of ]s in the ].",
-- parent is just United States; census-designated places have no political
-- status and exist only in the US, so no need for a top-level
-- "census-designated places" category
parents = {"United States"},
}
-- counties
labels = {
type = "name",
description = "{{{langname}}} names of the counties of ].",
-- has two parents: "political subdivisions" and "counties of Ireland"
parents = {{name = "political subdivisions", sort = "Northern Ireland"}, {name = "counties of Ireland", sort = "Northern Ireland"}, "Northern Ireland"},
}
-- nomes
labels = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the nomes of ].",
parents = {{name = "political subdivisions", sort = "Egypt"}, "Ancient Egypt"},
}
-- regions and "regional units"
labels = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the regions (peripheries) of ].",
parents = {{name = "political subdivisions", sort = "Albania"}, "Albania"},
}
labels = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the regions (peripheries) of ].",
parents = {{name = "political subdivisions", sort = "Greece"}, "Greece"},
}
labels = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of the regions (peripheries) of ].",
parents = {{name = "political subdivisions", sort = "North Macedonia"}, "North Macedonia"},
}
-- subdistricts and subprefectures
labels = {
type = "name",
description = "default",
-- not listed in the normal place because no categories like "cities in Jakarta"
parents = {{name = "political subdivisions", sort = "Jakarta"}, "Indonesia"},
}
labels = {
type = "name",
-- special-cased description
description = "{{{langname}}} names of subprefectures of Japanese prefectures.",
parents = {{name = "political subdivisions", sort = "Japan"}, "Japan"},
}
-- towns and townships
labels = {
type = "name",
description = "{{{langname}}} names of townships in ].",
parents = {{name = "townships", sort = "Canada"}, "Canada"},
}
labels = {
type = "name",
description = "{{{langname}}} names of townships in ]. Municipalities in Ontario can be called as a city, a town, a township, or a village.",
parents = {{name = "townships in Canada", sort = "Ontario"}, "Ontario"},
}
labels = {
type = "name",
description = "{{{langname}}} names of townships in ].",
parents = {{name = "townships in Canada", sort = "Quebec"}, "Quebec"},
}
-- temporary while users adjust to recent changes, also kept in case of desire to use for its topical purpose, see description; can be removed later if unused
labels = {
type = "type",
description = "{{{langname}}} terms like ''hydronym'', for types geographical ]s.",
parents = {"names"},
}
return {LABELS = labels, HANDLERS = handlers}