This module contains data on all known locations, along with some lower-level code to process them (higher-level known-location code is in Module:place/placetypes). You must load this module using require(), not using mw.loadData().
NOTE: In order to understand the following better, first read the introductory documentation in Module:place, especially the section More about known locations
.
The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations and their relationships. Locations are grouped into location groups that share some common properties (examples are states of the United States and cities in Brazil). Each location group is associated with two tables, a data table that lists the locations and their individual properties, and a metadata table that lists group-level properties and defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data table as its data
field), and the global locations
variable holds a list of all group metadata tables. A given location is generally described by three values: (a) the group metadata table for the group the location is part of; (b) the location's canonical key, which is the actual key in the group's data table and is globally unique across all locations; and (c) the location's spec, which is the initialized object describing the properties of the location and comes from the value in the data table corresponding to the canonical key, transformed by the initialize_spec()
function. These are typically named group
, key
and spec
, respectively and in that order, and are found in the arguments to many functions.
In a per-group data table, the keys are either canonical keys describing locations (which, as mentioned above, must be globally unique) or alias keys specifying an allowed alias for a given location. There may be multiple aliases for a given location and the alias keys only need to be unique within a particular group data table, not across all groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another group. (For example, Newcastle
appears as an alias key in two different groups, referring to two different locations, canonically known as Newcastle upon Tyne
, for the city in England, and Newcastle, New South Wales
, for the city in New South Wales, Australia; and Birmingham
appears both as a canonical key in the group of English cities and an alias key for canonical Birmingham, Alabama
in the group of US cities.) The corresponding value objects are different for canonical and alias keys. Corresponding to canonical keys are location specs, describing the properies of the location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys are alias specs, which are highly restricted in the properties they can contain, and whose properties do not have per-group defaults, but only global defaults.
The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it must be globally unique. For example, the country of Georgia uses the canonical key Georgia
and corresponding bare category Category:Georgia, while the US state of Georgia uses the canonical key Georgia, USA
and corresponding bare category Category:Georgia, USA. The following conventions are followed in naming keys:
Newcastle, New South Wales
and Birmingham, Alabama
above. Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of León, Guanajuato
, Mérida, Yucatán
and Cartagena, Colombia
to avoid ambiguity with the well-known respective cities of the same name in Spain, even though none of those cities are large enough to be included as known locations in this module. (The cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.)Normandy, France
(a region), Calvados, France
(a department in the region of Normandy), Herefordshire, England
(a ceremonial county), Northwest Territories, Canada
(a territory), Central Finland, Finland
(a region), Antalya Province, Turkey
(a province), Cluj County, Romania
(a county), County Cork, Ireland
(a county) and New York, USA
(a state). As shown in these various examples, (a) first and second-level divisions are sometimes both included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States or Unites States of America); (c) the word the
is not normally included in the key even if the location is normally preceded by the
when following a preposition (there is a property in the location and alias specs to indicate this), except in a very few cases (most notably The Hague
); (d) the country is included as a qualifier even if it creates an apparent redundancy, as with Central Finland, Finland
; and (e) sometimes the placetype is included in the key, as with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on per-country conventions. For example, provinces in Turkey, Iran and several other countries (likewise for states in Nigeria, oblasts in Russia, etc.) conventionally include the word "Province", "State", "Oblast" etc. in their name because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "County" preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article naming scheme for a given administrative division is a strong clue as to how the division is normally referred to, and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and Thailand include the word province
with an initial lowercase letter while provinces elsewhere, e.g. North and South Korea, Saudi Arabia and Turkey, use uppercase Province
; we normalize to uppercase Province
in all cases.)As mentioned above, associated with canonical keys in the group data table are location specs, which are objects containing properties. It is important here to distinguish initialized specs from uninitialized specs. Unininitialized specs are as directly specified in Module:place/locations, containing only those properties that differ from the per-group or global defaults. Initialized specs result from calling initialize_spec()
on an uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a given location property. (The initialization process also does more transformations in a few cases, noted below.) Note that the default value of a given property is stored under a key in the group metadata table that is preceded by the string default_
; for example, the default value corresponding to the placetype
property of a given location is specified in the default_placetype
key in the group metadata table.
The following are the properties of the location spec.
placetype
: String specifying the placetype of the location (e.g. "country", "state", province"). This can also be a table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any of the specified types. The placetype must be either specified on an individual location or defaulted at the group level, or an error occurs.container
: Either a string, a canonicalized container structure or a list of either type, specifying the immediate container (or containers) of the given location. A container is another location which this location is considered to be directly part of, either politically or (above the country level) geographically. Some locations belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and Turkey. Containers can themselves have containers, forming a tree (or more correctly, a w:directed acyclic graph) of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed the container trail, and some functions compute and return this trail as part of their operation. When a location spec is initialized, the given container spec is canonicalized into canonical container form, which consists of a list of canonicalized container structures, each of which is of the form {key = "container_key", placetype = "container_placetype"}
, where container_key
is a canonical location key and container_placetype
should be the listed placetype for the location, or the first listed placetype if there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the placetype from the container structure.) The list of canonicalized container structures is stored into the .containers
field of the location spec (this happens even if the container value is unset in its uninitialized spec form, causing it to default to the corresponding group-level value), and the .container
field is set to nil
. The canonicalization process is described in more detail below under #Container spec canonicalization.divs
: List of recognized political divisions; e.g. for the Netherlands, a specification of the form divs = {"provinces", "municipalities"}
will allow categories such as Category:de:Provinces of the Netherlands and Category:pt:Municipalities of the Netherlands to be created. Any division that appears here must also be found in placetype_data
, or an error occurs. The entities appearing in the divs
list can be structures as well as just strings; this is explained more below under #Location divisions. Additional political divisions that apply to all locations in a group can be specified at the group level using the group-only property addl_divs
, which has the same format as divs
. This is intended to be used in the situation where some division types are shared among all locations in the group and others differ from location to location. An example where this is used is the United States, where census-designated places
is specified in the group-level addl_divs
so that all 50 states have census-designated places categorized as e.g. Category:Census-designated places in Arizona, USA, but counties
and county seats
are specified in the group-level default_divs
because not all states have counties and county seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have municipalities). Note that under most circumstances (particularly, if container_parent_type
is not set as a property associated with the division type), any division type specified on a sub-country-level location must also be specified on all containers up through the country. For example, since French departments specify communes
and municipalities
in default_divs
, the same division types must be (and are) specified on French regions and for France itself.keydesc
: String directly specifying a description of the location, for use in generating the contents of category pages related to the location. In place of a string, a function of three arguments (group
, key
, spec
, as is normal for locations) that computes the location description can also be given. This is used, for example, for Russian federal subjects; see construct_russia_federal_subject_keydesc
. The special string +++
contained in the keydesc is replaced with the default value of the location description, which specifies the location's placename, placetype, and the corresponding values for each container in the container trail, generally up through (but not beyond) the country level; see no_include_container_in_desc
below. The location description is used to construct the full description of various categories, such as bare location categories, whose description generally reads "{{{langname}}} terms related to the people, culture, or territory of keydesc."
where keydesc
is the specified or auto-constructed location description.fulldesc
: String overriding the full description for the bare location category (but not for any other category). This is currently used only for the location Earth
, at the very top of the tree (because the standard `people, culture or territory of ... text doesn't make sense here), and for
Antarctica` (because it has no permanent inhabitants). FIXME: This should be renamed bare_category_fulldesc
.addl_parents
: Specify additional parents for the bare location category, in addition to the category or categories generated based on the immediate container(s). For example, Hawaii, USA
specifies Polynesia
as an additional parent category; both North Korea
and South Korea
specify Korea
(which is a specially handled location category) as an additional parent; and Earth
specifies nature
(not a location category, but still a topic category) as an additional parent (which in this case becomes the first parent, as Earth
has no container). The only restriction on the categories in addl_parents
is that they must be topic categories, because each language-specific version of the bare location category gets the corresponding language-specific versions of the categories in addl_parents
. FIXME: This shoudl be renamed bare_category_addl_parents
.wp
: Spec describing how to construct the Wikipedia article for the location. Each spec is either true
(equivalent to "%l"
, i.e. use the full location placename directly) or a string containing formatting directives, indicating how to construct the article name. The allowed formatting directives are %l
(the full location placename), %e
(the elliptical location placename) and %c
(the full placename of the first immediate container). For example, the default value of wp
for the group of United States cities is "%l, %c"
since the city articles tend to be named e.g. Austin, Texas
(but with many exceptions, specified using wp
fields at the city level). Another example is Thai provinces, which specify a group-level default of "%e province"
as the Wikipedia articles have lowercase province
in their name but the Thai province keys specified in this module have uppercase Province
. Here we have to use %e
to get the placename without the word Province
in it. The default is true
, which simply uses the full location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category pages, are shown in the upper right of bare category pages.wpcat
: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles and categories relevant to the location). The format is the same as with wp
, and it defaults to the value of wp
. It rarely needs to be specified because the category page and the article page almost always follow the same format.commonscat
: Spec describing how to construct the Commons category page for the location (i.e. the page on the MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as wp
and wpcat
and defaults to wpcat
, which is usually (but not always) correct.the
: Boolean specifying whether a location should be preceded by the
when following a preposition, e.g. in category names such as Category:Cities in the Northern Territory, Australia and in old-style place descriptions when the location occurs as the first holonym, such as the city Darwin described using {{place|city|terr/Northern Territory|c/Australia}}
. Note that the global default for this and all Boolean properties is nil
, which amounts to the same as false
.british_spelling
: Boolean indicating whether the location in question uses British spelling. Currently this only affects whether the spelling neighborhoods
or neighbourhoods
is used in categories such as Category:Neighborhoods of New York City and Category:Neighbourhoods of Sydney. This usually needs to be set only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail for any container that has british_spelling = true
set, and if found, assume that British spelling applies. The general principle used in setting this is that all countries in Europe, all dependent territories of any such country, all former British colonies, and any dependent territories of these former colonies, are assumed to use British spelling, while all other countries and associated dependent territories are assumed to use American spelling. This can potentially be modified on a case-by-case basis.is_city
: Boolean indicating whether the location in question is a city. This is explicitly set to true
for city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire, Gibraltar, etc.), certain city-level administrative divisions (such as City of Belfast, Northern Ireland
) and (through a group-levell setting) New York boroughs. In addition, it is set to true
in initialize_spec() whenever the group-level default_placetype == "city"
, so that all cities get it set without explicitly needing to add a group-level setting for this. Note that the condition default_placetype == "city"
intentionally excludes Chinese prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods, but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to categories like Category:Rivers in Osaka Prefecture, Japan and Category:Cities in Wuhan for holonyms that are not cities; (b) to add districts, neighborhoods, and the like to categories like Category:Neighborhoods of Brooklyn and Category:Neighborhoods of Monaco for holoynms that are cities; (c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location. (Those that can occur with cities have a generic_before_cities
setting in Module:place/placetypes, and those that can occur with non-cities have a generic_before_non_cities
setting.)is_former_place
: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such places, categories such as Category:fr:Rivers in the Soviet Union are neither generated nor recognized (more generally, no "generic" placetypes apply except for places
), and category descriptions include the word former
.overriding_bare_label_parents
: Document me!bare_category_parent_type
: Document me!no_container_cat
: Document me!no_container_parent
: Document me!no_generic_place_cat
: Document me!no_check_holonym_mismatch
: Document me!no_auto_augment_container
: Document me!no_include_container_in_desc
: Document me!The divs
field of a location describes the recognized political division types of that location. Specifying a given division type will cause places defined as being of the specified division type and with the location as a holonym will cause the place to be categorized as placetypes in/of location
; for example, specifying that the United States has "states"
as a division will cause anything defined as {{place|fr|state|c/US}}
to be categorized under Category:fr:States of the United States. Note that you do not have to explicitly specify division types for "generic" placetypes (those that have a generic_before_non_cities
field if the location is not a city, or that have a generic_before_cities
field if the location is a city); this includes things like cities, towns, villages, neighbo(u)rhoods and rivers. A given element in the divs
list is usually a string naming a plural placetype; the placetype is automatically converted to the singular for recognizing the placetype in a {{place}}
spec, and irregular plurals such as kibbutzim
are handled correctly as long as the placetype specifies an appropriate plural
field (if the plural
isn't explicitly given, the default singularization algorithm in Module:en-utilities is run, which gets most things correctly but has problems with passes
and fortresses
, which are singularized to passe
and fortresse
; for this reason, an explicit plural entry is added to terms in -ss). In place of a string, an object can be given with the plural placetype in the type
field; this allows additional properties to be specified along with the placetype. An example of this is the divs
list for Canada:
= {divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
"counties", "districts", "municipalities", "regional municipalities",
"rural municipalities", "parishes",
"Indian reserves",
"census divisions",
{type = "townships", prep = "in"},
}, ...},
Here, both provinces and territories are set to categorize as provinces and territories
, meaning that there is a single category Category:Provinces and territories of Canada rather than separate categories for provinces and territories. Similar things are done for other countries that have more than one type of first-level administrative division (e.g. Australia, China, India and Pakistan). Note that any placetype listed under cat_as
must exist in the table of placetypes in Module:place/placetypes, and in fact there is a category-only entry there for `provinces and territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for use in categories and won't be recognized as the placetype field in a {{place}}
description). In addition, townships are declared to use in
rather than of
as the preposition in the category; hence the category name will be Category:Townships in Canada rather than Category:Townships of Canada. (The use of in
vs. of
is somewhat related to whether a given placetype is an official administrative or statistical division of the location in question and comes in a defined list, in which case of
should be used, or is more ill-defined, in which case in
should be used; the default is of
, and the use of in
with townships
is probably by analogy with the use of in
with cities and towns.)
Another more complex example is the divisions given for Quebec:
= {divs = {
"counties",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
{type = "regions", container_parent_type = false},
{type = "townships", prep = "in"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "in"}, "municipalities"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "in"}, "municipalities"}},
}, ...},
Here, container_parent_type
controls the second parent category of the placetype/location category associated with the entry. In this case, for example, Category:Counties of Quebec, Canada will have Category:Counties of Canada as its second or container-level parent. However, this doesn't make sense for regional county municipalities
, which exist only in Quebec (so the parent category Category:Regional county municipalities of Canada would have only one subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the container_parent_type = "regional municipalities"
spec causes the container-level parent of this category to be Category:Regional municipalities of Canada. Likewise, regions
as administrative divisions (as opposed to mere geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent using container_parent_type = false
. The specs for parish municipalities
, township municipalities
and village municipalities
show both that multiple types can be specified under cat_as
(here, for example, we categorize parish municipalities
as both parishes
and municipalities
) and that these types can themselves have properties, just as for entries directly under divs
. Specifically, {type = "parishes", container_parent_type = "counties"}
means that any place defined as a parish municipality in Quebec will be categorized under both Category:Parishes of Quebec, Canada and Category:Municipalities of Quebec, Canada, and that the former will have a container-level parent of Category:Counties of Canada (rather than the default of Category:Parishes of Canada). Similarly, township municipalities
will be categorized under both Category:Townships in Quebec, Canada (not Category:Townships of Quebec, Canada) and Category:Municipalities of Quebec, Canada.
A fully canonicalized container spec for a given location consists of a list of canonicalized container objects, each with a key
and placetype
field. The key
field should name the canonical key of some other location at a higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The placetype
field should correspond to the first (canonical) placetype listed for the key in question. The process of initializing a locaion spec converts the container spec in .container
into a canonicalized spec in .containers
and removes the spec from .container
. It works as follows:
container
field is missing, and there is a group-level default_container
field, it is used in its place. For example, none of the Brazilian states listed in brazil_states
specifies a container, but the group specifies default_container = "Brazil"
.default_container
, and there is a group-level canonicalize_key_container
field, it is assumed to be a one-argument function and is called on the string to get a canonicalized container object.key
, with placetype
set to "country"
.Aliases can be provided for canonical keys using alias keys. Alias keys have a very different location spec structure from canonical keys. This structure does not, in general, have defaults at the group level and is not initialized using initialize_spec()
, but is used as-is. The following properties are recognized in an alias location spec:
alias_of
: The canonical key of which this key is an alias. Required.the
: If true, this alias key is preceded by the
following a preposition. Defaults to the group-level default_the
but does not pay attention to the value of the
for the corresponding canonical key.display
: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise, the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display canonicalizing.placetype
: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype, and if that is unspecified, to the group-level default placetype.As mentioned above, associated with each location group is a metadata table listing group-level properties. The metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but preceded by default_
, e.g. default_placetype
corresponding to the location-level placetype
key) and group-only keys, which are mostly functions. The following are the possible group-only keys:
data
: This points to the group data table for the group, as described above.key_to_placename
: This is a function of one argument to transform the location's key (whether canonical or alias) into the full and elliptical placenames. The difference between full and elliptical placenames is described in the documentation for Module:place, but in essence, it applies for keys that include the placetype in them (e.g. Phuket Province, Thailand
or County Mayo, Ireland
), in which case the full placename includes the placetype and the elliptical placename does not. For keys that do not include the placetype in them (e.g. Arizona, USA
or Gloucestershire, England
), the full and elliptical placenames are identical. Note that neither the full nor the elliptical placename includes the container in it; hence, for Phuket Province, Thailand
, the full placename is Phuket Province
and the elliptical placename is just Phuket
. (Note that the full vs. elliptical placename distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there is no difference between the two in whether they are normally preceded by the
. More complex situations, such as State of Mexico
(which normally takes the
) vs. just Mexico
(which doesn't), or Islamabad Capital Territory
vs. just Islamabad
, should be handled instead by aliases.) The key_to_placename
function takes one argument, the key, and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to chop off anything starting with a comma and return the result as both full and elliptical placename, and if specifically set to false
, the key is used directly as both full and elliptical placename. If it needs to be defined, it is best to use the helper function make_key_to_placename
, if possible (or make_irish_type_key_to_placename
in the case of Ireland and Northern Ireland, where County
precedes), rather than rolling your own. In addition, you should use the global key_to_placename
function (which takes care of the default implementation and such) rather than directly calling the function in the key_to_placename
field.placename_to_key
: This is approximately the inverse of key_to_placename
, transforming a placename (which can be either in full or elliptical form) into the corresponding key. As with key_to_placename
, if you need to define this (generally, when the full and elliptical placenames are different), prefer using make_placename_to_key
(or make_irish_type_placename_to_key
for Ireland and Northern Ireland) to rolling your own. In addition, similarly to key_to_placename
, use the global placename_to_key
function to convert placenames to keys rather than directly invoking the function in the placename_to_key
field. If the field is set to false
, the placename is used unchanged as the key. Otherwise, the default algorithm works as follows:
default_placetype == "city"
, use the placename unchanged as the key.default_container
exists and is a string, append it to the placename after a comma + space and use the result as the key.default_container
is a canonical container object (an object with key
and placetype
fields), and the placetype
field is either country
or constituent country
, append the key
field to the placename after a comma + space and use the result as the key.canonicalize_key_container
: A function of one argument to convert the specified container
field, when a string, to canonical form. Described in more detail above under #Container spec canonicalization. It is preferable to construct the function using make_canonicalize_key_container
, if possible, rather than rolling your own.addl_divs
: Additional political divisions appended, for all locations in the group, to the list of divisions derived from the location-level divs
or group-level default_divs
fields to get the final list of divisions for the location. See #Location divisions for more details.function export.process_error(fmt, ...)
Throw an error. fmt
is a format string and the remaining arguments are passed through mw.dumpObject
and then used to format the format string as if fmt:format(...)
were called. In general, callers should use internal_error
unless the error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like this).
function export.internal_error(fmt, ...)
Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user error triggered by bad input or a system error due to something like running out of memory or hitting a time limit). fmt
is a format string and the remaining arguments are passed through mw.dumpObject
and then used to format the format string as if fmt:format(...)
were called.
function export.key_to_placename(group, key)
Call the location group's key_to_placename
function if it exists (see the comment at the top of Module:place for the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full "County Durham"
vs. elliptical "Durham"
). If the group does not define key_to_placename
, both full and elliptical placenames are computed by chopping off anything starting with a comma.
function export.placename_to_key(group, placename)
Call the location group's placename_to_key
function if it exists (see the comment at the top of Module:place for the distinction between keys and placenames) and return the result. If placename_to_key
exists with the value false
, return the placename unchanged. If the group does not define placename_to_key
, and it defines a default_container
whose placetype is either country
or constituent country
, the container name is appended to the placename after a comma and a space. Otherwise the placename is returned unchanged.
function export.initialize_spec(group, key, spec)
Initialize the location spec spec
, augmenting it with default values taken from group
if the spec itself doesn't specify values for the properties. This sets containers
to a canonicalized list of objects, each with key
and placetype
keys, describing the immediate containers of the location, and erases (sets to nil) the original non-canonicalized container
field. (Most locations have only one immediate container but some, e.g. Russia, have more than one. Containers should be carefully distinguished from category parents. Generally the container is the first category parent, or the first n
parents if there are n
containers, but there may be additional category parents, which indicate some sort of relation between the category parent and the location but not necessarily one of containment.)
This function is idempotent in that nothing happens if called more than once on the same spec.
FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables.
function export.find_canonical_key(key)
If key
is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec. If no such key exists, return nil
. This throws an internal error if two locations with the same key are found.
function export.iterate_matching_location(data)
Iterator that returns all locations matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator returns three values at each iteration: the location group, canonical key by which the location is known and the spec object describing the location. data
contains the following possible fields:
placetypes
: A list of possible placetypes, one of which must match one of the location's placetypes; or a string specifying a placetype, which must match one of the location's placetypes. This must be specified.placename
: The placename of the location. Either this or key
must be specified.key
: The key of the location. Either this or placename
must be specified.alias_resolution
: If specified, it behaves the same as for find_matching_key_in_group
.The spec is normally initialized using initialize_spec()
prior to it being returned (but may not be if alias_resolution
is given and the specified key or placename is an alias; see the documentation for find_matching_key_in_group
).
function export.get_matching_location(data)
Return the location matching a given description, where the description consists of either a placename or a key along with a list of possible placetypes. This is similar to iterate_matching_location()
but throws an internal error if there is not exactly one location found; as such, it is for use with internally specified locations (such as the containers of known locations) rather than externally specified locations, which may not match a known location and in some cases may match multiple known locations. For finding an externally specified location, consider using find_matching_holonym_location
, which returns nil
rather than throwing an error if the location isn't found, but also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g. {{place|city|s/Delaware|c/USA|t=Newark}}
will not match the known location Newark
(in New Jersey, not Delaware).
function export.iterate_containers(group, key, spec)
Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An internal error happens if a container loop is detected. The return value is a list of location objects, each of which contains group
, key
and spec
fields.
function export.construct_linked_placename(spec, placename, display_form)
Given a placename, convert it into a link (two-part if display_form
is given and differs from placename
) and add "the "
to the beginning if called for in spec
.
export.locations
List of all known locations, in groups. The first group lists continents and continental regions, followed by three groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities (administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the hundreds).
local export = {}
export.force_cat = false -- set to true to force category generation even on non-mainspace pages
local m_table = require("Module:table")
local string_utilities_module = "Module:string utilities"
local en_utilities_module = "Module:en-utilities"
local insert = table.insert
local concat = table.concat
local dump = mw.dumpObject
local unpack = unpack or table.unpack -- Lua 5.2 compatibility
--[==[ intro:
This module contains data on all known locations, along with some lower-level code to process them (higher-level
known-location code is in ]). You must load this module using require(), not using
mw.loadData().
===Location data===
'''NOTE: In order to understand the following better, first read the introductory documentation in ],
especially the section `More about known locations`.'''
The bulk of the code in this module (after some helper functions and placetype tables) describes the known locations
and their relationships. Locations are grouped into ''location groups'' that share some common properties (examples are
states of the United States and cities in Brazil). Each location group is associated with two tables, a ''data table''
that lists the locations and their individual properties, and a ''metadata table'' that lists group-level properties and
defaults for the location properties. Each metadata table points to the associated data table (i.e. contains the data
table as its `data` field), and the global `locations` variable holds a list of all group metadata tables. A given
location is generally described by three values: (a) the group metadata table for the group the location is part of; (b)
the location's canonical ''key'', which is the actual key in the group's data table and is globally unique across all
locations; and (c) the location's ''spec'', which is the initialized object describing the properties of the location
and comes from the value in the data table corresponding to the canonical key, transformed by the `initialize_spec()`
function. These are typically named `group`, `key` and `spec`, respectively and in that order, and are found in the
arguments to many functions.
In a per-group data table, the keys are either ''canonical keys'' describing locations (which, as mentioned above, must
be globally unique) or ''alias keys'' specifying an allowed alias for a given location. There may be multiple aliases
for a given location and the alias keys only need to be unique within a particular group data table, not across all
groups. It is also possible for the same string to serve as an alias key in one group and a canonical key in another
group. (For example, `Newcastle` appears as an alias key in two different groups, referring to two different locations,
canonically known as `Newcastle upon Tyne`, for the city in England, and `Newcastle, New South Wales`, for the city in
New South Wales, Australia; and `Birmingham` appears both as a canonical key in the group of English cities and an alias
key for canonical `Birmingham, Alabama` in the group of US cities.) The corresponding value objects are different for
canonical and alias keys. Corresponding to canonical keys are ''location specs'', describing the properies of the
location that cannot be derived from default properties of the group or global defaults. Corresponding to alias keys
are ''alias specs'', which are highly restricted in the properties they can contain, and whose properties do not have
per-group defaults, but only global defaults.
The canonical key is always the same as the bare category corresponding to the location, which is one of the reasons it
must be globally unique. For example, the country of Georgia uses the canonical key `Georgia` and corresponding bare
category ], while the US state of Georgia uses the canonical key `Georgia, USA` and corresponding
bare category ]. The following conventions are followed in naming keys:
* Countries, ''country-like entities'' (which are a mixture of unrecognized de-facto states and dependent territories)
and ''former countries'' (which also includes other types of polities, such as the Roman Empire) use their unqualified
placename as the canonical key. (See the documentation for ] for the distinction between keys and
placenames, which is critical to understand when working with location data.) This also applies to constituent
countries (such as England, Aruba and the Faroe Islands) and constituent parts of grouped dependent territories (such
as the island of Saint Helena, which is administratively part of the British overseas territory of Saint Helena,
Ascension and Tristan da Cunha).
* Cities (including prefecture-level cities in China, which behave in most respects more like non-city administrative
divisions) also normally use their unqualified placename as the canonical key, but if this causes name conflicts or
ambiguities, they use a ''qualified key'' containing either the country name or immediate containing division (if
different) following a comma, such as the case of `Newcastle, New South Wales` and `Birmingham, Alabama` above.
Examples of name conflicts are the two cities just given; examples of ambiguities are the major cities of León and
Mérida in Mexico and city of Cartagena, Colombia, which are given the respective canonical keys of `León, Guanajuato`,
`Mérida, Yucatán` and `Cartagena, Colombia` to avoid ambiguity with the well-known respective cities of the same name
in Spain, even though none of those cities are large enough to be included as known locations in this module. (The
cutoff is generally having a metro area of at least 1,000,000 inhabitants, although there are exceptions.)
* Administrative divisions of countries, other than the exceptions noted above for constituent countries and dependent
territories, use a qualified key that contains the name of the country or constituent country in it, e.g.
`Normandy, France` (a region), `Calvados, France` (a department in the region of Normandy), `Herefordshire, England`
(a ceremonial county), `Northwest Territories, Canada` (a territory), `Central Finland, Finland` (a region),
`Antalya Province, Turkey` (a province), `Cluj County, Romania` (a county), `County Cork, Ireland` (a county) and
`New York, USA` (a state). As shown in these various examples, (a) first and second-level divisions are sometimes both
included (as in France, the United Kingdom and China); (b) the qualifier after the comma is sometimes a constituent
country (England) instead of a country (United Kingdom), and is sometimes abbreviated (USA rather than United States
or Unites States of America); (c) the word `the` is not normally included in the key even if the location is normally
preceded by `the` when following a preposition (there is a property in the location and alias specs to indicate this),
except in a very few cases (most notably `The Hague`); (d) the country is included as a qualifier even if it creates
an apparent redundancy, as with `Central Finland, Finland`; and (e) sometimes the placetype is included in the key, as
with provinces in Turkey and several other countries; states in Nigeria; and counties in Ireland, Romania and several
other countries. Whether the placetype is included, and whether it follows or precedes the placename, depends on
per-country conventions. For example, provinces in Turkey, Iran and several other countries (likewise for states in
Nigeria, oblasts in Russia, etc.) conventionally include the word "Province", "State", "Oblast" etc. in their name
because they are normally named after the largest city in the division, which would otherwise lead to ambiguity; and
counties in Ireland and Northern Ireland (and likewise County Durham, England) normally have the word "County"
preceding rather than following them in their conventional name, so we follow this practice. The Wikipedia article
naming scheme for a given administrative division is a strong clue as to how the division is normally referred to,
and we usually follow this practice. (A minor exception is that the Wikipedia articles for provinces in Iran, Laos and
Thailand include the word `province` with an initial lowercase letter while provinces elsewhere, e.g. North and South
Korea, Saudi Arabia and Turkey, use uppercase `Province`; we normalize to uppercase `Province` in all cases.)
As mentioned above, associated with canonical keys in the group data table are location specs, which are objects
containing properties. It is important here to distinguish ''initialized specs'' from ''uninitialized specs''.
Unininitialized specs are as directly specified in ], containing only those properties that
differ from the per-group or global defaults. Initialized specs result from calling `initialize_spec()` on an
uninitialized spec (it is idempotent in that it will do nothing if encountering an already-initialized spec). This
copies all group-level defaults that are not overridden in the location spec itself from the group-level metadata table
into the location spec, so that in general, no more reference need be made to the group to fetch the correct value of a
given location property. (The initialization process also does more transformations in a few cases, noted below.) Note
that the default value of a given property is stored under a key in the group metadata table that is preceded by the
string `default_`; for example, the default value corresponding to the `placetype` property of a given location is
specified in the `default_placetype` key in the group metadata table.
The following are the properties of the location spec.
* `placetype`: String specifying the placetype of the location (e.g. "country", "state", province"). This can also be a
table of such types; in this case, the first listed type is the canonical type that will be used in descriptions, but
the location will be recognized (e.g. in a holonym, or for categorizing into the bare category) when tagged with any
of the specified types. The placetype '''must''' be either specified on an individual location or defaulted at the
group level, or an error occurs.
* `container`: Either a string, a ''canonicalized container'' structure or a list of either type, specifying the
immediate ''container'' (or containers) of the given location. A container is another location which this location is
considered to be directly part of, either politically or (above the country level) geographically. Some locations
belong to multiple immediate containers; this applies especially to transcontinental countries such as Russia and
Turkey. Containers can themselves have containers, forming a tree (or more correctly, a ])
of locations. The list of immediate container(s), followed by the container(s) of the container(s), etc., is termed
the ''container trail'', and some functions compute and return this trail as part of their operation. When a location
spec is initialized, the given container spec is canonicalized into ''canonical container form'', which consists of a
list of canonicalized container structures, each of which is of the form
`{key = "``container_key``", placetype = "``container_placetype``"}`, where ``container_key`` is a canonical location
key and ``container_placetype`` should be the listed placetype for the location, or the first listed placetype if
there are multiple. (FIXME: Since the key uniquely identifies the container location, we should eliminate the
placetype from the container structure.) The list of canonicalized container structures is stored into the
`.containers` field of the location spec (this happens even if the container value is unset in its uninitialized spec
form, causing it to default to the corresponding group-level value), and the `.container` field is set to {nil}. The
canonicalization process is described in more detail below under ].
* `divs`: List of recognized political divisions; e.g. for the Netherlands, a specification of the form
`divs = {"provinces", "municipalities"}` will allow categories such as ]
and ] to be created. Any division that appears here must also be
found in `placetype_data`, or an error occurs. The entities appearing in the `divs` list can be structures as well as
just strings; this is explained more below under ]. Additional political divisions that apply to
all locations in a group can be specified at the group level using the group-only property `addl_divs`, which has the
same format as `divs`. This is intended to be used in the situation where some division types are shared among all
locations in the group and others differ from location to location. An example where this is used is the United
States, where `census-designated places` is specified in the group-level `addl_divs` so that all 50 states have
census-designated places categorized as e.g. ], but `counties`
and `county seats` are specified in the group-level `default_divs` because not all states have counties and county
seats (Alaska has boroughs and borough seats and Louisiana has parishes and parish seats), and some states have
additional divisions (New Jersey and Pennsylvania also have boroughs, while Colorado and Connecticut have
municipalities). Note that under most circumstances (particularly, if `container_parent_type` is not set as a property
associated with the division type), any division type specified on a sub-country-level location must also be specified
on all containers up through the country. For example, since French departments specify `communes` and
`municipalities` in `default_divs`, the same division types must be (and are) specified on French regions and for
France itself.
* `keydesc`: String directly specifying a description of the location, for use in generating the contents of category
pages related to the location. In place of a string, a function of three arguments (`group`, `key`, `spec`, as is
normal for locations) that computes the location description can also be given. This is used, for example, for
Russian federal subjects; see `construct_russia_federal_subject_keydesc`. The special string `+++` contained in the
keydesc is replaced with the default value of the location description, which specifies the location's placename,
placetype, and the corresponding values for each container in the container trail, generally up through (but not
beyond) the country level; see `no_include_container_in_desc` below. The location description is used to construct
the full description of various categories, such as bare location categories, whose description generally reads
`"{{(((}}langname}}} terms related to the people, culture, or territory of ``keydesc``."` where ``keydesc`` is the
specified or auto-constructed location description.
* `fulldesc`: String overriding the full description for the bare location category (but not for any other category).
This is currently used only for the location `Earth`, at the very top of the tree (because the standard `people,
culture or territory of ...` text doesn't make sense here), and for `Antarctica` (because it has no permanent
inhabitants). FIXME: This should be renamed `bare_category_fulldesc`.
* `addl_parents`: Specify additional parents for the bare location category, in addition to the category or categories
generated based on the immediate container(s). For example, `Hawaii, USA` specifies `Polynesia` as an additional
parent category; both `North Korea` and `South Korea` specify `Korea` (which is a specially handled location category)
as an additional parent; and `Earth` specifies `nature` (not a location category, but still a topic category) as an
additional parent (which in this case becomes the first parent, as `Earth` has no container). The only restriction on
the categories in `addl_parents` is that they must be topic categories, because each language-specific version of the
bare location category gets the corresponding language-specific versions of the categories in `addl_parents`. FIXME:
This shoudl be renamed `bare_category_addl_parents`.
* `wp`: Spec describing how to construct the Wikipedia article for the location. Each spec is either `true` (equivalent
to `"%l"`, i.e. use the full location placename directly) or a string containing formatting directives, indicating how
to construct the article name. The allowed formatting directives are `%l` (the full location placename), `%e` (the
elliptical location placename) and `%c` (the full placename of the first immediate container). For example, the
default value of `wp` for the group of United States cities is `"%l, %c"` since the city articles tend to be named
e.g. `Austin, Texas` (but with many exceptions, specified using `wp` fields at the city level). Another example is
Thai provinces, which specify a group-level default of `"%e province"` as the Wikipedia articles have lowercase
`province` in their name but the Thai province keys specified in this module have uppercase `Province`. Here we have
to use `%e` to get the placename without the word `Province` in it. The default is `true`, which simply uses the full
location placename as the article name. Note that the Wikipedia article, along with the Wikipedia and Commons category
pages, are shown in the upper right of bare category pages.
* `wpcat`: Spec describing how to construct the Wikipedia category page for the location (i.e. the page listing articles
and categories relevant to the location). The format is the same as with `wp`, and it defaults to the value of `wp`.
It rarely needs to be specified because the category page and the article page almost always follow the same format.
* `commonscat`: Spec describing how to construct the Commons category page for the location (i.e. the page on the
MediaWiki Commons site listing articles and categories relevant to the location). It has the same format as `wp` and
`wpcat` and defaults to `wpcat`, which is usually (but not always) correct.
* `the`: Boolean specifying whether a location should be preceded by `the` when following a preposition, e.g. in
category names such as ] and in old-style place descriptions
when the location occurs as the first holonym, such as the city ] described using
{{tl|place|city|terr/Northern Territory|c/Australia}}. Note that the global default for this and all Boolean
properties is {nil}, which amounts to the same as {false}.
* `british_spelling`: Boolean indicating whether the location in question uses British spelling. Currently this only
affects whether the spelling `neighborhoods` or `neighbourhoods` is used in categories such as
] and ]. This usually needs to be set
only at the top level (i.e. country or country-like entity), because lower-level entities look up the container trail
for any container that has `british_spelling = true` set, and if found, assume that British spelling applies. The
general principle used in setting this is that all countries in Europe, all dependent territories of any such country,
all former British colonies, and any dependent territories of these former colonies, are assumed to use British
spelling, while all other countries and associated dependent territories are assumed to use American spelling. This
can potentially be modified on a case-by-case basis.
* `is_city`: Boolean indicating whether the location in question is a city. This is explicitly set to `true` for
city-states (e.g. Monaco and Vatican City), dependent territories that are cities (e.g. Hong Kong, Macau, Bonaire,
Gibraltar, etc.), certain city-level administrative divisions (such as `City of Belfast, Northern Ireland`) and
(through a group-levell setting) New York boroughs. In addition, it is set to `true` in initialize_spec() whenever
the group-level `default_placetype == "city"`, so that all cities get it set without explicitly needing to add a
group-level setting for this. Note that the condition `default_placetype == "city"` intentionally excludes Chinese
prefecture-level cities, which aren't really cities in that (for example) they don't directly contain neighborhoods,
but do contain cities within them. This setting is used in various places: (a) to add cities, rivers, etc. to
categories like ] and ] for holonyms that
are ''not'' cities; (b) to add districts, neighborhoods, and the like to categories like
] and ] for holoynms that ''are'' cities;
(c) generally, to determine which "generic" placetypes (cities, rivers, neighborhoods, etc.) apply to the location.
(Those that can occur with cities have a `generic_before_cities` setting in ], and those
that can occur with non-cities have a `generic_before_non_cities` setting.)
* `is_former_place`: Boolean that should be set on former places such as the Soviet Union and the Roman Empire. For such
places, categories such as ] are neither generated nor recognized (more
generally, no "generic" placetypes apply except for `places`), and category descriptions include the word `former`.
* `overriding_bare_label_parents`: Document me!
* `bare_category_parent_type`: Document me!
* `no_container_cat`: Document me!
* `no_container_parent`: Document me!
* `no_generic_place_cat`: Document me!
* `no_check_holonym_mismatch`: Document me!
* `no_auto_augment_container`: Document me!
* `no_include_container_in_desc`: Document me!
====Location divisions====
The `divs` field of a location describes the recognized political division types of that location. Specifying a given
division type will cause places defined as being of the specified division type and with the location as a holonym will
cause the place to be categorized as ` ``placetypes`` in/of ``location`` `; for example, specifying that the United
States has `"states"` as a division will cause anything defined as {{tl|place|fr|state|c/US}} to be categorized under
]. Note that you do not have to explicitly specify division types for
"generic" placetypes (those that have a `generic_before_non_cities` field if the location is not a city, or that have a
`generic_before_cities` field if the location is a city); this includes things like cities, towns, villages,
neighbo(u)rhoods and rivers. A given element in the `divs` list is usually a string naming a plural placetype; the
placetype is automatically converted to the singular for recognizing the placetype in a {{tl|place}} spec, and irregular
plurals such as `kibbutzim` are handled correctly as long as the placetype specifies an appropriate `plural` field
(if the `plural` isn't explicitly given, the default singularization algorithm in ] is run, which
gets most things correctly but has problems with `passes` and `fortresses`, which are singularized to `passe` and
`fortresse`; for this reason, an explicit plural entry is added to terms in ''-ss''). In place of a string, an object
can be given with the plural placetype in the `type` field; this allows additional properties to be specified along with
the placetype. An example of this is the `divs` list for Canada:
{
= {divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
"counties", "districts", "municipalities", "regional municipalities",
"rural municipalities", "parishes",
"Indian reserves",
"census divisions",
{type = "townships", prep = "in"},
}, ...},
}
Here, both provinces and territories are set to categorize as `provinces and territories`, meaning that there is a
single category ] rather than separate categories for provinces and
territories. Similar things are done for other countries that have more than one type of first-level administrative
division (e.g. Australia, China, India and Pakistan). Note that any placetype listed under `cat_as` must exist in the
table of placetypes in ], and in fact there is a category-only entry there for `provinces and
territories!` (the use of exclamation point following a plural placetype means that the placetype is present only for
use in categories and won't be recognized as the placetype field in a {{tl|place}} description). In addition, townships
are declared to use `in` rather than `of` as the preposition in the category; hence the category name will be
] rather than ]. (The use of `in` vs. `of` is somewhat
related to whether a given placetype is an official administrative or statistical division of the location in question
and comes in a defined list, in which case `of` should be used, or is more ill-defined, in which case `in` should be
used; the default is `of`, and the use of `in` with `townships` is probably by analogy with the use of `in` with cities
and towns.)
Another more complex example is the divisions given for Quebec:
{
= {divs = {
"counties",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
{type = "regions", container_parent_type = false},
{type = "townships", prep = "in"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "in"}, "municipalities"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "in"}, "municipalities"}},
}, ...},
}
Here, `container_parent_type` controls the second parent category of the placetype/location category associated with the
entry. In this case, for example, ] will have ] as
its second or ''container-level'' parent. However, this doesn't make sense for `regional county municipalities`, which
exist only in Quebec (so the parent category ] would have only one
subcategory); but they are similar to regional municipalities in British Columbia, Nova Scotia and Ontario, so the
`container_parent_type = "regional municipalities"` spec causes the container-level parent of this category to be
]. Likewise, `regions` as administrative divisions (as opposed to mere
geographic regions) exist only in Quebec; they have no equivalent elsewhere, so we disable the container-level parent
using `container_parent_type = false`. The specs for `parish municipalities`, `township municipalities` and
`village municipalities` show both that multiple types can be specified under `cat_as` (here, for example, we categorize
`parish municipalities` as both `parishes` and `municipalities`) and that these types can themselves have properties,
just as for entries directly under `divs`. Specifically, `{type = "parishes", container_parent_type = "counties"}`
means that any place defined as a parish municipality in Quebec will be categorized under both [[:Category:Parishes of
Quebec, Canada]] and ], and that the former will have a container-level
parent of ] (rather than the default of ]). Similarly,
`township municipalities` will be categorized under both ] (''not''
]) and ].
====Container spec canonicalization====
A fully canonicalized container spec for a given location consists of a list of ''canonicalized container objects'',
each with a `key` and `placetype` field. The `key` field should name the canonical key of some other location at a
higher level (e.g. French cities are contained in French departments, which are contained in French regions, which are
contained in France, which is contained in Europe, which is contained in Eurasia, which is contained in the Earth). The
`placetype` field should correspond to the first (canonical) placetype listed for the key in question. The process of
initializing a locaion spec converts the container spec in `.container` into a canonicalized spec in `.containers` and
removes the spec from `.container`. It works as follows:
# If the `container` field is missing, and there is a group-level `default_container` field, it is used in its place.
For example, none of the Brazilian states listed in `brazil_states` specifies a container, but the group specifies
`default_container = "Brazil"`.
# A single string or canonicalized container object is allowed and made into a one-element list.
# If a list element is a string that did ''not'' come from `default_container`, and there is a group-level
`canonicalize_key_container` field, it is assumed to be a one-argument function and is called on the string to get
a canonicalized container object.
# Any remaining strings are assumed to be countries and are used directly as the `key`, with `placetype` set to
`"country"`.
====Alias keys====
Aliases can be provided for canonical keys using ''alias keys''. Alias keys have a very different location spec
structure from canonical keys. This structure does not, in general, have defaults at the group level and is not
initialized using `initialize_spec()`, but is used as-is. The following properties are recognized in an alias location
spec:
* `alias_of`: The canonical key of which this key is an alias. Required.
* `the`: If true, this alias key is preceded by `the` following a preposition. Defaults to the group-level `default_the`
but does not pay attention to the value of `the` for the corresponding canonical key.
* `display`: This is a display alias, meaning that holonyms using the placename corresponding to this alias will be
converted to the placename corresponding to the canonical key when formatting the holonym for display. (Otherwise,
the aliasing applies only to categorization.) If the value is true, the display canonicalization is to the placename
of the canonical key; otherwise, the value should be a key whose corresponding placename is used when display
canonicalizing.
* `placetype`: The placetype of the alias. Rarely needs to be specified as it defaults to the canonical key's placetype,
and if that is unspecified, to the group-level default placetype.
====Location group metadata tables====
As mentioned above, associated with each location group is a ''metadata table'' listing group-level properties. The
metadata table contains two types of keys: group-level defaults (named like the corresponding location-level keys but
preceded by `default_`, e.g. `default_placetype` corresponding to the location-level `placetype` key) and group-only
keys, which are mostly functions. The following are the possible group-only keys:
* `data`: This points to the group data table for the group, as described above.
* `key_to_placename`: This is a function of one argument to transform the location's key (whether canonical or alias)
into the full and elliptical placenames. The difference between full and elliptical placenames is described in the
documentation for ], but in essence, it applies for keys that include the placetype in them (e.g.
`Phuket Province, Thailand` or `County Mayo, Ireland`), in which case the full placename includes the placetype and
the elliptical placename does not. For keys that do not include the placetype in them (e.g. `Arizona, USA` or
`Gloucestershire, England`), the full and elliptical placenames are identical. Note that neither the full nor the
elliptical placename includes the container in it; hence, for `Phuket Province, Thailand`, the full placename is
`Phuket Province` and the elliptical placename is just `Phuket`. (Note that the full vs. elliptical placename
distinction is intended only for handling cases where the placetype follows or precedes the raw placename and there
is no difference between the two in whether they are normally preceded by `the`. More complex situations, such as
`State of Mexico` (which normally takes `the`) vs. just `Mexico` (which doesn't), or `Islamabad Capital Territory` vs.
just `Islamabad`, should be handled instead by aliases.) The `key_to_placename` function takes one argument, the key,
and returns two arguments, the full and elliptical placenames, respectively. If left undefined, the default is to
chop off anything starting with a comma and return the result as both full and elliptical placename, and if
specifically set to `false`, the key is used directly as both full and elliptical placename. If it needs to be
defined, it is best to use the helper function `make_key_to_placename`, if possible (or
`make_irish_type_key_to_placename` in the case of Ireland and Northern Ireland, where `County` precedes), rather than
rolling your own. In addition, you should use the global `key_to_placename` function (which takes care of the default
implementation and such) rather than directly calling the function in the `key_to_placename` field.
* `placename_to_key`: This is approximately the inverse of `key_to_placename`, transforming a placename (which can be
either in full or elliptical form) into the corresponding key. As with `key_to_placename`, if you need to define this
(generally, when the full and elliptical placenames are different), prefer using `make_placename_to_key` (or
`make_irish_type_placename_to_key` for Ireland and Northern Ireland) to rolling your own. In addition, similarly to
`key_to_placename`, use the global `placename_to_key` function to convert placenames to keys rather than directly
invoking the function in the `placename_to_key` field. If the field is set to `false`, the placename is used unchanged
as the key. Otherwise, the default algorithm works as follows:
*# If the group-level `default_placetype == "city"`, use the placename unchanged as the key.
*# Otherwise, if the group-level `default_container` exists and is a string, append it to the placename after a comma +
space and use the result as the key.
*# Otherwise, if the group-level `default_container` is a canonical container object (an object with `key` and
`placetype` fields), and the `placetype` field is either `country` or `constituent country`, append the `key` field
to the placename after a comma + space and use the result as the key.
*# Otherwise, use the placename unchanged as the key.
* `canonicalize_key_container`: A function of one argument to convert the specified `container` field, when a string,
to canonical form. Described in more detail above under ]. It is preferable to
construct the function using `make_canonicalize_key_container`, if possible, rather than rolling your own.
* `addl_divs`: Additional political divisions appended, for all locations in the group, to the list of divisions derived
from the location-level `divs` or group-level `default_divs` fields to get the final list of divisions for the
location. See ] for more details.
]==]
-----------------------------------------------------------------------------------
-- Helper functions --
-----------------------------------------------------------------------------------
--[==[
Throw an error. `fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to
format the format string as if `fmt:format(...)` were called. In general, callers should use `internal_error` unless the
error was due to bad user input rather than a logic error (which usually isn't the case in deep back-end code like
this).
]==]
function export.process_error(fmt, ...)
local args = {...}
for i = 1, select("#", ...) do
args = dump(args)
end
return error(string.format(fmt, unpack(args)))
end
--[==[
Throw an internal error (a logic error that should never happen unless there is a bug in the code, as opposed to a user
error triggered by bad input or a system error due to something like running out of memory or hitting a time limit).
`fmt` is a format string and the remaining arguments are passed through `mw.dumpObject` and then used to format the
format string as if `fmt:format(...)` were called.
]==]
function export.internal_error(fmt, ...)
export.process_error("Internal error: " .. fmt, ...)
end
local internal_error = export.internal_error
-- Return whether `list_or_element` (a list of strings, or a single string) "contains" `item` (a string). If
-- `list_or_element` is a list, this returns true if `item` is in the list; otherwise it returns true if `item`
-- equals `list_or_element`.
local function list_or_element_contains(list_or_element, item)
if type(list_or_element) == "table" then
return m_table.contains(list_or_element, item) and true or false
end
return list_or_element == item
end
--[==[
Call the location group's `key_to_placename` function if it exists (see the comment at the top of ] for
the distinction between keys and placenames). Two values are returned, the full and elliptical placenames (e.g. full
`"County Durham"` vs. elliptical `"Durham"`). If the group does not define `key_to_placename`, both full and elliptical
placenames are computed by chopping off anything starting with a comma.
]==]
function export.key_to_placename(group, key)
if group.key_to_placename == false then
return key, key
end
if group.key_to_placename then
local full_placename, elliptical_placename = group.key_to_placename(key)
if type(full_placename) ~= "string" then
internal_error("Key %s returned a non-string full placename: %s", key, full_placename)
end
if type(elliptical_placename) ~= "string" then
internal_error("Key %s returned a non-string elliptical placename: %s", key, elliptical_placename)
end
return full_placename, elliptical_placename
end
key = key:gsub(",.*", "")
return key, key
end
--[==[
Call the location group's `placename_to_key` function if it exists (see the comment at the top of ] for
the distinction between keys and placenames) and return the result. If `placename_to_key` exists with the value `false`,
return the placename unchanged. If the group does not define `placename_to_key`, and it defines a `default_container`
whose placetype is either `country` or `constituent country`, the container name is appended to the placename after a
comma and a space. Otherwise the placename is returned unchanged.
]==]
function export.placename_to_key(group, placename)
if group.placename_to_key == false then
return placename
elseif group.placename_to_key then
local key = group.placename_to_key(placename)
if type(key) ~= "string" then
internal_error("Placename %s returned a non-string key: %s", placename, key)
end
return key
elseif group.default_placetype == "city" then
return placename
else
local defcon = group.default_container
if not defcon then
return placename
elseif type(defcon) == "string" then
return placename .. ", " .. defcon
elseif type(defcon) == "table" and (defcon.placetype == "country" or
defcon.placetype == "constituent country") then
return placename .. ", " .. defcon.key
else
return placename
end
end
end
--[==[
Initialize the location spec `spec`, augmenting it with default values taken from `group` if the spec itself doesn't
specify values for the properties. This sets `containers` to a canonicalized list of objects, each with `key` and
`placetype` keys, describing the immediate containers of the location, and erases (sets to nil) the original
non-canonicalized `container` field. (Most locations have only one immediate container but some, e.g. Russia, have more
than one. Containers should be carefully distinguished from category parents. Generally the container is the first
category parent, or the first ``n`` parents if there are ``n`` containers, but there may be additional category parents,
which indicate some sort of relation between the category parent and the location but not necessarily one of
containment.)
This function is idempotent in that nothing happens if called more than once on the same spec.
FIXME: Consider reimplementing this in a more standardly object-oriented way using metatables.
]==]
function export.initialize_spec(group, key, spec)
if spec.initialized then
return
end
local container = spec.container
local containers
local container_from_default
if not container then
container = group.default_container
container_from_default = true
end
if container then
if type(container) == "string" or container.key then
container = {container}
end
containers = {}
for _, cont in ipairs(container) do
if type(cont) == "string" then
if group.canonicalize_key_container and not container_from_default then
cont = group.canonicalize_key_container(cont)
else
cont = {key = cont, placetype = "country"}
end
end
insert(containers, cont)
end
end
spec.containers = containers
spec.container = nil
local function value_with_default(val, default_val)
if val == nil then
return default_val
else
return val
end
end
local function set_or_default(prop)
spec = value_with_default(spec, group)
end
set_or_default("placetype")
if not spec.placetype then
internal_error("No placetype found in key %s for spec %s or in group `default_placetype`", key, spec)
end
set_or_default("divs")
spec.addl_divs = group.addl_divs
for _, prop in ipairs {
"keydesc",
"fulldesc",
"addl_parents",
"overriding_bare_label_parents",
"bare_category_parent_type",
"wp",
"wpcat",
"commonscat",
"british_spelling",
"the",
"no_container_cat",
"no_container_parent",
"no_generic_place_cat",
"no_check_holonym_mismatch",
"no_auto_augment_container",
"no_include_container_in_desc",
"is_city",
"is_former_place",
} do
set_or_default(prop)
end
-- `default_placetype == "city"` is correct; if `default_placetype` has something else like `prefecture-level city`
-- as the canonical placetype but also lists `city` (as Chinese prefecture-level cities do), don't mark as
-- is_city.
spec.is_city = value_with_default(spec.is_city, group.default_placetype == "city")
spec.initialized = true
end
--[=[
Given a location group, key and possible placetypes that the placename must match, check if the key exists in the group
with at least one of the group's key's placetypes matching one of the passed-in placetypes. If so, return two values:
the group key (which potentially could differ from the passed-in key due to aliases) and the corresponding spec object,
which (as with all functions that return spec objects) has been initialized using `initialize_spec()` (i.e. default
property values have been copied from the group into the spec, if the spec doesn't itself specify a value for the
property in question).
`alias_resolution` controls how aliases are resolved. Normally, both display and category aliases are followed, and
the returned key will reflect the canonical location key. However, if `alias_resolution` is {"none"}, no alias following
happens. In that case, if the key specifies an alias, the spec for the alias rather than the spec for the canonical
location is returned, and importantly, it is returned uninitialized, meaning that properties from the group are not
copied into the spec. (If the key specifies a canonical location, its spec is returned initialized, as in the normal
case where `alias_resolution` is unspecified.) The caller needs to check whether the returned spec is an alias by
looking for an `alias_of` property. If `alias_resolution` is {"display"}, the behavior is the same as for {"none"}
except that if the alias contains a setting `display = true`, the returned key will reflect the canonical location key,
and if the alias contains a setting `display = ``string`` `, the returned key will reflect that string.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_key_in_group(group, placetypes, key, alias_resolution)
if alias_resolution ~= nil and alias_resolution ~= "none" and alias_resolution ~= "display" and
alias_resolution ~= "all" then
internal_error("Bad value for 'alias_resolution': %s", alias_resolution)
end
local spec = group.data
if not spec then
return nil
end
local function check_correct_placetype(placetype)
if type(placetype) == "table" then
for _, pt in ipairs(placetype) do
if list_or_element_contains(placetypes, pt) then
return true
end
end
return false
else
return list_or_element_contains(placetypes, placetype)
end
end
if spec.alias_of then
local resolved_key = spec.alias_of
local resolved_spec = group.data
if not resolved_spec then
internal_error("Key %s is an alias of %s, which doesn't exist", key, resolved_key)
elseif resolved_spec.alias_of then
internal_error("Key %s is an alias of %s, which is itself an alias; indirect aliasing not allowed",
key, resolved_key)
end
if alias_resolution == "none" or alias_resolution == "display" then
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or resolved_spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in any of spec %s, alias-resolved spec %s or in group " ..
"`default_placetype`", key, spec, resolved_spec)
end
if not check_correct_placetype(placetype) then
return nil
end
if alias_resolution == "display" then
if spec.display == true then
key = resolved_key
elseif spec.display then
key = spec.display
end
end
return key, spec
end
key = resolved_key
spec = resolved_spec
end
-- We could be working with non-initialized/defaulted spec, since we're pulling it directly from the group.
local placetype = spec.placetype or group.default_placetype
if not placetype then
internal_error("No placetype found for key %s in spec %s or group `default_placetype`", key, spec)
end
if not check_correct_placetype(placetype) then
return nil
end
export.initialize_spec(group, key, spec)
return key, spec
end
--[=[
Given a location group, placename and possible placetypes that the placename must match, check if the placename exists
in the group with at least one of the placetypes of the key in the group that corresponds to the placename matching one
of the passed-in placetypes. If so, return two values: the key corrsponding to the passed-in placename and the
corresponding spec object. This is similar to `find_matching_key_in_group()` but works with placenames rather than keys.
`alias_resolution` is as in `find_matching_key_in_group()`.
This is a low-level function meant for internal use; external callers should generally use `get_matching_location` (for
internally-derived locations), `find_matching_holonym_location` (for externally-derived locations) or
`find_canonical_key` (for known-canonical locations where the placetype isn't known).
]=]
local function find_matching_placename_in_group(group, placetypes, placename, alias_resolution)
local key = export.placename_to_key(group, placename)
return find_matching_key_in_group(group, placetypes, key, alias_resolution)
end
--[==[
If `key` is a canonical known location key (i.e. not an alias), return the corresponding group and initialized spec.
If no such key exists, return {nil}. This throws an internal error if two locations with the same key are found.
]==]
function export.find_canonical_key(key)
local found_locations = {}
for _, group in ipairs(export.locations) do
local spec = group.data
if not spec then
-- do nothing
elseif spec.alias_of then
mw.log(("Skipping alias '%s' of canonical '%s'"):format(key, spec.alias_of))
else
insert(found_locations, {group, spec})
end
end
if not found_locations then
return nil
elseif found_locations then
internal_error("Found multiple matching locations for canonical key %s: %s", key, found_locations)
else
local group, spec = unpack(found_locations)
export.initialize_spec(group, key, spec)
return group, spec
end
end
--[==[
Iterator that returns all locations matching a given description, where the description consists of either a placename
or a key along with a list of possible placetypes. Usually there will be at most one such location. The iterator
returns three values at each iteration: the location group, canonical key by which the location is known and the spec
object describing the location. `data` contains the following possible fields:
* `placetypes`: A list of possible placetypes, one of which must match one of the location's placetypes; or a string
specifying a placetype, which must match one of the location's placetypes. This must be specified.
* `placename`: The placename of the location. Either this or `key` must be specified.
* `key`: The key of the location. Either this or `placename` must be specified.
* `alias_resolution`: If specified, it behaves the same as for `find_matching_key_in_group`.
The spec is normally initialized using `initialize_spec()` prior to it being returned (but may not be if
`alias_resolution` is given and the specified key or placename is an alias; see the documentation for
`find_matching_key_in_group`).
]==]
function export.iterate_matching_location(data)
local i = 0
local n = #export.locations
return function()
while true do
i = i + 1
if i > n then
break
end
local group = export.locations
local key, spec
if data.placename then
key, spec = find_matching_placename_in_group(group, data.placetypes, data.placename,
data.alias_resolution)
else
if not data.key then
internal_error("'.placename' or '.key' must be defined: %s", data)
end
key, spec = find_matching_key_in_group(group, data.placetypes, data.key, data.alias_resolution)
end
if key then
return group, key, spec
end
end
end
end
--[==[
Return the location matching a given description, where the description consists of either a placename or a key along
with a list of possible placetypes. This is similar to `iterate_matching_location()` but throws an internal error if
there is not exactly one location found; as such, it is for use with internally specified locations (such as the
containers of known locations) rather than externally specified locations, which may not match a known location and in
some cases may match multiple known locations. For finding an externally specified location, consider using
`find_matching_holonym_location`, which returns {nil} rather than throwing an error if the location isn't found, but
also (more importantly) checks to make sure there are no conflicting holonyms among the user-specified holonyms (e.g.
{{tl|place|city|s/Delaware|c/USA|t=Newark}} will not match the known location `Newark` (in New Jersey, not Delaware).
]==]
function export.get_matching_location(data)
local all_found = {}
for group, key, spec in export.iterate_matching_location(data) do
insert(all_found, {group, key, spec})
end
if not all_found then
internal_error("Couldn't find matching location for data %s", data)
elseif all_found then
internal_error("Found multiple matching locations for data %s: %s", data, all_found)
else
return unpack(all_found)
end
end
--[==[
Successively iterate over a location's containers, and then the containers of those containers, etc. Keep in mind that
locations may have multiple containers (e.g. Russia has both Europe and Asia as containers, and both Europe and Asia
have Eurasia as their container). A given container will never be returned twice (e.g. in the case where a specific
location A has locations B and C as containers, and B has C as its container, C will not be returned twice). An
internal error happens if a container loop is detected. The return value is a list of location objects, each of which
contains `group`, `key` and `spec` fields.
]==]
function export.iterate_containers(group, key, spec)
local keys_seen = {}
keys_seen = true
local iterations = 0
local last_iteration_containers = {{group = group, key = key, spec = spec}}
return function()
iterations = iterations + 1
if iterations > 10 then
internal_error("Probable loop in containers when processing key %s", key)
end
local next_iteration_containers = {}
for _, location in ipairs(last_iteration_containers) do
local containers = location.spec.containers
if containers then
for _, container in ipairs(containers) do
local container_group, container_key, container_spec = export.get_matching_location {
placetypes = container.placetype,
key = container.key,
}
if not keys_seen then
insert(next_iteration_containers, {
group = container_group, key = container_key, spec = container_spec
})
keys_seen = true
end
end
end
end
if not next_iteration_containers then
return nil
end
last_iteration_containers = next_iteration_containers
return next_iteration_containers
end
end
--[==[
Given a placename, convert it into a link (two-part if `display_form` is given and differs from `placename`) and add
`"the "` to the beginning if called for in `spec`.
]==]
function export.construct_linked_placename(spec, placename, display_form)
local linked_placename = display_form and placename ~= display_form and ("]"):format(placename,
display_form) or ("]"):format(placename)
if spec.the then
linked_placename = "the " .. linked_placename
end
return linked_placename
end
--[=[
This is typically used to define `key_to_placename`. It generates a function that chops off parts of a string (a
location key), typically at the end, in order to get the full and elliptical versions of a placename. (See the
documentation above for `key_to_placename` under "Location group tables" for the difference between full and elliptical
placenames.) `container_patterns` is a Lua pattern or a list of possible patterns matching the container at the end of
the key, which will be used to remove that container. If multiple patterns are specified, each one is tried until one
matches. If `container_patterns` is omitted, this part of the process is skipped. The reulting string becomes the full
placename. If `divtype_patterns` is specified, it is likewise either a Lua pattern or list of possible patterns to match
and remove the political division affixed onto the end (or possibly the beginning) of the key in the keys of certain
countries (such as South Korean and North Korean counties, which include the word "County" in the key). The resulting
chopped string becomes the elliptical placename. If `divtype_patterns` is omitted, this part of the process is skipped
and the full and elliptical placenames are the same.
Typical usage is as follows:
```
key_to_placename = make_key_to_placename(", England$"),
```
or (when the political division is part of the key)
```
key_to_placename = make_key_to_placename(", South Korea$", " County$")
```
]=]
local function make_key_to_placename(container_patterns, divtype_patterns)
if type(container_patterns) == "string" then
container_patterns = {container_patterns}
end
if type(divtype_patterns) == "string" then
divtype_patterns = {divtype_patterns}
end
return function(key)
local full_placename = key
if container_patterns then
for _, container_pattern in ipairs(container_patterns) do
local nsubs
full_placename, nsubs = full_placename:gsub(container_pattern, "")
if nsubs > 0 then
break
end
end
end
local elliptical_placename = full_placename
if divtype_patterns then
for _, divtype_pattern in ipairs(divtype_patterns) do
local nsubs
elliptical_placename, nsubs = elliptical_placename:gsub(divtype_pattern, "")
if nsubs > 0 then
break
end
end
end
return full_placename, elliptical_placename
end
end
--[=[
This is typically used to define `placename_to_key`. It generates a function that appends a string to the end of a given
placename to get the key (see the definition of `placename_to_key` above in the documentation under "Location group
tables"). Optional `divtype_suffix` is a raw string (which should not contain hyphens or other characters that have
special meaning in Lua patterns) to be appended first to the placename; if already present at the end, it is not
appended. `container_suffix` is then added in the same fashion if given. Typical usage is like this:
```
placename_to_key = make_placename_to_key(", England")
```
(which will convert e.g. `"Hampshire"` into `"Hampshire, England"`)
or
```
placename_to_key = make_placename_to_key(", South Korea", " County")
```
(which will convert e.g. `"Gangwon"` or `"Gangwon County"` into `"Gangwon County, South Korea"`).
]=]
local function make_placename_to_key(container_suffix, divtype_suffix)
return function(placename)
local key = placename
if divtype_suffix then
if not key:find(divtype_suffix .. "$") then
key = key .. divtype_suffix
end
end
if container_suffix then
key = key .. container_suffix
end
return key
end
end
--[=[
This is typically used to define `canonicalize_key_container`, which converts a container as specified in the location
data into the canonical form containing both the full container key and its placetype. It generates a function to do
the canonicalization of a given container. If the container is a string, `suffix` is appended onto the string (use {nil}
or {""} if there is no suffix to append), and the placetype is set to `placetype`. Otherwise the container is left
as-is. Typical usage is like this:
```
canonicalize_key_container = make_canonicalize_key_container(", Canada", "province")
```
which will convert e.g. `"Ontario"` into `{key = "Ontario, Canada", placetype = "province"}`.
]=]
local function make_canonicalize_key_container(suffix, placetype)
return function(container)
if type(container) == "string" then
return {key = container .. (suffix or ""), placetype = placetype}
else
return container
end
end
end
-----------------------------------------------------------------------------------
-- Top-level tables --
-----------------------------------------------------------------------------------
export.continents = {
= {the = true, placetype = "planet", addl_parents = {"nature"},
fulldesc = "=the planet ] and the features found on it"},
= {placetype = "continent", container = {key = "Earth", placetype = "planet"}},
= {placetype = {"supercontinent", "continent"}, container = {key = "Earth", placetype = "planet"},
keydesc = "], in the sense of ] and ] combined",
wp = "Americas"},
= {alias_of = "America", the = true},
= {placetype = "continent", container = {key = "America", placetype = "supercontinent"}},
= {the = true, placetype = {"continental region", "region"}, container = {key = "North America", placetype = "continent"}},
= {placetype = {"continental region", "region"}, container = {key = "North America", placetype = "continent"}},
= {placetype = "continent", container = {key = "America", placetype = "supercontinent"}},
= {placetype = "continent", container = {key = "Earth", placetype = "planet"},
fulldesc = "=the territory of ]"},
= {placetype = {"supercontinent", "continent"}, container = {key = "Earth", placetype = "planet"},
keydesc = "], i.e. ] and ] together"},
= {placetype = "continent", container = {key = "Eurasia", placetype = "supercontinent"}},
= {placetype = "continent", container = {key = "Eurasia", placetype = "supercontinent"}},
= {placetype = "continent", container = {key = "Earth", placetype = "planet"}},
= {placetype = {"continental region", "region"}, container = {key = "Oceania", placetype = "continent"}},
= {placetype = {"continental region", "region"}, container = {key = "Oceania", placetype = "continent"}},
= {placetype = {"continental region", "region"}, container = {key = "Oceania", placetype = "continent"}},
}
export.continents_group = {
default_overriding_bare_label_parents = {}, -- container parents should be used
default_divs = {{type = "countries", prep = "in"}},
-- It's enough to mention the first-level continent or continent group. It seems excessive to write e.g.
-- "El Salvador, a country in Central America, a continental region in North America, a continent in America, ...".
default_no_include_container_in_desc = true,
default_no_container_cat = true,
default_no_container_parent = true,
default_no_auto_augment_container = true,
default_no_generic_place_cat = true,
-- French Guyana is in France but not in Europe, which should not be an issue, so don't check holonym mismatches at
-- this level. We also run into problems with supercontinents, which have "continent" as the fallback and cause
-- mismatches.
default_no_check_holonym_mismatch = true,
data = export.continents,
}
-- Countries: including those with partial recognition that are normally considered countries (e.g. Kosovo, Taiwan).
export.countries = {
= {container = "Asia", divs = {"provinces", "districts"}},
= {container = "Europe", divs = {"counties", "municipalities", "communes",
{type = "administrative units", cat_as = "communes"},
}, british_spelling = true},
= {container = "Africa", divs = {"provinces", "communes", "districts", "municipalities"}},
= {container = "Europe", divs = {"parishes"}, british_spelling = true},
= {container = "Africa", divs = {"provinces", "municipalities"}},
= {container = "Caribbean", divs = {"provinces"}, british_spelling = true},
= {container = "South America", divs = {"provinces", "departments", "municipalities"}},
= {container = {"Europe", "Asia"}, divs = {"provinces", "districts"}, british_spelling = true},
= {alias_of = "Armenia", the = true}, -- differs in "the"
-- Both a country and continent
= {container = "Oceania", divs = {
{type = "states", cat_as = "states and territories"},
{type = "territories", cat_as = "states and territories"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and territories"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of states and territories"},
"local government areas", "dependent territories",
}, british_spelling = true},
= {container = "Europe", divs = {"states", "districts", "municipalities"}, british_spelling = true},
= {container = {"Europe", "Asia"}, divs = {"districts", "municipalities"}, british_spelling = true},
= {the = true, container = "Caribbean", divs = {"districts"}, british_spelling = true, wp = "The %l"},
= {container = "Asia", divs = {"governorates"}},
= {container = "Asia", divs = {"divisions", "districts", "municipalities"}, british_spelling = true},
= {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
= {container = "Europe", divs = {"regions", "districts"}, british_spelling = true},
= {container = "Europe", divs = {"regions", "provinces", "municipalities"}, british_spelling = true},
= {container = "Central America", divs = {"districts"}, british_spelling = true},
= {container = "Africa", divs = {"departments", "communes"}},
= {container = "Asia", divs = {"districts", "gewogs"}},
= {container = "South America", divs = {"provinces", "departments", "municipalities"}},
= {container = "Europe", divs = {"entities", "cantons", "municipalities"}, british_spelling = true},
= {alias_of = "Bosnia and Herzegovina", display = true},
= {alias_of = "Bosnia and Herzegovina", display = true},
= {container = "Africa", divs = {"districts", "subdistricts"}, british_spelling = true},
= {container = "South America", divs = {
"states", "municipalities", "macroregions",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
= {container = "Asia", divs = {"districts", "mukims"}, british_spelling = true},
= {container = "Europe", divs = {"provinces", "municipalities"}, british_spelling = true},
= {container = "Africa", divs = {"regions", "departments", "provinces"}},
= {container = "Africa", divs = {"provinces", "communes"}},
= {container = "Asia", divs = {"provinces", "districts"}},
= {container = "Africa", divs = {"regions", "departments"}},
= {container = "North America", divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces and territories"},
{type = "ABBREVIATION_OF territories", cat_as = "abbreviations of provinces and territories"},
"counties", "districts", "municipalities", "regional municipalities",
"rural municipalities", "parishes",
-- Don't change the following to something more politically correct (e.g. "First Nations reserves") until/unless
-- the Canadian government makes a similar switch (and note that as of Apr 18 2025, the Wikipedia article is
-- still at ]).
"Indian reserves",
"census divisions",
{type = "townships", prep = "in"},
},
british_spelling = true},
= {container = "Africa", divs = {"municipalities", "parishes"}},
= {the = true, container = "Africa", divs = {"prefectures", "subprefectures"}},
= {container = "Africa", divs = {"regions", "departments"}},
= {container = "South America", divs = {"regions", "provinces", "communes"}},
= {container = "Asia", divs = {
{type = "provinces", cat_as = "provinces and autonomous regions"},
{type = "autonomous regions", cat_as = "provinces and autonomous regions"},
"special administrative regions", "prefectures", "prefecture-level cities",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities.
"districts", "subdistricts", "townships", "municipalities",
{type = "direct-administered municipalities", cat_as = "municipalities"},
}},
= {alias_of = "China", the = true}, -- differs in "the"
= {container = "South America", divs = {"departments", "municipalities"}},
= {the = true, container = "Africa", divs = {"autonomous islands"}},
= {container = "Central America", divs = {"provinces", "cantons"}},
= {container = "Europe", divs = {"counties", "municipalities"}, british_spelling = true},
= {container = "Caribbean", divs = {"provinces", "municipalities"}},
= {container = {"Europe", "Asia"}, divs = {"districts"}, british_spelling = true},
= {the = true, container = "Europe", divs = {"regions", "districts", "municipalities"}, british_spelling = true},
= {alias_of = "Czech Republic"}, -- differs in "the"
= {the = true, container = "Africa", divs = {"provinces", "territories"}},
= {alias_of = "Democratic Republic of the Congo", display = true, the = true},
= {container = "Europe", divs = {"regions", "municipalities", "dependent territories"},
british_spelling = true,
-- Wikipedia separates ] (constituent country) from ] (country)
},
= {container = "Africa", divs = {"regions", "districts"}},
= {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
= {the = true, container = "Caribbean", divs = {"provinces", "municipalities"},
keydesc = "the ], the country that shares the ] island of ] with ]"},
= {container = "Asia", divs = {"municipalities"}, wp = "Timor-Leste"},
= {alias_of = "East Timor", display = true},
= {container = "South America", divs = {"provinces", "cantons"}},
= {container = "Africa", divs = {"governorates", "regions"}, british_spelling = true},
= {container = "Central America", divs = {"departments", "municipalities"}},
= {container = "Africa", divs = {"provinces"}},
= {container = "Africa", divs = {"regions", "subregions"}},
= {container = "Europe", divs = {"counties", "municipalities"}, british_spelling = true},
= {container = "Africa", british_spelling = true},
= {alias_of = "Eswatini", display = true},
= {container = "Africa", divs = {"regions", "zones"}},
= {the = true, container = "Micronesia", divs = {"states"}},
= {alias_of = "Federated States of Micronesia"},
= {container = "Melanesia", divs = {"divisions", "provinces"}, british_spelling = true},
= {container = "Europe", divs = {"regions", "municipalities"}, british_spelling = true},
= {container = "Europe", divs = {"regions", "cantons", "collectivities",
"communes",
{type = "municipalities", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
"dependent territories", "territories", "provinces",
}, british_spelling = true},
= {container = "Africa", divs = {"provinces", "departments"}},
= {the = true, container = "Africa", divs = {"divisions", "districts"}, british_spelling = true, wp = "The %l"},
= {container = {"Europe", "Asia"}, divs = {"regions", "districts"},
keydesc = "the country of ], in ]", british_spelling = true, wp = "%l (country)"},
= {container = "Europe", divs = {"states", "municipalities", "districts"}, british_spelling = true},
= {container = "Africa", divs = {"regions", "districts"}, british_spelling = true},
= {container = "Europe", divs = {"regions", "regional units", "municipalities",
{type = "peripheries", cat_as = {"regions"}},
}, british_spelling = true},
= {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
= {container = "Central America", divs = {"departments", "municipalities"}},
= {container = "Africa", divs = {"regions", "prefectures"}},
= {container = "Africa", divs = {"regions"}},
= {container = "South America", divs = {"regions"}, british_spelling = true},
= {container = "Caribbean", divs = {"departments", "arrondissements"}},
= {container = "Central America", divs = {"departments", "municipalities"}},
= {container = "Europe", divs = {"counties", "districts"}, british_spelling = true},
= {container = "Europe", divs = {"regions", "municipalities", "counties"}, british_spelling = true},
= {container = "Asia", divs = {
{type = "states", cat_as = "states and union territories"},
{type = "union territories", cat_as = "states and union territories"},
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states and union territories"},
{type = "ABBREVIATION_OF union territories", cat_as = "abbreviations of states and union territories"},
"divisions", "districts", "municipalities",
}, british_spelling = true},
= {container = "Asia", divs = {"regencies", "provinces",
{type = "ABBREVIATION_OF provinces", cat_as = "abbreviations of provinces"},
}},
= {container = "Asia", divs = {"provinces", "counties"}},
= {container = "Asia", divs = {"governorates", "districts"}},
= {container = "Europe", addl_parents = {"British Isles"},
divs = {"counties", "districts", "provinces"}, british_spelling = true, wp = "Republic of %l"},
= {alias_of = "Ireland", the = true}, -- differs in "the"
= {container = "Asia", divs = {"districts"}},
= {container = "Europe", divs = {
"regions", "provinces", "metropolitan cities", "municipalities",
{type = "autonomous regions", cat_as = "regions"},
}, british_spelling = true},
= {container = "Africa", divs = {"districts", "regions"}},
-- We should really be using Ivory Coast (common name) but there are political ramifications to the use of
-- Côte d'Ivoire so don't make it a display alias.
= {alias_of = "Ivory Coast"},
= {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
= {container = "Asia", divs = {"prefectures", "subprefectures", "municipalities"}},
= {container = "Asia", divs = {"governorates"}},
= {container = {"Asia", "Europe"}, divs = {"regions", "districts"}},
= {container = "Africa", divs = {"counties"}, british_spelling = true},
= {container = "Micronesia", british_spelling = true},
= {container = "Europe", british_spelling = true},
= {container = "Asia", divs = {"governorates", "areas"}},
= {container = "Asia", divs = {"regions", "districts"}},
= {container = "Asia", divs = {"provinces", "districts"}},
= {container = "Europe", divs = {"municipalities"}, british_spelling = true},
= {container = "Asia", divs = {"governorates", "districts"}},
= {container = "Africa", divs = {"districts"}, british_spelling = true},
= {container = "Africa", divs = {"counties", "districts"}},
= {container = "Africa", divs = {"districts", "municipalities"}},
= {container = "Europe", divs = {"municipalities"}, british_spelling = true},
= {container = "Europe", divs = {"counties", "municipalities"}, british_spelling = true},
= {container = "Europe", divs = {"cantons", "districts"}, british_spelling = true},
= {container = "Africa", divs = {"regions", "districts"}},
= {container = "Africa", divs = {"regions", "districts"}, british_spelling = true},
= {container = "Asia", divs = {"states", "federal territories", "districts"}, british_spelling = true},
= {the = true, container = "Asia", divs = {"provinces", "administrative atolls"}, british_spelling = true},
= {container = "Africa", divs = {"regions", "cercles"}},
= {container = "Europe", divs = {"regions", "local councils"}, british_spelling = true},
= {the = true, container = "Micronesia", divs = {"municipalities"}},
= {container = "Africa", divs = {"regions", "departments"}},
= {container = "Africa", divs = {"districts"}, british_spelling = true},
= {container = "North America", addl_parents = {"Central America"}, divs = {
"states", "municipalities",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
}},
= {container = "Europe", divs = {
{type = "districts", cat_as = "districts and autonomous territorial units"},
{type = "autonomous territorial units", cat_as = "districts and autonomous territorial units"},
"communes", "municipalities",
}, british_spelling = true},
= {placetype = {"city-state", "country"}, container = "Europe",
-- We want the first placetype to be 'city-state' so the description of Monaco says it's a city-state, but we
-- want its parent to be "countries in Europe".
bare_category_parent_type = {type = "countries", prep = "in"},
is_city = true, british_spelling = true},
= {container = "Asia", divs = {"provinces", "districts"}},
= {container = "Europe", divs = {"municipalities"}},
= {container = "Africa", divs = {"regions", "prefectures", "provinces"}},
= {container = "Africa", divs = {"provinces", "districts"}},
= {container = "Asia",
divs = {"regions", "states", "union territories",
{type = "self-administered zones", cat_as = "self-administered areas"},
{type = "self-administered divisions", cat_as = "self-administered areas"},
"districts"}},
= {alias_of = "Myanmar"}, -- not display-canonicalizing; has political connotations
= {container = "Africa", divs = {"regions", "constituencies"}, british_spelling = true},
= {container = "Micronesia", divs = {"districts"}, british_spelling = true},
= {container = "Asia", divs = {"provinces", "districts"}},
= {the = true, placetype = {"country", "constituent country"}, container = "Europe",
divs = {"provinces", "municipalities",
{type = "FORMER municipalities", cat_as = "former municipalities"},
"dependent territories", "constituent countries"}, british_spelling = true,
-- Wikipedia separates ] (constituent country) from ]
-- (country)
},
= {container = "Polynesia", divs = {"regions", "dependent territories", "territorial authorities"},
british_spelling = true},
= {container = "Central America", divs = {"departments", "municipalities"}},
= {container = "Africa", divs = {"regions", "departments"}},
= {container = "Africa", divs = {
"states",
-- Categorize the Federal Capital Territory as a state because there's only one of it; we could categorize
-- everything under 'states and territories' but that seems a bit pointless.
{type = "federal territories", cat_as = "states"},
"local government areas",
}, british_spelling = true},
= {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties"}},
= {container = "Europe", divs = {"regions", "municipalities"}, british_spelling = true},
= {alias_of = "North Macedonia", display = true},
= {alias_of = "North Macedonia", the = true}, -- differs in "the"
= {alias_of = "North Macedonia", the = true}, -- differs in "the"
= {container = "Europe",
divs = {"counties", "municipalities", "dependent territories", "districts", "unincorporated areas"},
british_spelling = true},
= {container = "Asia", divs = {"governorates", "provinces"}},
= {container = "Asia", divs = {
{type = "provinces", cat_as = "provinces and territories"},
{type = "administrative territories", cat_as = "provinces and territories"},
{type = "federal territories", cat_as = "provinces and territories"},
{type = "territories", cat_as = "provinces and territories"},
"divisions", "districts",
}, british_spelling = true},
= {container = "Micronesia", divs = {"states"}},
= {container = "Asia", divs = {"governorates"}},
= {alias_of = "Palestine", the = true}, -- differs in "the"
= {container = "Central America", divs = {"provinces", "districts"}},
= {container = "Melanesia", divs = {"provinces", "districts"}, british_spelling = true},
= {container = "South America", divs = {"departments", "districts"}},
= {container = "South America", divs = {"regions", "provinces", "districts"}},
= {the = true, container = "Asia", divs = {"regions", "provinces", "districts", "municipalities", "barangays"}},
= {divs = {"voivodeships", "counties",
{type = "Polish colonies", cat_as = {{type = "villages", prep = "in"}}},
}, container = "Europe", british_spelling = true},
= {container = "Europe", divs = {
{type = "autonomous regions", cat_as = "districts and autonomous regions"},
{type = "districts", cat_as = "districts and autonomous regions"},
"provinces", "municipalities"}, british_spelling = true},
= {container = "Asia", divs = {"municipalities", "zones"}},
= {the = true, container = "Africa", divs = {"departments", "districts"}},
= {alias_of = "Republic of the Congo", display = true, the = true},
= {container = "Europe", divs = {
"regions", "counties", "communes",
{type = "ABBREVIATION_OF counties", cat_as = "abbreviations of counties"},
}, british_spelling = true},
= {container = {"Europe", "Asia"}, divs = {
"federal subjects", "republics", "autonomous oblasts", "autonomous okrugs", "oblasts", "krais", "federal cities",
"districts", "federal districts"},
british_spelling = true},
= {container = "Africa", divs = {"provinces", "districts"}},
= {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
= {container = "Caribbean", divs = {"districts"}, british_spelling = true},
= {container = "Caribbean", divs = {"parishes"}, british_spelling = true},
= {container = "Polynesia", divs = {"districts"}, british_spelling = true},
= {container = "Europe", divs = {"municipalities"}, british_spelling = true},
= {container = "Africa", divs = {"districts"}},
= {container = "Asia", divs = {"provinces", "governorates"}},
= {container = "Africa", divs = {"regions", "departments"}},
= {container = "Europe", divs = {"districts", "municipalities"}},
= {container = "Africa", divs = {"districts"}, british_spelling = true},
= {container = "Africa", divs = {"provinces", "districts"}, british_spelling = true},
= {container = "Asia", divs = {"districts"}, british_spelling = true},
= {container = "Europe", divs = {"regions", "districts"}, british_spelling = true},
= {container = "Europe", divs = {"statistical regions", "municipalities"}, british_spelling = true},
-- Note: the official name does not include "the" at the beginning, but it sounds strange in
-- English to leave it out and it's commonly included, so we include it.
= {the = true, container = "Melanesia", divs = {"provinces"}, british_spelling = true},
= {container = "Africa", divs = {"regions", "districts"}},
= {container = "Africa", divs = {
"provinces",
"districts",
{type = "district municipalities", cat_as = "districts"},
{type = "metropolitan municipalities", cat_as = "districts"},
"municipalities",
}, british_spelling = true},
= {container = "Asia", addl_parents = {"Korea"}, divs = {"provinces", "counties", "districts"}},
= {container = "Africa", divs = {"regions", "states", "counties"}, british_spelling = true},
= {container = "Europe", divs = {"autonomous communities", "provinces", "municipalities",
"comarcas", "autonomous cities"},
british_spelling = true},
= {container = "Asia", divs = {"provinces", "districts"}, british_spelling = true},
= {container = "Africa", divs = {"states", "districts"}, british_spelling = true},
= {container = "South America", divs = {"districts"}},
= {container = "Europe", divs = {"provinces", "counties", "municipalities"}, british_spelling = true},
= {container = "Europe", divs = {"cantons", "municipalities", "districts"}, british_spelling = true},
= {container = "Asia", divs = {"governorates", "districts"}},
= {container = "Asia", divs = {"counties", "districts", "townships", "special municipalities"}},
= {alias_of = "Taiwan", the = true}, -- differs in "the", different political connotations
= {container = "Asia", divs = {"regions", "districts"}},
= {container = "Africa", divs = {"provinces", "districts"}, british_spelling = true},
= {container = "Asia", divs = {"provinces", "districts", "subdistricts"}},
= {container = "Africa", divs = {"provinces", "prefectures"}},
= {container = "Polynesia", divs = {"divisions"}, british_spelling = true},
= {container = "Caribbean", divs = {"regions", "municipalities"}, british_spelling = true},
= {container = "Africa", divs = {"governorates", "delegations"}},
= {container = {"Europe", "Asia"}, divs = {"provinces", "districts"}},
-- Foreign names generally get display-canonicalized.
= {alias_of = "Turkey", display = true},
= {container = "Asia", divs = {"regions", "districts"}},
= {container = "Polynesia", divs = {"atolls"}, british_spelling = true},
= {container = "Africa", divs = {"districts", "counties"}, british_spelling = true},
= {container = "Europe", divs = {
{type = "oblasts", cat_as = "oblasts and autonomous republics"},
{type = "autonomous republics", cat_as = "oblasts and autonomous republics"},
"raions", "hromadas",
}, british_spelling = true},
= {the = true, container = "Asia", divs = {"emirates"}},
-- Abbreviations get display-canonicalized.
= {alias_of = "United Arab Emirates", display = true, the = true},
= {alias_of = "United Arab Emirates", display = true, the = true},
= {the = true, container = "Europe", addl_parents = {"British Isles"},
divs = {"constituent countries", "counties", "districts", "boroughs", "territories", "dependent territories",
"traditional counties"},
keydesc = "the ] of Great Britain and Northern Ireland", british_spelling = true},
-- Abbreviations get display-canonicalized.
= {alias_of = "United Kingdom", display = true, the = true},
= {alias_of = "United Kingdom", display = true, the = true},
= {the = true, container = "North America",
divs = {"counties", "county seats", "states", "territories", "dependent territories",
{type = "ABBREVIATION_OF states", cat_as = "abbreviations of states"},
{type = "DEROGATORY_NAME_FOR states", cat_as = "derogatory names for states"},
{type = "NICKNAME_FOR states", cat_as = "nicknames for states"},
{type = "OFFICIAL_NICKNAME_FOR states", cat_as = "official nicknames for states"},
{type = "boroughs", prep = "in"}, -- exist in Pennsylvania and New Jersey
"municipalities", -- these exist politically at least in Colorado and Connecticut
{type = "census-designated places", prep = "in"},
{type = "unincorporated communities", prep = "in"},
-- Don't change the following to something more politically correct until/unless the US government makes a
-- similar switch (and note that as of Apr 18 2025, the Wikipedia article is still at
-- ]).
"Indian reservations",
}},
-- Abbreviations and long forms (when possible) get display-canonicalized.
= {alias_of = "United States", display = true, the = true},
= {alias_of = "United States", display = true, the = true},
= {alias_of = "United States", display = true, the = true},
= {alias_of = "United States", display = true, the = true},
= {alias_of = "United States", display = true, the = true},
= {container = "South America", divs = {"departments", "municipalities"}},
= {container = "Asia", divs = {"regions", "districts"}},
= {container = "Melanesia", divs = {"provinces"}, british_spelling = true},
= {placetype = {"city-state", "country"}, container = "Europe",
-- We want the first placetype to be 'city-state' so the description of Vatican City says it's a city-state,
-- but we want its parent to be "countries in Europe".
bare_category_parent_type = {type = "countries", prep = "in"},
addl_parents = {"Rome"}, is_city = true, british_spelling = true},
= {alias_of = "Vatican City", the = true}, -- differs in "the"
= {container = "South America", divs = {"states", "municipalities"}},
= {container = "Asia", divs = {"provinces", "districts", "municipalities"}},
= {placetype = {"territory", "country"}, container = "Africa",
bare_category_parent_type = {type = "countries", prep = "in"},
},
-- Not display-canonicalizable both due to differences in 'the' and the sovereignty dispute over Western Sahara
= {alias_of = "Western Sahara", the = true},
= {container = "Asia", divs = {"governorates", "districts"}},
= {container = "Africa", divs = {"provinces", "districts"}, british_spelling = true},
= {container = "Africa", divs = {"provinces", "districts"}, british_spelling = true},
}
local function canonicalize_continent_container(key)
if type(key) ~= "string" then
return key
end
if export.continents then
return {key = key, placetype = export.continents.placetype}
end
internal_error("Unrecognized key %s in `canonicalize_continent_like`", key)
end
export.countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"+++", "countries"},
default_placetype = "country",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.countries,
}
-- Country-like entities: typically overseas territories or de-facto independent countries, which in both cases
-- are not internationally recognized as sovereign nations but which we treat similarly to countries.
export.country_like_entities = {
-- British Overseas Territory
= {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Cyprus", "Europe", "Asia"},
british_spelling = true,
},
-- Åland: Listed as a region of Finland. Wikipedia lists this under "dependent territories" in
-- ].
-- unincorporated territory of the United States
= {
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "United States",
addl_parents = {"Polynesia"},
},
-- British Overseas Territory
= {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
= {
placetype = {"unrecognized country", "country"},
addl_parents = {"Georgia", "Europe", "Asia"},
divs = {"districts"},
keydesc = "the de-facto independent state of ], internationally recognized as part of the country of ]",
british_spelling = true,
},
-- Australian external territory
= {
the = true,
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Asia"},
},
-- constituent country of the Netherlands
= {
placetype = {"constituent country", "country"},
container = "Netherlands",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- British Overseas Territory
= {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"North America"},
british_spelling = true,
},
-- special municipality of the Netherlands
= {
placetype = {"special municipality", "municipality", "overseas territory", "territory"},
container = "Netherlands",
addl_parents = {"Caribbean"},
is_city = true,
british_spelling = true,
},
-- British Overseas Territory
= {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Asia"},
british_spelling = true,
},
-- British Overseas Territory
= {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- Norwegian dependent territory
= {
placetype = {"dependent territory", "territory"},
container = "Norway",
addl_parents = {"Africa"},
british_spelling = true,
},
-- British Overseas Territory
= {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- Australian external territory
= {
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Asia"},
british_spelling = true,
},
-- Sui generis French "state private property" per Wikipedia; classify as overseas territory like the
-- French Southern and Antarctic Lands.
= {
placetype = {"overseas territory", "territory"},
container = "France",
addl_parents = {"North America"},
},
-- Australian external territory; also called the Keeling Islands or (officially) the Cocos (Keeling) Islands
= {
the = true,
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Asia"},
wp = "Cocos (Keeling) Islands",
british_spelling = true,
},
= {alias_of = "Cocos Islands", display = true, the = true},
= {alias_of = "Cocos Islands", display = true, the = true},
-- self-governing but in free association with New Zealand
= {
the = true,
placetype = {"country"},
container = "New Zealand",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- constituent country of the Netherlands
= {
placetype = {"constituent country", "country"},
container = "Netherlands",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- special territory of Chile
= {
placetype = {"special territory", "territory"},
container = "Chile",
addl_parents = {"Polynesia"},
},
-- British Overseas Territory
= {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"South America"},
british_spelling = true,
},
-- autonomous territory of Denmark
= {
the = true,
placetype = {"autonomous territory", "territory"},
container = "Denmark",
addl_parents = {"Europe"},
british_spelling = true,
},
-- overseas department and region of France
= {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"South America"},
british_spelling = true,
},
-- overseas collectivity of France
= {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- French overseas territory
= {
the = true,
placetype = {"overseas territory", "territory"},
container = "France",
addl_parents = {"Africa"},
},
-- British Overseas Territory
= {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Europe"},
is_city = true,
british_spelling = true,
},
-- autonomous territory of Denmark
= {
placetype = {"autonomous territory", "territory"},
container = "Denmark",
addl_parents = {"North America"},
divs = {"municipalities"},
british_spelling = true,
},
-- overseas department and region of France
= {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
addl_parents = {"Caribbean"},
divs = {"communes"},
british_spelling = true,
},
-- unincorporated territory of the United States
= {
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "United States",
addl_parents = {"Micronesia"},
},
-- self-governing British Crown dependency; technically called the Bailiwick of Guernsey
= {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"},
container = "United Kingdom",
addl_parents = {"British Isles", "Europe"},
british_spelling = true,
wp = "Bailiwick of %l",
},
= {alias_of = "Guernsey", the = true},
-- Australian external territory
= {
the = true,
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Africa"},
},
-- special administrative region of China
= {
placetype = {"special administrative region", "city"},
container = "China",
is_city = true,
british_spelling = true,
},
-- self-governing British Crown dependency
= {
the = true,
placetype = {"crown dependency", "dependency", "dependent territory", "territory"},
container = "United Kingdom",
addl_parents = {"British Isles", "Europe"},
british_spelling = true,
},
-- Norwegian unincorporated area
= {
placetype = {"unincorporated area", "dependent territory", "territory", "island"},
container = "Norway",
addl_parents = {"Europe"},
british_spelling = true,
},
-- self-governing British Crown dependency; technically called the Bailiwick of Jersey
= {
placetype = {"crown dependency", "dependency", "dependent territory", "bailiwick", "territory"},
container = "United Kingdom",
addl_parents = {"British Isles", "Europe"},
british_spelling = true,
},
= {alias_of = "Jersey", the = true},
-- special administrative region of China
= {
placetype = {"special administrative region", "city"},
container = "China",
is_city = true,
british_spelling = true,
},
-- overseas department and region of France
= {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- overseas department and region of France
= {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Africa"},
british_spelling = true,
},
-- British Overseas Territory
= {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- special collectivity of France
= {
placetype = {"special collectivity", "collectivity"},
container = "France",
addl_parents = {"Melanesia"},
british_spelling = true,
},
-- dependent territory of New Zealand
= {
the = true,
placetype = {"dependent territory", "territory"},
container = "New Zealand",
addl_parents = {"Antarctica"},
british_spelling = true,
},
-- self-governing but in free association with New Zealand
= {
placetype = {"country"},
container = "New Zealand",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- Australian external territory
= {
placetype = {"external territory", "territory"},
container = "Australia",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Cyprus
= {
placetype = {"unrecognized country", "country"},
addl_parents = {"Cyprus", "Turkey", "Europe", "Asia"},
divs = {"districts"},
keydesc = "the de-facto independent state of ], internationally recognized as part of the country of ]",
british_spelling = true,
},
-- commonwealth, unincorporated territory of the United States
= {
the = true,
placetype = {"commonwealth", "unincorporated territory", "overseas territory", "territory"},
container = "United States",
addl_parents = {"Micronesia"},
},
-- British Overseas Territory
= {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- commonwealth of the United States
= {
placetype = {"commonwealth", "overseas territory", "territory"},
container = "United States",
addl_parents = {"Caribbean"},
divs = {"municipalities"},
},
-- overseas department and region of France
= {
placetype = {"overseas department", "department", "administrative region", "region"},
container = "France",
divs = {"communes"},
addl_parents = {"Africa"},
british_spelling = true,
},
-- special municipality of the Netherlands
= {
placetype = {"special municipality", "municipality", "overseas territory", "territory"},
container = "Netherlands",
addl_parents = {"Caribbean"},
is_city = true,
british_spelling = true,
},
-- overseas collectivity of France
= {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- British Overseas Territory
= {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
divs = {{type = "constituent parts", container_parent_type = false}},
addl_parents = {"Atlantic Ocean", "Africa"},
british_spelling = true,
},
-- constituent parts of the combined oveseas territory
= {
placetype = {"constituent part", "territory", "island"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"Atlantic Ocean"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
= {
placetype = {"constituent part", "territory", "island"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"Atlantic Ocean"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
= {
placetype = {"constituent part", "territory", "archipelago"},
container = {key = "Saint Helena, Ascension and Tristan da Cunha", placetype = "overseas territory"},
addl_parents = {"Atlantic Ocean"},
overriding_bare_label_parents = {},
no_container_cat = false,
no_container_parent = false,
no_auto_augment_container = false,
},
-- overseas collectivity of France
= {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- overseas collectivity of France
= {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
divs = {"communes"},
addl_parents = {"North America"},
british_spelling = true,
},
-- special municipality of the Netherlands
= {
placetype = {"special municipality", "municipality", "overseas territory", "territory"},
container = "Netherlands",
addl_parents = {"Caribbean"},
is_city = true,
british_spelling = true,
},
-- constituent country of the Netherlands
= {
placetype = {"constituent country", "country"},
container = "Netherlands",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Somalia
= {
placetype = {"unrecognized country", "country"},
addl_parents = {"Somalia", "Africa"},
keydesc = "the de-facto independent state of ], internationally recognized as part of the country of ]",
british_spelling = true,
},
-- British Overseas Territory
-- FIXME: We should form the group "South Georgia and the South Sandwich Islands" like we did for
-- "Saint Helena, Ascension and Tristan da Cunha".
= {
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Atlantic Ocean"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Georgia
= {
placetype = {"unrecognized country", "country"},
addl_parents = {"Georgia", "Europe", "Asia"},
keydesc = "the de-facto independent state of ], internationally recognized as part of the country of ]",
british_spelling = true,
},
-- British Overseas Territory
= {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Atlantic Ocean"},
wp = true,
wpcat = "South Georgia and the South Sandwich Islands",
british_spelling = true,
},
-- Norwegian unincorporated area
= {
placetype = {"unincorporated area", "dependent territory", "territory", "archipelago"},
container = "Norway",
addl_parents = {"Europe"},
british_spelling = true,
},
-- dependent territory of New Zealand
= {
placetype = {"dependent territory", "territory"},
container = "New Zealand",
addl_parents = {"Polynesia"},
british_spelling = true,
},
-- de-facto independent state, internationally recognized as part of Moldova
= {
placetype = {"unrecognized country", "country"},
addl_parents = {"Moldova", "Europe"},
keydesc = "the de-facto independent state of ], internationally recognized as part of ]",
british_spelling = true,
},
-- British Overseas Territory
= {
the = true,
placetype = {"overseas territory", "territory"},
container = "United Kingdom",
addl_parents = {"Caribbean"},
british_spelling = true,
},
-- unincorporated territory of the United States
= {
the = true,
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "United States",
addl_parents = {"Islands", "Micronesia", "Polynesia", "Caribbean"},
},
-- FIXME: We should add entries for the other minor outlying islands.
-- Baker Island (Oceania)
-- Howland Island (Oceania)
-- Jarvis Island (Oceania)
-- Johnston Atoll (Oceania)
-- Kingman Reef (Oceania)
-- Midway Atoll (Oceania)
-- Navassa Island (Caribbean)
-- Palmyra Atoll (Oceania)
-- Wake Island (Oceania)
= {
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "United States",
addl_parents = {"Micronesia"},
},
-- unincorporated territory of the United States
= {
the = true,
placetype = {"unincorporated territory", "overseas territory", "territory"},
container = "United States",
addl_parents = {"Caribbean"},
},
= {alias_of = "United States Virgin Islands", display = true, the = true},
= {alias_of = "United States Virgin Islands", display = true, the = true},
-- overseas collectivity of France
= {
placetype = {"overseas collectivity", "collectivity"},
container = "France",
addl_parents = {"Polynesia"},
british_spelling = true,
},
}
export.country_like_entities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Saint Helena, Ascension and Tristan da Cunha".
key_to_placename = false,
placename_to_key = false,
canonicalize_key_container = make_canonicalize_key_container(nil, "country"),
default_overriding_bare_label_parents = {"country-like entities"},
default_no_container_cat = true,
default_no_container_parent = true,
-- These entities often aren't really part of their container; a village in Wallis and Futuna (an overseas
-- collectivity of France in Polynesia), for example, shouldn't be treated as a village in France, nor as a village
-- in Europe.
default_no_auto_augment_container = true,
data = export.country_like_entities,
}
-- Former countries and such; we don't create "Cities in ..." categories because they don't exist anymore
export.former_countries = {
-- de-facto independent state of Armenian ethnicity, internationally recognized as part of Azerbaijan
-- (also known as Nagorno-Karabakh)
-- NOTE: Formerly listed Armenia as a parent; this seems politically non-neutral so I've taken it out.
= {
placetype = {"unrecognized country", "country"},
addl_parents = {"Azerbaijan", "Europe", "Asia"},
keydesc = "the former de-facto independent state of ], internationally recognized as part of ]",
british_spelling = true,
},
= {alias_of = "Artsakh"},
= {container = "Europe", british_spelling = true},
= {container = "Europe", addl_parents = {"Germany"}, british_spelling = true},
= {container = "Asia", addl_parents = {"Vietnam"}},
= {placetype = {"empire", "country"}, container = "Asia", divs = {"provinces"}},
= {
the = true, placetype = {"empire", "country"}, container = {"Europe", "Africa", "Asia"},
addl_parents = {"Ancient Europe", "Ancient Near East"},
divs = {
"provinces", "themes",
}},
= {
the = true, placetype = {"empire", "country"}, container = {"Europe", "Africa", "Asia"}, addl_parents = {"Rome"},
divs = {
"provinces",
{type = "FORMER provinces", cat_as = "provinces"},
}},
= {container = "Asia", addl_parents = {"Vietnam"}},
= {
the = true, container = {"Europe", "Asia"}, divs = {"republics", "autonomous republics"},
british_spelling = true},
= {container = "Europe", addl_parents = {"Germany"}, british_spelling = true},
= {container = "Europe", divs = {"districts"},
keydesc = "the former ] (1918–1943) or the former ] (1943–1992)", british_spelling = true},
}
export.former_countries_group = {
canonicalize_key_container = canonicalize_continent_container,
default_overriding_bare_label_parents = {"former countries and country-like entities"},
default_is_former_place = true,
default_placetype = "country",
default_no_container_cat = true,
default_no_container_parent = true,
-- No need to augment country holonyms with continents; not needed for disambiguation.
default_no_auto_augment_container = true,
data = export.former_countries,
}
-----------------------------------------------------------------------------------
-- Subpolity tables --
-----------------------------------------------------------------------------------
export.australia_states_and_territories = {
= {the = true, placetype = "territory"},
= {the = true, placetype = "territory"},
= {},
= {the = true, placetype = "territory"},
= {},
= {},
= {},
= {},
= {},
}
-- states and territories of Australia
export.australia_group = {
default_container = "Australia",
default_placetype = "state",
default_divs = "local government areas",
data = export.australia_states_and_territories,
}
export.austria_states = {
= {},
= {},
= {},
= {},
= {wp = "Tyrol (state)"},
= {},
= {wp = "Salzburg (state)"},
= {},
= {},
}
-- states of Austria
export.austria_group = {
default_container = "Austria",
default_placetype = "state",
default_divs = "municipalities",
data = export.austria_states,
}
export.bangladesh_divisions = {
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
-- divisions of Bangladesh
export.bangladesh_group = {
key_to_placename = make_key_to_placename(", Bangladesh$", " Division$"),
placename_to_key = make_placename_to_key(", Bangladesh", " Division"),
default_container = "Bangladesh",
default_placetype = "division",
default_divs = "districts",
data = export.bangladesh_divisions,
}
export.brazil_states = {
= {wp = "%l (state)"},
= {},
= {},
= {wp = "%l (Brazilian state)"},
= {},
= {},
= {wp = "Federal District (Brazil)"},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {wp = "%l (state)"},
= {},
= {},
= {wp = "%l (state)"},
= {},
= {},
= {},
= {},
= {wp = "%l (state)"},
= {wp = "%l (state)"},
= {},
= {},
}
-- states of Brazil
export.brazil_group = {
default_container = "Brazil",
default_placetype = "state",
default_divs = "municipalities",
data = export.brazil_states,
}
export.canada_provinces_and_territories = {
= {divs = {
{type = "municipal districts", container_parent_type = "rural municipalities"},
}},
= {divs =
{type = "regional districts", container_parent_type = false},
"regional municipalities",
},
= {divs = {"rural municipalities"}},
= {divs = {"counties", "parishes", {type = "civil parishes", cat_as = "parishes"}}},
= {},
= {the = true, placetype = "territory"},
= {divs = {"counties", "regional municipalities"}},
= {placetype = "territory"},
= {divs = {"counties", "regional municipalities", {type = "townships", prep = "in"}}},
= {divs = {"counties", "parishes", "rural municipalities"}},
= {divs = {"rural municipalities"}},
= {divs = {
"counties",
{type = "regional county municipalities", container_parent_type = "regional municipalities"},
-- administrative regions have an official (but non-governmental) function but there don't appear to be any
-- equivalent regions elsewhere in Canada, so disable the ] grouping
{type = "regions", container_parent_type = false},
{type = "townships", prep = "in"},
{type = "parish municipalities", cat_as = {{type = "parishes", container_parent_type = "counties"}, "municipalities"}},
{type = "township municipalities", cat_as = {{type = "townships", prep = "in"}, "municipalities"}},
{type = "village municipalities", cat_as = {{type = "villages", prep = "in"}, "municipalities"}},
}},
= {placetype = "territory"},
= {alias_of = "Yukon, Canada", the = true},
}
-- provinces and territories of Canada
export.canada_group = {
default_container = "Canada",
default_placetype = "province",
data = export.canada_provinces_and_territories,
}
export.china_provinces_and_autonomous_regions = {
-- direct-administered municipalities are not here but below under prefecture-level cities
= {},
= {},
= {alias_of = "Fujian, China", display = true},
= {},
= {},
= {placetype = "autonomous region"},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {placetype = "autonomous region"},
= {},
= {},
= {},
= {},
= {placetype = "autonomous region"},
= {},
= {},
= {},
= {},
= {},
= {placetype = "autonomous region", wp = "Tibet Autonomous Region"},
= {placetype = "autonomous region"},
= {},
= {},
}
-- provinces and autonomous regions of China
export.china_group = {
default_container = "China",
default_placetype = "province",
default_divs = {
"prefectures", "prefecture-level cities",
"districts", "subdistricts", "townships",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_provinces_and_autonomous_regions,
}
export.china_prefecture_level_cities = {
-- In China, a "prefecture-level city" is not a city in any real sense. It is rather a prefecture, which is an
-- administrative unit smaller than a province but bigger than a county, which is administratively controlled by
-- the chief city of the prefecture (which bears the same name as the prefecture), in a unified government. Prior
-- to the mid-1980's, in fact, prefecture-level cities *were* prefectures, and a few of them (especially in the
-- western portion of China) have not yet been converted. Generally a given province is entirely tiled by
-- prefecture-level cities, another indication that they should be treated as prefectures and not cities per se.
-- Yet another indication is that prefecture-level cities can contain counties and county-level cities (which, much
-- like prefecture-level cities, are effectively counties surrounding a chief city of the county, again which bears
-- the same name as the county-level city).
--
-- For this reason, we treat prefecture-level cities as non-city political divisions, and separately enumerate the
-- most populous so we can separately categorize districts and counties under them instead of lumping them at the
-- province level.
--
-- Note also that China separately distinguishes "urban area" from "metro area". Sometimes the two figures are
-- identical but sometimes the metro area is larger (and very occasionally smaller, which I assume is an error). I'm
-- guessing that the "urban area" is the contiguous urban area over a certain density while the metro area includes
-- all urban areas above a certain density; when the latter is greater, it's because of satellite cities in the
-- metro area separated by suburban/exurban or rural land.
-- At first I chose all prefecture/province-level cities with a total prefecture/province-level population of at
-- least 6,000,000 per the 2020 census with data taken from https://www.citypopulation.de/en/china/admin/ (a total
-- of 67, including the four direct-administered municipalities), and also chose all prefecture/province-level
-- cities whose "urban population" was at least 2,000,000 per the 2020 census with data taken from Wikipedia
-- ] (a total of 61 cities; if we cut off
-- at 1.5 million we'd have 84 cities, and if we cut off at 1 million we'd have 105 cities). Merging them produces
-- 87 cities. Note that this leaves off a few well-known cities (Guilin, Qiqihar, Kashgar, Lhasa, ...) but includes
-- a lot of obscure cities.
--
-- At a later date I added all cities from citypopulation.de whose "urban" population per the 2020 China census was
-- >= 1 million, and then finally added all urban agglomerations from citypopulation.de whose 2025-01-01 estimate
-- was >= 1 million. These are sorted below by the urban agglomeration value (which is generally of the "adm-urb" =
-- "administrative area (urban population)" type) and sometimes groups nearby cities into a single agglomeration
-- (most notably in the case of the Pearl River Delta, grouped under Guangzhou with an agglomeration population of
-- 72,700,000 but including a large number of nearby large cities in the agglomeration (although for some reason not
-- Hong Kong, maybe due to the administrative issues involved). In addition, citypopulation.de includes divisions
-- under a prefecture-level city if they are city-like and have an agglomeration population of at least 1 million;
-- this includes several county-level cities, one county and one district (Wanzhou, a "district" of Chongqing
-- despite being 142 miles away). None of the county-level cities or counties have districts under them, only
-- subdistricts, towns and townships.
= {container = "Guangdong"}, -- 18.7 prefectural, 18.8 urban; sub-provincial city; 16.097 urban (72.700 adm-urb including Dongguan, Foshan, Huizhou, Jiangmen, Shenzhen, Zhongshan) per citypopulation.de
= {container = "Guangdong"}, -- 10.5 prefectural, 10.5 urban; 9.645 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
= {container = "Guangdong"}, -- 9.5 prefectural, 9.5 urban; 9.043 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
= {container = "Guangdong"}, -- 6.0 prefectural, 2.5 urban; 2.900 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
= {container = "Guangdong"}, -- 4.798 prefectural, 2.7 urban; 1.795 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
= {container = "Guangdong"}, -- 17.5 prefectural, 14.7 urban; sub-provincial city; 17.445 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
= {container = "Guangdong"}, -- 4.418 prefectural, 4.4 urban; 3.842 per citypopulation.de; included by citypopulation.de in Guangzhou agglomeration
= {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 24.9 prefectural, 29.9 urban; 21.910 urban (41.600 adm-urb including Changshu, Changzhou, Suzhou, Wuxi) per citypopulation.de
= {container = "Jiangsu"}, -- 1.231 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: Not to be confused with Cangzhou in Hebei
= {container = "Jiangsu"}, -- 5.278 prefectural, 3.6 urban; 3.187 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
-- NOTE: There is also a prefecture-level city Suzhou in Anhui with 5.3 million prefectural inhabitants
= {container = "Jiangsu"}, -- 12.8 prefectural, 4.3 urban; 5.893 urban per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
= {container = "Jiangsu"}, -- 7.5 prefectural, 3.3 urban; 3.957 per citypopulation.de; included by citypopulation.de in Shanghai agglomeration
= {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 21.9 prefectural, 21.9 urban; 18.961 urban (21.500 adm-urb) per citypopulation.de
= {container = "Sichuan"}, -- 20.9 prefectural, 16.9 urban; sub-provincial city; 13.568 urban (18.100 adm-urb) per citypopulation.de
= {container = "Fujian"}, -- 5.163 prefectural, 5.2 urban; sub-provincial city; 4.617 urban (15.400 adm-urb including Jinjiang, Quanzhou, Putian) per citypopulation.de
= {container = "Fujian"}, -- 1.416 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
= {container = "Fujian"}, -- 8.8 prefectural, 1.7 urban (6.7 metro); 1.469 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
= {container = "Fujian"}, -- 3.210 prefectural, 2.0 urban; 1.539 urban per citypopulation.de; included by citypopulation.de in Xiamen agglomeration
= {container = "Zhejiang"}, -- 11.9 prefectural, 10.7 urban; sub-provincial city; 9.236 urban (14.600 adm-urb including Shaoxing) per citypopulation.de
= {container = "Zhejiang"}, -- 5.270 prefectural, 2.5 urban; 2.333 urban per citypopulation.de; included by citypopulation.de in Hangzhou agglomeration
= {container = "Shaanxi"}, -- 12.1 prefectural, 11.9 urban; sub-provincial city; 9.393 urban (13.400 adm-urb including Xianyang) per citypopulation.de
= {container = "Shaanxi"}, -- 1.193 urban per citypopulation.de; included by citypopulation.de in Xi'an agglomeration
= {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 32.1 prefectural, 16.9 urban; 9.581 urban (12.900 adm-urb) per citypopulation.de
= {container = "Hubei"}, -- 12.4 prefectural, 12.3 urban; sub-provincial city; 10.495 urban (12.600 adm-urb) per citypopulation.de
= {placetype = {"direct-administered municipality", "municipality", "city"}}, -- 13.9 prefectural, 13.9 urban; 11.052 urban (11.700 adm-urb) per citypopulation.de
= {container = "Hunan"}, -- 10.0 prefectural, 6.0 urban; 5.630 urban (11.500 adm-urb including Xiangtan, Zhuzhou) per citypopulation.de
-- Changsha County -- 1.024 urban per citypopulation.de
= {container = "Hunan"}, -- 1.510 urban per citypopulation.de; included by citypopulation.de in Changsha agglomeration
= {container = "Henan"}, -- 12.6 prefectural, 6.7 urban; 6.461 urban (10.300 adm-urb) per citypopulation.de
= {container = "Jiangsu"}, -- 9.3 prefectural, 9.3 urban; sub-provincial city; 7.520 urban (9.500 adm-urb including Ma'anshan) per citypopulation.de
= {container = "Liaoning"}, -- 9.1 prefectural, 7.9 urban; sub-provincial city; 7.026 urban (8.800 adm-urb including Fushun) per citypopulation.de
= {container = "Liaoning"}, -- 1.229 urban per citypopulation.de; included by citypopulation.de in Shenyang agglomeration
= {container = "Anhui"}, -- 9.4 prefectural, 4.2 urban; 5.056 urban (8.200 adm-urb) per citypopulation.de
= {container = "Guangdong"}, -- 5.502 prefectural, 4.3 urban; 3.839 urban (8.050 adm-urb including Chaozhou, Jieyang, Puning) per citypopulation.de
= {container = "Guangdong"}, -- 1.254 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
= {container = "Guangdong"}, -- 1.243 urban per citypopulation.de; included by citypopulation.de in Shantou agglomeration
= {container = "Shandong"}, -- 10.1 prefectural, 7.1 urban; sub-provincial city; 6.165 urban (7.700 adm-urb) per citypopulation.de
= {container = "Zhejiang"}, -- 9.4 prefectural, 5.1 urban; sub-provincial city; 3.731 urban (7.600 adm-urb including Cixi, Yuyao) per citypopulation.de
= {container = "Zhejiang"}, -- 1.458 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
= {container = "Zhejiang"}, -- 1.014 urban per citypopulation.de; included by citypopulation.de in Ningbo agglomeration
-- Hong Kong 7.500 agglomeration per citypopulation.de 2025-01-01 estimate including Kowloon, Victoria
= {container = "Zhejiang"}, -- 9.6 prefectural, 3.6 urban; 2.582 urban (7.000 adm-urb including Rui'an, Cangnan, Pingyang) per citypopulation.de
-- Rui'an is a "county-level city" of the "prefecture-level city" of Wenzhou but in fact is 19 miles away from Wenzhou city proper (urban core to urban core).
= {placetype = "county-level city", container = {key = "Wenzhou", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 1.013 urban per citypopulation.de; included by citypopulation.de in Wenzhou agglomeration
= {container = "Yunnan"}, -- 8.5 prefectural, 6.0 urban; 5.273 urban (6.800 adm-urb) per citypopulation.de
-- includes Láiwú city
= {container = "Shandong", wp = "%l, %c"}, -- 9.2 prefectural, 8.4 urban; sub-provincial city; 5.648 urban (6.750 adm-urb) per citypopulation.de
-- includes Xīnjí city
= {container = "Hebei"}, -- 11.2 prefectural, 4.1 urban; 5.090 urban (6.450 adm-urb) per citypopulation.de
= {container = "Shanxi"}, -- 5.304 prefectural, 4.5 urban; 4.304 urban (6.150 adm-urb) per citypopulation.de
= {container = "Heilongjiang"}, -- 10.0 prefectural, 7.0 urban; sub-provincial city; 5.243 urban (5.550 adm-urb) per citypopulation.de
= {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 8.7 prefectural, 3.8 urban; 4.583 urban (5.550 adm-urb) per citypopulation.de
= {container = "Liaoning"}, -- 7.5 prefectural, 5.7 urban; sub-provincial city; 4.914 urban (5.400 adm-urb) per citypopulation.de
= {container = "Guizhou"}, -- 5.987 prefectural, 3.5 urban; 4.021 urban (5.300 adm-urb) per citypopulation.de
= {container = "Jilin"}, -- 9.1 prefectural, 5.7 urban; sub-provincial city; 4.557 urban (5.200 adm-urb) per citypopulation.de
= {container = "Jiangxi"}, -- 6.3 prefectural, 3.6 (3.9?) urban, 5.3 metro; 3.519 urban (5.150 adm-urb) per citypopulation.de
= {container = {key = "Xinjiang, China", placetype = "autonomous region"}}, -- 4.054 prefectural, 4.3 urban; 3.843 urban (5.000 adm-urb) per citypopulation.de
= {alias_of = "Ürümqi", display = true},
= {container = "Fujian"}, -- 8.3 prefectural, 4.1 urban; 3.723 urban (4.775 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 11.0 prefectural, 2.3 urban; 2.744 urban (4.650 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 4.704 prefectural, 2.6 urban; 2.750 urban (3.975 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 7.1 prefectural, 2.4 urban; 2.231 urban (3.750 adm-urb) per citypopulation.de
= {container = "Gansu"}, -- 4.359 prefectural, 3.1 urban; 3.013 urban (3.575 adm-urb) per citypopulation.de
= {container = "Jiangsu"}, -- 7.7 prefectural, 2.3 urban; 2.988 urban (3.475 adm-urb) citypopulation.de
= {container = "Shandong"}, -- 9.4 prefectural, 2.7 urban; 1.998 urban (3.325 adm-urb) per citypopulation.de
= {container = "Jiangsu"}, -- 1.331 urban (3.200 adm-urb including Zhangjiagang) per citypopulation.de
= {container = "Jiangsu"}, -- 1.056 urban per citypopulation.de; included in Jiangyin figures
= {container = "Jiangsu"}, -- 9.1 prefectural, 2.6 urban; 2.846 urban (3.150 adm-urb) per citypopulation.de
= {container = "Hebei"}, -- 9.4 prefectural, 2.8 urban; 2.095 urban (2.925 adm-urb) per citypopulation.de
= {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 3.446 prefectural, 2.7 urban; 2.373 urban (2.850 adm-urb) per citypopulation.de
= {container = "Hainan"}, -- 2.873 prefectural, 2.3 urban; 2.349 urban (2.800 adm-urb) per citypopulation.de
= {container = "Hebei"}, -- 7.7 prefectural, 3.4 urban; 2.550 urban (2.750 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 6.3 prefectural, 1.2 urban, 2.7 metro; 1.271 urban (2.700 adm-urb) per citypopulation.de
= {container = "Zhejiang"}, -- 1.481 urban (2.700 adm-urb) per citypopulation.de
= {container = "Guangdong"}, -- 2.439 prefectural, 2.4 urban; 2.207 urban (2.675 adm-urb) per citypopulation.de
= {container = "Zhejiang"}, -- 6.6 prefectural, 1.6 urban; 1.486 urban (2.625 adm-urb) per citypopulation.de
= {alias_of = "Taizhou, Zhejiang"},
= {container = "Shandong"}, -- 7.1 prefectural, 2.5 urban; 2.312 urban (2.550 adm-urb) per citypopulation.de
= {container = {key = "Ningxia, China", placetype = "autonomous region"}}, -- 1.663 urban (2.525 adm-urb) per citypopulation.de
= {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 4.157 prefectural, 2.2 urban; 2.205 urban (2.500 adm-urb) per citypopulation.de
= {container = "Liaoning"}, -- 1.480 urban (2.350 adm-urb including Liáoyáng) per citypopulation.de
= {container = "Jiangsu"}, -- 2.067 urban (2.300 adm-urb) per citypopulation.de
= {container = "Zhejiang"}, -- 1.188 urban (2.275 adm-urb) per citypopulation.de
= {container = "Qinghai"}, -- 1.677 urban (2.250 adm-urb) per citypopulation.de
-- includes Dìngzhōu city and Xióngān Xīnqū
= {container = "Hebei"}, -- 11.5 prefectural, 2.0 urban; 1.940 urban (2.225 adm-urb) per citypopulation.de
= {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 2.709 prefectural, 2.2 urban; 2.104 urban (2.200 adm-urb) per citypopulation.de
= {container = "Jiangxi"}, -- 9.0 prefectural, 1.6 urban; 1.778 urban (2.150 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 1.046 urban (2.100 adm-urb) per citypopulation.de
= {container = "Guizhou"}, -- 6.6 prefectural, 2.4 urban/metro; 1.675 urban (2.025 adm-urb) per citypopulation.de
= {container = "Anhui"}, -- 1.078 urban (2.000 adm-urb) per citypopulation.de
= {container = "Shanxi"}, -- 3.105 prefectural, 2.0 urban; 1.810 urban (2.000 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 1.188 urban (1.960 adm-urb) per citypopulation.de
= {container = "Jiangsu"}, -- 4.556 prefectural, 2.6 urban; 1.805 urban (1.940 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 1.350 urban (1.900 adm-urb) per citypopulation.de
= {container = "Guangdong"}, -- 7.0 prefectural, 1.9 urban; 1.401 urban (1.890 adm-urb) per citypopulation.de
= {container = "Anhui"}, -- 1.256 urban (1.880 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 8.4 prefectural, 1.5 urban; 1.700 urban (1.880 adm-urb) per citypopulation.de
= {container = "Heilongjiang"}, -- 1.604 urban (1.860 adm-urb) per citypopulation.de
= {container = "Anhui"}, -- 1.598 urban (1.850 adm-urb) per citypopulation.de
= {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 1.361 urban (1.830 adm-urb) per citypopulation.de
= {container = "Sichuan"}, -- 1.549 urban (1.800 adm-urb) per citypopulation.de
= {container = "Hubei"}, -- 1.686 urban (1.800 adm-urb) per citypopulation.de
= {container = "Zhejiang"}, -- 1.084 urban (1.750 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 0.824 urban (1.750 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 7.8 prefectural, 1.9 urban (2.8 metro); 1.031 urban (1.750 adm-urb) per citypopulation.de
= {container = "Hebei"}, -- 1.520 urban (1.740 adm-urb) per citypopulation.de
= {container = "Hebei"}, -- 7.1 prefectural, 971,000 urban; 1.5 urban (1.700 adm-urb) per citypopulation.de
= {container = "Henan", wp = "%l, %c"}, -- 9.7 prefectural, 2.1 urban/metro; 1.481 urban (1.680 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 0.875 urban (1.640 adm-urb) per citypopulation.de
= {container = "Jilin"}, -- 1.509 urban (1.610 adm-urb) per citypopulation.de
= {alias_of = "Jilin City"},
= {container = "Zhejiang"}, -- 7.1 prefectural, 1.5 urban; 1.041 urban (1.590 adm-urb) per citypopulation.de
= {container = "Jiangxi"}, -- 6.5 prefectural, 2.1 urban, 1.3 metro ; 1.342 urban (1.580 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 8.8 prefectural, 1.3 urban; 1.294 urban (1.570 adm-urb) per citypopulation.de
= {container = {key = "Guangxi, China", placetype = "autonomous region"}, wp = "%l, %c"}, -- 0.878 urban (1.570 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 1.417 urban (1.560 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 1.340 urban (1.510 adm-urb) per citypopulation.de
-- Taizhou, Jiangsu would be here (1.490 adm-urb) but moved to china_prefecture_level_cities_2 to avoid clash
= {container = "Jiangsu"}, -- 6.7 prefectural, 1.6 urban; 1.353 urban (1.460 adm-urb) per citypopulation.de
= {container = "Hebei"}, -- 1.339 urban (1.450 adm-urb) per citypopulation.de
= {container = "Guangdong"}, -- 6.2 prefectural, 2.5 urban; 1.308 urban (1.440 adm-urb) per citypopulation.de
= {container = "Sichuan"}, -- 1.254 urban (1.440 adm-urb) per citypopulation.de
= {container = "Anhui", wp = "%l, %c"}, -- 8.2 prefectural, 2.1 urban; 1.191 urban (1.410 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 0.850 urban (1.390 adm-urb) per citypopulation.de
= {container = "Hubei"}, -- 1.284 urban (1.390 adm-urb) per citypopulation.de
= {container = "Sichuan"}, -- 1.136 urban (1.380 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 1.194 urban (1.340 adm-urb) per citypopulation.de
= {container = "Sichuan"}, -- 1.128 urban (1.340 adm-urb) per citypopulation.de
= {container = "Guangdong"}, -- 1.198 urban (1.340 adm-urb) per citypopulation.de
= {container = "Anhui"}, -- 0.831 urban (1.330 adm-urb) per citypopulation.de
= {container = "Sichuan"}, -- 1.101 urban (1.310 adm-urb) per citypopulation.de
= {container = "Anhui"}, -- 1.070 urban (1.300 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 0.843 urban (1.290 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 1.147 urban (1.270 adm-urb) per citypopulation.de
= {container = "Shanxi"}, -- 1.047 urban (1.250 adm-urb) per citypopulation.de
= {container = "Hunan"}, -- 6.6 prefectural, 1.5 urban; 1.185 urban (1.250 adm-urb) per citypopulation.de
= {container = "Liaoning"}, -- 1.021 urban (1.240 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 1.020 urban (1.240 adm-urb) per citypopulation.de
= {container = "Hunan"}, -- 1.101 urban (1.230 adm-urb) per citypopulation.de
= {container = "Jiangsu"}, -- 1.082 urban (1.230 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 6.2 prefectural, 1.4 urban/metro; 1.015 urban (1.230 adm-urb) per citypopulation.de
= {container = "Shaanxi"}, -- 1.108 urban (1.220 adm-urb) per citypopulation.de
= {container = "Hunan"}, -- 1.125 urban (1.220 adm-urb) per citypopulation.de
= {container = "Jiangsu"}, -- 1.124 urban (1.210 adm-urb) per citypopulation.de
-- Wanzhou is a "district" of the "direct-administered municipality" of Chongqing but in fact is 142 miles away from Chongqing city proper.
= {placetype = "district", container = {key = "Chongqing", placetype = "direct-administered municipality"}, divs = {"subdistricts", "townships"}, wp = "%l, %c"}, -- 1.078 urban (1.190 adm-urb) per citypopulation.de
= {container = {key = "Inner Mongolia, China", placetype = "autonomous region"}}, -- 1.093 urban (1.180 adm-urb) per citypopulation.de
= {alias_of = "Ulanhad"},
= {alias_of = "Ulanhad", display = true},
= {container = "Hubei"}, -- < 0.750 urban (1.180 adm-urb) per citypopulation.de
= {container = "Guangdong"}, -- 1.036 urban (1.160 adm-urb) per citypopulation.de
= {container = "Jiangsu"}, -- 4.599 prefectural, 2.0 urban; 1.071 urban (1.150 adm-urb) per citypopulation.de
= {container = "Yunnan"}, -- 0.976 urban (1.150 adm-urb) per citypopulation.de
-- Shuyang is a "county" of the "prefecture-level city" of Suqian but in fact is 38 miles away from Suqian city proper (urban core to urban core).
-- The county itself is 37 miles by 34 miles.
= {placetype = "county", container = {key = "Suqian", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l County"}, -- 0.986 urban (1.120 adm-urb) per citypopulation.de
-- Yongkang is a "county-level city" of the "prefecture-level city" of Jinhua but in fact is 32 miles away from Jinhua city proper (urban core to urban core).
= {placetype = "county-level city", container = {key = "Jinhua", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}, wp = "%l, Zhejiang"}, -- < 0.750 urban (1.110 adm-urb) per citypopulation.de
= {container = "Henan"}, -- 9.0 prefectural, 721,000 urban (1.6 metro); < 0.750 urban (1.100 adm-urb) per citypopulation.de
= {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- < 1 urban (1.090 adm-urb) per citypopulation.de
= {container = "Jiangxi"}, -- < 0.750 urban (1.080 adm-urb) per citypopulation.de
= {container = "Hunan"}, -- 6.6 prefectural, 802,000 urban, 1.4 metro; < 1 urban (1.080 adm-urb) per citypopulation.de
= {container = "Anhui"}, -- < 0.750 urban (1.070 adm-urb) per citypopulation.de
= {container = "Hebei"}, -- 0.885 urban (1.070 adm-urb) per citypopulation.de
= {container = "Hubei"}, -- 0.955 urban (1.070 adm-urb) per citypopulation.de
= {container = "Liaoning"}, -- 0.764 urban (1.060 adm-urb) per citypopulation.de
= {container = "Shandong"}, -- 0.961 urban (1.050 adm-urb) per citypopulation.de
= {container = {key = "Guangxi, China", placetype = "autonomous region"}}, -- 0.921 urban (1.050 adm-urb) per citypopulation.de
-- Liuyang is a "county-level city" of the "prefecture-level city" of Changsha but in fact is 47 miles away from Changsha city proper (urban core to urban core).
= {placetype = "county-level city", container = {key = "Changsha", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.886 urban (1.040 adm-urb) per citypopulation.de
-- NOTE: Not to be confused with Changzhou in Jiangsu
= {container = "Hebei"}, -- 7.3 prefectural, 621,000 urban; 0.759 urban (1.030 adm-urb) per citypopulation.de
= {container = "Guizhou"}, -- < 0.750 urban (1.030 adm-urb) per citypopulation.de
= {container = "Liaoning"}, -- 0.980 urban (1.030 adm-urb) per citypopulation.de
= {container = "Heilongjiang"}, -- 1.030 urban (1.030 adm-urb) per citypopulation.de
= {container = "Shanxi"}, -- < 0.750 urban (1.010 adm-urb) per citypopulation.de
-- Tengzhou is a "county-level city" of the "prefecture-level city" of Zaozhuang but in fact is 30 miles away from Zaozhuang city proper (urban core to urban core).
= {placetype = "county-level city", container = {key = "Zaozhuang", placetype = "prefecture-level city"}, divs = {"subdistricts", "townships"}}, -- 0.937 urban (1.010 adm-urb) per citypopulation.de
-- 3 extra that got added in earlier incarnations and aren't found in the "major agglomerations of the world" page https://citypopulation.de/en/world/agglomerations/ reference date 2025-01-01
= {container = "Jiangsu"}, -- 1.652 urban (2020 China census) per citypopulation.de
= {container = "Henan"}, -- 7.0 prefectural, 722,000 urban per Wikipedia; 0.754 urban per citypopulation.de
= {container = "Guizhou"}, -- 6.9 prefectural, ? urban, ? metro (not listed in Wikipedia); < 0.750 urban per citypopulation.de
}
export.china_prefecture_level_cities_group = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Zhejiang" or "Suzhou, Anhui".
key_to_placename = false,
placename_to_key = false, -- don't add ", China" to make the key
default_container = "China",
canonicalize_key_container = make_canonicalize_key_container(", China", "province"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "city"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"districts", "subdistricts", "townships",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities,
}
-- Needed to avoid problems with two cities called Taizhou and Suzhou.
export.china_prefecture_level_cities_2 = {
-- NOTE: There is also a larger and better-known prefecture-level city Taizhou in Zhejiang.
= {container = "Jiangsu"}, -- 1.3 urban (1.490 adm-urb) per citypopulation.de 2020 census
= {alias_of = "Taizhou, Jiangsu"},
-- NOTE: There is also a larger and better-known prefecture-level city Suzhou in Jiangsu.
= {container = "Anhui"}, -- 5.3 prefectural, 1.766 metro and "urban"; < 1 urban (1.010 adm-urb) per citypopulation.de 2020 census
-- hopefully this will work because we also have Suzhou as a key by itself for the larger, more-well-known Suzhou in Jiangsu
= {alias_of = "Suzhou, Anhui"},
}
export.china_prefecture_level_cities_group_2 = {
-- don't do any transformations between key and placename; in particular, don't chop off anything from
-- "Taizhou, Jiangsu".
placename_to_key = false, -- don't add ", China" to make the key
default_container = "China",
canonicalize_key_container = make_canonicalize_key_container(", China", "province"),
-- Prefecture-level cities aren't really cities but allow them to be identified that way, as many people
-- don't understand how Chinese administrative divisions work.
default_placetype = {"prefecture-level city", "city"},
default_divs = {
-- "towns" (but not "townships") are automatically added as they are specified as generic_before_non_cities,
-- and prefecture-level cities (as well as county-level cities) are considered non-cities.
"districts", "subdistricts", "townships",
{type = "counties", cat_as = "counties and county-level cities"},
{type = "county-level cities", cat_as = "counties and county-level cities"},
},
data = export.china_prefecture_level_cities_2,
}
export.finland_regions = {
= {wp = "%l (%c)"},
= {},
= {alias_of = "North Ostrobothnia, Finland", display = true},
= {},
= {},
= {},
= {alias_of = "Northern Savonia, Finland", display = true},
= {},
= {alias_of = "Southern Savonia, Finland", display = true},
= {},
= {},
= {},
= {alias_of = "South Ostrobothnia, Finland", display = true},
= {wp = "%l (region)"},
= {},
= {},
= {},
= {},
= {alias_of = "Päijänne Tavastia, Finland", display = true},
= {},
= {alias_of = "Tavastia Proper, Finland", display = true},
= {},
= {},
= {},
= {the = true, wp = "Åland"},
= {alias_of = "Åland Islands, Finland"}, -- differs in "the"
}
-- regions of Finland
export.finland_group = {
default_container = "Finland",
default_placetype = "region",
default_divs = "municipalities",
data = export.finland_regions,
}
export.france_administrative_regions = {
= {},
= {},
= {wp = "%l (administrative region)"},
= {},
= {},
-- overseas departments are handled in `export.country_like_entities`
-- = {},
= {},
-- = {},
= {},
= {},
-- = {},
-- = {},
= {wp = "%l (administrative region)"},
= {},
= {wp = "%l (administrative region)"},
= {alias_of = "Occitania, France", display = true},
= {},
= {},
-- = {},
}
-- administrative regions of France
export.france_group = {
default_container = "France",
-- Canonically these are 'administrative regions' but also treat as 'region' ('administrative region' falls back
-- to 'region').
default_placetype = "region",
default_divs = {
"communes",
{type = "municipalities", cat_as = "communes"},
"departments",
{type = "prefectures", cat_as = {"prefectures", "departmental capitals"}},
{type = "French prefectures", cat_as = {"prefectures", "departmental capitals"}},
},
data = export.france_administrative_regions,
}
export.france_departments = {
= {container = "Auvergne-Rhône-Alpes"}, -- 01
= {container = "Hauts-de-France"}, -- 02
= {container = "Auvergne-Rhône-Alpes"}, -- 03
= {container = "Provence-Alpes-Côte d'Azur"}, -- 04
= {container = "Provence-Alpes-Côte d'Azur"}, -- 05
= {container = "Provence-Alpes-Côte d'Azur"}, -- 06
= {container = "Auvergne-Rhône-Alpes"}, -- 07
= {container = "Grand Est", wp = "%l (department)"}, -- 08
= {container = "Occitania", wp = "%l (department)"}, -- 09
= {container = "Grand Est"}, -- 10
= {container = "Occitania"}, -- 11
= {container = "Occitania"}, -- 12
= {container = "Provence-Alpes-Côte d'Azur"}, -- 13
= {container = "Normandy", wp = "%l (department)"}, -- 14
= {container = "Auvergne-Rhône-Alpes"}, -- 15
= {container = "Nouvelle-Aquitaine"}, -- 16
= {container = "Nouvelle-Aquitaine"}, -- 17
= {container = "Centre-Val de Loire", wp = "%l (department)"}, -- 18
= {container = "Nouvelle-Aquitaine"}, -- 19
= {container = "Corsica"}, -- 2A
= {container = "Corsica"}, -- 2B
= {container = "Bourgogne-Franche-Comté"}, -- 21
= {alias_of = "Côte-d'Or, France", display = true},
= {container = "Brittany"}, -- 22
= {alias_of = "Côtes-d'Armor, France", display = true},
= {container = "Nouvelle-Aquitaine"}, -- 23
= {container = "Nouvelle-Aquitaine"}, -- 24
= {container = "Bourgogne-Franche-Comté"}, -- 25
= {container = "Auvergne-Rhône-Alpes"}, -- 26
= {container = "Normandy"}, -- 27
= {container = "Centre-Val de Loire"}, -- 28
= {container = "Brittany"}, -- 29
= {container = "Occitania"}, -- 30
= {container = "Occitania"}, -- 31
= {container = "Occitania"}, -- 32
= {container = "Nouvelle-Aquitaine"}, -- 33
= {container = "Occitania"}, -- 34
= {container = "Brittany"}, -- 35
= {container = "Centre-Val de Loire"}, -- 36
= {container = "Centre-Val de Loire"}, -- 37
= {container = "Auvergne-Rhône-Alpes"}, -- 38
= {container = "Bourgogne-Franche-Comté", wp = "%l (department)"}, -- 39
= {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 40
= {container = "Centre-Val de Loire"}, -- 41
= {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 42
= {container = "Auvergne-Rhône-Alpes"}, -- 43
= {container = "Pays de la Loire"}, -- 44
= {container = "Centre-Val de Loire"}, -- 45
= {container = "Occitania", wp = "%l (department)"}, -- 46
= {container = "Nouvelle-Aquitaine"}, -- 47
= {container = "Occitania"}, -- 48
= {container = "Pays de la Loire"}, -- 49
= {container = "Normandy"}, -- 50
= {container = "Grand Est", wp = "%l (department)"}, -- 51
= {container = "Grand Est"}, -- 52
= {container = "Pays de la Loire"}, -- 53
= {container = "Grand Est"}, -- 54
= {container = "Grand Est", wp = "%l (department)"}, -- 55
= {container = "Brittany"}, -- 56
= {container = "Grand Est", wp = "%l (department)"}, -- 57
= {container = "Bourgogne-Franche-Comté"}, -- 58
= {container = "Hauts-de-France", wp = "%l (French department)"}, -- 59
= {container = "Hauts-de-France"}, -- 60
= {container = "Normandy"}, -- 61
= {container = "Hauts-de-France"}, -- 62
= {container = "Auvergne-Rhône-Alpes"}, -- 63
= {container = "Nouvelle-Aquitaine"}, -- 64
= {container = "Occitania"}, -- 65
= {container = "Occitania"}, -- 66
= {container = "Grand Est"}, -- 67
= {container = "Grand Est"}, -- 68
= {container = "Auvergne-Rhône-Alpes", wp = "%l (department)"}, -- 69D
= {container = "Auvergne-Rhône-Alpes", the = true}, -- 69M
= {alias_of = "Metropolis of Lyon, France"},
= {alias_of = "Metropolis of Lyon, France"},
= {container = "Bourgogne-Franche-Comté"}, -- 70
= {container = "Bourgogne-Franche-Comté"}, -- 71
= {container = "Pays de la Loire"}, -- 72
= {container = "Auvergne-Rhône-Alpes"}, -- 73
= {container = "Auvergne-Rhône-Alpes"}, -- 74
= {container = "Île-de-France"}, -- 75
= {container = "Normandy"}, -- 76
= {container = "Île-de-France"}, -- 77
= {container = "Île-de-France"}, -- 78
= {container = "Nouvelle-Aquitaine"}, -- 79
= {container = "Hauts-de-France", wp = "%l (department)"}, -- 80
= {container = "Occitania", wp = "%l (department)"}, -- 81
= {container = "Occitania"}, -- 82
= {container = "Provence-Alpes-Côte d'Azur", wp = "%l (department)"}, -- 83
= {container = "Provence-Alpes-Côte d'Azur"}, -- 84
= {container = "Pays de la Loire"}, -- 85
= {container = "Nouvelle-Aquitaine", wp = "%l (department)"}, -- 86
= {container = "Nouvelle-Aquitaine"}, -- 87
= {container = "Grand Est", wp = "%l (department)"}, -- 88
= {container = "Bourgogne-Franche-Comté"}, -- 89
= {container = "Bourgogne-Franche-Comté"}, -- 90
= {container = "Île-de-France"}, -- 91
= {container = "Île-de-France"}, -- 92
= {container = "Île-de-France"}, -- 93
= {container = "Île-de-France"}, -- 94
= {container = "Île-de-France"}, -- 95
-- = {container = "Guadeloupe"}, -- 971
-- = {container = "Martinique"}, -- 972
-- = {container = "French Guiana", wp = "French Guiana"}, -- 973
-- = {container = "Réunion", wp = "Réunion"}, -- 974
-- = {container = "Mayotte"}, -- 976
}
export.france_departments_group = {
placename_to_key = make_placename_to_key(", France"),
canonicalize_key_container = make_canonicalize_key_container(", France", "region"),
default_placetype = "department",
default_divs = {
"communes",
{type = "municipalities", cat_as = "communes"},
},
data = export.france_departments,
}
export.germany_states = {
= {},
= {},
-- Berlin, Bremen and Hamburg are effectively city-states and don't have districts (]), so override
-- the default_divs setting. Better not to include them at all since they're included as cities down below.
-- = {divs = {}},
= {},
-- = {divs = {}},
-- = {divs = {}},
= {},
= {},
= {},
= {alias_of = "Mecklenburg-Vorpommern, Germany", display = true},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
-- states of Germany
export.germany_group = {
default_container = "Germany",
default_placetype = "state",
default_divs = {"districts", "municipalities"},
data = export.germany_states,
}
export.greece_regions = {
= {wp = "%l (region)"},
= {wp = "%l (administrative region)"},
= {},
= {},
= {},
= {wp = "%l (region)"},
= {the = true, wp = "%l (region)"},
= {the = true},
-- I would expect 'the Peloponnese' but Wikipedia mostly has categories like ]
-- and ]; only ]
-- has "the" in it.
= {wp = "%l (region)"},
= {the = true},
= {},
= {},
= {},
= {placetype = {"autonomous region", "region"}, wp = "Monastic community of Mount Athos"},
}
-- regions of Greece
export.greece_group = {
default_container = "Greece",
default_placetype = "region",
default_divs = {"regional units", "municipalities"},
data = export.greece_regions,
}
local india_polity_with_divisions = {"divisions", "districts"}
local india_polity_without_divisions = {"districts"}
-- States and union territories of India. Only some of them are divided into divisions.
export.india_states_and_union_territories = {
=
{the = true, placetype = "union territory", divs = india_polity_without_divisions},
= {divs = india_polity_without_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_with_divisions},
= {placetype = "union territory", divs = india_polity_without_divisions},
= {divs = india_polity_with_divisions},
= {placetype = "union territory", divs = india_polity_without_divisions},
= {placetype = "union territory", divs = india_polity_with_divisions},
= {divs = india_polity_without_divisions},
= {divs = india_polity_without_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_with_divisions},
= {placetype = "union territory", divs = india_polity_with_divisions,
wp = "%l (union territory)"},
= {divs = india_polity_with_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_without_divisions},
= {placetype = "union territory", divs = india_polity_with_divisions},
= {placetype = "union territory", divs = india_polity_without_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_without_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_without_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_with_divisions},
= {placetype = "union territory", divs = india_polity_without_divisions,
wp = "%l (union territory)"},
= {alias_of = "Puducherry, India", display = true},
= {divs = india_polity_with_divisions, wp = "%l, %c"},
= {divs = india_polity_with_divisions},
= {divs = india_polity_without_divisions},
= {divs = india_polity_without_divisions},
= {divs = india_polity_without_divisions},
= {divs = india_polity_without_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_with_divisions},
= {divs = india_polity_with_divisions},
}
-- states and union territories of India
export.india_group = {
default_container = "India",
default_placetype = "state",
data = export.india_states_and_union_territories,
}
export.indonesia_provinces = {
= {},
= {},
= {the = true},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {wp = "%l"},
= {the = true, wp = "Jakarta"},
= {alias_of = "Special Capital Region of Jakarta, Indonesia"},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {wp = "%l (province)"},
= {},
= {the = true},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {wp = "%l (province)"},
= {},
= {},
= {the = true},
= {alias_of = "Special Region of Yogyakarta, Indonesia"},
}
-- provinces of Indonesia
export.indonesia_group = {
default_container = "Indonesia",
default_placetype = "province",
-- per https://www.quora.com/Does-Indonesia-use-British-or-American-English, Indonesia tends to use American
-- spellings.
data = export.indonesia_provinces,
}
export.iran_provinces = {
= {}, -- abbreviation AL, capital ]
= {}, -- abbreviation AR, capital ]
= {}, -- abbreviation BU, capital ]
= {}, -- abbreviation CB, capital ]
= {}, -- abbreviation EA, capital ]
= {}, -- abbreviation FA, capital ]
= {alias_of = "Fars Province, Iran", display = true},
= {}, -- abbreviation GN, capital ]
= {}, -- abbreviation GO, capital ]
= {}, -- abbreviation HA, capital ]
= {}, -- abbreviation HO, capital ]
= {}, -- abbreviation IL, capital ]
= {}, -- abbreviation IS, capital ]
= {}, -- abbreviation KN, capital ]
= {}, -- abbreviation KE, capital ]
= {}, -- abbreviation KH, capital ]
= {}, -- abbreviation KB, capital ]
= {}, -- abbreviation KU, capital ]
= {}, -- abbreviation LO, capital ]
= {}, -- abbreviation MA, capital ]
= {}, -- abbreviation MN, capital ]
= {}, -- abbreviation NK, capital ]
= {}, -- abbreviation QA, capital ]
= {}, -- abbreviation QM, capital ]
= {}, -- abbreviation RK, capital ]
= {}, -- abbreviation SE, capital ]
= {}, -- abbreviation SB, capital ]
= {}, -- abbreviation SK, capital ]
= {}, -- abbreviation TE, capital ]
= {}, -- abbreviation WA, capital ]
= {}, -- abbreviation YA, capital ]
= {}, -- abbreviation ZA, capital ]
}
-- provinces of Iran
export.iran_group = {
key_to_placename = make_key_to_placename(", Iran", " Province$"),
placename_to_key = make_placename_to_key(", Iran", " Province"),
default_container = "Iran",
default_placetype = "province",
-- There aren't nearly enough counties of Iran currently entered in any language to allow for categorizing them
-- per-province. (As of 2025-05-09, there are only 6 counties in each of ],
-- ] and ].)
-- default_divs = "counties",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.iran_provinces,
}
export.ireland_counties = {
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
local function make_irish_type_key_to_placename(container_pattern)
return function(key)
key = key:gsub(container_pattern, "")
local elliptical_key = key:gsub("^County ", "")
return key, elliptical_key
end
end
local function make_irish_type_placename_to_key(container_suffix)
return function(placename)
if not placename:find("^County ") and not placename:find("^City ") then
placename = "County " .. placename
end
return placename .. container_suffix
end
end
-- counties of Ireland
export.ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", Ireland$"),
placename_to_key = make_irish_type_placename_to_key(", Ireland"),
default_container = "Ireland",
default_placetype = "county",
data = export.ireland_counties,
}
export.italy_administrative_regions = {
= {},
= {placetype = {"autonomous region", "administrative region", "region"}},
= {},
= {},
= {},
= {},
= {},
= {placetype = {"autonomous region", "administrative region", "region"}},
= {},
= {},
= {},
= {},
= {},
= {},
= {placetype = {"autonomous region", "administrative region", "region"}},
= {placetype = {"autonomous region", "administrative region", "region"}},
= {placetype = {"autonomous region", "administrative region", "region"}},
= {},
= {},
= {},
}
-- administrative regions of Italy
export.italy_group = {
default_container = "Italy",
default_placetype = "region",
data = export.italy_administrative_regions,
}
-- table of Japanese prefectures; interpolated into the main 'places' table, but also needed separately
export.japan_prefectures = {
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {divs = "subprefectures", wp = "Hokkaido"},
= {},
= {alias_of = "Hyōgo Prefecture, Japan", display = true},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {alias_of = "Kōchi Prefecture, Japan", display = true},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {alias_of = "Ōita Prefecture, Japan", display = true},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
-- prefectures of Japan
export.japan_group = {
key_to_placename = make_key_to_placename(", Japan$", " Prefecture$"),
placename_to_key = make_placename_to_key(", Japan", " Prefecture"),
default_container = "Japan",
default_placetype = "prefecture",
data = export.japan_prefectures,
}
export.laos_provinces = {
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {placetype = "prefecture", wp = "%l"},
= {},
= {},
= {},
= {},
}
local function laos_placename_to_key(placename)
if placename == "Vientiane Prefecture" then
return placename .. ", Laos"
end
if placename:find(" Province$") then
return placename .. ", Laos"
end
return placename .. " Province, Laos"
end
-- provinces of Laos
export.laos_group = {
key_to_placename = make_key_to_placename(", Laos$", {" Province$", " Prefecture$"}),
placename_to_key = laos_placename_to_key,
default_container = "Laos",
default_placetype = "province",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.laos_provinces,
}
export.lebanon_governorates = {
= {},
= {},
= {},
= {},
= {},
= {},
= {},
-- These two are generic enough that we don't want to automatically augment a use of `gov/North Governorate` or
-- `gov/South Governorate` with `c/Lebanon`.
= {no_auto_augment_container = true},
= {no_auto_augment_container = true},
}
-- governorates of Lebanon
export.lebanon_group = {
key_to_placename = make_key_to_placename(", Lebanon$", " Governorate$"),
placename_to_key = make_placename_to_key(", Lebanon", " Governorate"),
default_container = "Lebanon",
default_placetype = "governorate",
data = export.lebanon_governorates,
}
export.malaysia_states = {
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
-- states of Malaysia
export.malaysia_group = {
default_container = "Malaysia",
default_placetype = "state",
default_wp = "%l, %c",
data = export.malaysia_states,
}
export.malta_regions = {
-- Some of the regions are generic enough that we don't want to automatically augment a use of e.g.
-- `r/Northern Region` with `c/Malta`. In particular;
-- * "Eastern Region" also occurs at least in Ghana, Uganda, Iceland, Nigeria, Venezuela, North Macedonia and
-- El Salvador;
-- * "Northern Region" also occurs at least in Ghana, Uganda, Malawi, Nigeria, Canada and South Africa;
-- * "Western Region" also occurs at least in Abu Dhabi, Bahrain, South Africa, Ghana, Iceland, Nepal, Nigeria,
-- Serbia and Uganda;
-- * "Southern Region" also occurs at least in Nigeria, Eritrea, Iceland, Ireland, Malawi and Serbia.
= {no_auto_augment_container = true},
= {wp = "%l"},
= {no_auto_augment_container = true},
= {},
= {no_auto_augment_container = true},
= {no_auto_augment_container = true},
}
-- regions of Malta
export.malta_group = {
key_to_placename = make_key_to_placename(", Malta$", " Region"),
placename_to_key = make_placename_to_key(", Malta", " Region"),
default_container = "Malta",
default_placetype = "region",
default_wp = "%l, %c",
default_the = true,
data = export.malta_regions,
}
export.mexico_states = {
= {},
= {},
-- not display-canonicalizing because the "Norte" could be for emphasis
= {alias_of = "Baja California, Mexico"},
= {},
= {},
= {},
= {wp = "%l (state)"},
= {},
= {},
= {},
= {},
= {},
= {wp = "%l (state)"},
= {},
= {the = true},
= {alias_of = "State of Mexico, Mexico"}, -- differs in "the"
-- = {}, doesn't belong here because it's a city
= {},
= {alias_of = "Michoacán, Mexico", display = true},
= {},
= {},
= {},
= {alias_of = "Nuevo León, Mexico", display = true},
= {},
= {},
= {},
= {alias_of = "Querétaro, Mexico", display = true},
= {},
= {},
= {alias_of = "San Luis Potosí, Mexico", display = true},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {alias_of = "Yucatán, Mexico", display = true},
= {},
}
-- Mexican states
export.mexico_group = {
default_container = "Mexico",
default_placetype = "state",
data = export.mexico_states,
}
export.moldova_districts_and_autonomous_territorial_units = {
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {placetype = "municipality"},
= {placetype = "municipality"},
= {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital ]
-- the remainder are under the de-facto control of the unrecognized state of Transnistria
= {placetype = "municipality"},
= {alias_of = "Bender, Moldova"},
= {placetype = {"autonomous territorial unit", "autonomous region", "region"}}, -- capital ]
= {alias_of = "Transnistria, Moldova", the = true},
= {alias_of = "Transnistria, Moldova", the = true},
}
local function moldova_placename_to_key(placename)
local elliptical_key = placename .. ", Moldova"
if export.moldova_districts_and_autonomous_territorial_units then
return elliptical_key
end
if placename:find(" District$") then
return placename .. ", Moldova"
end
return placename .. " District, Moldova"
end
-- Moldovan districts (raions) and autonomous territorial units
export.moldova_group = {
key_to_placename = make_key_to_placename(", Moldova$", " District"),
placename_to_key = moldova_placename_to_key,
default_container = "Moldova",
default_placetype = {"district", "raion"},
default_divs = "communes",
data = export.moldova_districts_and_autonomous_territorial_units,
}
export.morocco_regions = {
= {},
= {wp = "%l (%c)"},
= {alias_of = "Oriental, Morocco", display = true},
= {},
= {wp = "Rabat-Salé-Kénitra"},
= {alias_of = "Rabat-Sale-Kenitra, Morocco", display = true},
= {wp = "Béni Mellal-Khénifra"},
= {alias_of = "Beni Mellal-Khenifra, Morocco", display = true},
= {},
= {wp = "Marrakesh–Safi"}, -- WP title has en-dash
= {alias_of = "Marrakesh-Safi, Morocco", display = true},
= {wp = "Drâa-Tafilalet"},
= {alias_of = "Draa-Tafilalet, Morocco", display = true},
= {},
= {
keydesc = "+++. '''NOTE:''' This region lies partly within the disputed territory of ]"
},
= {
wp = "Laâyoune-Sakia El Hamra",
keydesc = "+++. '''NOTE:''' This region lies almost completely within the disputed territory of ]",
},
= {alias_of = "Laayoune-Sakia El Hamra, Morocco", display = true},
= {
keydesc = "+++. '''NOTE:''' This region lies completely within the disputed territory of ]",
},
}
-- regions of Morocco
export.morocco_group = {
default_container = "Morocco",
default_placetype = "region",
data = export.morocco_regions,
}
export.netherlands_provinces = {
= {},
= {},
= {},
= {},
= {wp = "%l (province)"},
= {wp = "%l (%c)"},
= {},
-- Foreign forms get display-canonicalized.
= {alias_of = "North Brabant, Netherlands", display = true},
= {},
= {alias_of = "North Holland, Netherlands", display = true},
= {},
= {},
= {alias_of = "South Holland, Netherlands", display = true},
= {wp = "%l (province)"},
= {},
}
-- provinces of the Netherlands
export.netherlands_group = {
default_container = "Netherlands",
default_placetype = "province",
default_divs = "municipalities",
data = export.netherlands_provinces,
}
export.new_zealand_regions = {
-- North Island regions
= {wp = "%l Region"}, -- ISO 3166-2 code NZ-NTL, number 1, capital ]
= {wp = "%l Region"}, -- ISO 3166-2 code NZ-AUK, number 2, capital ]
= {}, -- ISO 3166-2 code NZ-WKO, number 3, capital ]
= {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-BOP, number 4, capital ]
= {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-GIS, number 5, capital ]
= {}, -- ISO 3166-2 code NZ-HKB, number 6, capital ]
= {}, -- ISO 3166-2 code NZ-TKI, number 7, capital ]
= {}, -- ISO 3166-2 code NZ-MWT, number 8, capital ]
= {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
= {alias_of = "Manawatū-Whanganui, New Zealand", display = true},
= {wp = "%l Region"}, -- ISO 3166-2 code NZ-WGN, number 9, capital ]
-- South Island regions
= {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-TAS, number 10, capital ]
= {placetype = {"region", "city"}, wp = "%l, %c", is_city = true}, -- ISO 3166-2 code NZ-NSN, number 11, capital ]
= {placetype = {"region", "district"}, wp = "%l District"}, -- ISO 3166-2 code NZ-MBH, number 12, capital ]
= {the = true, wp = "%l Region"}, -- ISO 3166-2 code NZ-WTC, number 13, capital ]
= {wp = "%l Region"}, -- ISO 3166-2 code NZ-CAN, number 14, capital ]
= {}, -- ISO 3166-2 code NZ-OTA, number 15, capital ]
= {wp = "%l Region"}, -- ISO 3166-2 code NZ-STL, number 16, capital ]
}
-- regions of New Zealand
export.new_zealand_group = {
default_container = "New Zealand",
default_placetype = "region",
data = export.new_zealand_regions,
}
export.nigeria_states = {
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {
-- not a state but allow it to be referenced as one in holonyms
placetype = {"federal territory", "territory", "state"}, the = true, wp = "%l (%c)",
},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
-- states of Nigeria
export.nigeria_group = {
key_to_placename = make_key_to_placename(", Nigeria$", " State$"),
placename_to_key = make_placename_to_key(", Nigeria", " State"),
default_container = "Nigeria",
default_placetype = "state",
data = export.nigeria_states,
}
export.north_korea_provinces = {
= {},
= {},
= {},
= {},
= {},
= {wp = "%l (%c)"},
= {},
= {},
= {},
}
-- provinces of North Korea
export.north_korea_group = {
key_to_placename = make_key_to_placename(", North Korea$", " Province$"),
placename_to_key = make_placename_to_key(", North Korea", " Province"),
default_container = "North Korea",
default_placetype = "province",
data = export.north_korea_provinces,
}
export.norwegian_counties = {
= {},
= {},
= {},
= {},
= {},
= {},
= {},
-- the following two were merged into Innlandet
-- = {},
-- = {},
= {},
= {},
= {},
-- the following two were merged into Agder
-- = {},
-- = {},
= {},
-- the following two were merged into Vestland
-- = {},
-- = {},
= {},
= {},
= {},
= {},
}
-- counties of Norway
export.norway_group = {
default_container = "Norway",
default_placetype = "county",
data = export.norwegian_counties,
}
export.pakistan_provinces_and_territories = {
= {
placetype = {"administrative territory", "autonomous territory", "territory"},
},
= {alias_of = "Azad Kashmir, Pakistan", display = true},
= {wp = "%l, %c"},
= {
placetype = {"administrative territory", "territory"},
},
= {
the = true,
divs = {}, -- no divisions
placetype = {"federal territory", "administrative territory", "territory"},
},
-- Islamabad is an accepted alias for Islamabad Capital Territory given the above placetypes
= {alias_of = "Islamabad Capital Territory, Pakistan"},
= {},
= {wp = "%l, %c"},
= {},
}
-- provinces and territories of Pakistan
export.pakistan_group = {
default_container = "Pakistan",
default_placetype = "province",
default_divs = "divisions",
data = export.pakistan_provinces_and_territories,
}
export.philippines_provinces = {
= {wp = "%l (province)"},
= {},
= {},
= {},
= {},
= {wp = "%l (province)"},
= {},
= {wp = "%l (province)"},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {the = true},
= {},
= {},
= {},
= {},
= {},
= {},
= {wp = "%l (province)"},
= {wp = "%l (province)"},
= {},
= {wp = "%l (province)"},
= {},
= {},
= {wp = "%l (province)"},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {wp = "%l (province)"},
= {},
= {wp = "%l (province)"},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
-- not a province but treated as one; allow it to be referred to as a province in holonyms
= {placetype = {"region", "province"}},
}
-- provinces of the Philippines
export.philippines_group = {
default_container = "Philippines",
default_placetype = "province",
default_divs = {"municipalities", "barangays"},
data = export.philippines_provinces,
}
export.poland_voivodeships = {
= {}, -- abbr DS, code 02, capital Wrocław
= {}, -- abbr KP, code 04, capital Bydgoszcz (seat of voivode), Toruń (seat of sejmik and marshal)
= {}, -- abbr LU, code 06, capital Lublin
= {}, -- abbr LB, code 08, capital Gorzów Wielkopolski (seat of voivode), Zielona Góra (seat of sejmik and marshal)
= {wp = "Łódź Voivodeship"}, -- abbr LD, code 10, capital Łódź
= {alias_of = "Lodz Voivodeship, Poland", display = true, display_as_full = true},
= {}, -- abbr MA, code 12, capital Kraków
= {}, -- abbr MZ, code 14, capital Warsaw
= {}, -- abbr OP, code 16, capital Opole
= {}, -- abbr PK, code 18, capital Rzeszów
= {}, -- abbr PD, code 20, capital Białystok
= {}, -- abbr PM, code 22, capital Gdańsk
= {}, -- abbr SL, code 24, capital Katowice
= {wp = "Świętokrzyskie Voivodeship"}, -- abbr SK, code 26, capital Kielce
= {alias_of = "Holy Cross Voivodeship, Poland", display = true, display_as_full = true},
= {}, -- abbr WN, code 28, capital Olsztyn
= {}, -- abbr WP, code 30, capital Poznań
= {}, -- abbr ZP, code 32, capital Szczecin
}
-- voivodeships of Poland
export.poland_group = {
key_to_placename = make_key_to_placename(", Poland$", " Voivodeship$"),
placename_to_key = make_placename_to_key(", Poland", " Voivodeship"),
default_container = "Poland",
default_placetype = "voivodeship",
default_divs = {
-- "counties", -- not enough of them currently
{type = "Polish colonies", cat_as = {{type = "villages", prep = "in"}}},
},
data = export.poland_voivodeships,
}
export.portugal_districts_and_autonomous_regions = {
= {the = true, placetype = {"autonomous region", "region"}},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {alias_of = "Lisbon District, Portugal", display = true},
= {placetype = {"autonomous region", "region"}},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
local function portugal_placename_to_key(placename)
if placename == "Azores" or placename == "Madeira" then
return placename .. ", Portugal"
end
if placename:find(" District$") then
return placename .. ", Portugal"
end
return placename .. " District, Portugal"
end
-- districts and autonomous regions of Portugal
export.portugal_group = {
key_to_placename = make_key_to_placename(", Portugal$", " District$"),
placename_to_key = portugal_placename_to_key,
default_container = "Portugal",
default_placetype = "district",
default_divs = "municipalities",
data = export.portugal_districts_and_autonomous_regions,
}
export.romania_counties = {
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
-- Bucharest: not in a county
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
-- counties of Romania
export.romania_group = {
key_to_placename = make_key_to_placename(", Romania$", " County$"),
placename_to_key = make_placename_to_key(", Romania", " County"),
default_container = "Romania",
default_placetype = "county",
default_divs = "communes",
data = export.romania_counties,
}
local function make_russia_federal_subject_spec(spectype, use_the, wp)
return {
placetype = spectype,
the = not not use_the,
bare_category_parent_type = {"federal subjects", spectype .. "s"},
wp = wp,
}
end
local russia_autonomous_okrug_no_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"}}
local russia_autonomous_okrug_the =
{placetype = {"autonomous okrug", "okrug"}, bare_category_parent_type = {"federal subjects", "autonomous okrugs"},
the = true}
local russia_krai = make_russia_federal_subject_spec("krai")
local russia_oblast = make_russia_federal_subject_spec("oblast")
local russia_republic_the = make_russia_federal_subject_spec("republic", "use the")
local russia_republic_no_the = make_russia_federal_subject_spec("republic")
export.russia_federal_subjects = {
-- autonomous oblasts
=
{the = true, placetype = {"autonomous oblast", "oblast"},
bare_category_parent_type = {"federal subjects", "autonomous oblasts"}},
-- autonomous okrugs
= russia_autonomous_okrug_the,
= {alias_of = "Chukotka Autonomous Okrug, Russia"},
= russia_autonomous_okrug_the,
= {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
= {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
= {alias_of = "Khanty-Mansi Autonomous Okrug, Russia"},
= russia_autonomous_okrug_the,
= {alias_of = "Nenets Autonomous Okrug, Russia"},
= russia_autonomous_okrug_the,
= {alias_of = "Yamalo-Nenets Autonomous Okrug, Russia"},
-- krais
= russia_krai,
= russia_krai,
= russia_krai,
= russia_krai,
= russia_krai,
= russia_krai,
= russia_krai,
= russia_krai,
= russia_krai,
-- oblasts
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
= russia_oblast,
-- republics
--
-- We only need to include cases that aren't just shortened versions of the full federal subject name (i.e. where
-- words like "Republic" and "Oblast" are omitted but the name is not otherwise modified; these are handled by
-- key_to_placename). Non-display-canonicalizing aliases are generally due to differences in the presence or absence
-- of "the".
= russia_republic_no_the,
= {alias_of = "Adygea, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Bashkortostan, Russia", the = true},
= {alias_of = "Bashkortostan, Russia"},
= russia_republic_no_the,
= {alias_of = "Buryatia, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Dagestan, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Ingushetia, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Kalmykia, Russia", the = true},
= make_russia_federal_subject_spec("republic", nil, "Republic of Karelia"),
= {alias_of = "Karelia, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Khakassia, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Mordovia, Russia", the = true},
= make_russia_federal_subject_spec("republic", nil, "North Ossetia–Alania"), -- with en-dash
= {alias_of = "North Ossetia-Alania, Russia", the = true},
= {alias_of = "North Ossetia-Alania, Russia", display = true},
= {alias_of = "North Ossetia-Alania, Russia", display = true},
= russia_republic_no_the,
= {alias_of = "Tatarstan, Russia", the = true},
= russia_republic_the,
= russia_republic_no_the,
= {alias_of = "Chechnya, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Chuvashia, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Kabardino-Balkaria, Russia", display = true},
= {alias_of = "Kabardino-Balkaria, Russia", the = true},
= {alias_of = "Kabardino-Balkaria, Russia",
display = "Kabardino-Balkarian Republic, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Karachay-Cherkessia, Russia"},
= make_russia_federal_subject_spec("republic", nil, "Komi Republic"),
= {alias_of = "Komi, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Mari El, Russia", the = true},
= make_russia_federal_subject_spec("republic", nil, "Sakha Republic"),
= {alias_of = "Sakha, Russia", the = true},
= {alias_of = "Sakha, Russia"},
= {alias_of = "Sakha, Russia", display = "Yakutia, Russia"},
= {alias_of = "Sakha, Russia", display = "Sakha Republic, Russia",
the = true},
= russia_republic_no_the,
= {alias_of = "Tuva, Russia", display = true},
= {alias_of = "Tuva, Russia", the = true},
= {alias_of = "Tuva, Russia", display= "Tuva Republic, Russia", the = true},
= russia_republic_no_the,
= {alias_of = "Udmurtia, Russia", the = true},
-- Not included due to being unrecognized and only partly controlled:
-- = make_russia_federal_subject_spec("republic", nil, "Republic of Crimea (Russia)")
-- = russia_republic_the,
-- = russia_republic_the,
-- = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Zaporizhzhia Oblast"),
-- = make_russia_federal_subject_spec("oblast", nil, "Russian occupation of Kherson Oblast"),
-- There are also federal cities (not included because they're cities):
-- Moscow, Saint Petersburg; Sevastopol (unrecognized; same status as for "Crimea, Russia" above)
}
local function russia_key_to_placename(key)
key = key:gsub(",.*", "")
local full_placename = key
if key == "Jewish Autonomous Oblast" then
return full_placename, full_placename
end
local elliptical_placename
for _, suffix in ipairs({"Krai", "Oblast"}) do
elliptical_placename = key:match("^(.*) " .. suffix .. "$")
if elliptical_placename then
return full_placename, elliptical_placename
end
end
return full_placename, full_placename
end
local function russia_placename_to_key(placename)
local key = placename .. ", Russia"
if export.russia_federal_subjects then
return key
end
-- We allow the user to say e.g. "obl/Samara" in place of "obl/Samara Oblast".
for _, suffix in ipairs({"Krai", "Oblast"}) do
local suffixed_key = placename .. " " .. suffix .. ", Russia"
if export.russia_federal_subjects then
return suffixed_key
end
end
return placename .. ", Russia"
end
local function construct_russia_federal_subject_keydesc(group, key, spec)
local placename = key:gsub(",.*", "")
local linked_placename = export.construct_linked_placename(spec, placename)
local placetype = spec.placetype
if type(placetype) == "table" then
placetype = placetype
end
if placetype == "oblast" then
-- Hack: Oblasts generally don't have entries under "Foo Oblast"
-- but just under "Foo", so fix the linked key appropriately;
-- doesn't apply to the Jewish Autonomous Oblast
linked_placename = linked_placename:gsub(" Oblast%]%]", "%]%] Oblast")
end
return linked_placename .. ", a ] (]) of ]"
end
-- federal subjects of Russia
export.russia_group = {
key_to_placename = russia_key_to_placename,
placename_to_key = russia_placename_to_key,
default_container = "Russia",
default_keydesc = construct_russia_federal_subject_keydesc,
default_overriding_bare_label_parents = {"federal subjects of Russia", "+++"},
data = export.russia_federal_subjects,
}
export.saudi_arabia_provinces = {
= {},
= {},
-- Name is too generic to assume it's in Saudi Arabia if not specified.
= {no_auto_augment_container = true, wp = "%l, %c"},
= {wp = "%l (%c)"},
= {wp = "Asir"},
= {alias_of = "Aseer Province, Saudi Arabia", display = true},
= {},
= {wp = "Al-Qassim Province"},
= {alias_of = "Qassim Province, Saudi Arabia", display = true},
= {},
= {wp = "Ḥa'il Province"},
= {alias_of = "Hail Province, Saudi Arabia", display = true},
= {alias_of = "Hail Province, Saudi Arabia", display = true},
= {wp = "Al-Jawf Province"},
= {alias_of = "Al-Jouf Province, Saudi Arabia", display = true},
= {},
= {},
= {},
}
-- provinces of Saudi Arabia
export.saudi_arabia_group = {
key_to_placename = make_key_to_placename(", Saudi Arabia$", " Province$"),
placename_to_key = make_placename_to_key(", Saudi Arabia", " Province"),
default_container = "Saudi Arabia",
default_placetype = "province",
data = export.saudi_arabia_provinces,
}
export.south_africa_provinces = {
= {the = true},
= {the = true, wp = "%l (province)"},
= {},
= {},
= {},
= {},
-- per Wikipedia and other sources, `North West` doesn't normally have `the` before it
= {wp = "%l (South African province)"},
= {the = true},
= {the = true},
}
-- provinces of South Africa
export.south_africa_group = {
default_container = "South Africa",
default_placetype = "province",
default_divs = "municipalities",
data = export.south_africa_provinces,
}
export.south_korea_provinces = {
= {},
= {},
= {wp = "%l, %c"},
= {},
= {},
= {},
= {},
= {},
= {},
}
-- provinces of South Korea
export.south_korea_group = {
key_to_placename = make_key_to_placename(", South Korea$", " Province$"),
placename_to_key = make_placename_to_key(", South Korea", " Province"),
default_container = "South Korea",
default_placetype = "province",
data = export.south_korea_provinces,
}
export.spain_autonomous_communities = {
= {},
= {},
= {},
= {the = true},
= {the = true, wp = "%l (autonomous community)"},
= {the = true},
= {},
= {},
= {wp = "Castilla–La Mancha"}, -- with en-dash
= {},
= {the = true},
= {},
= {wp = "%l (Spain)"},
= {},
= {wp = "Region of %l"},
= {},
= {wp = "Valencian Community"},
= {alias_of = "Valencia, Spain", the = true},
}
-- autonomous communities of Spain
export.spain_group = {
default_container = "Spain",
default_placetype = "autonomous community",
default_divs = {"municipalities", "comarcas"},
data = export.spain_autonomous_communities,
}
export.taiwan_counties = {
= {},
= {},
= {},
= {},
= {wp = "Kinmen"},
= {wp = "Matsu Islands"},
= {},
= {},
= {wp = "Penghu"},
= {},
= {},
= {wp = "%l, %c"},
= {},
}
-- counties of Taiwan
export.taiwan_group = {
key_to_placename = make_key_to_placename(", Taiwan$", " County$"),
placename_to_key = make_placename_to_key(", Taiwan", " County"),
default_container = "Taiwan",
default_placetype = "county",
default_divs = {"districts", "townships"},
data = export.taiwan_counties,
}
export.thailand_provinces = {
-- Bangkok (special administrative area)
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
-- provinces of Thailand
export.thailand_group = {
key_to_placename = make_key_to_placename(", Thailand$", " Province$"),
placename_to_key = make_placename_to_key(", Thailand", " Province"),
default_container = "Thailand",
default_placetype = "province",
default_divs = "districts",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.thailand_provinces,
}
export.turkey_provinces = {
= {}, -- code 01
= {}, -- code 02
= {}, -- code 03
= {}, -- code 04
= {}, -- code 05
= {}, -- code 06
= {}, -- code 07
= {}, -- code 08
= {}, -- code 09
= {}, -- code 10
= {}, -- code 11
= {}, -- code 12
= {}, -- code 13
= {}, -- code 14
= {}, -- code 15
= {}, -- code 16
= {}, -- code 17
= {}, -- code 18
= {}, -- code 19
= {}, -- code 20
= {}, -- code 21
= {}, -- code 22
= {}, -- code 23
= {alias_of = "Elazığ Province, Turkey", display = true},
= {}, -- code 24
= {}, -- code 25
= {}, -- code 26
= {}, -- code 27
= {}, -- code 28
= {}, -- code 29
= {}, -- code 30
= {alias_of = "Hakkâri Province, Turkey", display = true},
= {}, -- code 31
= {}, -- code 32
= {}, -- code 33
-- = {}, -- code 34; this is coextensive with the city itself
= {}, -- code 35
= {alias_of = "İzmir Province, Turkey", display = true},
= {}, -- code 36
= {}, -- code 37
= {}, -- code 38
= {}, -- code 39
= {}, -- code 40
= {}, -- code 41
= {}, -- code 42
= {}, -- code 43
= {}, -- code 44
= {}, -- code 45
= {}, -- code 46
= {}, -- code 47
= {}, -- code 48
= {}, -- code 49
= {}, -- code 50
= {}, -- code 51
= {}, -- code 52
= {}, -- code 53
= {}, -- code 54
= {}, -- code 55
= {}, -- code 56
= {}, -- code 57
= {}, -- code 58
= {}, -- code 59
= {}, -- code 60
= {}, -- code 61
= {}, -- code 62
= {}, -- code 63
= {}, -- code 64
= {}, -- code 65
= {}, -- code 66
= {}, -- code 67
= {}, -- code 68
= {}, -- code 69
= {}, -- code 70
= {}, -- code 71
= {}, -- code 72
= {}, -- code 73
= {}, -- code 74
= {}, -- code 75
= {}, -- code 76
= {}, -- code 77
= {}, -- code 78
= {}, -- code 79
= {}, -- code 80
= {}, -- code 81
}
-- provinces of Turkey
export.turkey_group = {
key_to_placename = make_key_to_placename(", Turkey$", " Province$"),
placename_to_key = make_placename_to_key(", Turkey", " Province"),
default_container = "Turkey",
default_placetype = "province",
default_divs = "districts",
data = export.turkey_provinces,
}
export.ukraine_oblasts = {
= {}, -- capital ], license plate prefix CA, IA
= {}, -- capital ], license plate prefix CB, IB
= {}, -- capital ], license plate prefix CE, IE
-- apparently will be renamed to 'Dnipro Oblast'
= {}, -- capital ], license plate prefix AE, KE
= {}, -- capital ''] (])'', license plate prefix AH, KH
= {}, -- capital ], license plate prefix AT, KT
= {}, -- capital ], license plate prefix AX, KX
= {}, -- capital '']'', license plate prefix ''BT, HT''
= {}, -- capital ], license plate prefix BX, HX
-- apparently will be renamed to 'Kropyvnytskyi Oblast'
= {}, -- capital ], license plate prefix BA, HA
= {}, -- capital ], license plate prefix AI, KI
= {alias_of = "Kyiv Oblast, Ukraine", display = true},
= {}, -- capital ''] (])'', license plate prefix BB, HB
= {}, -- capital ], license plate prefix BC, HC
= {}, -- capital ], license plate prefix BE, HE
= {}, -- capital ], license plate prefix BH, HH
= {alias_of = "Odesa Oblast, Ukraine", display = true},
= {}, -- capital ], license plate prefix BI, HI
= {}, -- capital ], license plate prefix BK, HK
= {}, -- capital ], license plate prefix BM, HM
= {}, -- capital ], license plate prefix BO, HO
= {}, -- capital ], license plate prefix AB, KB
= {}, -- capital ], license plate prefix AC, KC
= {}, -- capital ], license plate prefix AO, KO
= {}, -- capital '']'', license plate prefix AP, KP
= {alias_of = "Zaporizhzhia Oblast, Ukraine", display = true},
= {}, -- capital ], license plate prefix AM, KM
}
-- oblasts of Ukraine
export.ukraine_group = {
key_to_placename = make_key_to_placename(", Ukraine$", " Oblast$"),
placename_to_key = make_placename_to_key(", Ukraine", " Oblast"),
default_container = "Ukraine",
default_placetype = "oblast",
default_divs = {"raions", "hromadas"},
data = export.ukraine_oblasts,
}
export.united_kingdom_constituent_countries = {
= {divs = {
"counties",
"districts",
{type = "local government districts", cat_as = "districts"},
{
type = "local government districts with borough status",
cat_as = {"districts", "boroughs"},
},
{type = "boroughs", cat_as = {"districts", "boroughs"}},
{type = "civil parishes", container_parent_type = false},
}},
= {
placetype = {"constituent country", "province", "country"},
divs = {"counties", "districts"},
},
= {divs = {
{type = "council areas", container_parent_type = false},
"districts",
}},
= {divs = {
"counties",
{type = "county boroughs", container_parent_type = false},
{type = "communities", container_parent_type = false},
{type = "Welsh communities", cat_as = {{type = "communities", container_parent_type = false}}},
}},
}
-- constituent countries and provinces of the United Kingdom
export.united_kingdom_group = {
placename_to_key = false,
default_container = "United Kingdom",
default_placetype = {"constituent country", "country"},
addl_divs = {
"traditional counties",
{type = "historical counties", cat_as = "traditional counties"},
},
-- Don't create categories like 'Category:en:Towns in the United Kingdom'
-- or 'Category:en:Places in the United Kingdom'.
default_no_container_cat = true,
data = export.united_kingdom_constituent_countries,
}
export.england_counties = {
-- NOTE: We used to have various other "no longer" counties commented out, which seems to refer to counties that
-- existed officially at some point between 1889 and 1974, which I have removed. I have only kept the three
-- ceremonial counties that existed from 1974 (when ceremonial counties were created) to 1996, as well as those
-- still considered "historic counties" per ].
-- = {wp = "%l (county)"}, -- no longer (1974 to 1996)
= {},
= {},
-- = {}, -- city
-- = {}, -- city
= {},
= {},
= {},
-- = {wp = "%l (county)"}, -- no longer (1974 to 1996)
= {},
-- = {}, -- no longer (historic county)
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
-- = {}, -- no longer (1974 to 1996)
-- = {}, -- no longer (historic county)
= {the = true},
= {},
= {},
= {},
= {},
= {},
-- = {}, -- no longer (historic county)
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
-- = {}, -- no longer (historic county)
= {},
= {},
= {the = true, wp = "%l (county)"},
-- = {}, -- no longer (historic county)
= {},
= {},
= {},
= {},
-- = {}, -- no longer (historic county)
= {the = true},
}
-- counties of England
export.england_group = {
default_container = {key = "England", placetype = "constituent country"},
default_placetype = "county",
default_divs = {
"districts",
{type = "local government districts", cat_as = "districts"},
{
type = "local government districts with borough status",
cat_as = {"districts", "boroughs"},
},
{type = "boroughs", cat_as = {"districts", "boroughs"}},
"civil parishes",
},
data = export.england_counties,
}
export.northern_ireland_counties = {
= {},
= {},
= {the = true, is_city = true, wp = "Belfast"},
= {},
= {},
= {},
= {the = true, is_city = true, wp = "Derry"},
= {},
}
-- counties of Northern Ireland
export.northern_ireland_group = {
key_to_placename = make_irish_type_key_to_placename(", Northern Ireland$"),
placename_to_key = make_irish_type_placename_to_key(", Northern Ireland"),
default_container = {key = "Northern Ireland", placetype = "constituent country"},
default_placetype = "county",
data = export.northern_ireland_counties,
}
export.scotland_council_areas = {
= {},
= {wp = "%l, %c"},
= {},
= {the = true, wp = "Aberdeen"},
= {alias_of = "City of Aberdeen, Scotland"},
= {alias_of = "City of Aberdeen, Scotland"},
= {the = true, wp = "Dundee"},
= {alias_of = "City of Dundee, Scotland"},
= {alias_of = "City of Dundee, Scotland"},
= {the = true, wp = "%l council area"},
= {alias_of = "City of Edinburgh, Scotland"},
= {the = true, wp = "Glasgow"},
= {alias_of = "City of Glasgow, Scotland"},
= {},
= {},
= {},
= {},
= {},
= {},
= {wp = "%l council area"},
= {},
= {wp = "%l council area"},
= {},
= {},
= {},
= {},
= {},
= {the = true},
= {},
= {},
= {the = true},
= {the = true},
= {},
= {},
= {wp = "%l council area"},
= {},
= {},
= {the = true, wp = "Outer Hebrides"},
= {alias_of = "Western Isles, Scotland"},
}
-- council areas of Scotland
export.scotland_group = {
default_container = {key = "Scotland", placetype = "constituent country"},
default_placetype = "council area",
data = export.scotland_council_areas,
}
export.wales_principal_areas = {
= {},
= {wp = "%l County Borough"},
= {wp = "%l County Borough"},
-- = {placetype = "city"},
= {placetype = "county"},
= {placetype = "county"},
= {wp = "%l County Borough"},
= {placetype = "county"},
= {placetype = "county"},
= {placetype = "county"},
= {the = true, placetype = "county"},
= {alias_of = "Isle of Anglesey, Wales"}, -- differs in "the"
= {wp = "%l County Borough"},
= {placetype = "county"},
= {},
-- = {placetype = "city", wp = "%l, %c"},
= {placetype = "county"},
= {placetype = "county"},
= {},
-- = {placetype = "city"},
= {},
= {the = true},
= {wp = "%l County Borough"},
}
-- principal areas (cities, counties and county boroughs) of Wales
export.wales_group = {
default_container = {key = "Wales", placetype = "constituent country"},
default_placetype = "county borough",
data = export.wales_principal_areas,
}
export.united_states_states = {
= {},
= {divs = {
{type = "boroughs", container_parent_type = "counties"},
{type = "borough seats", container_parent_type = "county seats"},
}},
= {},
= {},
= {},
= {divs = {"counties", "county seats", "municipalities"}},
= {divs = {"counties", "county seats", "municipalities"}},
= {},
= {},
= {wp = "%l (U.S. state)"},
= {addl_parents = {"Polynesia"}},
= {},
= {},
= {},
= {},
= {},
= {},
= {divs = {
{type = "parishes", container_parent_type = "counties"},
{type = "parish seats", container_parent_type = "county seats"},
}},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {divs = {
"counties", "county seats",
{type = "boroughs", prep = "in"},
}},
= {},
= {wp = "%l (state)"},
= {},
= {},
= {},
= {},
= {},
= {divs = {
"counties", "county seats",
{type = "boroughs", prep = "in"},
}},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {},
= {wp = "%l (state)"},
= {},
= {},
= {},
}
-- states of the United States
export.united_states_group = {
placename_to_key = make_placename_to_key(", USA"),
default_container = "United States",
default_placetype = "state",
default_divs = {"counties", "county seats"},
addl_divs = {
{type = "census-designated places", prep = "in"},
{type = "unincorporated communities", prep = "in"},
},
data = export.united_states_states,
}
export.vietnam_provinces = {
-- ] region
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
-- ] region
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {alias_of = "Hoà Bình Province, Vietnam", display = true},
= {}, -- capital ]
= {}, -- capital ]
-- ] region
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
-- = {placetype = {"municipality", "city"}}, -- capital ]
-- = {placetype = {"municipality", "city"}}, -- capital ]
-- ] region
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {alias_of = "Thanh Hoá Province, Vietnam", display = true},
-- = {placetype = {"municipality", "city"}, wp = "Huế"}, -- capital ]
-- ] region
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
-- ] region
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {alias_of = "Khánh Hoà Province, Vietnam", display = true},
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
-- = {placetype = {"municipality", "city"}}, -- capital ]
-- ] region
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
-- = {placetype = {"municipality", "city"}}, -- capital ]
-- ] region
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
= {}, -- capital ]
-- = {placetype = {"municipality", "city"}, wp = "Cần Thơ"}, -- capital ]
}
-- provinces of Vietnam
export.vietnam_group = {
key_to_placename = make_key_to_placename(", Vietnam$", " Province$"),
placename_to_key = make_placename_to_key(", Vietnam", " Province"),
default_container = "Vietnam",
default_placetype = "province",
-- There may not be enough districts to subcategorize like this.
-- default_divs = "districts",
-- For obscure reasons, provinces of Iran, Laos, Thailand and Vietnam use lowercase 'province'
default_wp = "%e province",
data = export.vietnam_provinces,
}
-----------------------------------------------------------------------------------
-- City data --
-----------------------------------------------------------------------------------
export.australia_cities = {
= {container = "South Australia"}, -- 1,450,000 (Agglomeration)
= {container = "Queensland"}, -- 3,450,000 (Conglomeration; including the Gold Coast )
= {container = {key = "Australian Capital Territory, Australia", placetype = "territory"}}, -- 510,641 (2024 estimate)
= {container = "Victoria"}, -- 5,200,000 (Agglomeration)
= {container = "New South Wales", wp = "%l, %c"}, -- 534,033 (2024 estimate)
= {alias_of = "Newcastle, New South Wales"},
= {container = "Western Australia"}, -- 2,350,000 (Agglomeration)
= {container = "New South Wales"}, -- 5,100,000 (Agglomeration)
}
export.australia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Australia", "state"),
default_placetype = "city",
data = export.australia_cities,
}
export.brazil_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
= {container = "São Paulo"}, -- 22,600,000 (Consolidated Urban Area; including Guarulhos)
= {alias_of = "São Paulo", display = true},
= {container = "Rio de Janeiro"}, -- 13,600,000 (Consolidated Urban Area)
= {container = "Minas Gerais"}, -- 5,300,000
= {container = "Pernambuco"}, -- 4,100,000
= {container = "Rio Grande do Sul"}, -- 3,950,000 (Consolidated Urban Area)
= {container = "Distrito Federal"}, -- 3,850,000
= {alias_of = "Brasília", display = true},
= {container = "Ceará"}, -- 3,825,000
= {container = "Bahia", wp = "%l, %c", commonscat = "%l (%c)"}, -- 3,400,000
= {container = "Paraná"}, -- 3,375,000
= {container = "São Paulo"}, -- 3,250,000
= {container = "Goiás"}, -- 2,525,000
= {alias_of = "Goiânia", display = true},
= {container = "Amazonas"}, -- 2,275,000
= {container = "Pará"}, -- 2,200,000
= {alias_of = "Belém", display = true},
= {container = "Espírito Santo", wp = "%l, %c"}, -- 1,870,000
= {alias_of = "Vitória", display = true},
= {container = "São Paulo", wp = "%l, %c"}, -- 1,760,000
= {container = "Maranhão", wp = "%l, %c"}, -- 1,530,000
= {alias_of = "São Luís", display = true},
= {container = "Rio Grande do Norte", wp = "%l, %c"}, -- 1,360,000
= {container = "Santa Catarina"}, -- 1,260,000
= {alias_of = "Florianópolis", display = true},
= {container = "Alagoas"}, -- 1,220,000
= {alias_of = "Maceió", display = true},
= {container = "Paraíba", wp = "%l, %c"}, -- 1,210,000
= {alias_of = "João Pessoa", display = true},
= {container = "São Paulo"}, -- 1,090,000
= {alias_of = "São José dos Campos", display = true},
= {container = "Paraná"}, -- 1,050,000
= {container = "Piauí"}, -- 1,040,000
}
export.brazil_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Brazil", "state"),
default_placetype = "city",
data = export.brazil_cities,
}
export.canada_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-27; reference date 2025-01-01.
= {container = "Ontario"}, -- 7,850,000 (Consolidated Urban Area; including Hamilton)
= {container = "Quebec"}, -- 4,500,000 (Consolidated Urban Area)
= {container = "British Columbia"}, -- 3,175,000 (Consolidated Urban Area)
= {container = "Alberta"}, -- 1,510,000 (Consolidated Urban Area)
= {container = "Alberta"}, -- 1,460,000 (Consolidated Urban Area)
= {container = "Ontario"}, -- 1,390,000 (Consolidated Urban Area)
= {container = "Quebec"}, -- 839,311 metro per Wikipedia (2021 census)
= {container = "Manitoba"}, -- 834,678 metro per Wikipedia (2021 census)
= {container = "Ontario", wp = "%l, %c"}, -- 785,184 metro per Wikipedia (2021 census)
= {container = "Ontario", wp = "%l, %c"}, -- 575,847 metro per Wikipedia (2021 census)
}
export.canada_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Canada", "province"),
default_placetype = "city",
data = export.canada_cities,
}
export.france_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
= {container = "Île-de-France"}, -- 11,500,000 (Conglomeration)
= {container = "Auvergne-Rhône-Alpes"}, -- 2,050,000 (Conglomeration)
= {alias_of = "Lyon", display = true},
= {container = "Provence-Alpes-Côte d'Azur"}, -- 1,710,000 (Conglomeration)
= {alias_of = "Marseille", display = true},
= {container = "Hauts-de-France"}, -- 1,320,000 (Conglomeration)
= {container = "Nouvelle-Aquitaine"}, -- 1,160,000 (Conglomeration)
= {container = "Occitania"}, -- 1,150,000 (Conglomeration)
= {container = "Provence-Alpes-Côte d'Azur"},
= {container = "Pays de la Loire"},
= {container = "Grand Est"},
= {container = "Brittany"},
}
export.france_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", France", "region"),
default_placetype = "city",
data = export.france_cities,
}
export.germany_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
-- listed under Rhein-Ruhr Area, total population 10,900,000 (Consolidated Urban Area)
= {container = "North Rhine-Westphalia"},
= {alias_of = "Cologne", display = true},
= {container = "North Rhine-Westphalia"},
= {alias_of = "Düsseldorf", display = true},
= {container = "North Rhine-Westphalia"},
= {container = "North Rhine-Westphalia"},
= {container = "North Rhine-Westphalia"},
= {}, -- 4,700,000
= {container = "Hesse"}, -- 3,225,000
= {alias_of = "Frankfurt"}, -- not a display alias as it's longer
= {}, -- 2,900,000
= {container = "Bavaria"}, -- 2,300,000
= {container = "Baden-Württemberg"}, -- 2,300,000
= {container = "Baden-Württemberg"}, -- 1,550,000
= {container = "Bavaria"}, -- 1,120,000
= {"Lower Saxony"}, -- 1,090,000
= {container = "North Rhine-Westphalia"}, -- 1,080,000
= {container = "Saxony"}, -- 1,080,000
= {container = "North Rhine-Westphalia"}, -- 1,000,000
= {alias_of = "Aachen"}, -- historical; not a display alias
= {},
}
export.germany_cities_group = {
default_container = "Germany",
canonicalize_key_container = make_canonicalize_key_container(", Germany", "state"),
default_placetype = "city",
data = export.germany_cities,
}
export.india_cities = {
-- This lists the 65 metro areas per Demographia's 2023 estimates, as found in
-- ]. The last census in India (as of April 2025) was
-- conducted in 2011, and the results are not accurate any more.
= {container = {key = "Delhi, India", placetype = "union territory"}}, -- 31,190,000
= {container = "Maharashtra"}, -- 25,189,000
= {container = "West Bengal"}, -- 21,747,000
= {container = "Karnataka", wp = "Bengaluru"}, -- 15,257,000
= {alias_of = "Bangalore"},
= {container = "Tamil Nadu"}, -- 11,570,000
= {container = "Telangana"}, -- 9,797,000
= {container = "Gujarat"}, -- 8,006,000
= {container = "Maharashtra"}, -- 6,819,000
= {container = "Gujarat"}, -- 6,601,000
= {container = "Uttar Pradesh"}, -- 4,661,000
= {container = "Rajasthan"}, -- 4,360,000
= {container = "Uttar Pradesh"}, -- 4,350,000
= {container = "Madhya Pradesh"}, -- 3,765,000
= {container = "Maharashtra"}, -- 3,493,000
= {container = "Bihar"}, -- 3,331,000
= {container = "Uttar Pradesh"}, -- 3,229,000
= {container = "Kerala"}, -- 3,049,000
= {container = "Kerala"}, -- 2,851,000
= {container = "Uttar Pradesh"}, -- 2,737,000
= {container = "Madhya Pradesh"}, -- 2,562,000
= {container = "Tamil Nadu"}, -- 2,551,000
= {container = "Uttar Pradesh", wp = "Prayagraj"}, -- 2,438,000
= {alias_of = "Allahabad"},
= {container = "Kerala"}, -- 2,381,000
= {container = "Punjab"}, -- 2,205,000
= {container = "Gujarat"}, -- 2,182,000
= {container = {key = "Chandigarh, India", placetype = "union territory"}}, -- 2,168,000
= {container = "Tamil Nadu"}, -- 2,048,000
= {container = "Uttar Pradesh"}, -- 2,011,000
= {container = "Andhra Pradesh"}, -- 2,005,000
= {container = "Jharkhand"}, -- 1,925,000
= {container = "Kerala"}, -- 1,868,000
= {container = "Maharashtra"}, -- 1,810,000
= {container = "West Bengal"}, -- 1,720,000
= {container = "Uttar Pradesh"}, -- 1,660,000
= {container = "Jharkhand"}, -- 1,638,000
= {container = "Kerala"}, -- 1,578,000
= {container = "Kerala"}, -- 1,576,000
= {container = "Madhya Pradesh"}, -- 1,533,000
= {container = "Jharkhand"}, -- 1,503,000
= {container = "Rajasthan"}, -- 1,497,000
= {container = "Maharashtra"}, -- 1,490,000
= {alias_of = "Aurangabad"},
= {container = "Gujarat"}, -- 1,487,000
= {container = "Madhya Pradesh"}, -- 1,477,000
= {container = "Chhattisgarh"}, -- 1,429,000
= {container = "Uttar Pradesh"}, -- 1,410,000
= {container = "Kerala"}, -- 1,360,000
= {container = "Uttar Pradesh"}, -- 1,355,000
= {container = "Assam"}, -- 1,355,000
= {container = "Uttar Pradesh"}, -- 1,345,000
= {container = "Punjab"}, -- 1,313,000
= {container = "Karnataka"}, -- 1,296,000
= {container = "Chhattisgarh"}, -- 1,293,000
= {alias_of = "Bhilai"},
= {alias_of = "Bhilai"},
= {alias_of = "Bhilai"},
= {alias_of = "Bhilai"},
= {container = "Andhra Pradesh"}, -- 1,232,000
= {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,212,000
= {container = "Tamil Nadu", wp = "%l, %c"}, -- 1,189,000
= {container = "Rajasthan"}, -- 1,172,000
= {container = "Punjab"}, -- 1,165,000
= {container = "Uttar Pradesh"}, -- 1,152,000
= {container = "Uttarakhand"}, -- 1,136,000
= {container = "Tamil Nadu"}, -- 1,131,000
= {container = "Odisha"}, -- 1,112,000
= {container = {key = "Jammu and Kashmir, India", placetype = "union territory"}}, -- 1,103,000
= {container = "Maharashtra"}, -- 1,082,000
= {container = "Karnataka", wp = "Hubli–Dharwad"}, -- 1,062,000; wp with en dash
= {alias_of = "Hubli-Dharwad"},
= {alias_of = "Hubli-Dharwad"},
= {container = {key = "Puducherry, India", placetype = "union territory"}}, -- 1,024,000
= {alias_of = "Puducherry", display = true},
-- satellite/secondary cities of metro area (none in citypopulation.de)
= {container = "Uttar Pradesh"}, -- 1,729,000 city, 2,358,525 urban agglomeration per 2011 census; 3,406,061 2025 estimate from official website; part of Delhi metro area
= {container = "Haryana"}, -- 1,414,050 city per 2011 census; part of Delhi metro area
= {container = "Maharashtra"}, -- 1,841,488 city per 2011 census; part of Mumbai metro area
= {container = "Maharashtra"}, -- 1,246,381 city per 2011 census; part of Mumbai metro area
= {alias_of = "Kalyan-Dombivli", display = true},
= {alias_of = "Kalyan-Dombivli"},
= {alias_of = "Kalyan-Dombivli"},
= {alias_of = "Kalyan-Dombivli"},
= {container = "Maharashtra"}, -- 1,221,233 city per 2011 census; part of Mumbai metro area
= {alias_of = "Vasai-Virar"},
= {alias_of = "Vasai-Virar"},
= {container = "Maharashtra"}, -- 1,120,547 city per 2011 census; part of Mumbai metro area
= {container = "West Bengal"}, -- 1,077,075 city ("metropolis"), 2,811,344 "metro" per 2011 census; part of Kolkata metro area
= {container = "Maharashtra"}, -- 1,727,692 per 2011 census; part of Pune metro area
= {alias_of = "Pimpri-Chinchwad", display = true},
}
export.india_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", India", "state"),
default_placetype = "city",
data = export.india_cities,
}
export.indonesia_cities = {
-- cities where the city proper has more than 1,000,000 people as of mid-2023 estimate
= {container = "Special Capital Region of Jakarta", divs = {
{type = "subdistricts", container_parent_type = false},
}},
= {container = "East Java"},
= {container = "West Java"}, -- part of Jakarta metro area
= {container = "West Java"},
= {container = "North Sumatra"},
= {container = "West Java"}, -- part of Jakarta metro area
= {container = "Banten"}, -- part of Jakarta metro area
= {container = "South Sumatra"},
= {container = "Central Java"},
= {container = "South Sulawesi"},
= {container = "Banten"}, -- part of Jakarta metro area
= {container = "Riau Islands"},
= {container = "West Java"}, -- part of Jakarta metro area
= {container = "Riau"},
= {container = "Lampung"},
-- other metro areas over 1,000,000 people
= {container = "West Sumatra"},
= {container = "East Kalimantan"},
= {container = "East Java"},
= {container = "Special Region of Yogyakarta"},
= {container = "Bali"},
= {container = "West Java"},
= {container = "Central Java"},
= {container = "South Kalimantan"},
= {container = "West Java"},
}
export.indonesia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Indonesia", "province"),
default_placetype = "city",
data = export.indonesia_cities,
}
export.italy_cities = {
-- Data per ]. There are several lists given; the most recent one, used
-- here, only gives estimates as of Jan 1, 2014.
= {container = "Lombardy"}, -- 6,623,798
= {container = "Campania"}, -- 5,294,546
= {container = "Lazio"}, -- 4,447,881
= {container = "Piedmont"}, -- 1,865,284
= {container = "Veneto"}, -- 1,645,900
= {container = "Tuscany"}, -- 1,485,030
= {container = "Apulia"}, -- 1,257,459
= {container = "Sicily"}, -- 1,183,084
-- include a few just below 1,000,000 metro area that may be above it by now (depending on the definition).
= {container = "Sicily"}, -- 988,240
= {container = "Lombardy"}, -- 924,090
= {container = "Liguria"}, -- 861,318
}
export.italy_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Italy", "region"),
default_placetype = "city",
data = export.italy_cities,
}
export.japan_cities = {
-- Population figures from ]. Metro areas from
-- ].
= {keydesc = "] Metropolis, the ] and a ] of ] (which is a country in ])",
placetype = {"city", "prefecture"},
divs = {
{type = "special wards", container_parent_type = false},
{type = "cities", prep = "in"},
},
},
= {container = "Kanagawa"}, -- 3,697,894
= {container = "Osaka"}, -- 2,668,586
= {container = "Aichi"}, -- 2,283,289
-- FIXME, Hokkaido is handled specially.
= {container = "Hokkaido"}, -- 1,918,096
= {container = "Fukuoka"}, -- 1,581,527
= {container = "Hyōgo"}, -- 1,530,847
= {container = "Kyoto"}, -- 1,474,570
= {container = "Kanagawa", wp = "%l, Kanagawa"}, -- 1,373,630
= {container = "Saitama", wp = "%l (city)", commonscat = "%l, %c"}, -- 1,192,418
= {container = "Hiroshima"}, -- 1,163,806
= {container = "Miyagi"}, -- 1,029,552
-- the remaining cities are considered "central cities" in a 1,000,000+ metro area
-- (sometimes there is more than one central city in the area).
= {container = "Fukuoka"}, -- 986,998
= {container = "Chiba", wp = "%l (city)", commonscat = "%l, %c"}, -- 938,695
= {container = "Osaka"}, -- 835,333
= {container = "Niigata", wp = "%l (city)", commonscat = "%l, %c"}, -- 813,053
= {container = "Shizuoka"}, -- 811,431
= {container = "Shizuoka", wp = "%l (city)", commonscat = "%l, %c"}, -- 710,944
= {container = "Kanagawa"}, -- 706,342
= {container = "Okayama"}, -- 701,293
= {container = "Kumamoto"}, -- 670,348
= {container = "Kagoshima"}, -- 605,196
-- skipped 6 cities (Funabashi, Hachiōji, Kawaguchi, Himeji, Matsuyama, Higashiōsaka)
-- with population in the range 509k - 587k because not central cities in any
-- 1,000,000+ metro area.
= {container = "Tochigi"}, -- 507,833
}
export.japan_cities_group = {
default_container = "Japan",
canonicalize_key_container = make_canonicalize_key_container(" Prefecture, Japan", "prefecture"),
default_placetype = "city",
data = export.japan_cities,
}
export.mexico_cities = {
= {}, -- its own state
= {container = "Nuevo León"},
= {container = "Jalisco"},
= {container = "Puebla", wp = "%l (city)"},
= {container = "State of Mexico"},
= {container = "Baja California"},
-- Include the state in the category for León due to possible confusion with León, Spain.
= {container = "Guanajuato", wp = "%l, %c"},
= {alias_of = "León, Guanajuato"},
= {alias_of = "León, Guanajuato", display = true},
= {container = "Querétaro", wp = "%l (city)"},
= {alias_of = "Querétaro", display = true},
= {container = "Chihuahua"},
= {alias_of = "Ciudad Juárez"},
= {alias_of = "Ciudad Juárez", display = "Juárez"},
= {container = "Coahuila"},
= {alias_of = "Torreón", display = true},
-- Include the state in the category for Mérida due to possible confusion with Mérida, Spain or
-- Mérida, Venezuela.
= {container = "Yucatán", wp = "%l, %c"},
= {alias_of = "Mérida, Yucatán"},
= {alias_of = "Mérida, Yucatán", display = true},
= {container = "San Luis Potosí", wp = "%l (city)"},
= {alias_of = "San Luis Potosí", display = true},
= {container = "Aguascalientes", wp = "%l (city)"},
= {container = "Baja California"},
}
export.mexico_cities_group = {
default_container = "Mexico",
canonicalize_key_container = make_canonicalize_key_container(", Mexico", "state"),
default_placetype = "city",
data = export.mexico_cities,
}
export.nigeria_cities = {
-- Figures from citypopulation.de unless otherwise indicated; retrieved 2025-04-26; reference date 2025-01-01.
= {container = "Lagos"}, -- 21,300,000 (unindicated; population of low reliability)
= {container = "Kano", wp = "%l (city)"}, -- 5,350,000 (unindicated; population of low reliability)
= {container = "Oyo"}, -- 3,400,000 (unindicated; population of low reliability)
= {container = {key = "Federal Capital Territory, Nigeria", placetype = "federal territory"}}, -- 3,050,000 (unindicated; population of low reliability)
= {container = "Rivers"}, -- 2,250,000 (unindicated; population of low reliability)
= {container = "Kaduna"}, -- 1,980,000 (unindicated; population of low reliability)
= {container = "Edo"}, -- 1,790,000 (unindicated; population of low reliability)
= {container = "Abia", wp = "%l, Nigeria"}, -- 1,280,000 (unindicated; population of low reliability)
= {container = "Anambra"}, -- 1,230,000 (unindicated; population of low reliability)
= {container = "Borno"}, -- 1,190,000 (unindicated; population of low reliability)
= {container = "Kwara"}, -- 1,160,000 (unindicated; population of low reliability)
= {container = "Sokoto", wp = "%l (city)"}, -- 1,140,000 (unindicated; population of low reliability)
= {container = "Plateau"}, -- 1,110,000 (unindicated; population of low reliability)
= {container = "Kaduna"}, -- 1,050,000 (unindicated; population of low reliability)
= {container = "Enugu", wp = "%l (city)"}, -- 1,010,000 (unindicated; population of low reliability)
}
export.nigeria_cities_group = {
default_container = "Nigeria",
canonicalize_key_container = make_canonicalize_key_container(" State, Nigeria", "state"),
default_placetype = "city",
data = export.nigeria_cities,
}
export.pakistan_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
= {container = "Sindh"}, -- 21,000,000 (Consolidated Urban Area)
= {container = "Punjab"}, -- 14,600,000 (Consolidated Urban Area)
= {container = "Punjab"}, -- 5,600,000 (Consolidated Urban Area; including Islamabad)
= {container = {key = "Islamabad Capital Territory, Pakistan", placetype = "federal territory"}}, -- 5,600,000 (Consolidated Urban Area; including Rawalpindi)
= {container = "Punjab"}, -- 4,125,000 (Consolidated Urban Area)
= {container = "Punjab"}, -- 3,450,000 (Consolidated Urban Area)
-- there is also Hyderabad in India (very confusing)
= {container = "Sindh", wp = "%l, %c"}, -- 2,475,000 (Consolidated Urban Area)
= {alias_of = "Hyderabad, Pakistan"},
= {container = "Punjab"}, -- 2,425,000 (Consolidated Urban Area)
= {container = "Khyber Pakhtunkhwa"}, -- 2,150,000 (Consolidated Urban Area)
= {container = "Balochistan"}, -- 1,720,000 (Urban Area)
= {container = "Punjab"}, -- 1,080,000 (Urban Area)
= {container = "Punjab"}, -- 1,050,000 (Consolidated Urban Area)
}
export.pakistan_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Pakistan", "province"),
default_placetype = "city",
data = export.pakistan_cities,
}
export.philippines_cities = {
-- Skipped some cities in Metro Manila (Taguig, Pasig) which don't have districts.
-- Other cities outside Metro Manila skipped as not central city in their urban area.
= {container = {key = "Metro Manila, Philippines", placetype = "region"}},
-- Don't display-canonicalize Foo to Foo City as it may make the display weird.
= {alias_of = "Quezon City"},
= {container = {key = "Metro Manila, Philippines", placetype = "region"}},
= {container = "Davao del Sur"},
= {alias_of = "Davao City"},
= {container = {key = "Metro Manila, Philippines", placetype = "region"}},
= {container = "Zamboanga del Sur"},
= {alias_of = "Zamboanga City"},
= {container = "Cebu"},
= {alias_of = "Cebu City"},
= {container = "Rizal"},
= {container = "Misamis Oriental"},
= {container = "Cavite"},
= {alias_of = "Dasmariñas", display = true},
= {container = "South Cotabato"},
= {container = "Bulacan"},
= {container = "Negros Occidental"},
= {container = "Laguna", wp = "%l, %c"},
= {container = "Pampanga", wp = "Angeles City"},
= {alias_of = "Angeles"},
= {container = "Iloilo"},
= {alias_of = "Iloilo City"},
}
export.philippines_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Philippines", "province"),
default_placetype = "city",
data = export.philippines_cities,
}
export.russia_cities = {
-- Figures from citypopulation.de; retrieved 2025-04-26; reference date 2025-01-01.
= {}, -- 18,800,000 (Agglomeration)
= {}, -- 6,350,000 (Agglomeration)
= {container = "Novosibirsk Oblast"}, -- 1,820,000 (Agglomeration)
= {container = "Sverdlovsk Oblast"}, -- 1,810,000 (Agglomeration)
= {container = "Nizhny Novgorod Oblast"}, -- 1,620,000 (Agglomeration)
= {container = {key = "Tatarstan, Russia", placetype = "republic"}}, -- 1,560,000 (Agglomeration)
= {container = "Chelyabinsk Oblast"}, -- 1,430,000 (Agglomeration)
= {container = "Rostov Oblast"}, -- 1,390,000 (Agglomeration)
= {alias_of = "Rostov-on-Don", display = true},
= {container = {key = "Krasnodar Krai, Russia", placetype = "krai"}}, -- 1,370,000 (Agglomeration)
= {container = "Samara Oblast"}, -- 1,350,000 (Agglomeration)
= {container = {key = "Krasnoyarsk Krai, Russia", placetype = "krai"}}, -- 1,270,000 (Agglomeration)
= {container = {key = "Bashkortostan, Russia", placetype = "republic"}}, -- 1,230,000 (Agglomeration)
= {container = "Saratov Oblast"}, -- 1,170,000 (Agglomeration)
= {container = "Omsk Oblast"}, -- 1,140,000 (Agglomeration)
= {container = "Voronezh Oblast"}, -- 1,130,000 (Agglomeration)
= {container = "Volgograd Oblast"}, -- 1,080,000 (Agglomeration)
= {container = {key = "Perm Krai, Russia", placetype = "krai"}, wp = "%l, Russia"}, -- 1,070,000 (Agglomeration)
}
export.russia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Russia", "oblast"),
default_container = "Russia",
default_placetype = "city",
data = export.russia_cities,
}
export.saudi_arabia_cities = {
-- Figures for the first five from ] as of 2022. Unclear if these are
-- metro, urban or city proper figures.
= {container = "Riyadh"}, -- 7,000,100; 7,700,000 per citypopulation.de 2025-01-01 (Agglomeration)
= {container = "Mecca"}, -- 3,751,917; 3,950,000 per citypopulation.de 2025-01-01 (Agglomeration)
= {alias_of = "Jeddah", display = true},
= {alias_of = "Jeddah", display = true},
= {alias_of = "Jeddah", display = true},
= {container = "Eastern"}, -- 2,638,166; 2,925,000 per citypopulation.de 2025-01-01 (Agglomeration)
= {container = "Mecca"}, -- 2,385,509; 2,675,000 per citypopulation.de 2025-01-01 (Agglomeration)
= {alias_of = "Mecca", display = true},
= {container = "Medina"}, -- 1,477,023; 1,530,000 per citypopulation.de 2025-01-01 (City)
= {container = "Eastern"}, -- 1,060,000 per citypopulation.de 2025-01-01 (Agglomeration)
= {container = "Aseer"}, -- 1,030,000 per citypopulation.de 2025-01-01 (Agglomeration)
= {alias_of = "Khamis Mushait", display = true},
}
export.saudi_arabia_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(" Province, Saudi Arabia", "province"),
default_placetype = "city",
data = export.saudi_arabia_cities,
}
export.south_korea_cities = {
-- All cities listed are not associated with any county.
= {},
= {},
= {},
= {},
= {},
= {},
= {},
}
export.south_korea_cities_group = {
default_container = "South Korea",
canonicalize_key_container = make_canonicalize_key_container(" County, South Korea", "province"),
default_placetype = "city",
data = export.south_korea_cities,
}
export.spain_cities = {
= {container = "Community of Madrid"},
= {container = "Catalonia"},
= {container = "Valencia"},
= {container = "Andalusia"},
= {container = "Basque Country"},
}
export.spain_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", Spain", "autonomous community"),
default_placetype = "city",
data = export.spain_cities,
}
export.taiwan_cities = {
= {},
= {alias_of = "New Taipei City", display = true},
= {},
= {wp = "%l, Taiwan"},
= {},
= {},
= {},
-- these last three are not special municipalities
= {placetype = "city"},
= {placetype = "city"},
= {placetype = "city"},
}
export.taiwan_cities_group = {
placename_to_key = false, -- don't add ", Taiwan" to make the key
canonicalize_key_container = make_canonicalize_key_container(", Taiwan", "county"),
default_container = "Taiwan",
default_placetype = {"special municipality", "municipality", "city"},
default_is_city = true,
default_divs = {"districts"},
data = export.taiwan_cities,
}
-- NOTE: It's OK to mix cities from different constituent countries; as long as the immediate container is correct,
-- everything else will be figured out.
export.united_kingdom_cities = {
= {container = "Greater London"},
= {container = "Greater Manchester"},
= {container = "West Midlands"},
= {container = "Merseyside"},
= {container = {key = "City of Glasgow, Scotland", placetype = "council area"}},
= {container = "West Yorkshire"},
= {container = "Tyne and Wear"},
= {alias_of = "Newcastle upon Tyne"},
= {container = {key = "England", placetype = "constituent country"}},
= {container = {key = "Wales", placetype = "constituent country"}},
= {container = "Hampshire"},
= {container = {key = "City of Edinburgh, Scotland", placetype = "council area"}},
-- under 1,000,000 people but principal areas of Wales; requested by ]
= {container = {key = "Wales", placetype = "constituent country"}},
= {container = {key = "Wales", placetype = "constituent country"}, wp = "Newport, Wales"},
}
export.united_kingdom_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(", England", "county"),
default_placetype = "city",
data = export.united_kingdom_cities,
}
export.united_states_cities = {
-- top 50 CSA's by population, with the top and sometimes 2nd or 3rd city listed
= {container = "New York", wp = "%l", divs = {
{type = "boroughs", container_parent_type = false},
}},
-- Don't display-canonicalize as it may make the display weird (e.g. in the context New York, New York).
= {alias_of = "New York City"},
= {container = "New Jersey"},
= {container = "California", wp = "%l"},
= {container = "California"},
= {container = "California"},
= {container = "Illinois", wp = "%l"},
= {wp = "%l"},
= {alias_of = "Washington, D.C.", display = true},
= {alias_of = "Washington, D.C.", display = true},
= {alias_of = "Washington, D.C.", display = true},
-- Don't display-canonicalize as it may make the display weird (e.g. if the holonym is followed by a District of
-- Columbia holonym).
= {alias_of = "Washington, D.C."},
= {container = "Maryland", wp = "%l"},
-- to avoid conflict with San Jose in Costa Rica
= {container = "California"},
= {alias_of = "San Jose, California"},
= {container = "California", wp = "%l"},
= {container = "California"},
= {container = "Massachusetts", wp = "%l"},
= {container = "Rhode Island"},
= {container = "Texas", wp = "%l", commonscat = "%l, %c"},
= {container = "Texas"},
= {container = "Pennsylvania", wp = "%l"},
= {container = "Texas", wp = "%l"},
= {container = "Florida", wp = "%l", commonscat = "%l, %c"},
= {container = "Georgia", wp = "%l"},
= {container = "Michigan", wp = "%l"},
= {container = "Arizona", wp = "%l", commonscat = "%l, %c"},
= {container = "Arizona"},
= {container = "Washington", wp = "%l"},
= {container = "Florida"},
= {container = "Minnesota", wp = "%l"},
= {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
= {container = "Colorado", wp = "%l", commonscat = "%l, %c"},
= {container = "California", wp = "%l", commonscat = "%l, %c"},
= {container = "Oregon"},
= {container = "Florida"},
= {container = "Missouri", wp = "%l", commonscat = "%l, %c"},
= {alias_of = "St. Louis", display = true},
= {container = "North Carolina"},
= {container = "California"},
= {container = "Pennsylvania", wp = "%l"},
= {container = "Utah", wp = "%l"},
= {container = "Texas", wp = "%l", commonscat = "%l, %c"},
= {container = "Ohio"},
= {container = "Missouri", wp = "%l metropolitan area", commonscat = "%l, %c"},
= {container = "Indiana", wp = "%l"},
= {container = "Nevada", wp = "%l"},
= {container = "Ohio", wp = "%l", commonscat = "%l, %c"},
= {container = "Texas"},
= {container = "Wisconsin", wp = "%l", commonscat = "%l, %c"},
= {container = "North Carolina"},
= {container = "Tennessee"},
= {container = "Virginia"},
= {container = "Virginia"},
= {container = "North Carolina"},
= {container = "North Carolina"},
= {container = "Florida"},
= {container = "Louisiana", wp = "%l"},
= {container = "Kentucky"},
= {container = "South Carolina"},
= {container = "Connecticut"},
= {container = "Oklahoma", wp = "%l"},
= {container = "Michigan"},
= {container = "Tennessee"},
= {container = "Alabama"},
= {alias_of = "Birmingham, Alabama"},
= {container = "California"},
= {container = "Virginia"},
= {container = "Pennsylvania"},
-- any major city of top 50 MSA's that's missed by previous
= {container = "New York"},
-- any of the top 50 city by city population that's missed by previous
= {container = "Texas"},
= {container = "New Mexico"},
= {container = "Arizona"},
= {container = "Colorado"},
= {container = "Nebraska"},
= {container = "Oklahoma"},
-- skip Arlington, Texas; too obscure and likely to be interpreted as Arlington, Virginia
}
export.united_states_cities_group = {
default_container = "United States",
canonicalize_key_container = make_canonicalize_key_container(", USA", "state"),
default_placetype = "city",
default_wp = "%l, %c",
data = export.united_states_cities,
}
export.new_york_boroughs = {
= {the = true, wp = "The Bronx"},
= {},
= {},
= {},
= {},
}
export.new_york_boroughs_group = {
default_container = {key = "New York City", placetype = "city"},
default_placetype = "borough",
default_is_city = true,
data = export.new_york_boroughs,
}
export.vietnam_cities = {
-- Figures from citypopulation.de (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
= {}, -- 14,300,000 (Agglomeration; inclunding Bien Hoa)
= {alias_of = "Ho Chi Minh City"},
= {}, -- 7,350,000 (Agglomeration)
= {}, -- 1,500,000 (Agglomeration)
= {alias_of = "Da Nang", display = true},
= {}, -- 1,450,000 (Agglomeration)
= {alias_of = "Haiphong", display = true},
-- This is the one entry in this list that is not a province-level municipality; instead it's a "provincial city"
-- meaning it is directly under its province as opposed to being contained in a district.
= {placetype = "city", container = "Đồng Nai", wp = "Biên Hòa"}, -- 1,272,235 (2022 city population per Wikipedia)
= {alias_of = "Bien Hoa", display = true},
= {alias_of = "Bien Hoa", display = true},
-- These two not in citypopulation.de because the urban population may be slightly under 1,000,000, but they are
-- both province-level municipalities and close to the 1,000,000 mark.
= {wp = "Cần Thơ"}, -- 1,456,000 municipality (2019 census), 994,704 urban (2022 General Statistics Office of Vietnam estimate); capital ]
= {alias_of = "Can Tho", display = true},
= {wp = "Huế"}, -- 1,257,000 municipality (2019 census), 840,000 urban (2022 General Statistics Office of Vietnam estimate); -- capital ]
= {alias_of = "Hue", display = true},
}
export.vietnam_cities_group = {
placename_to_key = false, -- don't add ", Vietnam" to make the key
default_container = "Vietnam",
canonicalize_key_container = make_canonicalize_key_container(" Province, Vietnam", "province"),
-- Most of the cities listed are province-level municipalities in addition, which contain a certain amount of
-- rural territory surrounding the city, but not enough to separate the municipality from the city as distinct
-- known locations.
default_placetype = {"municipality", "city"},
default_is_city = true,
-- There may not be enough districts to subcategorize like this.
-- default_divs = "districts",
data = export.vietnam_cities,
}
export.misc_cities = {
------------------ Africa -------------------
-- Sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated; combined with data from
-- ].
= {container = "Algeria"}, -- 4,325,000 (Consolidated Urban Area)
= {container = "Algeria"}, -- 1,640,000 (Consolidated Urban Area)
= {container = "Angola"}, -- 9,650,000 (Urban Area)
= {container = "Angola"}, -- 1,420,000 (Urban Area)
= {container = "Benin"}, -- 2,150,000 (Agglomeration)
= {container = "Burkina Faso"}, -- 3,425,000 (Agglomeration)
= {container = "Burkina Faso"}, -- 1,100,000 (Agglomeration)
= {container = "Burundi"}, -- 1,143,202 (Urban Area 2023 per PopulationStat, cited in Wikipedia)
= {container = "Cameroon"}, -- 3,975,000 (City)
= {alias_of = "Yaoundé", display = true},
= {container = "Cameroon"}, -- 3,900,000 (City)
= {container = "Central African Republic"}, -- 1,680,000 (Agglomeration)
= {container = "Chad"}, -- 1,950,000 (City)
= {alias_of = "N'Djamena", display = true},
= {container = "Democratic Republic of the Congo"}, -- 16,300,000 (City; population of low reliability)
= {container = "Democratic Republic of the Congo"}, -- 2,875,000 (City; population of low reliability)
= {container = "Democratic Republic of the Congo"}, -- 2,500,000 (City; population of low reliability)
= {container = "Democratic Republic of the Congo"}, -- 1,370,000 (City; population of low reliability)
= {container = "Democratic Republic of the Congo"}, -- 1,300,000 (City; population of low reliability)
= {container = "Democratic Republic of the Congo"}, -- 1,100,000 (City; population of low reliability)
= {container = "Democratic Republic of the Congo"}, -- 1,010,000 (City; population of low reliability)
= {container = "Democratic Republic of the Congo"}, -- 1,020,468 (2023 Wikipedia ] from populationstat.com; not in citypopulation.de)
= {container = "Egypt"}, -- 22,800,000 (Agglomeration, including Giza and Subhra El Kheima)
= {container = "Egypt"}, -- 6,250,000 (Agglomeration)
= {container = "Egypt"}, -- 4,458,135 (2023 from citypopulation.de)
= {container = "Egypt"}, -- 1,240,239 (2021 from citypopulation.de)
= {container = "Eritrea"}, -- 1,090,000 (City; population of low reliability)
= {alias_of = "Asmara", display = true},
= {container = "Ethiopia"}, -- 4,825,000 (Agglomeration)
= {container = "Gambia"}, -- 1,170,000 (Agglomeration)
= {container = "Ghana"}, -- 6,800,000 (Agglomeration)
= {container = "Ghana"}, -- 2,900,000 (Agglomeration)
= {container = "Guinea"}, -- 2,975,000 (Consolidated Urban Area)
= {container = "Ivory Coast"}, -- 7,050,000 (Agglomeration)
= {container = "Kenya"}, -- 6,900,000 (unindicated)
= {container = "Kenya"}, -- 1,370,000 (City)
= {container = "Liberia"}, -- 1,940,000 (Urban Area)
= {container = "Libya", wp = "%l, %c"}, -- 1,870,000 (unindicated)
= {container = "Madagascar"}, -- 3,150,000 (Agglomeration)
= {container = "Malawi"}, -- 1,210,000 (City)
= {container = "Mali"}, -- 5,700,000 (Agglomeration)
= {container = "Mauritania"}, -- 1,500,000 (City)
= {container = {key = "Casablanca-Settat, Morocco", placetype = "region"}}, -- 4,450,000 (Municipality (urban population))
= {container = {key = "Rabat-Sale-Kenitra, Morocco", placetype = "region"}}, -- 2,125,000 (Municipality (urban population))
= {container = {key = "Tangier-Tetouan-Al Hoceima, Morocco", placetype = "region"}}, -- 1,410,000 (Municipality (urban population))
= {alias_of = "Tangier", display = true},
= {alias_of = "Tangier", display = true},
= {container = {key = "Fez-Meknes, Morocco", placetype = "region"}, wp = "%l, Morocco"}, -- 1,310,000 (Municipality (urban population))
= {alias_of = "Fez", display = true},
= {alias_of = "Fez", display = true},
= {container = {key = "Souss-Massa, Morocco", placetype = "region"}}, -- 1,270,000 (Municipality (urban population))
= {container = {key = "Marrakesh-Safi, Morocco", placetype = "region"}}, -- 1,140,000 (Municipality (urban population))
= {alias_of = "Marrakesh", display = true},
= {container = "Mozambique"}, -- 2,575,000 (Agglomeration)
= {container = "Niger"}, -- 1,530,000 (City)
= {container = "Republic of the Congo"}, -- 2,475,000 (Agglomeration)
= {container = "Republic of the Congo"}, -- 1,480,000 (City)
= {container = "Rwanda"}, -- 1,960,000 (Municipality (urban population))
= {container = "Senegal"}, -- 4,225,000 (Agglomeration)
= {container = "Senegal"}, -- 1,320,000 (Agglomeration)
= {container = "Sierra Leone"}, -- 1,420,000 (Agglomeration)
= {container = "Somalia"}, -- 2,250,000 (unindicated; population of low reliability)
= {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 14,800,000 (Consolidated Urban Area; including Pretoria, Soweto, etc.)
= {container = {key = "Western Cape, South Africa", placetype = "province"}}, -- 5,100,000 (Consolidated Urban Area)
= {container = {key = "KwaZulu-Natal, South Africa", placetype = "province"}}, -- 3,900,000 (Consolidated Urban Area)
= {container = {key = "Gauteng, South Africa", placetype = "province"}}, -- 2,921,488 (2011 census)
= {container = {key = "Eastern Cape, South Africa", placetype = "province"}, wp = "Gqeberha"}, -- 1,200,000 (Consolidated Urban Area)
= {alias_of = "Port Elizabeth"}, -- official name; not a display alias
= {container = "Sudan"}, -- 7,200,000 (unindicated; population of low reliability)
= {container = "Tanzania"}, -- 6,650,000 (Agglomeration)
= {container = "Tanzania"}, -- 1,340,000 (Agglomeration)
= {alias_of = "Mwanza", display = true},
= {container = "Tanzania"}, -- 1,190,000 (Agglomeration)
= {container = "Tanzania"}, -- 1,030,000 (Agglomeration)
= {container = "Togo"}, -- 2,625,000 (unindicated)
= {alias_of = "Lomé", display = true},
= {container = "Tunisia"}, -- 2,725,000 (Municipality (urban population))
= {container = "Tunisia"}, -- 1,180,000 (Municipality (urban population))
= {alias_of = "Sousse", display = true},
= {container = "Uganda"}, -- 4,300,000 (unindicated)
= {container = "Zambia"}, -- 3,000,000 (Consolidated Urban Area)
= {container = "Zimbabwe"}, -- 2,675,000 (Agglomeration)
------------------ Asia -------------------
-- sorted by country and then within the country, by decreasing population; figures from citypopulation.de
-- (retrieved 2025-04-26; reference date 2025-01-01) unless otherwise indicated.
= {container = "Afghanistan"}, -- 5,250,000 (Agglomeration)
= {container = "Azerbaijan"}, -- 3,725,000 (Administrative Area (urban population))
= {container = "Bahrain"}, -- 1,560,000 (unindicated)
= {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 23,100,000 (Agglomeration)
= {alias_of = "Dhaka", display = true},
= {container = {key = "Chittagong Division, Bangladesh", placetype = "division"}}, -- 5,050,000 (Agglomeration)
= {container = {key = "Dhaka Division, Bangladesh", placetype = "division"}}, -- 2,674,697 (City per 2022; countied in citypopulation.de as part of Dhaka metro area)
= {container = {key = "Khulna Division, Bangladesh", placetype = "division"}}, -- 1,210,000 (Agglomeration)
= {container = "Cambodia"}, -- 2,925,000 (Agglomeration)
= {container = {key = "Tehran Province, Iran", placetype = "province"}}, -- 16,800,000 (Agglomeration)
= {alias_of = "Tehran", display = true},
= {container = {key = "Razavi Khorasan Province, Iran", placetype = "province"}}, -- 3,475,000 (Agglomeration)
= {alias_of = "Mashhad", display = true},
= {alias_of = "Mashhad", display = true},
= {alias_of = "Mashhad", display = true},
= {container = {key = "Isfahan Province, Iran", placetype = "province"}}, -- 3,425,000 (Agglomeration)
= {alias_of = "Isfahan", display = true},
= {container = {key = "East Azerbaijan Province, Iran", placetype = "province"}}, -- 1,970,000 (Agglomeration)
= {container = {key = "Fars Province, Iran", placetype = "province"}}, -- 1,950,000 (Agglomeration)
= {container = {key = "Khuzestan Province, Iran", placetype = "province"}}, -- 1,550,000 (Agglomeration)
= {container = {key = "Qom Province, Iran", placetype = "province"}}, -- 1,450,000 (City)
= {container = {key = "Kermanshah Province, Iran", placetype = "province"}}, -- 1,130,000 (City)
= {container = "Iraq"}, -- 7,800,000 (Administrative Area (urban population))
= {container = "Iraq"}, -- 1,710,000 (Administrative Area (urban population))
= {container = "Iraq"}, -- 1,550,000 (Administrative Area (urban population))
= {container = "Iraq"}, -- 1,220,000 (Administrative Area (urban population))
= {container = "Iraq"}, -- 1,160,000 (Administrative Area (urban population))
= {container = "Iraq"}, -- 1,050,000 (Administrative Area (urban population))
= {container = "Israel"}, -- 3,000,000 (Agglomeration)
-- Jerusalem is not recognized internationally as part of either Israel or Palestine, but as a
-- ], so put the container as "Asia" and list Israel and Palestine as additional parents for
-- categorization purposes.
= {container = {key = "Asia", placetype = "continent"},
addl_parents = {"Israel", "Palestine"}}, -- 1,080,000 (Agglomeration)
= {container = "Jordan"}, -- 6,150,000 (unindicated)
= {container = "Jordan"}, -- 1,070,000 (unindicated)
= {container = "Kazakhstan"}, -- 2,700,000 (Agglomeration)
= {alias_of = "Almaty"}, -- former name, sometimes still used; don't display-canonicalize
= {container = "Kazakhstan"}, -- 1,600,000 (Agglomeration)
= {container = "Kazakhstan"}, -- 1,370,000 (Agglomeration)
= {container = "Kuwait"}, -- 5,050,000 (Agglomeration)
= {container = "Kyrgyzstan"}, -- 1,540,000 (Agglomeration)
= {container = "Lebanon"}, -- 1,930,000 (unindicated; population of low reliability)
-- Kuala Lumpur is a federal capital city, not in any state
= {container = "Malaysia"}, -- 9,550,000 (Agglomeration)
-- there are various George Towns and Georgetowns
= {container = {key = "Penang, Malaysia", placetype = "state"}, wp = "%l, %c"}, -- 2,075,000 (Agglomeration)
= {alias_of = "George Town, Malaysia"},
= {container = "Mongolia"}, -- 1,610,000 (City)
= {alias_of = "Ulaanbaatar", display = true},
= {container = "Myanmar"}, -- 5,650,000 (Municipality (urban population))
= {alias_of = "Yangon", display = true},
= {container = "Myanmar"}, -- 1,600,000 (Municipality (urban population))
= {container = "Nepal"}, -- 3,175,000 (Agglomeration)
-- Pyongyang is a directly governed city, not in any province
= {container = "North Korea"}, -- 3,025,000 (Administrative Area (urban population))
= {container = "Oman"}, -- 1,620,000 (Agglomeration)
= {container = "Palestine", wp = "Gaza City"}, -- 2,275,000 (unindicated)
= {alias_of = "Gaza"},
= {container = "Qatar"}, -- 2,650,000 (Agglomeration)
= {container = "Sri Lanka"}, -- 4,975,000 (unindicated)
= {container = "Syria"}, -- 3,975,000 (unindicated; population of low reliability)
= {container = "Syria"}, -- 1,980,000 (unindicated; population of low reliability)
= {container = "Tajikistan"}, -- 1,270,000 (City)
= {container = "Thailand"}, -- 21,800,000 (Agglomeration)
-- Chiang Mai not in citypopulation.de, but 1,198,000 urban population in 2021 per Wikipedia
-- ]
= {container = {key = "Chiang Mai Province, Thailand", placetype = "province"}},
= {container = {key = "Chonburi Province, Thailand", placetype = "province"}}, -- 1,570,000 (Agglomeration; including Pattaya)
-- metro area population stats from https://www.statista.com/statistics/255483/biggest-cities-in-turkey/ as of 2021;
-- second source is citypopulation.de reference date 2025-01-01.
= {placetype = {"city", "province"}, divs = {"districts"}, container = "Turkey"}, -- 15.2 million; 16,000,000 (Agglomeration)
= {alias_of = "Istanbul", display = true},
= {container = {key = "Ankara Province, Turkey", placetype = "province"}}, -- 5.15 million; 5,200,000 (Agglomeration)
= {container = {key = "İzmir Province, Turkey", placetype = "province"}, wp = "İzmir"}, -- 2.95 million; 3,025,000 (Agglomeration)
= {alias_of = "Izmir", display = true},
= {container = {key = "Bursa Province, Turkey", placetype = "province"}}, -- 2.02 million; 2,200,000 (Agglomeration)
= {container = {key = "Adana Province, Turkey", placetype = "province"}}, -- 1.77 million; 1,780,000 (Agglomeration)
= {container = {key = "Gaziantep Province, Turkey", placetype = "province"}}, -- 1.71 million; 1,750,000 (Agglomeration)
= {container = {key = "Antalya Province, Turkey", placetype = "province"}}, -- 1.3 million; 1,400,000 (Agglomeration)
= {container = {key = "Konya Province, Turkey", placetype = "province"}}, -- 1.35 million; 1,390,000 (Agglomeration)
= {container = {key = "Diyarbakır Province, Turkey", placetype = "province"}}, -- 1.07 million; 1,100,000 (Agglomeration)
-- Diyarbakır is more common per Ngrams and Google Scholar, but Diyarbakir is the Kurdish form, so we should not
-- display-canonicalize to the Turkish form Diyarbakır.
= {alias_of = "Diyarbakır"},
= {container = {key = "Mersin Province, Turkey", placetype = "province"}}, -- 1.03 million; 1,060,000 (Agglomeration)
= {container = "Turkmenistan"}, -- 1,150,000 (Agglomeration)
= {container = "United Arab Emirates"}, -- 6,050,000 (Agglomeration; including Sharjah)
= {container = "United Arab Emirates"}, -- 1,850,000 (City)
= {container = "United Arab Emirates"}, -- 1,800,000 (Metro area 2022-2023 per Wikipedia; separate from Dubai)
= {container = "Uzbekistan"}, -- 3,850,000 (unindicated)
= {container = "Yemen"}, -- 3,275,000 (City; population of low reliability)
= {alias_of = "Sanaa", display = true},
= {container = "Yemen"}, -- 1,079,060 (?; 2023 estimate from World Population Review per Wikipedia)
------------------ Europe or Europe-like (Caucasus etc.) ---------------------
= {container = "Armenia"}, -- 1,520,000 (Agglomeration)
= {container = "Austria"}, -- 2,375,000 (Agglomeration)
= {container = "Belarus"}, -- 2,100,000 (unindicated)
= {container = "Belgium"}, -- 2,800,000 (Consolidated Urban Area)
= {container = "Belgium"}, -- 1,270,000 (Consolidated Urban Area)
= {container = "Bulgaria"}, -- 1,260,000 (Agglomeration)
= {container = "Croatia"},
= {container = "Czech Republic"}, -- 1,470,000 (Agglomeration)
= {container = "Czech Republic"}, -- 729,405 (metro area per Wikipedia as of 2024-01-01 Czech Statistical Office)
= {container = "Czech Republic"}, -- 102,293 (city; included only because someone went crazy creating Olomouc-related terms)
= {container = "Denmark"}, -- 1,800,000 (Consolidated Urban Area)
= {container = {key = "Uusimaa, Finland", placetype = "region"}}, -- 1,560,000 (Consolidated Urban Area)
= {container = "Georgia"}, -- 1,430,000 (Agglomeration)
= {container = "Greece"},
= {container = "Greece"},
= {container = "Hungary"},
-- FIXME, per Wikipedia "County Dublin" is now the "Dublin Region"
= {container = {key = "County Dublin, Ireland", placetype = "county"}},
= {container = "Latvia"},
= {container = {key = "North Holland, Netherlands", placetype = "province"}},
= {container = {key = "South Holland, Netherlands", placetype = "province"}},
= {container = {key = "South Holland, Netherlands", placetype = "province"}},
-- Christchurch (metro 546,600) and Wellington (metro 439,800) are too small to make it.
= {container = {key = "Auckland, New Zealand", placetype = "region"}},
= {container = {key = "Oslo, Norway", placetype = "county"}},
= {container = {key = "Masovian Voivodeship, Poland", placetype = "voivodeship"}},
= {container = {key = "Silesian Voivodeship, Poland", placetype = "voivodeship"}},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Krakow" without accent.
= {container = {key = "Lesser Poland Voivodeship, Poland", placetype = "voivodeship"}, wp = "Kraków"},
= {alias_of = "Krakow", display = true},
= {alias_of = "Krakow", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirm "Gdańsk" and "Poznań" with accent.
= {container = {key = "Pomeranian Voivodeship, Poland", placetype = "voivodeship"}},
= {alias_of = "Gdańsk", display = true},
= {container = {key = "Greater Poland Voivodeship, Poland", placetype = "voivodeship"}},
= {alias_of = "Poznań", display = true},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Lodz" without accents.
= {container = {key = "Lodz Voivodeship, Poland", placetype = "voivodeship"}, wp = "Łódź"},
= {alias_of = "Lodz", display = true},
= {container = {key = "Lisbon District, Portugal", placetype = "district"}},
= {container = {key = "Porto District, Portugal", placetype = "district"}},
= {alias_of = "Porto", display = true},
= {container = "Romania"},
= {container = "Serbia"},
= {container = "Sweden"},
= {container = "Switzerland"},
--- Ngrams (up through 2022) and Google Scholar (>= 2024) confirms the common form "Zurich" without umlaut.
--- Even Wikipedia uses the form without umlaut.
= {alias_of = "Zurich", display = true},
= {container = "Ukraine"}, -- not in Kyiv Oblast
-- Don't display-canonicalize Kiev -> Kyiv because in ancient contexts, Kiev is still more common.
= {alias_of = "Kyiv"},
= {container = {key = "Kharkiv Oblast, Ukraine", placetype = "oblast"}},
= {container = {key = "Odesa Oblast, Ukraine", placetype = "oblast"}, wp = "Odesa"},
-- Don't display-canonicalize Odesa -> Odessa because it may be interpreted as a political statement.
= {alias_of = "Odessa"},
------------------ North America, South America ---------------------
-- Primary figures from citypopulation.de retrieved on 2025-04-26 (reference date 2025-01-01);
-- Wikipedia metropolitan figures from ] based on per-country data;
-- Wikipedia city limits figures from ].
= {container = "Argentina"}, -- 16,800,000 (Consolidated Urban Area; 13,985,794 metropolitan area per Wikipedia)
= {container = "Argentina", wp = "%l, %c"}, -- 1,810,000 (Consolidated Urban Area; 1,505,25 city limits per Wikipedia)
-- to avoid confusion with Córdoba in Spain
= {alias_of = "Córdoba, Argentina"},
= {alias_of = "Córdoba, Argentina", display = "Córdoba"},
= {container = "Argentina", wp = "%l, Santa Fe"}, -- 1,510,000 (Consolidated Urban Area; 1,348,725 metropolitan area per Wikipedia)
= {container = "Argentina", wp = "%l, %c"}, -- 1,180,000 (Consolidated Urban Area)
= {container = "Argentina"}, -- 1,110,000 (Consolidated Urban Area)
= {alias_of = "San Miguel de Tucumán"},
= {alias_of = "San Miguel de Tucumán", display = "Tucumán"},
= {container = "Bolivia"}, -- 1,960,000 (Consolidated Urban Area); 1,606,671 (city limits per Wikipedia)
= {alias_of = "Santa Cruz de la Sierra"},
= {container = "Bolivia"}, -- 1,870,000 (Consolidated Urban Area; composed of El Alto, now slightly larger, and La Paz)
= {container = "Bolivia"},
= {container = "Bolivia"}, -- 1,280,000 (Consolidated Urban Area)
= {container = "Chile"}, -- 8,400,000 (Consolidated Urban Area; 6,903,479 city limits? per Wikipedia)
= {container = "Chile"}, -- 1,060,000 (Consolidated Urban Area)
= {alias_of = "Valparaíso"}, -- 1,060,000 (Consolidated Urban Area)
= {container = "Colombia"}, -- 10,600,000 (Agglomeration; 12,772,828 metropolitan area per Wikipedia)
= {alias_of = "Bogotá", display = true},
= {container = "Colombia"}, -- 4,350,000 (Agglomeration; 4,068,000 metropolitan area per Wikipedia)
= {alias_of = "Medellín", display = true},
= {container = "Colombia"}, -- 2,975,000 (Agglomeration; 2,837,000 metropolitan area per Wikipedia)
= {container = "Colombia"}, -- 2,375,000 (Agglomeration; 1,341,160 city limits per Wikipedia)
= {container = "Colombia"}, -- 1,380,000 (Agglomeration)
= {container = "Colombia", wp = "%l, %c"}, -- 1,250,000 (Agglomeration)
-- to avoid confusion with Cartagena, Spain
= {alias_of = "Cartagena, Colombia"},
= {container = "Colombia"}, -- 1,130,000 (Agglomeration)
= {alias_of = "Cúcuta", display = true},
-- to avoid conflict with San Jose, California
= {container = "Costa Rica", wp = "%l, %c"}, -- 2,450,000 (Municipality (urban population); 3,160,000 metropolitan area per Wikipedia)
= {alias_of = "San José, Costa Rica"},
= {alias_of = "San José, Costa Rica"}, -- display = "San José"; causes error due to San Jose alias for California city; FIXME
= {container = "Cuba"}, -- 2,150,000 (City; 2,137,847 city limits? per Wikipedia)
= {container = "Dominican Republic"}, -- 3,900,000 (Municipality (urban population); 4,274,651 ??? per Wikipedia)
= {container = "Ecuador"}, -- 3,350,000 (Agglomeration; 3,092,000 metro area? per Wikipedia)
= {container = "Ecuador"}, -- 2,875,000 (Agglomeration; 2,889,703 metro area? per Wikipedia)
= {container = "El Salvador"}, -- 1,580,000 (Municipality (urban population))
= {container = "Guatemala"}, -- 3,375,000 (Municipality (urban population); 3,160,000 metro area? per Wikipedia)
= {container = "Haiti"}, -- 3,050,000 (Agglomeration; population of low reliability; 2,915,000 metro area? per Wikipedia)
= {container = "Honduras"}, -- 1,330,000 (Consolidated Urban Area)
= {container = "Honduras"}, -- 1,220,000 (Urban Area)
= {container = "Nicaragua"}, -- 1,400,000 (Consolidated Urban Area)
= {container = "Panama"}, -- 1,430,000 (Urban Area)
= {container = "Paraguay"}, -- 2,350,000 (Municipality (urban population))
= {container = "Peru"}, -- 12,000,000 (Agglomeration; 11,283,787 ??? per Wikipedia)
= {container = "Peru"}, -- 1,210,000 (Agglomeration)
= {container = {key = "Puerto Rico", placetype = "commonwealth"}, wp = "%l, %c"}, -- 1,910,000 (Consolidated Urban Area)
= {container = "Uruguay"}, -- 1,810,000 (Agglomeration; 1,302,954 ??? per Wikipedia)
= {container = "Venezuela"}, -- 3,850,000 (Consolidated Urban Area; 5,243,301 ??? per Wikipedia)
= {container = "Venezuela"}, -- 2,825,000 (Consolidated Urban Area; 5,278,448 ??? per Wikipedia)
-- to avoid confusion with Valencia (city and autonomous community of Spain)
= {container = "Venezuela", wp = "%l, %c"}, -- 2,100,000 (Consolidated Urban Area)
= {alias_of = "Valencia, Venezuela"},
= {container = "Venezuela"}, -- 1,480,000 (Consolidated Urban Area)
= {container = "Venezuela"}, -- 1,360,000 (Consolidated Urban Area)
}
export.misc_cities_group = {
canonicalize_key_container = make_canonicalize_key_container(nil, "country"),
default_placetype = "city",
data = export.misc_cities,
}
--[==[ var:
List of all known locations, in groups. The first group lists continents and continental regions, followed by three
groups listing top-level locations: countries, "country-like entities" (de-facto/unrecognized/etc. countries and
dependent territories) and former polities (countries, empires, etc.). After that come first-level subpolities
(administrative divisions) of several, mostly large, countries, followed by groups of cities. China and the United
Kingdom include second-level subpolities (in the case of China, only the largest ones as the full list runs in the
hundreds).
]==]
export.locations = {
export.continents_group,
export.countries_group,
export.country_like_entities_group,
export.former_countries_group,
export.australia_group,
export.austria_group,
export.bangladesh_group,
export.brazil_group,
export.canada_group,
export.china_group,
export.china_prefecture_level_cities_group,
export.china_prefecture_level_cities_group_2,
export.finland_group,
export.france_group,
export.france_departments_group,
export.germany_group,
export.greece_group,
export.india_group,
export.indonesia_group,
export.iran_group,
export.ireland_group,
export.italy_group,
export.japan_group,
export.laos_group,
export.lebanon_group,
export.malaysia_group,
export.malta_group,
export.mexico_group,
export.moldova_group,
export.morocco_group,
export.netherlands_group,
export.new_zealand_group,
export.nigeria_group,
export.north_korea_group,
export.norway_group,
export.pakistan_group,
export.philippines_group,
export.poland_group,
export.portugal_group,
export.romania_group,
export.russia_group,
export.saudi_arabia_group,
export.south_africa_group,
export.south_korea_group,
export.spain_group,
export.taiwan_group,
export.thailand_group,
export.turkey_group,
export.ukraine_group,
export.united_kingdom_group,
export.united_states_group,
export.england_group,
export.northern_ireland_group,
export.scotland_group,
export.wales_group,
export.vietnam_group,
export.australia_cities_group,
export.brazil_cities_group,
export.canada_cities_group,
export.france_cities_group,
export.germany_cities_group,
export.india_cities_group,
export.indonesia_cities_group,
export.italy_cities_group,
export.japan_cities_group,
export.mexico_cities_group,
export.nigeria_cities_group,
export.pakistan_cities_group,
export.philippines_cities_group,
export.russia_cities_group,
export.saudi_arabia_cities_group,
export.south_korea_cities_group,
export.spain_cities_group,
export.taiwan_cities_group,
export.united_kingdom_cities_group,
export.united_states_cities_group,
export.new_york_boroughs_group,
export.vietnam_cities_group,
export.misc_cities_group,
}
return export