Hello, you have come here looking for the meaning of the word Wiktionary talk:Language treatment. In DICTIOUS you will not only get to know all the dictionary meanings for the word Wiktionary talk:Language treatment, but we will also tell you about its etymology, its characteristics and you will know how to say Wiktionary talk:Language treatment in singular and plural. Everything you need to know about the word Wiktionary talk:Language treatment you have here. The definition of the word Wiktionary talk:Language treatment will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofWiktionary talk:Language treatment, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
Latest comment: 15 years ago1 comment1 person in discussion
For cases where we consider the macrolanguage to be the individual language and the subdivisions to be dialects, I think we should move the subdivision language code templates to the etyl: namespace. Similar to language families & other dialects, this is where we house codes that should only be used in Etymologies and not as valid L2 languages. Sound fine? --Bequw → ¢ • τ21:59, 20 January 2010 (UTC)Reply
Treatment by SIL
Latest comment: 15 years ago1 comment1 person in discussion
I thought it may be interesting to post what SIL's (the Registration Authority for ISO 639-3) criteria are for determining if language varieties are dialects or distinct languages. It can be found on their Change Request Form (page 3).
For this part of ISO 639, judgments regarding when two varieties are considered to be the same or different languages are based on a number of factors, including linguistic similarity, intelligibility, a common literature (traditional or written), a common writing system, the views of users concerning the relationship between language and identity, and other factors. The following basic criteria are followed:
Two related varieties are normally considered varieties of the same language if users of each variety have inherent understanding of the other variety (that is, can understand based on knowledge of their own variety without needing to learn the other variety) at a functional level.
Where intelligibility between varieties is marginal, the existence of a common literature or of a common ethnolinguistic identity with a central variety that both understand can be strong indicators that they should nevertheless be considered varieties of the same language.
Where there is enough intelligibility between varieties to enable communication, the existence of well-established distinct ethnolinguistic identities can be a strong indicator that they should nevertheless be considered to be different languages
Latest comment: 15 years ago1 comment1 person in discussion
Sometimes (as with Latvian and Estonian) we treat the subdivisions of a macrolanguage as individual languages, but we use the macrolanguage name/code in place of the "standard" dialect name/code. I just added this option to the table. Are there other macrolanguages where this is the case (possibly Arabic and Malay)? --Bequw → ¢ • τ17:09, 23 January 2010 (UTC)Reply
Aramaic
Latest comment: 14 years ago2 comments2 people in discussion
Apparently some have been treating "Jewish Babylonian Aramaic" (aka "Talmudic Aramaic", code=tmr) as a variety of Aramaic. Does anyone know if this is standard, or if this is true of other ISO 639-3 coded Aramaic varieties? --Bequw → τ18:34, 8 February 2010 (UTC)Reply
Use the templates, please; because they standardize the possible texts, and standardization is good. Another way to contribute to the page is typing here what you need, so I may update the table. --Daniel.14:15, 22 April 2010 (UTC)Reply
I've changed the table to a regular wikitable so that anyone can edit it and so that it can handle more complex situations and the presence of deleted codes. Cheers, - -sche(discuss)21:05, 23 May 2013 (UTC)Reply
Aramaic redux
Latest comment: 11 years ago1 comment1 person in discussion
Because at least one RFM is ongoing(?), I'll list this here rather than on the main page: oar (Old Aramaic, up to 700 BCE) is not used, as it has been superseded by arc and syc. tmr (Jewish Babylonian Aramaic, circa 200-1200 CE) is not used, as it has been superseded by arc and etyl:tmr. - -sche(discuss)00:11, 16 July 2013 (UTC)Reply
Montagnais/Innu
Latest comment: 11 years ago1 comment1 person in discussion
Currently, some main-namespace pages use Montagnais/Innu's language code (probably mostly in translations tables) while a few use other Cree dialects' language codes. Innu is different enough from Cree that Innu is regularly considered side-by-side with (rather than subordinated under) Cree; e.g. the Linguistic Atlas of Canada speaks of "different Cree and Innu dialects". OTOH, they're not that different, and splitting them at the L2 level would raise questions of what to do with e.g. Naskapi. I'm curious whether we should (a) allow Innu its own L2, (b) merge it completely into Cree, or (c) leave it subordinated under / merge into cr at the L2 level, but let it keep its code (it currently still has one, as no-one ever deleted it) so that it can be used in translations tables (like the Romani lects' codes). The translations could be nested under Cree/cr, or could be separate, sorted under M or I depending on which name we end up using for the lect. - -sche(discuss)22:26, 20 July 2013 (UTC)Reply
East Frisian: frs, stq
Latest comment: 13 years ago10 comments3 people in discussion
This is an old, old mistake in ISO. Both codes refer to the very same language, namely the Frisian dialect spoken in Saterland, which is an Eastern Frisian dialect. I have no idea how that was overlooked, but it means the two codes should be merged somehow. I'd prefer {{frs}}, since that one is in 639-2. -- Liliana•14:24, 17 October 2011 (UTC)Reply
But what about etymologies involving Eastern Frisian, at the time it still existed? With no code, how should they be entered? Or even, how should the etymologies that already exist be fixed? —CodeCat11:23, 24 October 2011 (UTC)Reply
If it warrants a distinction, it should get one of these constructed codes. It isn't covered by the code frs anyway, which ISO classifies as a "living" language, not an extinct one. -- Liliana•12:13, 24 October 2011 (UTC)Reply
This doesn't explain why ISO assigned two codes to one language. We do not have that for any other language of the world. Using frs for a different language than what ISO intended would make a precedent case, and almost certainly require a vote. Another problem is that the current name "East Frisian" is really confusing, since there's an (unrelated) Low German dialect which is also called East Frisian. So in any case, you would have to sort out the erroneous uses. -- Liliana•16:15, 24 October 2011 (UTC)Reply
I agree with Liliana, we need a separate code of our own for non-Saterland varieties of East Frisian (or we need to clearly indicate that we are using "frs" to refer to a language other than the one the ISO refers to as "frs"). If a word is derived from a variety of East Frisian other than the one the ISO calls "stq", it cannot be derived from what the ISO calls "frs", because "frs" is living, and the only living East Frisian lect is "stq". - -sche(discuss)00:34, 26 October 2011 (UTC)Reply
Proposed additions / clarifications
Latest comment: 11 years ago4 comments2 people in discussion
These are all from translation tables, which I will edit to reflect consensus for any of these cases:
Macro languages:
Chinese: dng, ltc, och
Sorbian: dsb, hsb
Apache: apw, apm, apj, apl, apk
Sami: smn, smj, sms, sma, se
Frisian: fy, ofs, frr
Berber: shi
Marquesan: mrq, mrm
Dialects / script group:
sq: als does not exist any more, change to just Tosk
cop: Bohairic, Sahidic, Fayyumic
lt: Aukštaitian
ms: Rumi, Jawa
sc: Nugorese
tly: Asalemi, Anbarani, Masali
sh: Cyrillic, Roman, Arebica, Latin
arc: Hebrew, Syriac
ks: Arabic, Devanagari
cu: Cyrillic, Glagolitic
ro: mo no longer exists; Latin, Cyrillic
os: Digor, Iron
kea: Badiu, São Vicente, ALUPEC, Sotavento, Barlavento, Santo Antão
az: Cyrillic, Roman, Perso-Arabic, Arabic, Persic
avd: Vidari
egy: Archaic Egyptian, Old Egyptian, Middle Egyptian, Late Egyptian
tt: Cyrillic, Roman
lad: Roman, Hebrew, Latin
pa: Gurmukhi, Shahmukhi (has its own code?)
nso: Sepedi
vot: Roman, Cyrillic
rom: table says that rmc, rmf, rml, rmn, rmo, rmw, rmy are deprecated but they still exist in the languages module
Nice. Some additional things that I noticed after a quick read: okm should be under ko, pgl should be under ga, zlw-opl should be under pl, there are tons of missing Arabic sublects that should be under ar, and grc (and possibly some other lects) should be under el. —Μετάknowledgediscuss/deeds02:40, 20 August 2013 (UTC)Reply
Latest comment: 10 years ago2 comments2 people in discussion
A lot of the language codes in the table don't have a name next to them, but if we added the name it would become very hard to see. Would it be useful to turn it into title text, so that the name is shown when you over the mouse over the code? —CodeCat19:36, 25 August 2013 (UTC)Reply
Hmm. One downside to that is that it would no longer be possible (would it?) to hit Ctrl+F and search the page for a particular dialect's name. Given that one of the reasons this page exists is so that people can see if the reason we don't have a code is because we've merged it into something else (vs we just haven't added it yet), that's a significant downside. - -sche(discuss)05:15, 26 February 2014 (UTC)Reply
This is causing some script errors because some of the codes have since been deleted. I'm not sure what to do about that. —CodeCat13:34, 23 May 2013 (UTC)Reply
It needs to be redesigned so that the table can contain/mention codes that have been deleted, for the reason you mention and several other reasons. - -sche(discuss)19:03, 23 May 2013 (UTC)Reply
Retired codes which were not used on Wiktionary in February 2014
Codes which were retired from the ISO and which were not used on Wiktionary as of February of 2014. (Since then, several other codes which were retired from the ISO by that date have also been retired on Wiktionary; see the following sections.)
List
nln — Durango Nahuatl - split into Eastern Durango Nahuatl and Western Durango Nahuatl
noo — Nootka - split into Ditidaht and Nootka / Nuu-chah-nulth
unp|getCanonicalName}} (unp) Worora - split into Worrorra and Unggumi
wiw|getCanonicalName}} Wirangu - split into Wirangu and Nauo
aay — Aariya - retired as nonexistent
acc — Cubulco Achí - merged with
aex — Amerax - retired as nonexistent
agp — Paranan - split into Pahanan Agta and Paranan (the lects are extremely similar, but only due to convergence; their different grammar reveals their different origins)
ahe — Ahe - merge into Kendayan / Salako
aiz — Aari - split into Aari (new identifier) and Gayil
akn — Amikoana - retired as nonexistent
amd — Amapá Creole - retired as nonexistent
arf — Arafundi - split into three languages: Andai ; Nanubae ; Tapei
atf — Atuence - retired as nonexistent
auv — Auvergnat - merge into Occitan (post 1500)
ayx — Ayi (China) - merge into Anong as duplicate / nonexistent as separate lect
azr — Adzera - split into three languages: Adzera (new identifier), Sukurum and Sarasira
baz — Tunen - split into Tunen and Nyokon
bcx — Pamona - split into Pamona (new identifier) and Batui
bgh — Bogan - merge into Bugan as duplicate
bhk — Albay Bicolano - split into Buhi'non Bikol ; Libon Bikol ; Miraya Bikol ; West Albay Bikol
bii — Bisu - split into Bisu (new identifier) and Laomian
bjq — Southern Betsimisaraka Malagasy (later granted another code which we ourselves retired)
bkb — Finallig
bke — Bengkulu
blu — Hmong Njua
bnh — Banawá
boc — Bakung Kenyah
bqe — Navarro-Labourdin Basque
bsd — Sarawak Bisaya
bsz — Souletin Basque
btb — Beti (Cameroon)
bvs — Belgian Sign Language
bwv — Bahau River Kenyah
bxt — Buxinhua
byu — Buyang
cbm — Yepocapa Southwestern Cakchiquel
ccx — Northern Zhuang
ccy — Southern Zhuang
chs — Chumash - Extinct
cit — Chittagonian
cjr — Chorotega - Extinct
cka — Khumi Awa Chin
ckc — Northern Cakchiquel
ckd — South Central Cakchiquel
cke — Eastern Cakchiquel
ckf — Southern Cakchiquel
cki — Santa María De Jesús Cakchiquel
ckj — Santo Domingo Xenacoj Cakchiquel
ckk — Acatenango Southwestern Cakchiquel
ckw — Western Cakchiquel
cmk — Chimakum - Extinct
cnm — Ixtatán Chuj
cru — Carútana - Extinct
cti — Tila Chol
cun — Cunén Quiché
daf — Dan
dap — Nisi (India)
dat — Darang Deng
dha — Dhanwar (India)
dkl — Kolum So Dogon
drh — Darkhat
drw — Darwazi
dyk — Land Dayak - actually a family
elp — Elpaputih - "Lack of information may have cause Elpaputih to be considered different from (Amahai) and (Paulohi) "
eml — Emiliano-Romagnolo - split into Emilian and Romagnol
eni — Enim - merge into Central Malay
eur — Europanto - constructed - hoax / joke
fiz — Izere
flm — Falam Chin
fri — Western Frisian
gav — Gabutamon - merge into Domung
gen — Geman Deng - merge into Miju-Mishmi as duplicate
ggh — Garreh-Ajuran - split between Borana and Somali (!)
ggm — Gugu Mini - extinct - nonexistent
gmo — Gamo-Gofa-Dawro - split into three languages: Gamo , Gofa , and Dawro
gsc — Gascon - merge into Occitan (post 1500)
hsf — Southeastern Huastec
hva — San Luís Potosí Huastec
itu — Itutang
ixi — Nebaj Ixil
ixj — Chajul Ixil
jai — Western Jacalteco - merge with Popti'
jap — Jaruára - merge into Jamamadí
jar — Jarawa (Nigeria) - split into Gwak and Bankal
kds — Lahu Shi
knh — Kayan River Kenyah
kob — Kohoroxitari
krg — North Korowai
krq — Krui
kxg — Katingan
leg — Lengua
lmm — Lamam
lms — Limousin
lmt — Lematang
lnc — Languedocien
lnt — Lintang
mbg — Northern Nambikuára
mdo — Southwest Gbaya
mhh — Maskoy Pidgin
mhv — Arakanese
miv — Mimi
mja — Mahei
mld — Malakhel
mly — Malay (- language) - actually a family
mms — Southern Mam
mob — Moinba
mof — Mohegan-Montauk-Narragansett - extinct
mol — Moldavian
mpf — Tajumulco Mam
mqd — Madang
mst — Cataelano Mandaya
mtz — Tacanec
muw — Mundari
mvc — Central Mam
mvj — Todos Santos Cuchumatán Mam
myq — Forest Maninka
myt — Sangab Mandaya
mzf — Aiku
nbf — Naxi
nfg — Nyeng
nfk — Shakara
nhj — Tlalitzlipa Nahuatl
nhs — Southeastern Puebla Nahuatl
nky — Khiamniungan Naga
nlr — Ngarla
nxj — Nyadu
occ — Occidental - constructed - merge into Interlingue as Duplicate
ogn — Ogan - merge into Central Malay
ope — Old Persian - ancient - merge into Old Persian (ca. 600-400 B.C.) as duplicate
ork — Orokaiva - split into Orokaiva (new identifier), Aeka and Hunjara-Kaina Ke
paj — Ipeka-Tapuia
pbz — Palu
pec — Southern Pesisir
pen — Penesak
pgy — Pongyong
plm — Palembang
poa — Eastern Pokomam
pob — Western Pokomchí
poj — Lower Pokomo
pou — Southern Pokomam
ppv — Papavô
prv — Provençal
pun — Pubian
puz — Purum Naga
quj — Joyabaj Quiché
qut — West Central Quiché
quu — Eastern Quiché
qxi — San Andrés Quiché
rae — Ranau
rjb — Rajbanshi
rmr — Caló
rws — Rawas
sap — Sanapaná
sdd — Semendo
sdi — Sindang Kelingi
sgl — Sanglechi-Ishkashimi
sic — Malinguat
skl — Selako
slb — Kahumamahon Saluan
srj — Serawai
stc — Santa Cruz
suf — Tarpia
suh — Suba
sul — Surigaonon
sum — Sumo-Mayangna
suu — Sungkai
szk — Sizaki
tkk — Takpa
tle — Southern Marakwet
tlz — Toala'
tmx — Tomyang
tnf — Tangshewi
tnj — Tanjong
tot — Patla-Chicontla Totonac
ttx — Tutong 1
tzb — Bachajón Tzeltal
tzc — Chamula Tzotzil
tze — Chenalhó Tzotzil
tzs — San Andrés Larrainzar Tzotzil
tzt — Western Tzutujil
tzu — Huixtán Tzotzil
tzz — Zinacantán Tzotzil
ubm — Upper Baram Kenyah
vky — Kayu Agung
vlr — Vatrata
vmo — Muko-Muko
wgw — Wagawaga
wit — Wintu
wre — Ware
xah — Kahayan
xkm — Mahakam Kenyah
xmi — Miarrã
xsk — Sakan - ancient
xst — Silt'e
xuf — Kunfal
yib — Yinglish - merged into English
yio — Dayao Yi
ymj — Muji Yi
ypl — Pula Yi
ypw — Puwa Yi
yus — Chan Santa Cruz Maya - merge with Yucateco
yuu — Yugh - Yugh is a duplicate of Yug
ywm — Wumeng Yi - merge into Wusa Yi , renamed Wumeng Nasu (cf. 2007-038)
yym — Yuanjiang-Mojiang Yi - split into Southern Nisu and Southwestern Nisu
ztc — Lachirioag Zapotec - merge into Yatee Zapotec
Retired codes which have been discussed since February 2014
Some codes which were retired from the ISO but which are still used on Wiktionary. (This list is not necessarily comprehensive.) Some codes in the list have been discussed, and these have been intentionally retained: sh "Serbo-Croatian", gio "Gelao", kzh "Dongolawi" / "Kenuzi-Dongola", mnt "Maykulan". Meanwhile, these have not yet been discussed.
List of ISO 639 codes absent from Wiktionary
Latest comment: 8 years ago3 comments1 person in discussion
Most of the 7865 codes present in ISO 639 are present on Wiktionary; most of those which are not are recorded on WT:LT. The only ones which have slipped between those two cracks are these, which should be investigated and discussed in the coming weeks. In many cases, the exclusion is likely nothing more than an oversight; in some cases, it's clearly because a naming conflict prevented importation of the codes back when Wiktionary bot-imported ISO 639 en masse (something we can now solve with disambiguators):
cek — Eastern Khumi Chin - a dialect of cnk (Khumi Chin)
dda — Dadi Dadi
dgw — Daungwurrung
dja — Djadjawurrung
deq — Dendi (Central African Republic) - presumably failed to be included because of the naming conflict with ddn — Dendi (Benin)
dmd — Madhi Madhi (Muthimuthi)
dth — Adithinngithigh - compare rrt, which is said to be a different language
dty — Dotyali
gku — ǂUngkue
gll — Garlali
gpe — Ghanaian Pidgin English - probably to be combined with other African Pidgin English (see RFM)
gwm — Awngthim
gmz - Mgbo
hna — Mina (Cameroon) - presumably failed to be included because of the naming conflict with myi Mina (India), which, however, is spurious
ihw — Bidhawal - a dialect of/with unn
jan — Jandai
jbi — Badjiri - possibly not even Karnic; cf my notes about ekc above and on User:-sche/retired codes
jbk (Barikewa) and jmw (Mouwase) — varieties of {{mgx}} Omati/Mini, said to be quite divergent from each other: but we should either have mgxor have jbk+jmw, not all three
The ISO added a code for Bidhawal, which we never got around to adding. That seems to be OK; Robert M. W. Dixon says in Australian Languages: Their Nature and Development (2002, →ISBN that "Bidhawal appears not to constitute a separate language, but rather to be the most eastern dialect of Q, Muk-thang (or Kurnai). The grammatical forms given by Mathews for Bidhawal are almost identical to those for Muk-thang, as are most of the verbs and a good proportion of nouns." - -sche(discuss)03:02, 21 August 2016 (UTC)Reply
Treatment of reconstructed languages?
Latest comment: 9 years ago2 comments2 people in discussion
We merged Proto-Finno-Ugric and Proto-Finno-Permic into Proto-Uralic, and Proto-Baltic into Proto-Balto-Slavic. The original languages remain as etymology codes. Should this be mentioned here? —CodeCat18:48, 21 August 2015 (UTC)Reply
Sure. Maybe in a separate table, though? Since those aren't cases where we deprecated, split, or broadened an ISO code, but rather cases where we assigned a code of our own devising and then went "wait, on second thought, nah". - -sche(discuss)19:10, 21 August 2015 (UTC)Reply
Akan and its subdivisions
Latest comment: 8 years ago2 comments2 people in discussion
Oh my~ got to extirpate the remnants of truth from Wiktionary! This is just about pretending Chinese is a language when we know it's a macrolanguage. Just treat Chinese like any other macrolanguage group. So sad. --Geographyinitiative (talk) 05:05, 9 June 2020 (UTC)Reply
If I may inquire, what other macrolanguage group is treated like Chinese is treated on Wiktionary? If you can give me a good answer on this, I could be much more convinced that the current system for covering Chinese languages on Wiktionary is not a disaster. --Geographyinitiative (talk) 05:24, 9 June 2020 (UTC)Reply
@Geographyinitiative: Zhuang is one of among many. Please see the main page - any macrolanguage in the table that is marked with "Only the macrolanguage is treated as a language" would be the same situation (more or less). — justin(r)leung{ (t...) | c=› }05:35, 9 June 2020 (UTC)Reply
If I may ask, what are the different Zhuang languages? Are there any other macrolanguage groups not associated with Chinese characters or influenced by Chinese politics that are not split up by language? I think every language should have its own header on Wiktionary, don't you? Atitarev, please don't hate me man! I am bringing a perspective that represents the opinions of many others and I am trying to make honest inquiries about really important things. There were no Wade Giles or Tongyong Pinyin derived geo terms before I came here, and I helped add an important perspective which was being neglected. I am a 'troll' because I bring an outsider perspective, but I am not a troll because I am actively working and negotiating to make the dictionary better with tangible results. Geographyinitiative (talk) 22:43, 20 June 2020 (UTC)Reply
You don't bring anything new. All valid forms are welcome and nobody blocked any language or any script or dialect or transliteration scheme. Yes, that includes Wade-Giles, Tonyong Pinyin, Min Nan in Chinese characters and Min Nan in POJ. Your conspiracy theories have no grounds at all. Bring away your perspective but don't poison people's minds about the achievements of this site. You don't raise any awareness, everyone is aware of what's out there. You just don't want to see it. You're slinging dirt around, then apologise or start praising people, which I find hypocritical. You talk a lot about your own achievements but nobody does it here, this is called narcissism. If there is not enough coverage for anything, then there was not enough contributors. Languages are somewhat like currencies. If a value of currency of small third-world country is low, nobody is interested in it but people of that country have to use it. Even if you do add Wade-Giles, Tonyong Pinyin, you pose it as an opposition of Mandarin and Hanyu Pinyin domination, which you blame this site for, not accepting the reality but it's still someone's fault, isn't it? And you keep blaming someone and no-one in particular for that. Everything is doable and achievable. You want to make the distinction between Min Nan and Mandarin, just do it within the existing infrastructure. Nobody stops you from defining specific senses, usage examples, etc. You want to add alternative English spellings, varieties of Chinese, go ahead, just do it in a positive way. Stop blaming everyone or the site. You just turn away people from your cause. All the work is welcome, if it's not breaking agreed conventions or rules. In short, add you Wade-Giles your forms, POJ, Min Nan, whatever but start making sense, stop attacking pinyin, Mandarin, this site, etc.
The Zhuang situation is a good example of a macrolanguage but it's harder to demonstrate at Wiktionary as the Zhuang coverage is very low at Wiktionary. The unified approach for other language, other than Chinese is better demonstrated by Serbo-Croatian, which combines two to four different standards, depending how to count - Croatian, Serbian, Bosnian and Montenegrin, two scripts - Cyrillic and Roman (Latin), two major dialects - Ekavian and Ijekavian (+Kajkavian). I don't want to cause more trolling from you but Serbo-Croatian "unification" had much stronger opposition and hate. You can imagine the passions after the Yugoslav war where language identity was a reason to be shot at or imprisoned. Nevertheless, at Wiktionary, the scientific and technical reasoning prevailed over hate. Don't imagine for a second that Chinese varieties and Serbo-Croatian standards and dialects are comparable. No way. They are not. Chinese varieties are mostly not mutually comprehensible. However, the rationale for the unified approach was presented and it won. You won't achieve anything by winging and trolling negative messages. Yes, I consider your mentioning that this site may be a complete disaster or similar at every opportunity is trolling. --Anatoli T.(обсудить/вклад)23:37, 20 June 2020 (UTC)Reply
Let me mull it over a little more. However, I think that it would be wildly difficult to reach the conclusion "the Chinese macrolanguage header, including all language in modern China's borders from Cangjie to to-day, should be portrayed as equivalent to the Danish, Norwegian, Sweedish, English, German, French etc headers, implying they are all equally "languages"." I would that in my worst case scenario some kind of further disclaimer should be added automatically to every page that has the "Chinese" header so we know it means "any Chinese characters used in China since the Shang dynasty til today, including numerous unintelligible dialects with independent Wikipedia versions". What an expansive header it is! --Geographyinitiative (talk) 00:20, 22 June 2020 (UTC)Reply
Wiktionary doesn't have to apologise on every page on how it works. The votes on Chinese and Serbo-Croatian unifications defined the dictionary policies, which is or should be mentioned on appropriate About pages. If it's too hard to accept, which is understandable, you have two options: 1. get a new vote and win it or 2. leave, which was the case with some unhappy Croatian and Serbian editors. We didn't have a precedent in my memory with unhappy Chinese editors wanting to reverse the change and leaving, you may be the first. If you decide to stay, my personal advice is, you have to stop complaining at every opportunity in talk pages or edit summaries, bite the bullet and contribute in your favourite area, including enhancing dialectal coverage, strive to make it work, so that all promises in the vote to adequately cover all Chinese varieties are kept. --Anatoli T.(обсудить/вклад)00:53, 22 June 2020 (UTC)Reply
Stale but unresolved discussions of languages to add or remove
Latest comment: 1 year ago26 comments3 people in discussion
Because they are so stale, I am unilaterally moving these off the RFM page because that page has grown too massive (800,000 bytes) to be usable; however, because they are unresolved, I don't want to hide them away on WT:LTD... so here they go... - -sche(discuss)06:52, 28 December 2023 (UTC)Reply
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Even more languages without ISO codes, part 6
This next batch is of languages from lists other than Ethnologue and LinguistList. As before, I've tried to vet them all beforehand, but I will have doubtlessly made some mistakes. NB if you want to find more: I've avoided dealing with most of the Loloish languages, because all the literature seems to be in Chinese. —Μετάknowledgediscuss/deeds04:54, 6 July 2016 (UTC)Reply
Wikipedia (and Lyle Campbell, Anna Belew, Cataloguing the World's Endangered Languages, 2018) says this is dix "Dixon Reef". Is it not? (Or if it is, should the name associated with that code be changed?) - -sche(discuss)20:10, 1 August 2020 (UTC)Reply
Based on feedback there, not added at this time, although I note that content in the language seems to exist, which suggests we would eventually need to figure out a header to include it under. - -sche(discuss)20:44, 2 August 2020 (UTC)Reply
Perhaps better prm-kya? Also while I am not convinced treating the Komi varieties as separate languages altogether is the best solution, as long as we do so, we might moreover need Old Komi. --Tropylium (talk) 18:44, 11 July 2016 (UTC)Reply
I'm not sure... the very language is "reconstructed" by Bowern on the assumption that three wordlists (of which only two make it into the name) attest the same language, although apparently none of the three bothered to name the language. The chance of someone "would run across it and want to know what it means" seems nonexistent. If we wanted to host the wordlists, we could do that in an appendix or on Wikisource. - -sche(discuss)16:09, 9 August 2016 (UTC)Reply
Bowern's methods are scientific; but I would feel better if more than one scholar was saying there was one language in this set of wordlists, the way that for e.g. Port Sorrell, Dixon & Crowley and Glottolog agree that there is a unit/lect there. - -sche(discuss)16:55, 4 June 2017 (UTC)Reply
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Some more missing American languages
Here are a few more North American languages for which we could add codes:
Akokisa (nai-ako). WP says it is attested certainly in two words in Spanish records (Yegsa "Spaniard", which Swanton suggests is similar to Atakapa yik "trade" + ica "people"; and the female name Quiselpoo), and possibly in more words in a wordlist by Jean Béranger in 1721 (if the wordlist is not some other language).
Labrador Inuit Pidgin French, less often called Belle-Isle Pidgin, was spoken in Labrador from the late 1600s (probably since before the 1660s, but first written down in 1694) until at least the mid 1760s, based on Inuktitut, French, Basque, Montagnais, and possibly Spanish and Breton. Louis-Jacques Dorais, An Inuit Pidgin around Belle-isle Strait (1996; with reference to "Clermont - Martijn 1980; Dorais 1980; Bakker 1988"), covers the records:
Louis Jolliet recorded words at Baie Saint-Louis in 1694, including the 'greeting' thou tcharacou, saying the latter word is "peace", which Dorais says is "corroborated by two other sources, from 1717 (characoua ) and 1720 (characo ). But a text from 1743 (Privy Council 1927: 3284), written by the French merchant Louis Fornel, gives to characo the meaning 'war'." Thou is probably from tu. The other would could be Basque txarrakoa "bad", thus "are you bad?".
Le Cour in 1742 records some more words: bons camaras "good comrades", tous camaras "all comrades", capitaine "captain", kellanoré (which Dorais says "seems to be Le Cour's rendering of Inuktitut kinaunali 'but who is he?'?), the personal name Amargo (a rendering of Amaqqut "Wolves"), rénombek "bead" (probably a loanword), maumek "file" (probably a loanword), monkoumek "knife" (probably a loanword from Montagnais mukuma:n, as spelled in Marguerite Ellen MacKenzie Towards a Dialectology of Cree-Montagnais-Naskapi).
Louis Fornel in 1743 recorded more: tout camara "all comrades", troquo balena "let us trade whale" (from French troquons!), non characo "no war" (sic, per Fornel).
Jens Haven wrote other words in 1764-5: makagua "peace" (perhaps from Basque bake "peace" plus a suffix -koa), kutta (French couteau "knife"), memek "to drink" (from Inuktitut imiq "drinking water").
Few references discuss the lect and it is difficult to judge whether it is really a language or just something like broken French or like Spanglish (which I think we exclude), but the fact that the Inuit apparently changed the meaning and even part of speech of words in their own language when speaking pidgin suggests it is more on the pidgin-language side of that continuum than the code-switching side.
Algonquian–Basque pidgin (crp-abp). Wikipedia has a sample. The Atlas of Languages of Intercultural Communication, citing Bakker, says it was spoken from at least 1580 (and perhaps as early as 1530s) through 1635, and "only a few phrases and less than 30 words attributable to Basque were written down" (though apparently more words, attributable to other sources, were also recorded).
Guachichil (Cuauchichil, Quauhchichitl, Chichimeca) (nai-gch or, if Guachí is added as sai-gch, perhaps nai-gcl to prevent the two similarly-named lects from being mixed up by only typoing the initial n vs s), apparently sparsely attested.
Concho (nai-cnc). The Handbook of North American Indians, volume 10, says "three words of Concho were recorded in 1581 look like they may be Uto-Aztecan".
Jumano (Humano, Jumana, Xumana, Chouman, Zumana, Zuma, Suma, and Yuma) (nai-jmn). The Handbook says "It has been established that the Jumano and Suma spoke the same language. Three words have been recorded" of it.
and from South America:
Peba / Peva (sai-peb), said by Erben to more properly by called Nijamvo, Nixamvo. Spoken in "the department of Loreto" in Peru. Attested in wordlists by Erben and Castelnau, which Loukotka provides, and which disagree with each other substantially: munyo (Erben) / money (Castelnau) "canoe, small boat"; nero (E) / yuna (C) "demon"; nebi (E) / nemey (C) "jaguar"; teki (E) / tomen-lay (C) "one", manaxo (E) / nomoira (C) "two"; etc. I would even consider that one might not be the same language as the other... what's with these languages that survive in disparate wordlists? lol.
possibly Saynáwa: fr.Wikt grants a code to this variety of Yaminawá language, described here (see also ).
Support all except possibly Akokisa. I think it's a dialect of Atakapa, and that the wordlist is very likely not being linked correctly. That said, it's so few words, that there's no real reason not to accept it as a separate language, just to be conservative about it. —Μετάknowledgediscuss/deeds04:08, 16 August 2016 (UTC)Reply
Good point about Akokisa. (I am reminded that you had mentioned its dialectness earlier; sorry I forgot!) The wordlist, labelled only with a tribal name per WP, is possibly plain Atakapa, but Yegsa is supposedly recorded as specifically Akokisa; OTOH that doesn't rule out that Akokisa is a dialect. Indeed, M. Mithun's Languages of Native North America treats as dialects Akokisa, Eastern ("the most divergent, known from a list of 287 entries") and Western ("the best documented. Gatschet recorded around 2000 words and sentences, as well as texts Swanton recorded a few Western forms", all published in 1932 in a dictionary). I suppose the benefit to treating it as a dialect would be that we could context-label Yegsa and Quiselpoo as {{lb|aqp|Akokisa}} and then Béranger's forms as {{lb|aqp|possibly|Akokisa}} without needing to agonize over which header to put them under. - -sche(discuss)15:31, 16 August 2016 (UTC)Reply
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
More unattested languages
The following languages have ISO codes, but those codes should be removed, as there is no linguistic material that can be added to Wiktionary. This list is taken from Wikipedia's list of unattested languages, but I have excluded languages which are not definitively extinct (and thus which may have material become available). If there was any reliable source I could find corroborating the WP article's claim of lack of attestation, it is given after the language. —Μετάknowledgediscuss/deeds04:15, 4 April 2017 (UTC)Reply
Unclear if it even existed per The Indigenous Languages of South America: A Comprehensive Guide (Campbell and Grondona).
Barbacoas language (the Wikipedia article has a discussion of the conflation of this unattested language with Pasto, which needs a code; for clarity, I think this should be retired and an exceptional code made explicitly for Pasto)
AIATSIS has the following to say: "According to Ian Green (2007 p.c.), this language probably died before the 1920's and neighbouring groups in the Daly claim it was the language of Peron Island which was linguistically and perhaps culturally distinctive from the nearby mainland societies. Black & Walsh (1989) say that this may or may not have been a dialect of Wadiginy N31." —Μετάknowledge
The 1992 International Encyclopedia of Linguistics, v. 1, p. 337, says "Giyug: 2 speakers reported in 1981, in the Peron Islands in Anson Bay, southwest of Darwin." The 2003 edition repeats the claim that "2 speakers remain". Wikipedia says it's extinct and unattested, but Glottolog, although having no resources on it, suggests it's not extinct. Might be best to leave it alone for now. - -sche(discuss)01:13, 6 August 2020 (UTC)Reply
Removed, and mcw renamed. Glottolog had only one reference to support the existence of Mawa, Temple (1922), which does not even include a section under that header. There may be confusion with the section on the "Marawa", but that does not even mention what language those people speak. (Temple also knows very little about linguistics; while skimming through, I found that Margi (a Chadic language) was said to be similar to the languages of South Africa. —Μετάknowledgediscuss/deeds01:39, 6 August 2020 (UTC)Reply
Appendix I in The Indo-Aryan Languages records this language as being a subdialect of Dhundari and the 1901 Indian Census concurs; this is at odds with its description as an unattested Dravidian language, but the geographical specifications seem to match up.
AIATSIS says: "Harvey (PMS 5822) treats Ngomburr as a dialect of Umbukarla N43, but in Harvey (ASEDA 802), it is listed as a separate language." Nicholas Evans confirms in The Non-Pama-Nyungan Languages of Northern Australia that it is unattested.
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Some spurious languages to merge or remove, 2
remove Adabe
Geoffrey Hull, director of research for the Instituto Nacional de Linguística in East Timor, notes (in a 2004 Tetum Reference Grammar, page 228) that "the alleged Atauran Papuan language called 'Adabe' is a case of the mistaken identity of Raklungu," a dialect (along with Rahesuk and Resuk) of Wetarese. He notes (in The Languages of East Timor, Some Basic Facts) that only Wetarese is spoken on the island, and Studies in Languages and Cultures of East Timor likewise says "The three Atauran dialects—with the northernmost of which the dialect of nearby Lirar is mutually intelligible—are unquestionably Wetarese, and not dialects of Galoli, as Fox and Wurm suggest for two of them (n. 32). The same authors refer (ibidem) to a supposedly Papuan language of Atauro, the existence of which appears to be entirely illusory." (The error appears to have originated not with Fox and Wurm but with Antonio de Almeida in 1966.) - -sche(discuss)01:45, 31 May 2017 (UTC)Reply
We could repurpose the code into one for those three Atauran varieties of Malayo-Polynesian Wetarese, Rahesuk, Resuk, and Raklu Un / Raklungu (the last of which Ethnologue does list as an alt name of adb, despite their erroneous family assignment of it), perhaps under the name "Atauran Wetarese" for clarity. - -sche(discuss)01:52, 31 May 2017 (UTC)Reply
Arma (aoh) is also said to be "a possible but unattested extinct language"; I am trying to see if that means it is entirely unattested, or if there are personal/ethnic/place names, etc. - -sche(discuss)09:45, 3 June 2017 (UTC)Reply
The VU Amsterdam report linked to here seems to indicate that one lect has been given multiple codes, and that "Jair" at least is spurious. Further research wouldn't hurt. —Μετάknowledgediscuss/deeds00:24, 3 October 2019 (UTC)Reply
RFM
Latest comment: 1 year ago6 comments4 people in discussion
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
{{ping|Surjection|Tropylium|Kaarkemhveel}}
After a while of deliberation with Kaarkemhveel and two other future Selkup editors, we have come to the conclusion that it's best to split Selkup into two codes: Northern Selkup (sel-nor) and Southern Selkup (sel-sou) , which will both be part of the Selkup family (sel).
These two dialect areas are so different that treating them as a single language would be too bothersome. All subdialects are going to be marked with labels, and provided as languages in descendants sections (much like the two Karelian proper varieties are, or the Zyrian dialects).
The two branches are often named as different: Glottolog splits Selkup into "Kety-Central-Southern Selkup" (Southern) and "Taz-Turukhan" (Northern); The Oxford Guide to the Uralic Languages also shows a split between "Northern Selkup" and "Tomsk region Selkup" (p.778). A few more examples of papers that do this include Wurm (1997), Budzisch (2015), Vorobeva et al. (2017)...
There is precedent for treating these as different languages: ELP splits the family into three full-fledged languages (). On the pages there is the following reasoning for this split: "The three main varieties of Selkup have traditionally been counted as dialects of a single language; their differences are, however, comparable to those between, for instance, Ket, Yug, and Pumpokol".
The Russian institute RAN also splits Selkup into Northern and Southern, as two full-fledged languages.
The Wikipedia article also mentions a Central Selkup. What are you doing with that one? Does it belong to Southern Selkup? —Mahāgaja · talk14:03, 19 June 2023 (UTC)Reply
No opposition on this much, Northern Selkup is by now clearly distinct from non-Northern and has its own literary standard. Bridging historical data exists but would be probably better handled in Proto-Selkup entries anyway, about all of it is field records and not direct literary use by the speaker community.
Depending on how work on non-Northern Selkup develops, further division could be eventually meaningful too. The other recent handbook, Routledge's The Uralic Languages, Second Edition discusses things from a primarily tripartite Southern / Central / Northern perspective and notes that, though the sharpest modern boundary is Central vs. Northern, the most taxonomically significant difference is Southern vs. {Central, Northern}. I believe currently Southern is better-documented than Central, but the latter is what still has some attempts at literary usage and revival. --Tropylium (talk) 14:48, 19 June 2023 (UTC)Reply
I propose we split Carpathian and Pannonian Rusyn into two codes (rue and rsk respectively, in line with their ISO 639-3 codes), and then set Old Slovak the ancestor of Pannonian Rusyn. I have made a list of typical Slavic developments on User:Thadh/Rusyn and given both a Pannonian Rusyn form (from Ramač 1995, Српско-русински речник) and a Carpathian Rusyn form (from Kercha 2012, Словник русько-русинськый). I think this proves beyond much of a doubt that Pannonian Rusyn belongs to the West Slavic group, and specifically to the Slovak dialects, while Carpathian Rusyn is part of the East Slavic group. This is also a view that is supported by many scholars. Thadh (talk) 13:28, 14 December 2023 (UTC)Reply
@Thadh would it be possible to add an Eastern Slovak column to your tables (presumably the variety of Slovak that Pannonian Rusyn would be closest to) for comparison? I'm not sure how much extra work that would be, but if it's not a huge amount, it would be helpful. Chernorizets (talk) 13:44, 14 December 2023 (UTC)Reply
Strong support. The reflexes are clear, there are language codes, and it's the right moment to do this as Rusyn isn't highly developed yet, so splitting will be easier. Vininn126 (talk) 13:49, 14 December 2023 (UTC)Reply
@Atitarev: Yes (which is kind of the point). Similarly reflexes of PS palatals, strong yers, and other things. Everything points to Pannonian being West Slavic and Carpathian being East Slavic. Thadh (talk) 22:34, 14 December 2023 (UTC)Reply
@Thadh: I see, thanks. I have yet to digest other differences.
@Atitarev: The language has been influenced by Czech, Ruthenian (> Ukrainian/Rusyn), Hungarian and Serbo-Croatian for the last two-hundred years quite intensively, so some inconsistencies due to borrowings are expected. For гарло, this might be a language-specific innovation (I can imagine grdl- and -rdl- overall not being a very easy cluster, and for this specific example Slovincian also does some simplification). дороги is undoubtedly a borrowing though. Thadh (talk) 23:36, 14 December 2023 (UTC)Reply
@Thadh: I think it's worth addressing possible loanwords for your case (e.g. дороги, etc.). Compare with the English, which has more Romance words than native words and the Korean, which has more Sinitic words than native but it doesn't change their language family belonging. These languages are described well, though, but for Pannonian Rusyn, need to make it explicit, IMO, in case someone questions. Anatoli T.(обсудить/вклад)23:49, 14 December 2023 (UTC)Reply
@Atitarev I think the words chosen are unlikely to have been borrowed. Or at least there are enough that are unlikely to have been borrowed that it's even more unlikely that we chose only borrowed words. Vininn126 (talk) 09:49, 15 December 2023 (UTC)Reply
It's been a month and there's been overall support for this. I'm going to mark this thread as closed and lang codes for Carpathian and Pannonian should be assigned. Vininn126 (talk) 12:49, 14 January 2024 (UTC)Reply