Wiktionary:Language treatment requests/Archives/2015-19

Hello, you have come here looking for the meaning of the word Wiktionary:Language treatment requests/Archives/2015-19. In DICTIOUS you will not only get to know all the dictionary meanings for the word Wiktionary:Language treatment requests/Archives/2015-19, but we will also tell you about its etymology, its characteristics and you will know how to say Wiktionary:Language treatment requests/Archives/2015-19 in singular and plural. Everything you need to know about the word Wiktionary:Language treatment requests/Archives/2015-19 you have here. The definition of the word Wiktionary:Language treatment requests/Archives/2015-19 will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofWiktionary:Language treatment requests/Archives/2015-19, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
Language treatment requests: Archive index

An archive of language treatment discussions from 2015 to 2019. Do not start or continue discussions here; do that in WT:LTR.

Miscellaneous code changes

Nota bene the following discussions

Merging Twi and Fanti into Akan

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merging Twi and Fanti into Akan

It's bizarre that we currently have all three of Akan (the macrolanguage) and Twi and Fanti (its two dialects). References, even old ones, tend to treat Akan as one language with two or three dialects — from Johann Gottlieb Christaller's 1875 A Grammar of the Asante and Fante Language all the way through Florence Abena Dolphyne's 1988 The Akan (Twi-Fante) Language and the 2011 Modern Akan: A Concise Introduction to the Akuapem, Fanti and Twi Language. The dialects have always been mutually intelligible when spoken, and in 1978 speakers established a unified orthography to make them intelligible in writing, too. - -sche (discuss) 01:06, 1 September 2015 (UTC)

Yeah, I support this. I meant to propose a merger when I posted WT:RFV#gyeografi, but I never got around to it. —Μετάknowledgediscuss/deeds 01:18, 1 September 2015 (UTC)
Merged. - -sche (discuss) 01:35, 27 September 2015 (UTC)



The Asu and Pare languages

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


@-sche (and anyone else who loves language-name conflicts), we currently call aum "Abewa" and asa "Asu". Firstly, it seems that practically nobody calls aum by that name; Wikipedia and Ethnologue both titles their entries for it "Asu", and Roger Blench, who may be the only person ever to study it, calls it "Asu" as well. As for asa, which currently occupies that name, Wikipedia and most hits on Google Books call it "Pare" (the remainder call it "Asu" or "Chasu", for the most part). As you may have guessed, there is already a language that we call "Pare", namely ppt, but this New Guinean language is also called "Akium-Pare" and (Wikipedia's choice) "Pa", which thankfully appears to be untaken. The chain of changing language names does seem rather silly, but the overall purpose of this is to move the only language out of these three that actually has any literature on it (and thus the one I just added a translation in), namely asa, to its most commonly used name. —Μετάknowledgediscuss/deeds 05:40, 9 September 2015 (UTC)

I'm down with renaming aum away from Abewa.
On a balance, I'm also OK with renaming asa away from Asu. I can find a decent amount of references to "Asu": it's hard to say whether more or less than "Pare" because both terms turn up so much chaff. There is some documentation of its vocabulary available, incidentally; The Making of a Mixed Language: The Case of Ma'a/Mbugu by Maarten Mous mentions "Pare (Chasu)" muruke "sweat" and tika "lift" (in the context of their having been borrowed without change into Normal Mbugu, and then glottalized into Inner Mbugu muru'u and ti'i); Mbugu also borrowed Pare ku-kasha "hunt", Zigua ku-kala "to hunt", and Shambaa u-kalá "hunting" and ngwilizi "eagle" (source of Normal + Inner Mbugu ngwirizi, variation of l/r being dialectal in Shambaa; contrast Pare ngwirini).
Isaria N. Kimambo's Political history of the Pare of Tanzania, c. 1500-1900 (1969) implies Pare and Asu are different: "Other naming procedures include the use of u- for territorial names, e.g. Upare for the Pare country; and ki- for language, e.g. Kipare for the Pare language. The only exception here is Chasu which refers to the Asu language." John D. Kesby's Rangi of Tanzania: an introduction to their culture (1981) seems to clarify, however: "in the northeast of Tanzania, the people called Pare in Swahili refer to themselves as Asu". (And Maarten Mous provides a bit more detail, that Pare/Asu has at least two dialects, north and south, with the north one apparently also being called Vudee and several spelling variations thereof.)
As for ppt, some works refer to it as "Pari", e.g. The Abandoned Narcotic: Kava and Cultural Instability refers to "the Pari (Pa) language" (not to be confused with lkr Päri - Lokoro). I guess we can rename it "Pa" for now, and switch to "Pari" later if something else called Pa comes up. - -sche (discuss) 05:48, 10 September 2015 (UTC)
Renamed as proposed: aum from Abewa to Asu, asa from Asu to Pare, ppt from Pare to Pa. - -sche (discuss) 00:09, 27 September 2015 (UTC)


Chidigo to Digo (dig)

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming dig

From "Chidigo" to just "Digo", partly because we should try to purge prefixes from our language names where appropriate and partly because the latter name is vastly more used by linguists. —Μετάknowledgediscuss/deeds 19:35, 5 September 2015 (UTC)

Support. - -sche (discuss) 16:55, 8 September 2015 (UTC)
Renamed. - -sche (discuss) 00:26, 27 September 2015 (UTC)


Pol Pomo

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently call pmm "Pomo", which makes it sound like the Pomoan "language" which some people hypothesize exists, and for this reason I almost added a translation (with nested translations for Northern and Central Pomo) using the code in this way. I propose it be renamed "Pol" or "Pol Pomo", the name Wikipedia and the International Encyclopedia of Linguistics use. - -sche (discuss) 13:57, 19 June 2015 (UTC)

Damn, that's very rightfully confusing. Support "Pol", since that's what Wikipedia uses. —Μετάknowledgediscuss/deeds 07:17, 11 August 2015 (UTC)
Renamed. - -sche (discuss) 22:25, 27 September 2015 (UTC)


Aramanik (aam), Aasax (aas)

The ISO merged aam "Aramanik" into aas "Aasax", saying here "Aramanik is listed as a Southern Nilotic language of the Nandi group, presumably because the Aramanik people assimilated to the Nandi. The original Aramanik language was a Cushitic language (or a non-Nilotic language with heavy Cushitic overlay) usually called Aasax (Fleming 1969) and is already included in a separate Aasax entry. Maarten Mous, in A Grammar of Iraqw, also gives them as synonyms."
- -sche (discuss) 22:11, 2 November 2015 (UTC)

Gallurese (sdn), Sassarese (sdc)

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


These are currently named "Gallurese Sardinian" and "Sassarese Sardinian", which has led to them sometimes being nested under Sardinian in translations tables, but this is erroneous because they are (transitional) dialects of Corsican spoken on Sardinia, not dialects of Sardinian. I propose to drop "Sardinian" from their names. See also WT:RFM#Sardinian_templates. - -sche (discuss) 17:52, 17 August 2015 (UTC)

Or we could just merge them into co. —Aɴɢʀ (talk) 19:09, 17 August 2015 (UTC)
We could. They're subject to the same LDL CFI whether we keep them independent or merge them into Corsican (as contrasted with dialects of Italian, for example, which would find themselves subject to much higher CFI if merged into it), and a merger would reduce duplication while we could still note differences with {{label}}s and {{qualifier}}s... so perhaps we even should merge them. But they occupy grey areas. Gallurese is transitional between Corsican and Sardinian; Sassarese is transitional between Corsican, Sardinian and Tuscan (which I guess we consider it?). Ah, dialect continua... - -sche (discuss) 01:45, 18 August 2015 (UTC)
Renamed. Not merged at this time. - -sche (discuss) 21:23, 19 October 2015 (UTC)


Kiyaka language

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


There are two problems here: First, we currently have the code yaf referred to a language we call Kiyaka, but which Wikipedia calls Yaka (without the noun class prefix) and which authors on Google Books seem to agree to call Yaka as well. There are a couple other languages sometimes called Yaka, but fortunately they all have other names that are more common and therefore there is no conflict, and the principal name of yaf should therefore be modified; there are very few categories associated with this one, so it should be easy to change.
Secondly, the Wikipedia article states that the codes noq, ppp, and lnz refer to its dialects, but Glottolog seems to consider them separate languages (possibly just following the ISO rather than actually making a judgement). Therefore we should consider merging these codes, if in fact there are not enough differences (some data would be helpful). @-scheΜετάknowledgediscuss/deeds 02:16, 18 June 2015 (UTC)

axk currently goes by "Yaka"; if yaf were renamed "Yaka", yaf would need to use a disambiguator or axk would need to be renamed, but to what? axk's alt name of "Aka" is taken (by soh), and I haven't offhand found evidence that anyone calls it by its alt name "Beka".
Ethnologue, although it grants Ngoongo a separate code (noq), labels it a dialect of Yaka in its entry on Yaka. I can find a German reference stating "Eng mit den Yaka verwandt sind die Lonzo, Pelende und Suku." ("Closely related to the Yaka are the Lonzo, Pelende and Suku", the last of which WP and Ethnologue consider to speak a separate language.) A French reference says "Les Pelende ont un accent linguistique propre, mais ils s'entendent avec les Yaka, Suku, Lonzo, Luwa, Hungana, Tsamba, Ngongo, Mbala et Kongo." ("The Pelendes have their own linguistic accent, but they get along with / can understand the Yaka, Suku, Lonzo, Luwa, Hungana, Tsamba, Ngongo, Mbala and Kongo.") Another says "Le kipelênde comme le kiyaka est un dialecte du kikongo commun. Plus répandu, «Le kiyaka comprend quelques neuf dialectes distincts, présentant parfois des variantes assez considérables.»" ("The Kipelênde like Kiyaka is a dialect of the common Kikongo. More widespread, "The Kiyaka includes some nine separate dialects, sometimes with quite considerable variations.")
I can find a small Yaka corpus, but not any comparison of the different dialects.
A conservative approach might leave the codes separate until such time as someone comes along with words in them. - -sche (discuss) 16:20, 18 June 2015 (UTC)
Hmm, re names: soh is also called (Jebel) Silak, or (Jebel) Sillok; Wikipedia uses the name Sillok, but I'm not finding many resources to assess how common that is (and some refer it by a hyphenated string of dialects). If we can move soh (and perhaps should anyway), then we could move the rest down without disambiguating (so axk would be Aka, and yaf would be Yaka). —Μετάknowledgediscuss/deeds 23:22, 18 June 2015 (UTC)
I sometimes wonder if we should start preferring disambiguators to alt names: they'd be annoying to type, but I think the mere fact that we're discussing a three-link chain of renames (language A takes B's name, B takes C's name, C takes D's name) shows how much clearer they'd be. I pity the new user who e.g. adds aja content under an ==Adja== header, and I pity the veteran user who has to notice that that has happened.
In this case, I can't find evidence of soh being called Sillok, but the people who speak it and the place they live are called Sillok, so at least it wouldn't be unclear. Perhaps we could just rename axk and soh to have disambiguators, though: "Aka (Congo)" and "Aka (Sudan)". - -sche (discuss) 02:02, 19 June 2015 (UTC)
I suggest renaming axk and soh to "Aka (Central Africa)" and "Aka (Sudan)", and then renaming yaf as originally proposed. - -sche (discuss) 06:52, 4 July 2015 (UTC)
Renamed as proposed. What to do with the dialects remains to be determined. - -sche (discuss) 03:47, 12 July 2015 (UTC)
@-sche: I've generally followed the guideline that we avoid such parenthetical geographic locators; were we to use them in general, it would change a great deal of our names. I know few others care, but perhaps we ought to put this to the community at large in the BP? —Μετάknowledgediscuss/deeds 19:56, 28 July 2015 (UTC)
In this specific case, I think there is compelling reason to deviate from the general practice/guideline of preferring alt names to parentheticals even without changing that guideline: in order to use alt names here, we'd have to chain-rename ≥3 languages such that the name each one most often went by was assigned to a different one, and one of them would end up with an unattested name, which would all be extremely confusing.
As for whether/how to change the general guideline: I'll think the matter through more thoroughly before I post anything in the BP. I don't think I'd propose switching to parentheticals in all cases (I think, for instance, that Pyu/Tircul and Riang/Reang use different scripts and so are unlikely to be mixed up). I would only prefer parentheticals where people would be likely to mix up which language was meant by a given name, and where the mix-up would be likely to go unnoticed (e.g. because the script was the same). - -sche (discuss) 21:23, 28 July 2015 (UTC)
Convenient links to previous discussions:
--WikiTiki89 21:38, 28 July 2015 (UTC)


The Tonga languages

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


There are far more Tonga languages than anyone would want to have to deal with, but I am excluding all but two of them in this discussion for the sake of ease. We currently call toi and tog "Tonga" and "Chitonga", but both languages are called by both names, and use of alternative names seems to be vanishingly rare. Our current system is leaving me (and at least one person who tried to give a translation in one of the languages) thoroughly confused, so as much as I find them ungainly, I'd much rather we use parenthetical geographic identifiers than have to go through this madness. (Pinging @-sche as usual (and you ought to take a look at the other ones I've posted recently on this page when you get a chance, if you are so inclined.) —Μετάknowledgediscuss/deeds 05:28, 20 September 2015 (UTC)

I'm all in favor of parenthetical disambiguators, but what should they be? Wikipedia calls toi Tonga language (Zambia and Zimbabwe) and tog Tonga (Nyasa) language, but that seems suboptimal to me since the parentheticals aren't parallel. Ethnologue suggests the majority of toi speakers are in Zambia and all tog speakers are in Malawi, so how about "Tonga (Zambia)" and "Tonga (Malawi)"? —Aɴɢʀ (talk) 13:25, 20 September 2015 (UTC)
Support a rename of toi to "Tonga (Zambia)" and of tog to "Tonga (Malawi)". While we're at it, I think we prefer (do we?) to drop "ki-", "chi-", "gi-" and such African language-name prefixes, so toh could be renamed from "Gitonga" to "Tonga (Mozambique)". Happily, to is distinct as Tongan, and we don't have tnz yet, but it seems to be consistently called "Ten'edn" or "Maniq" (the latter being properly an ethnonym) by its speakers, who SIL says are totally unfamiliar with "Tonga" as the name of a language. - -sche (discuss) 23:50, 26 September 2015 (UTC)
@-sche: I don't know if we've talked about it before, but I think in general we should avoid language-name prefixes. However, there are some exceptions; I prefer "Luganda" to "Ganda", for example, because it's far more commonly used. I'm fine with renaming Gitonga as you suggest. By the way, thanks for dealing with some of these language issues; Kikuyu and Rwanda-Rundi are still lingering on this page, so please give them some love/research when you have a chance.Μετάknowledgediscuss/deeds 04:07, 27 September 2015 (UTC)
Done Done. - -sche (discuss) 10:08, 18 February 2016 (UTC)


Wolof

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Gambian Wolof (wof) should be merged into Wolof (wo), IMO. Ethnologue says "Senegalese Wolof intelligible by speakers of Gambian Wolof but with significant enough differences to require adaptation of materials", which seems to have been their motive in splitting these lects (like so many others). - -sche (discuss) 06:51, 25 February 2016 (UTC)

Support. We can make Gambian Wolof a regional dialect of Wolof and tag relevant words {{lb|wo|Gambia}}. —Aɴɢʀ (talk) 08:38, 25 February 2016 (UTC)
Done Done (no entries used the code). - -sche (discuss) 05:11, 29 February 2016 (UTC)


Raga or Hano

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming lml

It seems to be more common to call this language "Raga", as Wikipedia does. Compare e.g. google books:"Raga" "vavine" (several books mentioning the language) vs google books:"Hano" "vavine" (no relevant hits). - -sche (discuss) 09:11, 27 February 2016 (UTC)

That's a good way to demonstrate commonness of use. Support renaming to Raga. —Μετάknowledgediscuss/deeds 01:44, 4 March 2016 (UTC)
Renamed. Compare also google books:"Raga" "wai" "water" vs the same with "Hano". - -sche (discuss) 07:23, 22 March 2016 (UTC)


Sauk or Ma Manda

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming skc

Currently called "Sauk", but Wikipedia, SIL publications on the language, and work by Alexandra Aikhenvald all call it "Ma Manda". A happy side effect of the move would be that we could add "Sauk" as an alias for sac, as it is probably the second most common name for that language. —Μετάknowledgediscuss/deeds 06:34, 3 November 2015 (UTC)

Renamed per nom. Sure enough, the only entries with "Sauk" translations are referring to sac. - -sche (discuss) 09:07, 27 February 2016 (UTC)


Waray-Waray and Warray

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Considering war and wrz

Our current situation is to call war "Waray-Waray" and wrz "Waray"; this is not necessarily an optimal solution. Wikipedia chooses to call war "Waray" and wrz "Warray"; although "Warray" is less common than "Waray" to refer to wrz (as far as I can tell), this gives the commonest name of war to that language, which probably deserves priority due to being much more studied. At Template talk:war, you can see that the idea to rename war to "Winaray" was rejected and Liliana's choice of "Waray-Waray" won out. However, it's clear that our current situation has caused some confusion (User:DTLHS/cleanup/mismatched translation codes shows a lot of misuse of wrz when war was intended). Basically, what we have now isn't bad, but the fact is that it's resulted in mismatched codes, so we might want to try a different approach. —Μετάknowledgediscuss/deeds 20:46, 14 September 2015 (UTC)

I'd prefer minimizing ambiguity by calling war "Waray-Waray" and wrz "Warray" so that no language at all is called by the ambiguous name "Waray". —Aɴɢʀ (talk) 14:56, 15 September 2015 (UTC)
To be fair I think most of the mistakes were caused when the language was renamed but the translations weren't edited, not by whoever added them in the first place. DTLHS (talk) 23:49, 26 September 2015 (UTC)
I've renamed wrz in the manner Angr suggested. - -sche (discuss) 09:04, 27 February 2016 (UTC)


Sura or Mwaghavul

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming sur

The name "Sura" is ambiguous (as noted on Talk:am), and the name "Mwaghavul" seems to be more common (compare google books:"Sura language", google books:"Mwaghavul language") — and is, in any case, quite common — so I propose to rename sur from "Sura" to "Mwaghavul". - -sche (discuss) 20:25, 23 August 2015 (UTC)

Support. —Μετάknowledgediscuss/deeds 04:40, 31 August 2015 (UTC)
Renamed. - -sche (discuss) 05:06, 28 February 2016 (UTC)


Tingal and Tegali

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


In 2011, the ISO retired the code for Tingal , merging it into Tegali . I think we should follow suit. Wikipedia notes that there is dialectal variation in Tegali, but it's not between Tegali proper and Tingal, it is rather between Tegali proper and Rashad (but even those dialects are "nearly identical"). - -sche (discuss) 06:15, 11 August 2015 (UTC)

Done Done. - -sche (discuss) 03:26, 29 February 2016 (UTC)


Batak or Palawan Batak

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming bya

-sche has pointed out that we call this language "Batak"; that is an awful idea, due to the existence of the Batak languages. To reduce confusion, we should do as Wikipedia does, and call it "Palawan Batak". —Μετάknowledgediscuss/deeds 06:18, 29 February 2016 (UTC)

@Μετάknowledge: I support this renaming. — I.S.M.E.T.A. 01:30, 4 March 2016 (UTC)
Support, naturally. "Palawan Batak" still sounds like it might be a Batak language, but at least it stops people from seeing one of the Batak languages and entering it into Wiktionary as bya (compare the recent rename of "Pomo" to "Pol Pomo"). - -sche (discuss) 02:03, 4 March 2016 (UTC)
Done Done. - -sche (discuss) 07:09, 14 April 2016 (UTC)


Shuadit

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Shuadit (sdt) aka Judeo-Occitan or Judeo-Provençal is most definitely not an independent language. The literature seems to be quite sure on this point: Banitt refers to it as a "langue fantôme", Vouland as a "non-langue", and Alessio as a "langue imaginaire", to quote three especially scathing francophone scholars. One main scholar is Szajkowski, who seems to have made up a great deal about it (including the name Shuadit, of which there is no evidence of use) and who "was no linguist, and his knowledge of Occitan was quite poor" (Strich and Jochnowitz). Moreover, the so-called last speaker, and a chief primary source, Armand Lunel, was evidently a semi-speaker who was not actually fluent in the "language". Glottolog sums all this up by saying: "This entry is spurious. This means either that the language denoted cannot be asserted to be/have been a language distinct from all others, or that the language denoted is covered in another entry." To the extent that anyone wants to enter the paltry Hebrew-script text, it can be done as Old Provençal or Occitan, depending on how old it is. —Μετάknowledgediscuss/deeds 06:33, 14 April 2016 (UTC)

@Metaknowledge: Good to know. Is there any evidence that it was a dialect, or was it basically just a script variant like Judeo‐French? --Romanophile (contributions) 06:41, 14 April 2016 (UTC)
Pretty much just a script variant. That was a popular thing to do, because using the Latin script was seen as too associated with the Church and distant from the Jewish educational tradition. There have been many other isolated incidences of languages like Urdu and Samogitian being written in Hebrew script as well. —Μετάknowledgediscuss/deeds 06:45, 14 April 2016 (UTC)
@Metaknowledge: do you think that much of Judaeo‐Romance contains only superficial differences? Most of them are considered ‘extinct’ by Wikipedia, bearing Ladino and Judeo‐Italian. Judaeo‐Italian notwithstanding, Ladino appears to be the language of Latin Jews around the world. --Romanophile (contributions) 07:13, 14 April 2016 (UTC)
I think so, but I'd like to read through the chapter on each in the Handbook and consult some other sources before deciding. I obviously like Jewish languages, but I think that the line between language and dialect is being abused, and that we should find better ways to document these. —Μετάknowledgediscuss/deeds 07:20, 14 April 2016 (UTC)
Merge into (Old) Provençal / Occitan per nom. - -sche (discuss) 03:23, 19 April 2016 (UTC)
Merged. - -sche (discuss) 02:56, 28 May 2016 (UTC)


Zarphatic

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This name is used rather uncommonly, and really only by Jewish philologists, not mainstream linguists. It should be changed to the much more common "Judeo-French".

Unrelated to the name change, I'm not fully sure that we should have this as a separate language. Kiwitt and Dörr (2015) say: "It should be noted, however, that the major part of linguistic data attested in Judeo-French sources is simply common Old French written in Hebrew script, with some texts showing little to no register variation in comparison with Christian Old French sources." They go on to discuss one extensive text, a biblical glossary, where only 6% of words were not attested in Christian Old French texts. Basically, this is similar to Hindi and Urdu — is it worth keeping separate? (And if you think it'd be strange to have Hebrew-script entries under ==Old French==, remember that we have Arabic-script entries under ==Afrikaans==.) @-sche, Renard Migrant, Wikitiki89Μετάknowledgediscuss/deeds 04:52, 13 April 2016 (UTC)

Interesting. I’d be fine with a merge or a rename. Script variants and dialectisms can simply be marked with Judeo-French. I would love to work on this dialect, but I have no idea where to find any texts. --Romanophile (contributions) 05:08, 13 April 2016 (UTC)
Before commenting here, I was planning to read the Judeo-French chapter of my new copy of the Handbook of Jewish Languages, but I now realize that Metaknowledge also recently acquired this book and has probably already this chapter and probably only started this discussion because of that. So I'll just assume that his conclusion is the same that I would have drawn and say that I agree that this should be merged with Old French. --WikiTiki89 21:10, 13 April 2016 (UTC)
Precisely. I intend to work through the entire book, improving how we cover Jewish languages. —Μετάknowledgediscuss/deeds 01:10, 14 April 2016 (UTC)
Rename per nom; see also ngrams. Also merge per nom; perusing the references that turn up if I just search for references that mention both lects (google books:"Judeo-French" "Old French"), I find that they agree:
  • Raphael Patai, Encyclopedia of Jewish Folklore and Traditions (2015, →ISBN, page 316: "Judeo-French is Medieval (Old) French as spoken and written by French and Rhenish Jews. It differs from the other “Judeo” languages in that there were no dialectal differences between it and the Old French spoken by the non-Jews"
  • Aaron D. Rubin, ‎Lily Kahn, Handbook of Jewish Languages (2015, →ISBN, page 139: "However, this term does not imply the existence of a set of linguistic features common to these sources that would allow identifying a 'Judeo-French' language or dialect distinct from the varieties of Old French encountered in Christian sources."
Yes, this will result in Hebrew-script Old French (alternative-form-of) entries, and that is OK. - -sche (discuss) 00:16, 14 April 2016 (UTC)
If I remember correctly, a number of Old French words are attested first in Rashi's writings, and those writings been used in recent years by scholars of Old French to fill in gaps in knowledge of other aspects of the language. I'm sure some of our etymologies already include those Zarphatic words as Old French, so we might as well make it official. I agree we should both rename the lect and merge it with Old French. Chuck Entz (talk) 02:29, 14 April 2016 (UTC)
Merged. - -sche (discuss) 03:01, 28 May 2016 (UTC)


the Lega lects

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently have separate codes for three lects considered part of the Lega macrolanguage: lea, lgm, and khx. These are reasonable to separate into two languages, lgm (which should be called Lega-Ntara) and lea (which should be called Lega-Malinga) as there is 67% mutual intelligibility, but khx is clearly a variety of lea. All this is per the treatment of the Beya dialect of lea in The Bantu Languages. —Μετάknowledgediscuss/deeds 04:23, 22 November 2015 (UTC)

Support; merge khx into lea, with lgm remaining separate, and name everything per nom. I'm not sure why Wikipedia names the two main dialects using placenames. The full array of alternate names I encountered in (cursorily) researching the matter:
  • Mwenga Lega = Lega-Ntara / Lega Ntara (variously translated in refs as "Lower Lega", "Upper Lega" or "Eastern/Northern Lega") = Isile, Ishile, Kisile; Mwenda-Liga
  • Shabunda Lega = Lega-Malinga / Lega Malinga (variously translated in refs as "Upper Lega", "Lower Lega" or "Forest Lega" or "Western/Southern Lega") = Lega (Kilega) / Liga (Kiliga) proper; dialects: Kanu (Kikanu), Gala (Kigala), Yoma (Kiyoma), Sede (Kisede), Gonzabale, Beya (Beia), and possibly (Ki)Nyamunsange and Banagabo and Kabango and Bene
- -sche (discuss) 22:22, 25 November 2015 (UTC)
I'v merged khx into lea. However, as names go, "Lega-Shabunda" and "Lega-Mwenga" seems to be more common than "Lega-Malinga" or "Lega-Ntara". - -sche (discuss) 05:06, 29 February 2016 (UTC)
(Re)named "Lega-Shabunda" and "Lega-Mwenga". - -sche (discuss) 18:55, 22 March 2016 (UTC)


Kikuyu

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming ki

This is a hard one, and I'm not advocating one way or the other, just wishing to raise the issue. Ngrams reveal that in 1992, "speaking Gikuyu" and "Gikuyu language" became more common than "speaking Kikuyu" and "Kikuyu language" (as well as becoming the linguistic standard), yet overall the spelling "Kikuyu" is still more common, presumably in speaking of the people, who are less obscure than their language. We follow Wikipedia in using the spelling "Kikuyu", but this spelling is clearly no longer favoured for the language (if you're curious, the "g" is to reflect the etymon, Kikuyu Gĩkũyũ). Should we change it? —Μετάknowledgediscuss/deeds 19:55, 5 September 2015 (UTC)

  • @-sche, Liliana-60, Angr, Chuck Entz, Stephen G. Brown: I would really like to get some opinions on this. —Μετάknowledgediscuss/deeds 06:01, 27 February 2016 (UTC)
    • I really don't have a strong opinion on this. Personally, I still think of the language as Kikuyu, which makes it difficult for me to come out and say "Yes, we should rename it", but the reasons you mention make it difficult for me to come out and say "No, we shouldn't rename it". So I abstain. I'll be happy if we continue to call it Kikuyu, but I won't be unhappy if we start calling it Gikuyu. (I will be unhappy if we start calling it Gĩkũyũ, though, since that really isn't an English word.) —Aɴɢʀ (talk) 08:19, 27 February 2016 (UTC)
      • I am very familiar with the English name Kikuyu for both the language and the people (especially for the language), but I have never seen Gikuyu used in English. I think Gikuyu is the native name (more properly Gĩkũyũ). —Stephen (Talk) 22:50, 27 February 2016 (UTC)
As you note, there are phrases where ngrams suggets Gikuyu is now more common as a name for the language — but there are also phrases (1, 2) where Kikuyu is still more common even as a language name. I'd stick with the current spelling. - -sche (discuss) 05:00, 28 February 2016 (UTC)
Not renamed at this time. - -sche (discuss) 23:40, 2 July 2016 (UTC)


Kristang

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Judging by Ngrams, "Malaccan Creole Portuguese" is the least common name for this language; more common is "Malacca Creole Portuguese" with no n, and most common is "Kristang". In Glottolog's list of materials on it, I note that most of the modern material on it (by Baxter and Marbeck) calls it Kristang. I suggest renaming. That entails updating several entries and moving several categories. - -sche (discuss) 19:01, 20 March 2016 (UTC)

Support. —Μετάknowledgediscuss/deeds 00:19, 2 April 2016 (UTC)
Kristang gets a misleadingly high number of hits because it’s also the name of the people that speaks it, and part of the synonym Papia/Papiah/Papiá Kristang. — Ungoliant (falai) 01:05, 2 April 2016 (UTC)
@Ungoliant MMDCCLXIV: Given that you're the most knowledgeable about PT-based creoles, what would you prefer? —Μετάknowledgediscuss/deeds 17:36, 3 April 2016 (UTC)
Kristang, but Papia Kristang or Malacca Creole Portuguese are also good options. The most important works about this language use Kristang more prominently. — Ungoliant (falai) 19:30, 3 April 2016 (UTC)
Done. - -sche (discuss) 07:30, 3 July 2016 (UTC)


Loreto-Ucayali Spanish

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently have a code spq for this dialect of Spanish; Wikipedia has an article for it at Amazonic Spanish which states that "Ethnologue's reasons for doing this are poorly documented." Although it has some mild differences, it is clearly a dialect of es and should be merged into it. (There are no entries, but we should record the merger.) —Μετάknowledgediscuss/deeds 17:35, 3 April 2016 (UTC)

Support. — Ungoliant (falai) 20:27, 3 April 2016 (UTC)
Merge, IMO. (The only thing that gives me pause is the difference in attestation requirements: if Amazonian Spanish isn't well documented, then treating it as a separate lect allows its entries to be held to a lower attestation requirement. But the same could be said of a lot of varieties that don't have their own codes, e.g. New Mexico and Southern Colorado Spanish, for which there are several published references.) - -sche (discuss) 00:50, 4 April 2016 (UTC)
Done Done. - -sche (discuss) 19:23, 2 July 2016 (UTC)


Shempire Senoufo

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The Senoufo lects are a mess, and we have no consistency in naming them (some are Senoufo, some Sénoufo, some without the Senoufo at all). This one, seb seems not even to be a separate lect at all, but instead what Supyire spp is called in Côte d'Ivoire. —Μετάknowledgediscuss/deeds 20:25, 3 April 2016 (UTC)

Yes, Wikipedia and Omniglot agree they are the same. (The old 2003 International Encyclopedia of Linguistics said their "Relationship undetermined" at that time.) As for the other Senoufo lects: I noticed one while checking translations at water, and removed "Senoufo" from its name before I added entries in it because I saw how rarely it was actually referred to with "Senoufo" in the name. Supyire too seems to be mostly referred to without "Senoufo". - -sche (discuss) 00:45, 4 April 2016 (UTC)
Eventually we'll have to get to renaming them. For now, I just wanted to excise duplicates. Also, thanks for the archiving.Μετάknowledgediscuss/deeds 01:55, 4 April 2016 (UTC)
Merged. - -sche (discuss) 19:26, 2 July 2016 (UTC)


Brythonic to Brittonic

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


@Angr According to w:Brittonic languages, the name "Brittonic" is far more common than "Brythonic", which is apparently rather outdated. We should use the more common name. —CodeCat 17:30, 12 April 2016 (UTC)

@CodeCat This is pretty funny. I'm fairly certain I used to use "Brittonic" and then you corrected me to "Brythonic" (rightfully, since it is the one we use currently), but it's humorous to me that we're now suggesting the change. I definitely think we should change it to "Brittonic". —JohnC5 18:21, 12 April 2016 (UTC)
@CodeCat, JohnC5: What about the (unsourced) "Some authors reserve the term Brittonic for the modified later Brittonic languages after about AD 600." statement? — I.S.M.E.T.A. 14:45, 13 April 2016 (UTC)
Google Books Ngrams shows what looks to me like virtually a statistical tie since 1950, though Brythonic has been more common since the turn of the century. I really don't have a strong preference either way. —Aɴɢʀ (talk) 09:28, 14 April 2016 (UTC)
@Angr: That Ngrams result is very interesting and makes me lean towards keeping it as “Brythonic.” —JohnC5 14:56, 14 April 2016 (UTC)
@CodeCat, JohnC5, Angr: Yes, keep as “Brythonic”. — I.S.M.E.T.A. 15:27, 14 April 2016 (UTC)
Yes, based on the ngram, keep as "Brythonic". - -sche (discuss) 03:02, 28 May 2016 (UTC)
Not renamed at this time. - -sche (discuss) 04:49, 3 July 2016 (UTC)


Extinct languages of the Marañón River basin

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I reckon that we should add exceptional codes for the ones that have words directly recorded from them, even though it's precious few for each.

  • Palta (separate article) could be qfa-jiv-pal, in imitation of Linguist list's jiv-pal, but given that its family assignment is not one hundred percent certain and a three-part code is annoying and abnormal, we could also settle for sai-pal. I don't know why Jivaroan's language family uses qfa rather than sai, which seems to be our default for unsorted South American languages.
  • Rabona could be sai-rab.
  • Patagón could be sai-pat.
  • Bagua could be sai-bag.
  • Copallén could be sai-cop.
  • Tabancale could be sai-tab.
  • Chirino could be sai-chi.
  • Sácata could be sai-sac.

I'm not sure I see a point in adding codes for languages where the only words are elements in toponyms or names, rather than directly recorded. However, Puruhá language, Cañari language, Panzaleo language, Caranqui language, and others can be added if there is interest. After all, ISO already has codes for European languages like Dacian that aren't much better attested, and I suppose we could have an entry or two. @-scheΜετάknowledgediscuss/deeds 05:13, 26 May 2016 (UTC)

Re "why Jivaroan's language family uses qfa rather than sai": presumably human error. It could be updated to "sai-jiv". Re whether to use "sai-pal" or "sai-jiv-pal": for consistency with other codes ("nai-yuc-yav", etc) and with the schema described in WT:LANG, we should use "sai-jiv-pal" if we accept that it was a Jivaroan language. But as you say, the family identification is speculative (although the evidence which does exist is consistent with it). I suppose we could use "sai-pal" to be 'safe' / 'conservative' about the family identification and get a shorter code... this also lets us add the others as "sai-" codes without worrying we ought to reassign them if we later create a family code for the families they belong to. (Indeed, we already have a code for the Cariban family which Campbell and Grondona say Patagón belonged to.) - -sche (discuss) 22:27, 26 May 2016 (UTC)
@-sche: I think it's better not to make any assumptions for these languages' genetic affiliations. And do you want to add the languages I mentioned in my last paragraph? —Μετάknowledgediscuss/deeds 04:02, 27 May 2016 (UTC)
I've added Palta as sai-pal, added Rabona, Patagón, Bagua, Copallén Tabancale, Chirino and Sácata, and also renamed the Jivaroan code to sai-jiv and changed Esmeralda's code from qfa-und-esm to sai-esm to fit the usual naming scheme. - -sche (discuss) 15:42, 29 June 2016 (UTC)
I have added Puruhá as sai-prh. There was at one point a grammar of it (though it has been lost), so we know it existed as a discrete lect. And based on personal- and place-names, words in it have been reconstructed by scholars. Even if the only words in it we can add are words in the Reconstruction: namespace, that does seem worth having a code for (a code also lets us reference it when giving the etymologies of those personal- and place-names). The other languages could probably be given codes on the same basis (if words in them have been reconstructed). - -sche (discuss) 00:56, 1 July 2016 (UTC)
I've also added codes and entries for Cañari, Panzaleo and Caranqui. Everything here is done, I think. - -sche (discuss) 20:34, 2 July 2016 (UTC)


Saliba languages

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently have a language called Saliba (sbe) and one called Sáliba (slc). In my mind, having two languages' names only differ in a diacritic is not acceptable. It confuses automated programs as much as human editors. I'm not sure there are any acceptable alternative names, so I propose using geographic disambiguation for sbe as "Saliba (Papua New Guinea)". @-scheΜετάknowledgediscuss/deeds 04:15, 26 June 2016 (UTC)

And slc should be "Saliba (Colombia)". As a side note, w:Sáliba language redirects to w:Saliba language. —Aɴɢʀ (talk) 13:30, 26 June 2016 (UTC)
Alternately we could call slc "Saliva". This seems to get about 10× more Ghits for both the ethnic group and the language (though interference from saliva is possible). --Tropylium (talk) 04:34, 27 June 2016 (UTC)
Indeed the current names are so confusing that I got them backwards when I added the only three entries we have in the two languages. I think that disambiguating them both with parentheticals is clearer than allowing one to keep the ambiguous name (either while renaming the other to "Saliva" or while disambiguating the other). Current practice suggests that we should rename slc to "Saliva" or "Sáliva", but I wouldn't mind if we started making more frequent use of disambiguators instread. - -sche (discuss) 08:50, 27 June 2016 (UTC)
My first choice is "Saliba (Colombia)", my second choice is "Sáliva". It's bad enough we have Anus language to protect from puerile vandalism without also having Saliva language. —Aɴɢʀ (talk) 09:40, 27 June 2016 (UTC)
Alright, I'll rename them both to use parenthetical disambiguators. PS, don't forget Category:Anal language. - -sche (discuss) 05:16, 28 June 2016 (UTC)
Done Done, except that per previous discussion on using geographic disambiguators rather than national ones, I used "New Guinea" rather than "Papua New Guinea". (Other languages also use "New Guinea" as a parenthetical disambiguator in their canonical names, and only mention "Papua New Guinea" in alt names because SIL uses national disambiguators like that.) - -sche (discuss) 05:24, 28 June 2016 (UTC)


Loarki and Gade Lohar

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Ethnologue encoded this language twice: once as lrk "Loarki" (the name its 20,000 Pakistani speakers call it), and a second time as gda "Gade Lohar", the name its ~500 speakers on the Indian side of the border call it. (The International Encyclopedia of Linguistics entry on Gade Lohar conservatively only says the languages "may be the same" as Loarki, and notes its long list of other names: Gaduliya Lohar, Lohpitta Rajput Lohar, Bagri Lohar, Bhubaliya Lohar, Lohari, Gara, Domba, Dombiali, Chitodi Lohar, Panchal Lohar, Belani, and Dhunkuria Kanwar Khati. The IEL entry on Loarki is more explicit, breaking down the population by country and countain Gade Lohar's 500 speakers as Loarki speakers, because Loarki is "probably the same as Gade Lohar in Rajasthan, India, a Rajasthani language.") I propose to merge gda into lrk. - -sche (discuss) 20:50, 11 August 2015 (UTC)

Done Done - -sche (discuss) 03:23, 11 July 2016 (UTC)


Dhuwal

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The ISO has retired the code duj and replaced it with dwu (in the process of splitting off dwy "Dhuwaya"). If we want to follow the ISO, all our Dhuwal entries and categories need to be switched from duj to dwu, which seems like a lot of unnecessary bother. (Whoever does this should also add dwy to the module.) - -sche (discuss) 05:54, 24 February 2016 (UTC)

On one hand, it's unnecessary; on the other hand, I think it would be ideal to follow ISO in all cases where we don't have a well articulated reason not to do so. —Μετάknowledgediscuss/deeds 06:00, 24 February 2016 (UTC)
Support deprecation of duj. —Μετάknowledgediscuss/deeds 06:19, 29 February 2016 (UTC)
Done Done. (I also recoded Elfdalian from the nonstandard dlc to the standard ovd, per a BP discussion.) - -sche (discuss) 02:56, 7 July 2016 (UTC)


Kiriri

The following discussion has been moved from the page User talk:-sche.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


What do you think about adding a code for this language, and under what name? Wikipedia describes it at Katembri language; they cite Fabre for the claim that this language is only preserved in a single brief wordlist, where it is called Kiriri (the wordlist is on page 22 (section 3.4) of this pdf). Regardless, that document does seem to be a good place to find more words for water. —Μετάknowledgediscuss/deeds 03:24, 2 March 2016 (UTC)

There's a Kiriri attested in a single wordlist collected from an elder from the 1960s that's a Katembri language, and another Kiriri attested in a single wordlist collected from an elder from the 1960s that's a Xukuru language? Well, that's confusing.
The fact that there's only a limited number of words is no reason not to include the language, but it would be good to avoid the ambiguous name Kiriri. How about calling them Katembri and Xukuru, like Wikipedia does? Wait, (as a minor point of curiosity,) if the wordlist is labelled Kiriri, where'd the alternative name come from?
The difficult part will be assigning codes, given that the family affiliation is unclear. - -sche (discuss) 03:45, 2 March 2016 (UTC)
Given the naming issues, I'm suddenly confused about whether I have correctly identified the wordlist being referred to. I don't know anything about any of these languages, so I feel lost (it's so much better in Austronesian, for example, where I at least feel like I have a hold on what goes where). Anyway, that naming scheme makes sense; we can use qfa codes and not worry about the families, no? —Μετάknowledgediscuss/deeds 05:01, 2 March 2016 (UTC)
qfa is the prefix for exceptional family codes. All of our exceptional language codes which start with qfa do so because they start with a family code that starts with qfa, like qfa-ctc-col. There's been at least one case where we've created a family code for an accepted family (qfa-len, the Lencan languages) in order to use it in constructing a language code (qfa-len-slv for Salvadoran Lencan), but Wikipedia notes that scholars aren't certain what family either Kiriri belonged to, so we couldn't do that here because we couldn't accurately, confidently assign either one a family code (even an exceptional family code). I suppose we could construct codes starting with qfa-und, like qfa-und-ktm for Katembri. I wouldn't want to use bare qfa-___ (e.g. qfa-ktm for Katembri) because it would look like a family code. - -sche (discuss) 08:21, 3 March 2016 (UTC)


More languages without ISO codes, part 1

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I have gone through w:Category:Languages without ISO 639-3 code but with Linguist List code (thanks, Angr), and the languages listed below still need exceptional codes. I have not listed those that have no recorded material or toponyms, or those that are treated as a dialect of another language in the linguistic literature (like Akokisa). I have put suggested codes after them, and notes where I'm unsure (please correct me if I made any mistakes). —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)


Comments:
  • Amarizana: add per nom. Alexandra Y. Aikhenvald in Languages of the Amazon cautions that "many languages in Amazonia have 'namesakes' more than one group may hide behind the same name", and more than one language has been called Amarizana: "one of the clans of the Piapoco is the Amarizanes (and the name is sometimes applied to the whole group). A now extinct language, also called Amarizana, and from the same Arawak family, used to be spoken in the Meta territory of modern Colombia." Nonetheless, the only references I find unambiguously mean this language. Čestmír Loukotka, Johannes Wilbert, Classification of South American Indian Languages (1968), page 131, lists some Amarizana words alongside and hence obviously distinct from Piapoco, including nuita "head", notuy "eye", nukagi "hand", kaxü "house", sietai "water", eriepi "fire" and keybin "sun". Julian Granberry's A Grammar and Dictionary of the Timucua Language even provides some etymology, connecting Amarizana eri(-...) "fire" to Achagua eri "sun, day", Arekena ale "sun".
  • Amasi: this happens to highlight what a mess our African language family codes are. Several codes use the prefix nic- even though their most immediate superfamily is alv, e.g. nic-vco should be alv-vco. Fortunately, fixing the nic- codes should not require updating very many pages. One that is done, precedent would have us use alv-bco-... rather than alv-... (compare nai-yuc-tip, qfa-ctc-cat), although the argument in favor of a shorter code is obvious. :-/ Some words are listed in a 1973 article in Africana Marburgensia ('AM') and in a pre-draft working paper cited by WP ('B'), including (AM) / bu (B) "dog", ázɔ́lí (AM) / azɔle (B) "tree", ɣà-nēm (AM) / ɣanim (B) "man", ɣà-zhyī (-zhyì?) (AM) / ɣaʒɛ (B) "woman", mwɔ̄ (AM) / muɔ (B) "water".
  • Anauyá: add per nom. Also called Anauya, but the version with diacritic is more common. has uni "water" and ahiri "sun", the latter confirmed by I Simposio Antonio Tovar sobre Lenguas Amerindias: Tordesillas... (Emilio Ridruejo Alonso, ‎Mara Fuertes, ‎Carlos González-Espresati; 2003) and both are seemingly in the aforementioned Classification of South American Indian Languages, although I can't see the exact snippet.
  • Atanque(s): out of the various names WP mentions, namely "Atanque (Atanques) or Cancuamo (Kankuamo), also known as Kankwe and Kankuí", plus others I ran across (Atanke), "Atanques" seems to be most common, at least as the name of the language. ("Kankuamo" is quite common as a placename(?) that forms part of the designation of a tribe.) A 1962 article in Anthropological Linguistics has some words, including jo̱ke "gourd cup", cognate to cho̱kue (gourd cup), and mo̱ga "two", cognate to mo̱ga (two), and the 1981 Comparative Chibchan Phonology has more words (and may drop the underline from the os of those words; it is hard to see, because all words are underlined), including ji "worm", jinua "six".
  • Ayomán (rarely also Ayoman): I've added a code for Jirajaran, sai-jir, so this language's code should be sai-jir-ayo.
- -sche (discuss) 07:32, 4 July 2016 (UTC)
@-sche: Can't we just do two-part codes so we don't have to feel obligated to create these horribly long ones? It wouldn't clash with all of our preëxisting practice, despite there being some precedent. Also, I'm worried that your careful work on this is going to make this RFM section far too long, and also cause you to burn out. Perhaps this should be a user page that this section links to? —Μετάknowledgediscuss/deeds 07:57, 4 July 2016 (UTC)
Where an existing ISO family code like alv exists, I suppose we could go with two-part codes, but then what should be done with languages that have no ISO family code but instead belong to families for which we've had to create qfa- codes? I suppose they can be treated the same as they are now. But if we accept nai and sai as family codes for this purpose, I suppose that means some qfa- things like Salvadoran Lenca and Catacao can be re-coded. I will update the existing three-part non-proto-language codes if we go that route. I've started Wiktionary:Beer_parlour/2016/July#Shortening_some_.27exceptional.27_language_codes.
Yes, I considered as probably should start storing long comments and information on addable vocabulary in userspace. I wouldn't worry too much about burnout; we can take time; the only reason there would be a rush to add these codes ASAP is if we wanted to add words in particular ones of them, and if we wanted to add words, we'd need to do some research to find words to add. - -sche (discuss) 17:40, 4 July 2016 (UTC)
Regarding Burgundian, what should we do about Burgundian language (Oïl)? Add it, too? - -sche (discuss) 01:48, 5 July 2016 (UTC)
I'm not sure. I assumed that it could be subsumed under fr, as the Oïl languages usually are, but we probably ought to address it separately. —Μετάknowledgediscuss/deeds 09:01, 5 July 2016 (UTC)
I mean, is it separate from fr? I don't know. In any case, I guess on further consideration it doesn't stop us from adding the Germanic language as "Burgundian", because if the Romance language needs to be added, it can be Bourguignon. (And since they're from different (sub)families, it should be easy to tell which one was meant if someone enters a word from one incorrectly as the other.) I'll collect information about them at User:-sche/Burgundian (Wiktionary:About Bourguignon). - -sche (discuss) 16:50, 5 July 2016 (UTC)
Btw, I found and added another one we were missing, Macoris, attested in one word (baeza) and some placenames. - -sche (discuss) 06:27, 5 July 2016 (UTC)
Oh, there are tons more. This is only the low-hanging fruit; a lot more languages with paltry data are waiting to be dealt with. I'm avoiding the Bantu ones for now, because pretty much all of them are in dialect continua and probably should be left alone unless good scholarship on their mutual intelligibility can be found (which I suppose I should go about finding). There's a pile of Australian ones that I'll get around to listing at some point (I thought maybe I'd give you time to digest all this first), and then even more messier ones from South America. —Μετάknowledgediscuss/deeds 09:01, 5 July 2016 (UTC)
FWIW, re 'time to digest', I'd feel free to post any others you have the data to post (except the ones you mention are parts of dialect continua ... might as well leave them alone, as long as some part of the continuum has a code, although if no part does, then we should probably rectify that). There's no harm in it sitting around on the site unattended-to for a while, whereas letting it sit on one's computer sometimes (at least for me) means forgetting where one put it. (I can no longer find the information I thought I had collected on the separability vs mergeability of Haida dialects.) - -sche (discuss) 16:50, 5 July 2016 (UTC)
Regarding Ch'olti': The oldest stage of the language (written in so-called hieroglyphs), sometimes confusingly just called "Ch'olti'" but more often called by other names, has its own code emy. The Colonial- and post-Colonial-era stage is considered distinct from emy, and also considered distinct from the more recent stages of Ch'orti', e.g. one reference says "In the Mayan classificatory tradition, the Ch'olti' language, as recorded in the 1695 grammar of Pedro Moran, is generally held to be related to but separate from the modern language of Ch'orti' (see Kaufman's 1976 classification, for example)." Post-Epigraphic-era Ch'olti' and modern-era Ch'orti' (caa) are theoretically distinguished from each other as different branches of Eastern Ch'olan (and from ctu, as it is a Western Ch'olan language), but the size of the difference between Ch'olti' and Ch'orti' is hard to ascertain, especially because, quoth WP, "the post-colonial stage of the language is only known from a single manuscript written between 1685 and 1695" (as afore-mentioned). For that matter, the size of the difference between Epigraphic Ch'olti'an and Colonial Ch'olti' is not obvious to me; Søren Wichmann, The Linguistics of Maya Writing (2004), page 271, says "In this section we show how Classic Ch'olti'an became seventeenth-century Ch'olti'. The chief grammatical difference between the grammars of Classic Ch'olti'an and Ch'olti' is the difference between straight- and split-ergativity." As an example, mi "father" is used in Classic and Colonial Ch'olti'(an) and in Ch'orti'. Nonetheless, given that the corpus of post-emy Ch'olti' is small and well-defined, it shouldn't be that hard to include it separately from emy and caa. - -sche (discuss) 21:42, 5 July 2016 (UTC)
I also added Wanham. - -sche (discuss) 22:07, 10 July 2016 (UTC)
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)


The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We seem to have this as a single macrolanguage, yok, despite the fact that the constituent lects seem to constitute at least a few languages. -sche added some entries, but it's all tagged by (dia)lect, so it will be easy to separate them. I think we should retire yok and replace it with exceptional codes to reduce confusion, but I am not sure what those divisions should be. —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)

Yes, the difficulty is in deciding how to divide it. Christopher Loether cautions that although e.g. "Gayton (1948) listed 26 named groups or 'tribelets' many of these named groups speak dialects which are nearly identical phonologically, lexically and syntactically, while others speak varieties which are indeed quite distinct from their neighbours' speech. Kroeber divided language into two main divisions: Valley and Foothill Yokuts. He further divided the Valley Branch into Northern and Southern, and the Foothill branch into Kings, Tule-Kaweah, Poso, and Buena Vista. Newman (1944: 5-3) agreed with Kroeber's analysis of a single Yokuts language and stated that his data corroborated Kroeber's dialect divisions."
Kroeber lists 20+ dialects, of which 21 are named in ]. Wikipedia has a tree/bush diagram from Whistler and Golla of 23-28 dialects, including all of those 21 plus Koyeti, Merced "(?)", Noptinte (Nopchinche(s), Nopthrinthre(s), Nopṭinṭe, Nopthrinte, Noptinci), Yachikumne a.k.a. Chulamni, Lower San Joaquin Yokuts, and Lakisamni "(?)", and Tawalimni. (Several have multiple names, e.g. Ayticha is also called Kocheyali as well as Ayitcha; Palewyami is also Altinin and Poso Creek Yokuts in addition to Paleuyami. And Hometwoli is also Taneshach?)
For Yawelmani and Chukchansi, decent resources exist; in addition to those two, Wikchamni and Tachi are also being taught according to WP, and in addition to those four, Choinimni and Kechayi also have at least some speakers according to WP.
WP says the Yokutsan family consists of "half a dozen" languages, but evidently not the six just named, because those six leave out several major branches that Kroeber, Newman, and Whistler and Golla all agree on.
I suggest we create a family code nai-yok for the Yokutsan languages, and then distinguish the following branches which Kroeber, Newman, and Whistler and Golla all consider distinguishable, without splitting them further at this time:
  1. Palewyami (nai-ply — or putting the y at the start so the codes sort together and are more apparently connected — on further thought, nah) a.k.a. Poso a.k.a. Poso Creek
  2. Buena Vista Yokuts (nai-bvy) a.k.a. Tulamni-Hometwoli
  3. Tule-Kaweah Yokuts (nai-tky) a.k.a. Wikchamni, Yawdanchi
  4. Kings River Yokuts (nai-kry) a.k.a. Choinimni, etc
  5. Gashowu (nai-gsy), which Kroeber and Whislter/Golla agree is intermediate between Kings River and Northern Valley, though Kroeber considers it ultimately/genetically Kings
  6. Southern Valley Yokuts (nai-svy) a.k.a. Yawelmani, Tachi, etc
  7. Northern Valley Yokuts (nai-nvy) a.k.a. Chukchansi, Kechayi, etc
  8. Delta Yokuts (nai-dly) a.k.a. Far Northern Valley Yokuts
A more conservative approach would keep Southern Valley Yokuts, Northern Valley Yokuts, and Delta Yokuts together as "Valley Yokuts", but Delta Yokuts is relatively divergent. - -sche (discuss) 22:16, 3 July 2016 (UTC)
Pinging @Chuck Entz in case you have insight or input on this Californian language family. - -sche (discuss) 04:01, 5 July 2016 (UTC)
Not much, I'm afraid. A couple of decades ago I read everything I could find on the Wikchamni, who used to live in the area where my brother lives now. I read all of the sources you mentioned above, but I was more interested in the ethnobotany of the Yokuts than their languages, per se, and that was a long time ago.
On a side note, I remember one of my professors at UCLA back in the 80s saying that Yawelmani was one of the best-understood languages in the world at the time from a theoretical perspective, because so many linguists had been publishing papers on it- it was sort of the linguistic equivalent of a model organism. Chuck Entz (talk) 04:25, 5 July 2016 (UTC)
Done Done . - -sche (discuss) 06:51, 15 July 2016 (UTC)


Merging Malay

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Malay lects

I propose to merge the following Malay lects into Malay :

  • Jakun
  • Orang Kanaq
  • Orang Seletar
  • Temuan

These are all mere dialects of Malay with no written tradition and perfectly mutually intelligible. Even Ethnologue says they should be considered dialects of Malay rather than separate languages. -- Liliana 23:39, 28 February 2016 (UTC)

Support. - -sche (discuss) 02:36, 29 February 2016 (UTC)
Support. — I.S.M.E.T.A. 01:32, 4 March 2016 (UTC)
Done Done. - -sche (discuss) 04:20, 3 July 2016 (UTC)


Ngadha

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming nxg

This language is far more commonly called "Ngadha" than "Ngad'a"; the latter spelling is so rare that when I was trying to verify our "Ngad'a" translation of water using that spelling for the language name, I couldn't find any references at all (they all spell it "Ngadha"). This rename entails moving a few categories and updating a handful of entries. - -sche (discuss) 02:32, 29 February 2016 (UTC)

Done Done. Should Eastern Ngad'a be merged? - -sche (discuss) 04:47, 3 July 2016 (UTC)


Bontoc

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently include both the macrolanguage Bontoc (code: bnc) and its dialects, particularly Eastern Bontok (not even using the same spelling, you notice! code: ebk) which we have about ~35 translations in. IMO, it rarely makes sense to include both a macrolanguage and also all of its dialects; we should usually have one or the other but not both. Ethnologue says the dialects are "reportedly similar", as if they split bnc into dialects in 2010 without without knowing enough about them to tell whether they were similar or distinct. The International Encyclopedia of Linguistics considers Central Bontoc to be only 56% intelligible with Eastern Bontoc, which is only a few percentage points better than the intelligibility of the various Bontocs with Ilocano, suggesting that at least Central and Eastern Bontocs, if not the others, are different languages. Our ~15 "Bontoc" (bnc) entries seem to be Central (Igorot) Bontoc and could be relabelled accordingly if we deprecated bnc. - -sche (discuss) 07:48, 29 February 2016 (UTC)

@-sche: Relabelled "Central Bontoc" or "Igorot Bontoc"? And is it "Bontoc" or "Bontok"? Whatever the details, I support the idea of reducing the macrolanguage Bontoc (bnc) to an etymology-only language in favour or having translations and entries for the various Bontoc languages. — I.S.M.E.T.A. 01:30, 4 March 2016 (UTC)
"Central Bonto(c|k)" is more common than "Igorot Bonto(c|k)" or "Bontok Igorot", and "Bontoc" is more common than "Bontok". I've tweaked the canonical spelling of Central Bontoc (lbk) accordingly; I suppose the other Bontocs which are currently spelled with k should also be updated. - -sche (discuss) 01:58, 4 March 2016 (UTC)
@-sche: "Cental Bontoc" it is, then. — I.S.M.E.T.A. 02:14, 4 March 2016 (UTC)
A number of works refer to "the Bontoc language" without specifying which of the Bontoc languages they mean, and we couldn't easily include words from these works if we deprecated bnc; there are even books like Clapp's Vocabulary of the Igorot Language as Spoken by the Bontok Igorots which conflate all the languages of the Igorot people (perhaps understandably, given the point above that the Bontocs and e.g. Ilocano are equally different from each other). However, if we accept that being barely halfway mutually intelligible makes Central vs Eastern Bontoc separate languages, then we're not losing anything of quality by not following (and not being able to easily add content from) books that fail to distinguish such different lects. - -sche (discuss) 03:01, 4 March 2016 (UTC)
@-sche: Quite. Though, there will be occasions when it will not possible to work out easily from which of the Bontoc languages a given term in a borrowing language will have been derived. — I.S.M.E.T.A. 03:28, 4 March 2016 (UTC)
Done Split. - -sche (discuss) 04:15, 15 July 2016 (UTC)


Matanawi

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Another language we need a code for, presumably sai-mat. Is there a more efficient way to find languages we've missed than my current method, which is simply happening upon them? —Μετάknowledgediscuss/deeds 08:33, 28 June 2016 (UTC)

I suppose you could go through w:Category:Languages without ISO 639-3 code but with Linguist List code. —Aɴɢʀ (talk) 13:54, 28 June 2016 (UTC)
Done Done - -sche (discuss) 04:51, 15 August 2016 (UTC)


Btw, Native South Americans: Ethnology of the Least Known Continent lays out the case from Nimuendaju (who documented Mura) that Rivet, at least, if not others, was too hasty in grouping this with Mura. Nimuendaju considers Mat. and Mur to be isolates. - -sche (discuss) 05:12, 15 August 2016 (UTC)

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


These are three extinct languages of Argentina that lack ISO codes, but two of them have recorded material (the third, Puntano, seems pointless to add). The only problem is that some linguists consider these to be dialects of the same language, although that is debated and cannot be satisfactorily resolved with the limited preserved lexica from each. I would prefer we follow es.wiktionary's lead in adding separate codes for Allentiac (sai-all) and Millcayac (sai-mil). —Μετάknowledgediscuss/deeds 05:36, 28 June 2016 (UTC)

Done Done. - -sche (discuss) 06:11, 15 August 2016 (UTC)


renaming Atong

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I suggest renaming

  • ato from Atong to Atong (Africa) or Atong (Cameroon)
  • aot from A'tong to Atong (Asia) or Atong (India) if we accept the non-inclusion in the latter name of the border areas of Bangladesh where it is also spoken

References seem to prefer the spelling "Atong" to "A'tong" for aot, and Wikipedia says "The correct spelling Atong is based on the way the speakers themselves pronounce the name of their language. There is no glottal stop in the name and it is not a tonal language." - -sche (discuss) 04:37, 29 July 2016 (UTC)

@-sche: I support renaming the languages to "Atong (Cameroon)" and "Atong (India)", respectively; in the latter, "India" can be taken to refer to the subcontinent, rather than the country. — I.S.M.E.T.A. 18:55, 31 July 2016 (UTC)
Done Done. - -sche (discuss) 20:36, 15 August 2016 (UTC)


More languages without ISO codes, part 2

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


{—Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC))

  • Dorasque language (cba-dor) Done Done
    Glottolog mentions a dialect Chánguena (Changuina); Linguist List gives separate codes to Chumulu and Gualaca. Do we know how distinct these are? - -sche (discuss) 07:31, 6 July 2016 (UTC)
    see below
  • Duit language (cba-dui) Done Done
  • Ewarhuyana language (sai-ewa) Done Done
    Sorry it took so long before I added this language... I just had to get around Duit. - -sche (discuss) 22:40, 6 August 2016 (UTC)
  • Gayón language (sai-gay) Done Done
    Do we need a code for the fourth of the languages Oramas covers (in the same work where he covers Gayon, Ayaman, and Jirajira), Ajagua (Axagua, Jagua)? Loukotka says: "once spoken on the Tocuyo River near Carera, state of Lara, Venezuela. " - -sche (discuss) 23:02, 10 August 2016 (UTC)
    Done Done. This Ajagua is to be distinguished from Achagua, which is also called Ajagua and Achawa, but is spoken 1500+ kilometers away. - -sche (discuss) 05:07, 15 August 2016 (UTC)
  • Haush language (sai-hau) Done Done
  • Huetar language (cba-hue) Done Done
  • Jirajara language (sai-jir) — if we're not doing three-part codes, then this will need to be sai-jrj to avoid clashing with the family code Done Done
  • Katembri language (sai-kat) Done Done
    Cf User talk:-sche#Kiriri, where a few other needed additions are mentioned. This language is also known as Catrimbi or Kariri de Mirandela, or Kiriri, and is described by Loukotka as the "lost language of the ancient mission of Saco dos Morcegos, now the city of Mirandela". AFAICT, we should add this language (which is also known as Kariri), we already have Xukuru (which is also known as Kirirí and Kirirí-Xokó), and we should possibly add Xukuru-Kariri (postscript: Done Done), which is also known as Xocó. Keeping them all straight is going to be difficult and is complicated by the fact that each is known only from short words elicted from elders in 1961. - -sche (discuss) 20:21, 14 July 2016 (UTC)
    Done All done. - -sche (discuss) 22:10, 6 August 2016 (UTC)
  • Natú language (sai-nat) Done Done
  • Otomaco language (sai-oto) Done Done
  • Paeonian language (ine-pae) Done Done
  • Pamigua language (sai-pam) Done Done
  • Purukotó language (sai-pur) Done Done
    I have also added Sapará language. - -sche (discuss) 19:38, 16 August 2016 (UTC)
  • Sanavirón language (sai-san)
    Done Done but without the accent, since English-language sources seem to drop it at least as often as they retain it. Cabrera (1929) is said to record a few words and Serrano (1945) five more. I've also added a code for Comechingón / Comechingon / Comechingona. - -sche (discuss) 17:51, 15 August 2016 (UTC)
  • Sechura language (sai-sec) Done Done
    Btw, Matthias Urban lays out a number of arguments that Spruce's semi-well-known wordlist is definitely Sechura and not Tallan. - -sche (discuss) 06:03, 14 August 2016 (UTC)
  • Shebaya language (awd-she)
    Also called Shebaye, Shebayo, per Campbell, American Indian Languages: The Historical Linguistics of Native America; he says David Payne "adduces persuasive evidence from the scant fifteen words recorded in extinct Shebayo (Shebaye) of Trinidad to show that it belongs with the Caribbean group (for example, it appears to have da- ‘my’, and these languages are the only ones which have an alveolar stop and not a nasal for 'first person singular')". - -sche (discuss) 05:04, 14 August 2016 (UTC)
    Done Done as "Shebayo", the most common spelling. Douglas MacRae Taylor, Languages of the West Indies (1977), page 15, has: The Shebayo list, taken from De Laet's Novus Orbis, is as follows (divergent spellings found in different editions are shown in parentheses): heia (heja) 'pater', hamma 'mater', wackewijrrij 'caput', wackenoely (wackenoey) 'auris', noeyerri (noeyerii) 'oculus', wassibaly (wassi) 'nasus', darrymaily 'os', wadacoely 'dentes', watabaye 'crura', wackehyrry 'pedes', ataly 'arbor', hoerapallii 'arcus', hewerry 'sagittae', kyrizyrre 'luna', and wecoelije 'sol'. - -sche (discuss) 19:22, 15 August 2016 (UTC)
  • Tallán language (sai-tal) Done Done
  • Taparita language (sai-tpr) Done Done
  • Tataviam language (azc-tat) Done Done
  • Teushen language (sai-teu) Done Done
  • Voto language (cba-vot) Not done
    Loukotka says nothing of Voto is attested (which is ironic, because the subfamily is apparently named after it). The Indigenous Languages of South America: A Comprehensive Guide considers it a variety of Rama, possibly based on the fact that the Ramas were also called Votos. - -sche (discuss) 05:04, 14 August 2016 (UTC)
    Glottolog doesn't list it or have any resources on it, either. Therefore, not added until content in it can be found, or at least determined to exist. - -sche (discuss) 05:45, 15 August 2016 (UTC)
  • Wajumará language (sai-waj) Done Done as Wayumará, a more common name
    (also called Wayumara per several sources, and Azumara and Guimara per Loukotka)
  • Wamo language (sai-wam) Done Done as Guamo, a more common name

Comments:
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)
As of now, with the addition of Tataviam, we include codes for 7900 languages, and will soon have codes for over 8000, including artificial languages and proto-languages. Wiktionary talk:Milestones#We_have_words_from_21.25_of_languages. - -sche (discuss) 07:48, 11 July 2016 (UTC)
As to how different the dialects of Dorasque are: A. L. Pinart's Vocabulario Castellano-Dorasque, Dialectos Chumulu, Gualaca y Changuina gives si (and ji) as the Chumulu form(s), and gives ti as the Gualaca form and ji as the Changuina form for "water". Overall, it's hard to tell how intelligible the dialects would be; some words are quite similar (Chumulu utká, Gualaca utkál "yellow"; Chumulu katuvá, Gualaca katavá "bow"), others are very different: Chumulu sagúsaña, Gualaca θake "blue"; Chumulu sérkala, Gualaca okiyigua "asiento" (OTOH, all three dialects use sérkala for "bench"). "Woman" is biá in Chumulu and Changuina, ωiá in Gualaca. I suppose we can follow Pinart in entering it as one language. - -sche (discuss) 01:12, 10 August 2016 (UTC)
I've also added a code for Yuri language (Amazon) and Culle language. - -sche (discuss) 21:08, 13 August 2016 (UTC)
I've also added three Shastan languages:
  • Okwanuchu language (nai-okw): Berkeley.edu's Indian Languages project says some (<100) words were recorded in the early 20th century; the Handbook of North American Indians: California refers readers to Kroeber (1925) and Voegelin (1942); Glottolog cites Victor Golla California Indian languages (2011). Kroeber says "The dialect is peculiar. Many words are practically pure Shasta; others are distorted to the very verge of recognizability, or utterly different." Victor Golla, California Indian Languages, speculates at length that Okwanchu may have been "a bilingual mix of Shasta and some other language". There was a people "whom the McCloud River Wintu considered Wintu and called Waymaq ('north people') Du Bois believed were closely related to, if not identical with, the Shastan Okwanuchu; the survivor of the group whom she interviewed gave her a short vocabulary that included words of Shasta origin (Du Bois 1935:8)." These words included atsa 'water', au-u 'wood', katisuk 'bring'. Golla also says "Okwanuchu speech may also be attested in words identified as 'Wailaki on McCloud' (cf. Wintu waylaki 'north people') that Jeremiah Curtin recorded" in 1884, namely: gü'ru 'man', ki'rikega 'woman', hänumaqa 'old man', apci 'old' (in ki'rikega apci 'old woman'), ä'toqe'äqa 'young man', kewatcaq 'young woman', tse'gwa 'one', hoka 'two', qätsi 'four', tseapka 'five' = 'one hand', hukaapka 'six', tsuwara 'sun', kapqu'wara 'moon', kau 'snow', atsa 'water', gri'tuma 'thunder', itsa 'rock', tarak (Golla's note: "") 'earth'. Golla notes: "Curtin employed the BAE transcription system, in which q represents a velar fricative, not a velar stop." "Of these forms, five (man, old man, four, thunder) are not attested in any other variety of Shasta."
  • Konomihu language (nai-knm): The Handbook of North American Indians: California says "An unpublished Konomihu word list was collected by Angulo (1928a)." Glottolog cites Shirley Silver Shasta and Konomihu (1980), Roland B. Dixon The Shasta-Achumawi: A New Linguistic Stock, with Four New Dialects (1905), Lars J. Larsson Who Were the Konomihu? (1987). Kroeber says "Kroeber says "it is still questionable whether their speech is more properly a highly specialized aberration of Shasta or of an ancient and independent but moribund branch of Hokan from which Karok and Chimariko are descended together with Shasta. Konomihu is their own name." Silver's work documents a number of words, including kihínàpxī́k "woman".
  • New River Shasta language (nai-nrs): "the language is attested only in a few wordlists" per Berkeley.edu's Indian Languages project. Kroeber, who mentions the alternative exonym Amutahwe, says they were "perhaps rather nearest to the major group in speech, although at that their tongue as a whole must have been unintelligible to the Shasta proper. he tribe melted away without a survivor, leaving only a fragmentary vocabulary."
- -sche (discuss) 23:11, 15 August 2016 (UTC)


Bole

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming bvx

The di- in its current name, Dibole, is one of those language prefixes, but curiously enough, the most commonly used name for this language is actually Babole, with a different prefix. (Luckily, unprefixed Bole isn't used, because it's taken up already by bol.) We should rename this to Babole accordingly. —Μετάknowledgediscuss/deeds 17:32, 29 June 2016 (UTC)

If I know anything about Bantu languages, "Babole" refers to the people who speak it rather than the language itself. Is this really what they call it? —CodeCat 18:24, 29 June 2016 (UTC)
Yes, that's why I noted that it was odd. You're welcome to compare google books:"Dibole language" and google books:"Babole language" yourself. Not much work is done on it, but when it is, it's called Babole. —Μετάknowledgediscuss/deeds 22:46, 29 June 2016 (UTC)
Indeed, searching google books:"Dibole" language turns up nothing relevant, and searching google books:"Bole" language and google books:"Bole" language Nigeria OR Congo finds only references to the Nigerian Bole. Whereas, google books:"Babole" language is well attested. Rename. - -sche (discuss) 03:11, 30 June 2016 (UTC)
Done Done - -sche (discuss) 00:08, 17 August 2016 (UTC)


Gandhari

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming pgd

We currently have this as "Gāndhārī". This spelling is indeed used frequently in the literature, but we try to avoid difficult diacritics, and this spelling can be seen as using IAST to render the native name of the language, whereas "Gandhari" is the corresponding English. I think that switching to "Gandhari" would be the better choice. @Aryamanarora, -scheΜετάknowledgediscuss/deeds 08:52, 5 July 2016 (UTC)

I agree, the IAST diacritics are unnecessary. —Aryamanarora (मुझसे बात करो) 15:34, 5 July 2016 (UTC)
Yes, rename. - -sche (discuss) 22:16, 5 July 2016 (UTC)
Done Done - -sche (discuss) 00:45, 17 August 2016 (UTC)


The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We follow the ISO in covering this as a single macrolanguage, ers. Following Yu (2012), we should keep ers as Ersu, but also create sit-tos for Tosu and sit-liz for Lizu. —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)

Done Done. And wow! What a writing system — the colour of the writing changes the meaning! - -sche (discuss) 20:52, 28 August 2016 (UTC)


More languages without ISO codes, part 3

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


{—Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC))


Comments:
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)

Quimbaya

Loukotka, and Campbell citing him (and Wikipedia citing Campbell), says only a single word is attested, but none of them say what that word is. Other sources, none of which strike me as reliable, provide different(!) words, of which the most common is quindio (one book has quindus, one has kindiyo, parallelling the spellings Quimbaya - Kimbaya), meaning "paradise", but other words mentioned include chascará, batatabatí, and fihisca "spirit" (the last of which may actually be from some other language). proel.org says eight words are attested, but doesn't say what they are. Several books provide placenames which may attest the language. - -sche (discuss) 23:57, 20 August 2016 (UTC)

Yupua

An old source (Daniel Garrison Brinton, 1898) says "The Jupua and Curetu dialects are properly one and the same, the difference which appears in their vocabularies arising simply from inequality in the ears and the orthographies of observers. This is evident by the following comparison..." and then proceeds to offer a comparison which IMO doesn't actually conclusively demonstrate anything. In any case, Cueretú / Curetu itself does not currently have a code. - -sche (discuss) 04:20, 14 July 2016 (UTC)
I can find a number of sources attesting Yupua words, but none seem like reliable primary sources: Peruvia Scythica: The Quichua Language of Peru mentions "wui" as the word for house, but does so as part of trying to connect a huge number of unrelated languages based on chance sound correspondances. Brinton's 1898 Studies in South American Native Languages has a list of words which he sources to Martius (surely referring to the Wörtersammlung Brasilianischer Sprachen) and compares to Curetu words sourced to Wallace (surely referring to A Narrative of Travels on the Amazon and Rio Negro), but I haven't located the Yupua words in the originals; the words Brinton gives are: blood: thik (Yupua), dü (Wallace); bow: patopai, patueipei; earth: thitta, ditta; flesh, ga'hi', se'hea'; finger, moh-asoing, mu-etshu; fire, pieri, piure; flower, pagari, bagaria; foot, göaphoe, giapa; hair, poa, phoa; hand, moho, muhu; head, co'ëre, cuilri; house, wu'i, wee; mouth, thischüh, dishi; sun, hauvä, aoué; tongue, toro, dolo; tooth, gobâckaa', gophpecuh; water, thäco, deco; woman, nomöa, nomi; he also offers the additional Yupua words hóggoa "water" (sic), göaphae "foot", ga'hi "meat", jih "jaguar", ikama "deer", jocheo' "star". Ruhlen has manapẹ "husband / man", apara "we two", ti "this", -mai- "we", tsīngeē "boy", pilo "fire", poa "feather". - -sche (discuss) 05:29, 14 July 2016 (UTC)
Martius confirms thäco is "water" and pieri is "fire", and has wúi "house", pohjá "feather". - -sche (discuss) 07:54, 20 August 2016 (UTC)


Merging Berawan

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Berawan

It does not make sense to have both the macrolanguage (lod, which the ISO retired) and the dialects (zbc, zbe, zbw). We should either merge the dialects or retire lod. However, Berawan is composed of more than three varieties, which the literature often speaks of, though there are some works that use the same coarser division as Ethnologue. - -sche (discuss) 09:57, 1 September 2016 (UTC)

Being a "lumper" rather than a "splitter", I support merging the dialects into the macrolanguage. Dialect labels can be added with {{lb}} as needed. —Aɴɢʀ (talk) 11:24, 1 September 2016 (UTC)
Oddly, Ethnologue does not provide a claim of how mutually intelligible vs unintelligible they are. Blust, in The Consonants of Long Terawan, asserts four dialects, all spoken within 25 miles of each other: Long Terawan (Ethnologue's zbw) and Batu Belah (part of zbc), and Long Teru (part of zbc) and Long Jegan (zbe); he considers e.g. Long Pata (which WP names, and which is noted by Ray's 1913 Languages of Borneo) to be a variant of one of these, though he is nonspecific as to which one. When I compare Blust's Terawan data and Burkhardt's The historic evolvement of true triphthong phonemes in Long Jegan Berawan, the differences are not large, especially considering the obvious differences in style of notation: e.g. Blust has Terawan gitoh "hundred", Burkhardt has Jegan getoʔ (Blust noted some uncertainty in his transcription of vowels); Terawan dimmeh "five", Jegan dimmiᵊy; T buluh "feathers or body hair", J bullǝw, T iciw "day", J iciᵊw, T iko "tail", J eko, T puté "white", J pote, T depeh "fathom", J dǝppiǝ̈. A larger difference is Terawan lebbih "two" vs Jegan duβiᵊy, T puʔ "head hair", J uk, T manoʔ "bird", J manǎuk. The only entry we have in the dialects is pi "water", which is (per our entry) the same in all of them. It does seem that these could be merged. - -sche (discuss) 18:19, 1 September 2016 (UTC)
Done Merged. - -sche (discuss) 05:33, 11 September 2016 (UTC)


Sanglechi and Warduji

and Warduji ">edit]

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


It appears from Sanglechi language that these names are synonymous. As "Sanglechi" is vastly more common, I move that we remove Warduji. —Μετάknowledgediscuss/deeds 06:12, 6 July 2016 (UTC)

Done Done. - -sche (discuss) 17:17, 18 September 2016 (UTC)


Even more languages without ISO codes, part 1

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)

Done Done as sai-cno since -chn was taken. In addition to the catechism, Fitz-Roy (1839) has three words, and Jose Garcia (1889) has three words, per Loukotka. - -sche (discuss) 06:18, 17 August 2016 (UTC)
  • Damin (art-dam) — this might be better off considered as a natural language instead so it can be in mainspace (--Meta)
    If we want Damin in mainspace, I'd rather consider it a dialect of Lardil than a separate language. —Aɴɢʀ (talk) 20:43, 13 August 2016 (UTC)
    Hmm, this is an interesting case. On the one hand, they seem to be mutually unintelligible, and speakers seem to refer to them as separate languages, and outside linguists seem to treat them as separate things (compare Eskayan)... and we do consider even such very similar, often-overlapping things as Norwegian and a slightly different orthography of Norwegian to be entirely separate languages. On the other hand, Damin is said — by researchers with access to its full vocabulary, as opposed to access to only a limited surviving corpus — to have a vocabulary of "perhaps no more than 250 words in all", which makes it hard to consider it a language, and does make it seem similar to pandanus avoidance registers or especially thick cant or jargon. According to Wikipedia, some markers are different, but others are the same, e.g. genitive -kan and future -ur. I suppose treating it as a dialect of sorts does make the most sense. - -sche (discuss) 17:35, 14 August 2016 (UTC)
    Not done per the above discussion. - -sche (discuss) 08:51, 9 September 2016 (UTC)

also this: - -sche (discuss) 05:21, 29 August 2016 (UTC)

  • Mamluk-Kipchak (see w:fr:Kiptchak mamelouk) (trk-mam? -mmk?) Done Done
    Water is سو (su); the word also means 'tempering (of a sword)'. 'Ebçi' (apparently containing the occupational suffix '-çi'; ç = č) is 'woman', mentioned in one military treatise when it says that Indian swords are "useful only for hanging on the neck of a woman who cannot give birth to a son". 'sovuq' is 'cold' and 'sol' is 'left': andın songra sol egining ūzārā salgıl taqı boynuñ ūzārā tezgindūrgil "after that, place it over your left shoulder and make it rotate over your neck". See Munytu'l-Ghuzāt: a 14th-century Mamluk-Kipchak military treatise and Vocabulaire arabe-kiptchak de l'époque de l'État mamelouk. - -sche (discuss) 00:47, 31 August 2016 (UTC)


splitting rGyalrong

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Based on their limited mutual intelligibility, and following the references on these languages (see w:Rgyalrong languages' bibliography), we should split jya, and its category and any entries, into four languages:

  • Situ (sometimes called Eastern rGyalrong) (sit-sit?)
  • Japhug (sit-jap?)
  • Tshobdun (Caodeng, Sidaba) (sit-tsh)
  • Zbu (Rdzong'bur, Showu, Sidaba) (sit-zbu)

- -sche (discuss) 05:11, 29 August 2016 (UTC)

Done Done. - -sche (discuss) 04:44, 4 October 2016 (UTC)


Merge Qashqai and Sonqor into Azeri?

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Sonqor Turkic is an Oghuz lect spoken in northwestern Iran. The Turkic Languages by Fuchs Christian et al speaks of it, Qashqai/Kashkay (code qxq) and Afshar as "deviating" dialects of Azeri. The Oxford Handbook of Inflection, edited by Matthew Baerman, speaks of Sonqor as "a minority language of Iran heavily influenced by Kurdish". (The Handbook provides some Sonqor example words: ušaġ-ækæ-le' "child-SPEC-PL" "those children" (compare ušaġ to Azeri uşaq), mincuġ-ækæ-re "bead-SPEC-ABL" "of those pearls" (Azeri muncuq-), šéʕr-eke-sin-ne "poem-SPEC-POSS-ABL" "about that poem by him" (Azeri şeir-).) Other sources note that "Sonqor and others" "transitional dialects". How should Sonqor and Qashqai be handled? Ethnologue claims there are "significant differences South Azerbaijani in phonology, lexicon, morphology, syntax, and loanwords", but we merged them (but not qxq) after this discussion, based on references saying " dialects of Azerbaijani do not differ substantially. Speakers of various dialects normally do not have problems understanding each other" (heck, even Azeri and Turkish speakers do not have that much difficulty understanding one another). Gilles Authier, in New strategies for relative clauses in Azeri and Apsheron Tat, in Clause Linkage in Cross-Linguistic Perspective (2012, →ISBN, similarly says "There is an almost perfect mutual intelligibility between Azeri and Kashkai speakers. I tested this personally by submitting recordings to both audiences." It sounds to me as if Kashkay should be merged into Azeri, and as if we don't need a code for Sonqor. However, I'm pinging our only recently active editor who speaks some Azeri, User:123snake45. - -sche (discuss) 03:33, 11 September 2016 (UTC)

As you noted, only political/orthographic considerations compel us to keep Turkish and Azeri apart. I'd keep all these merged in Azeri. While we're at it, have we ever discussed whether the name "Azeri" should be changed to "Azerbaijani"? —Μετάknowledgediscuss/deeds 03:46, 11 September 2016 (UTC)
BGC Ngrams shows a statistical dead heat between the two over the years. I personally prefer the shorter name. —Aɴɢʀ (talk) 12:51, 12 September 2016 (UTC)
I do not speak anything Turkic; but I keep getting the general impression we make too little use of the "X is a dialect of Y" functionality. This would be well-suited for handling cases where one variety is mostly intelligible with a bigger standard language, but has its share of its own quirks (loanwords, occasional diverging phonetics, etc). --Tropylium (talk) 13:32, 12 September 2016 (UTC)
Done Done. - -sche (discuss) 05:06, 4 October 2016 (UTC)


Belize Kriol

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming bzj

The name we use, "Belize Kriol English", is certainly among the more uncommon names of this language. Much better would be just "Belize Kriol"; there might also be argument to be made for the more dated term "Belizean Creole" (which is the title of the Wikipedia entry). There is one entry which would have to be changed. —Μετάknowledgediscuss/deeds 20:25, 3 April 2016 (UTC)

On Ngrams and in the 'raw' Google Books results, and when I check academic sites on Google (site:.edi) and academic papers (filetype:pdf — most results for most non-famous languages' names are academic papers, dictionaries, etc), the most common name is "Belizean Creole" (which doesn't seem dated), then "Belize Creole", with "Belize Kriol" bringing up the rear. Approximately the same pattern holds when I page through the results looking only for primary reference works about this language in particular, as opposed to works that just mention it. There is some interference from the fact that "Belizean Creole" and "Belize Creole" are apparently also used to name the people who speak it and not just the language, but I suggest a rename to "Belizean Creole". - -sche (discuss) 18:27, 20 August 2016 (UTC)
Renamed to "Belizean Creole". - -sche (discuss) 17:40, 11 September 2016 (UTC)


Even more languages without ISO codes, part 2

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)

Gallaecian and Ivernic

@Angr, since you are knowledgeable of Celtic languages: do you agree that the Gallaecian language and Ivernic language should be added? Should "Ivernic" be called that, or another name? And do you know if any words in it are attested? (It's not clear to me if ond and fern, which Wikipedia mentions, are Ivernic words, or words borrowed by some other language from (potentially slightly different) Ivernic words.) - -sche (discuss) 20:18, 12 August 2016 (UTC)

@-sche:, AFAIK "Ivernic" is unattested; ond and fern are Old Irish words which Cormac mac Cuilennáin (who lived in the 9th century) believed to be of "Ivernic" origin. I don't think we need to add it. Gallaecian, on the other hand, is attested. If it were to be subsumed under anything else, it would be Celtiberian (xce), but I'd rather keep it separate. —Aɴɢʀ (talk) 20:38, 13 August 2016 (UTC)


Here are other languages we might need codes for: - -sche (discuss) 05:21, 29 August 2016 (UTC)

  • Light Warlpiri, a small mixed language (350 speakers, all also fluent in Warlpiri). I'm not sure if we should have this or not.
    It's gotten a lot of press, but I lean toward saying it's not worth including — mixed languages are always a little hard to justify, and one that's just come into being is covered by Warlpiri and English sections well enough, I reckon. —Μετάknowledgediscuss/deeds 05:30, 4 October 2016 (UTC)


Bushi

">edit]

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Apparently, we merged all the Malagasy dialects except this one. I don't see any reason to keep it separate at this point (and I'm not even sure that this oversight merits an RFM). @-scheΜετάknowledgediscuss/deeds 06:19, 28 September 2016 (UTC)

It apparently has its own orthography and is spoken on a different island, which might have motivated a non-merger, although I can find no discussion of it so it seems most likely that it simply went entirely unnoticed. Reconstruction:Proto-Austronesian/daNum and water make use of it. The only references I can find that mention it do speak of it as a dialect of Malagasy. Merged. - -sche (discuss) 01:24, 4 October 2016 (UTC)


renaming Pu Xian

">edit]

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I propose renaming cpx to either "Pu-Xian", "Puxian", "Pu-Xian Min", or "Puxian Min". (at least remove the space, per google books:"pu xian" "fujian"). I do not know which name is most common, but Wikipedia's article is titled Pu-Xian Min. —suzukaze (tc) 09:46, 15 August 2016 (UTC)

On Google Books, "Puxian" and "Pu-Xian" seem to be the most common names, about equally common: but on Google Scholar, about 3.5 times as many works use "Puxian" as "Pu-Xian", so I will rename the language to "Puxian". "Pu Xian" does get a few hits. Incorporating "Min" ("Puxian Min") seems to be uncommon. Another attested spelling (not common) is "Putian". Wikipedia says Xinghua and Hinghwa are additional alt names. - -sche (discuss) 13:50, 1 November 2016 (UTC)
Done Done. - -sche (discuss) 13:57, 1 November 2016 (UTC)


Renaming Timne

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Searches for "speak X" and "X language" on Google Books reveal that , which we currently call "Timne", is referred to as "Temne" far more frequently than by any other spelling. The native spelling, Themnɛ (or the ASCII-friendly Themne), sees extremely limited use. See also w:Temne language. —Μετάknowledgediscuss/deeds 05:40, 4 February 2017 (UTC)

I support changing what we call the language to Temne. — I.S.M.E.T.A. 15:32, 15 February 2017 (UTC)
Done Done. - -sche (discuss) 17:56, 8 March 2017 (UTC)


Some more missing South American languages, 1

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Here are a few more South American languages for which we could add codes:

  • Cacán (Kakán, Diaguita) was documented, but the document was then lost, so it seems unnecessary to add.
    Update: Done added: Some words are recorded elsewhere and collected by Nardi (1979). - -sche (discuss) 14:28, 29 August 2016 (UTC)
  • Coroado Puri language (sai-crd), of which Revista de la Universidad Nacional de Cordoba, volume 6, includes two quite disparate wordlists, with "sun" alternately given as Hope or Aaam, "moon" as pethara or kesha, "water" as something illegible (nhanan?) or gioi. The Viagem a Curitiba e Província de Santa Catarina also has some words, including goio "water". Greenberg, from who knows where, sources ope as "sun", oron as "new moon", and also e.g. čama "animal", puara "chest", teke "belly", bo, ambo "tree", ke "wood, fire", tong, ton "nape of the neck" (compare Greenberg's Puri thong), uka, høka "stone" (compare Puri ukhua), tima "love/want/enjoy" (compare Puri tamathi), baj "to live".
  • Menien (sai-men), attested in a Vocabulario Menien by Príncipe von Wied. He wrote in German, was translated with only a few alterations into French, and then was translated badly with a number of serious errors into Portuguese (says Loukotka, giving some examples from all three editions). Words (agreed on by the German and French editions) include hioi "wax", koinin "child", and kohira "red".

- -sche (discuss) 21:18, 16 August 2016 (UTC)

More languages without ISO codes, part 4

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


{—Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC))


Comments:
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)
Note that some discussion of Yumana, Mariaté (awd-mrt), Wainumá, Wiriná, Guinau, Baré, and Marawá is found on User talk:-sche#Proto-Nawiki. - -sche (discuss) 04:40, 21 November 2016 (UTC)

Corobici

Whether Corobici existed as a separate language, and whether words in it are attested, is unclear to me (and apparently to a lot of writers).
Native-languages.org, based on who knows what, identifies Corobicí as bzd. They give four words, which I would not trust without other confirmation: tun "man", si "water", unsat "earth" and guá "house". (For comparison, Wikipedia gives the following Rama words: kiikna "man", kumaa "woman", nguu "house", sii "water".)
The 2013 Native Languages of the Americas, volume 2, citing Mason, envisions a "Rama-Corobici" family which includes the two languages "Rama" and "Guatuso (incl. Corobici)"; the Classification and Index of the World's Languages by Charles Frederick Voegelin and Florence Marie Voegelin also includes Corobici in Guatuso. Guatuso is also called Maléku Jaíka, and Ethnologue says it's specifically not intelligible with bzd.
However, the 1950 Handbook of South American Indians says "Guatuso, with its variety Corobici or Corbesi, and Rama with its dialect Melchora, are obviously very different from each other and from other Central American Chibchan languages, and Mason (1940) was evidently in error in making a Rama-Corobici subfamily." It goes on to place Guatuso and Rama in separate (sub)families.
Comparative Chibchan Phonology (1981), speaking about someone who has "gives a small sample of" Corobici, says "The 'Corobici' words are words from the dialect of Rama that was spoken in the region of Upala, Costa Rica, up to the 1920s. Really, not a single word of the language of the Corobicies (an extinct group) was recorded". An 1890s report by Daniel Brinton to the US Congress agrees, "Nothing remains of the Corobicies or Corvesies except the name Corobici or Curubici" used in placenames.
Other authors say the Corobici were the ancestors of both the Rama and Guatuso. Tozzer says "The modern Guatuso are probably descendants of, and synonymous with, the ancient Corobici." Lothrop says "It is generally assumed that the Rama were once a tribe identical in language and speech with the Corobici."
- -sche (discuss) 21:55, 9 August 2016 (UTC)
Therefore, not added. - -sche (discuss) 22:00, 2 November 2016 (UTC)

Cumana

It's hard to find information on this Chapacuran language, firstly because it's not discussed much in the literature, and secondly because it has the same name, in the same range of spellings, as the Cariban language of the Cumanagotos (cuo). Some sources speak of it as a dialect of Abitana (Wanham), and Loukotka's wordlist is quite similar to his Abitana wordlist; indeed, the two are more similar than some wordlists of the same language (e.g., of Peba) are to each other. Glottolog only has resources on Kuyubi / Kaw Ta Yo, and although the equation of that with Cumana is non-obvious, it would at least be a less confusing name (any spelling of Cumana is likely to be misinterpreted as cuo). - -sche (discuss) 03:36, 31 August 2016 (UTC)

Guachí

There may be two lects called Guachí. Wikipedia places a Guachí / Wachí, which it considers possibly Guaicuruan, in Argentina. Glottolog places a possibly-Guaicuruan Guachí in southwestern Brazil, in Mato Grosso do Sul, where Opaie is also spoken. The index to Čestmír Loukotka, Johannes Wilbert, Classification of South American Indian Languages (1968), says they describe Guachí is two places: pages 51-52, which I can't see, and page 66, which says: "Opaie or Ofaie-Chavante - spoken on the Ivinhema, Pardo and Nhandui Rivers in the Brazilian state of Mato Grosso, now by only a few individuals. The so-called language 'Guachi' on the Vaccaria River in the same state is only a dialectal version of Opaie. " Opaie is not Guaicuruan. It is possible that pages 51-52 describe a different lect. It is separately possible that the more recent (2004) Guaicurú no, macro-Guaicurú sí: Una hipótesis sobre la clasificación de la lengua Guachí (Mato Grosso do Sul, Brasil) knows more than Nimuendaju in 1932 and Loukotka in 1968 about the separateness and familiar relationships of Guachí. lists two Guachí words, ‘diente(s)’ and ‘pierna’. - -sche (discuss) 04:46, 11 July 2016 (UTC)

Guanaca

I'm having trouble finding evidence that this lect was recorded. Greenberg says "In Paezan, we find Guambiana, Guanaca, and Totoro with the noun plurals -ele, -el, and -le, respectively", but existence of the Paezan family has been discredited and it's not clear where Greenberg gets his Guanaca data; Glottolog apparently doesn't list the lect and I have not yet found anyone other than Greenberg who mentions words in it. Loukotka cites Castillo y Orozco's 1877 Vocabulario páez-castellano as a source for the lect, but I haven't found it in that work; maybe it's under an alternative name or a spelling I couldn't think of to check? I've found the spellings Guanaca, Guanáca, Guanaco, Guanuco, Guanukco(?), Wuanaka mentioned or used in various sources. Huanca and Huanuco are names of a variety of Quechua (whereas WP says Guanaca is "perhaps Barbacoan"). - -sche (discuss) 15:51, 18 September 2016 (UTC)

Kuwani

I'm ambivalent about this. WP says it's known only from one wordlist "and even its exact location is unknown. Smits and Voorhoeve (1998) assumed it to be equivalent to Kalabra". There are, as WP notes, lexical differences, but compare other languages attested (with more certainty) in multiple wordlists that show great differences, including Tapachultec and Coroado Puri. OTOH, we could include it and just merge it later if more information came out that it was Kalabra. - -sche (discuss) 01:44, 20 August 2016 (UTC)

With wordlist-only languages that thus have small, finite lexica, it's best to be conservative and keep them separate when there's no clear case for merger. —Μετάknowledgediscuss/deeds 04:39, 13 September 2016 (UTC)
OK, added. - -sche (discuss) 22:00, 2 November 2016 (UTC)


Baïnounk Gubëeher

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This obscure lect does seem like a separate language, but wasn't described until Cobbinah (2013). The Wikipedia article follows his lead and calls it thus, but it seems odd for us to do so, given that all the other Bainouk languages are given names starting with Bainouk. We also need to decide on a code, perhaps alv-bgu. —Μετάknowledgediscuss/deeds 00:19, 2 April 2016 (UTC)

Added. Wikipedia says Gubëeher (which I can also find referred to as that single word, or with the alternative "prefixes"/first-elements Nyun Gubëeher or Nun Gubëeher) is merely closely related to the two Bainouks, so it's probably tolerable that the spelling is Baïnounk here and Bainouk there. Incidentally, WP citing Glottolog says bcb is a duplicate code for an alternative name of bab. Should they be merged? and should we drop the "Bainouk"/"Baïnounk" prefixes? - -sche (discuss) 03:25, 13 September 2016 (UTC)
Incidentally, there's a book Le gúbaher, parler baïnouck de Djibonker by Noël Bernard Biagui which apparently uses yet another spelling. - -sche (discuss) 14:52, 18 September 2016 (UTC)


Antigua and Barbuda Creole English

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming aig

We currently call this "Antigua and Barbuda Creole English" — names containing and are suspect, in my opinion. Wikipedia calls it "Leeward Caribbean Creole English". The most common name in literature seems to be "Antiguan Creole", which is what I suggest renaming it. - -sche (discuss) 03:22, 19 April 2016 (UTC)

The country is called Antigua and Barbuda, so I don't see anything suspicious about the name per se. Nevertheless, I prefer Wikipedia's name, since it's also spoken in Montserrat, Saint Kitts and Nevis, and Anguilla. —Aɴɢʀ (talk) 17:53, 19 April 2016 (UTC)
I concur with Angr, although I have hesitations about changing it. The use of "Antiguan Creole" seems to be mainly by scholars who are just studying the lect spoken on that island and not worrying about what is and isn't considered the same language. —Μετάknowledgediscuss/deeds 04:12, 26 June 2016 (UTC)
Meh, left as-is. - -sche (discuss) 06:45, 27 March 2017 (UTC)


Kela-Yela language

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This language got encoded twice, once as Kela and once as Yela , based on its two names in two different provinces of DR Congo. Wikipedia opts for "Kela-Yela" as the compromise name, and I can't find a better one, unfortunately. (This also opens up "Kela" for , which we currently call "Kala", though that name seems to be less commonly used.) —Μετάknowledgediscuss/deeds 06:00, 4 October 2016 (UTC)

I've merged yel into kel. - -sche (discuss) 04:20, 27 March 2017 (UTC)


Sena language merger

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Sena is mutually intelligible with Chichewa, so it really shouldn't be considered a separate language, but that's probably not worth a merger, as the speakers seem to disagree. What the speakers do seem to agree with is that there is one Sena language, despite it having gotten two ISO codes, seh in Mozambique and swk in Malawi, which we call "Sena" and "Malawi Sena" respectively. We should probably just do away with swk. —Μετάknowledgediscuss/deeds 23:48, 8 October 2016 (UTC)

I've merged swk into seh. No action taken with regard to Chichewa at this time. - -sche (discuss) 04:15, 27 March 2017 (UTC)


Old Uighur language (oui) to Old Uyghur

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Spelling is inconsistent (we spell the modern language Uyghur). DTLHS (talk) 23:28, 26 December 2016 (UTC)

Renamed. It seems to be as common or perhaps more common to spell it with a y, although searchign is complicated by chaff from the SOP phrases "old Uighur", "old Uyghur". - -sche (discuss) 01:53, 28 March 2017 (UTC)


Chimwiini

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This is pretty decidedly too distinct to be considered a dialect; I was just reading a grammatical description and it's heavily different from Swahili. Maho, Kisseberth, and others seem to agree that it is a separate language. As for the name, it seems that "Chimwiini" (sadly, with the language prefix) is a better option than "Bravanese", used at w:Bravanese dialect. I suppose the code could be bnt-mwi. @-sche, I noticed this because of muke; what other Chimwiini entries did you add? —Μετάknowledgediscuss/deeds 03:28, 25 March 2017 (UTC)

I know what your reasoning is for choosing bnt-mwi as the language code, but I think that it's better if the code matches the English name, thus bnt-chi or bnt-cmw. —CodeCat 18:56, 25 March 2017 (UTC)
I agree it merits separation; indeed, I see articles going back to at least the 1960s pointing out that "Certainly Bravanese and standard Swahili are not mutually intelligible" (Morris Goodman's 1967 Prosodic Features of Bravanese, J. of African Languages 6). And yes, for better or worse codes have tended to be named based on the English names. Did you search for "Mwini" with one i? I find roughly comparable numbers of Google Books hits for "Mwini" Swahili, "Chimwiini" Swahili and "Bravanese" Swahili. (Ngrams, catching I-don't-know how much chaff, finds Mwini to be the most common spelling recently, followed by Bravanese — the most common spelling until recently — and then Mwiini, with Chimwiini not common enough for it to plot.) It looks like we could avoid the prefix if you wanted, though I defer to you if you checked more literature than the cross-section Google has digitized. Google Scholar has ~120 articles on "Chimwiini" Swahili compared to ~90 for "Mwini" Swahili, some of which are the irrelevant last name Mwini, and also ~90 for "Bravanese" Swahili, but many are bibliographies mentioning the same few articles. Other spellings I see, besides Chimwiini, Bravanese, Mwiini, and Mwini, include Chimwini, Chimini, and Brava. - -sche (discuss) 23:37, 25 March 2017 (UTC)
In my own files, and likely elsewhere, the Chichewa word mwini gets in the way. But in any case, most of the works I see dedicated to the language that aren't on the older end use "Chimwiini". Anyway, bnt-cmw is fine. —Μετάknowledgediscuss/deeds 23:43, 25 March 2017 (UTC)
I've added the code bnt-cmw. Btw, I didn't add muke: that was User:Muke themself all the way back in 2006. :-p - -sche (discuss) 03:06, 27 March 2017 (UTC)
I'm shocked that it wasn't part of your water-and-woman effort. I guess I'll add the word for water now. —Μετάknowledgediscuss/deeds 03:25, 27 March 2017 (UTC)


Aramaic

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The same language in two different scripts with two different literary standards, and yet each quite similar to each other. I think I see a pattern. —Μετάknowledgediscuss/deeds 07:49, 21 December 2012 (UTC)

"Arc" should only be used for "Imperial Aramaic" (aka "Official Aramaic"), ideally written in the Old Aramaic script rather than Hebrew. Current usage does not reflect that though and there are a whole mix of dialects intertwined within the "arc" code, so that one at least should be split. "Syc" should stay as it is. --334a (talk) 08:04, 21 December 2012 (UTC)
I think SIL has done a really bad job with the classification of Aramaic languages. ARC was an umbrella code that was used to describe all later Aramaic varieties in ISO-639-2 in ISO 639-3 they introduced SYC for classical Syriac, by far the most widespread form of literary Aramaic.--Rafy (talk) 17:38, 28 December 2012 (UTC)


East Frisian lects

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This is an issue that needs to be resolved soon: there are 90 module errors related to it.

First let me lay out the linguistic details:

East Frisian, a language of Lower Saxony in Germany, is, along with West Frisian in the Netherlands and North Frisian in Germany, one of the w:Frisian languages. It has historically consisted of two dialect groups: one near the Ems River, and the other near the Weser River. The Weser dialects are all now extinct, with the w:Wursten Frisian dialect surviving into at least the 1700s, and the last speaker of w:Wangerooge Frisian dying in 1953. Although most of the Ems dialects died out by the 18th century, Saterland Frisian has survived to the present day. The extinct dialects were absorbed into Low German to become East Frisian Low Saxon.

The ISO has typically made a massive mess out of the Germanic languages, and they really screwed up in this case- until recently, it was impossible to tell if their code for "East Frisian", frs, referred to Frisian East Frisian or to East Frisian Low Saxon. If the first were true, it would overlap with Saterland Frisian, stq. If the second were true, it would be just another of the Low German lects that we decided earlier to treat as German Low German, nds-de (not Dutch Low Saxon, nds-nl, in spite of having "Saxon" in the name). Because of this ambiguity, the frs code was pressed into service for Frisian East Frisian in upwards of 140 etymologies and translations (there might have been entries, too, but I have no way to tell).

A few weeks ago, I mentioned to User:-sche that the online description of another lect had been updated. In the process of checking this out, he discovered that frs was now unambiguously described as East Frisian Low Saxon, and thus redundant to nds-de. After making the usual checks of the categories and changing uses of frs that he knew about, he removed frs from the data module. Unfortunately, East Frisian is mostly only mentioned in etymologies as cognates, and cognates don't show up in any of the categories, nor do redlinked translations. User:Leasnam (who added most of the frs references in the first place), -sche and I have been able to whittle it down from 137 entries in Category:Pages with module errors, to the present 90 just by getting rid of unnecessary ones and by changing recognizable instances of Saterland Frisian and East Frisian Low Saxon to the correct language codes.

As I see it, there are two halfway-decent options:

  1. Merge all of Frisian East Frisian into Saterland Low Frisian, stq, since the latter is the only surviving dialect of the former.
  2. Create an exception code for Frisian East Frisian, such as gmw-efr or gmw-fre.

There's also the possibilty of restoring frs as Frisian East Frisian, but that would put us in direct contradiction to the ISO standard, and leave things open for all kinds of confusion.

The first probably fits the linguistic facts best, but the second may be more practical, at least in the short run.

I've found no references on East Frisian Low Saxon, and very little on Saterland Frisian (there's a Saterland Frisian Wiktionary, but most of the remaining terms aren't mentioned there). There is, at least, one good dictionary of Frisian East Frisian available online. Those with better sources apparently don't have the time to work on this right now. Chuck Entz (talk) 02:41, 11 April 2015 (UTC)

I don't think restoring frs is a good idea for the confusion you mentioned. I think option 1 is the best; we would then treat Saterland Frisian as the main dialect of East Frisian. Option 2 would just introduce more ambiguity, since one language would suddenly become a part of another. This is why we eliminated the Low German varieties in the first place. —CodeCat 17:28, 11 April 2015 (UTC)
Yes, it wouldn't make sense to have both "East Frisian" and "Saterland Frisian" (they are not sufficiently distinct), so option 2 would only make sense if we changed instances of stq to the new pan-East-Frisian code and {{label}}-ed them. But we do need to recognize that there are inclusible East Frisian words which are not Saterland Frisian (namely, all the words and forms that we know — from records — existed in the non-Saterland varieties of East Frisian). Precedent exists of us repurposing codes to refer to slightly more things than the ISO intended them to refer to, e.g. we use gcf to refer to both gcf and acf and we used en to refer to both en-proper and hwc (Hawai'ian Creole English) and pld (Polari). Hence, I would add "East Frisian", "Eastern Frisian" etc as alternative names of stq, and then change all remaining uses of frs to stq to solve the module errors. Then, at leisure, we can go back through the affected pages and specify, whenever possible, which precise dialect they are from. (It may look a bit ugly to have e.g. "Wursten Saterland Frisian foo" or "Saterland Frisian foo (Wursten dialect)", but it's probably the best we can do, since changing the canonical name of stq to "East Frisian" would just invite people to become confused about it and misuse it again.) - -sche (discuss) 16:58, 13 April 2015 (UTC)
By the way, with all these module errors on highly visited pages, we can't really wait for this to go through RFM procedure. I think whoever sees this next, if (s)he has the time, should implement -sche's temporary solution. (I myself will do it if I can get my work done in meatspace quickly enough.) —Μετάknowledgediscuss/deeds 06:36, 16 April 2015 (UTC)


Weyto language

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This language's Wikipedia article helpfully explains that this extinct language of Ethiopia is unattested. However, since Amharic is an LDL and wordlists of the Weyto dialect of Amharic have been made that contain words purported to derive from it, we should make it an etymology-only language code. —Μετάknowledgediscuss/deeds 02:23, 4 April 2017 (UTC)

I wonder why the SIL/ISO assigned a full code in the first place to an unattested extinct language of unclear family association that is not even certain to have existed... - -sche (discuss) 03:16, 4 April 2017 (UTC)
I dunno! But after my efforts above on this page to identify new codes that need adding, I reckon I need to do some work identifying codes that need removing. —Μετάknowledgediscuss/deeds 03:22, 4 April 2017 (UTC)
Done Done. - -sche (discuss) 06:59, 12 May 2017 (UTC)


Dupaningan Agta

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


At present, Dupaningan Agta is the language evoked by language code duo, but Dupaninan Agta is also a valid spelling. User:Mar vin kaiser has been adding entries in this language under Dupaninan Agta. —Stephen (Talk) 09:57, 5 April 2017 (UTC)

The only reason though why I used Dupaninan Agta instead of Dupaningan Agta is because that's the spelling that came out of Wiktionary when I inputted duo. --Mar vin kaiser (talk) 10:18, 5 April 2017 (UTC)
That's true: {{subst:\|duo}} yields "Dupaninan Agta", as listed at Module:languages/data3/d. —Aɴɢʀ (talk) 11:38, 5 April 2017 (UTC)
Which, in turn, is because that's the spelling the Ethnologue/SIL/ISO used when we copied codes over, years ago. But the spelling with g seems slightly more common. Should it be made the canonical spelling, with the other one retained as another name? The names without "Agta" (‎i.e. "Dupaninan", ‎"Dupaningan") should also be otherNames, since they are sometimes encountered. - -sche (discuss) 17:04, 5 April 2017 (UTC)
@Mar vin kaiser, Stephen G. Brown I have updated things so that the more common spelling is the one used in all the entries and translations tables and categories now: Dupaningan Agta. I'm sorry if this causes you to have to relearn how to type the header, etc. The "joys" of muscle memory! :-p - -sche (discuss) 07:27, 10 May 2017 (UTC)


Gail/Gayle (gic)

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The Gayle language is a gay argot lexicon that can be used in English or Afrikaans. Words should be under whatever language they are found in, as argots are not independent languages, and this code should be removed. We have precedent for excluding gay argots based on our deletion of the code for Polari (discussion, most of which is off-topic). —Μετάknowledgediscuss/deeds 06:09, 6 July 2016 (UTC)

Removed. It can be reinstated as an etymology-only code if the need arises. —Μετάknowledgediscuss/deeds 02:52, 16 May 2017 (UTC)


the Kewa lects

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I propose to merge "South Kewa"/"Erave Kewa", "East Kewa" and "West Kewa"/"Pasuma Kewa" into as "Kewa". AFAICT most literature treats Kewa as a single language, and the only effect having three codes has had upon us so far is that our Kewa content is duplicated under several headers (as in ipa and utyali). There seems to be a far more marked difference between normal Kewa and its pandanus avoidance register than between South, East and West. - -sche (discuss) 06:00, 11 August 2015 (UTC)

Done Done. - -sche (discuss) 05:37, 14 May 2017 (UTC)


(Proto-)Western Malayo-Polynesian into (Proto-)Malayo-Polynesian

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


As discussed earlier this year. Western Malayo-Polynesian is a solely geographic group, it is not recognized by our language categorization system, and a proto-language appendix seems to be superfluous.

In addition to the mentioned durian issue, we have currently 13 Proto-WMP lemmas, with a breakdown as follows:

  • 10 entries are fully identical to corresponding Proto-MP entries (e.g. *huaji = *huaji, *wada = *wada)
  • 2 entries (*huaji-ŋ, *qari-mauŋ) are reconstructed from very scarce data, and the most likely situation is that they were just randomly lost in Central-Eastern MP.
  • 1 entry (*azak) is, per the cited source (Blust's dictionary), probably a late Wanderwort originating in Malay(ic) and not inherited.

--Tropylium (talk) 15:03, 29 June 2015 (UTC)

Even if there are regional differences, you don't have to resort to separate protolanguages to explain them: for one thing, the substratum languages encountered off of Southeast Asia had to have been vastly different from those farther east. As for animal (and to a lesser extent plant) names, there's the matter of the w:Wallace Line and other such boundaries: the farther east you go, the fewer non-marine Asian species you find. By the time you get to Polynesia, the only flightless land animals (New Zealand is an exception, or course) are human-transported creatures such as pigs, chickens, dogs and rats, and the only widely-distributed plants are those with seeds that can drift on the currents, or Polynesian canoe plants- if you don't have wild beasts, you're not likely to preserve inherited names for them. Chuck Entz (talk) 02:30, 30 June 2015 (UTC)
I've started to merge these. - -sche (discuss) 05:14, 29 May 2016 (UTC)
Done Done. I've moved the remaining 8 PWMP appendices and updated the entries that referred to PWMP in their etymologies. - -sche (discuss) 09:39, 15 May 2017 (UTC)


Renaming Kxoe

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


There's a bit of noise on BGC, but it seems that "Khwe" is more common that "Kxoe" for , and continues to be used in modern books. See w:Khwe language. —Μετάknowledgediscuss/deeds 06:12, 4 February 2017 (UTC)

Perusing Glottolog's large bibliography for this language, I note that Kxoe predominated and Kxoe was little used until about the year 2000, after which Khwe has predominated and Kxoe has been little used. Wikipedia uses Khwe and mentions that it is the preferred spelling. Renamed. - -sche (discuss) 07:51, 22 May 2017 (UTC)


More languages without ISO codes, part 5

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


{—Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC))


Comments:
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)
I've added Pai-lang and Nicola. - -sche (discuss) 06:31, 30 March 2017 (UTC)
I've added Moran. - -sche (discuss) 04:15, 1 May 2017 (UTC)

Sinúfana

The variety of names this language has and their homography with the names of the people and the place they lived, has made it hard to search for information on this language. And some scholars say there is nothing to find: The Indigenous Languages of South America (edited by Lyle Campbell and Verónica Grondona) quotes "Adelaar and Muysken (2004: 624): 'cannot be classified for absence of data.' Loukotka (1968: 257) grouped Zenú (Senú) with the Chocó Stock, though nothing was known of the language." However, I've been poking around for quite a while now, and finally found a copy of A. Oyuela-Caycedo's full article in the Handbook of South American Archaeology (edited by Helaine Silverman, William Isbell), which I'd previously only been able to see a page of. On the page I'd seen, Oyuela-Caycedo says "the descendants of the Sinú have lost their language, making it impossible to classify them in terms of known linguistic families (Adelaar and Muysken 2004). However, taking toponyms into consideration the area seems to have been occupied by Chibchan speakers." On the next page, however, Oyuela-Caycedo goes on to discuss Spanish records and mentions that the "name or title" of the Finzenú chief was "Tota", with which clue I tracked down Jaime Alberto Castro Núñez's Historia de la medicina en Córdoba: notas preliminares (2002), mentioning a few other words and names: "A la llegada de los cappunia, como los indios llamaban en su lengua a los españoles, la cacica de Finzenú era Tota; el Zenúfana era Nutibara y el Panzenú era Yapé." "At the arrival of the cappunia, as the Indians called in their language the Spanish, the chief of Finzenu was Tota, that Zenúfana was Nutibara and that Panzenú was Yapé." (Other mentioned names are Anunaybe and Quinuchu.) The ambiguity over whether Tota was a proper name or title appears to be original; I find what seems to be a copy of Sebastián de Benalcázar's writings, saying " Tota, nombre que no sabemos si era el de su cargo o propio de la última reina de este riquísimo pueblo". - -sche (discuss) 05:21, 11 May 2017 (UTC)

Even more languages without ISO codes, part 3

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


04:54, 6 July 2016 (UTC)

Australian languages
other codes

Alungul, Angkula

Alungul and Angkula are very similar (but, OTOH, so are other related languages), with Barry Alpher mentioning the following words: Alungul oth:o, Angkula otho "liver" (but e.g. Ikarranggal otho is also similar); Alungul mbolvm, Angkula ombolvm "mosquito" (but also e.g. Ogunyjan ombolvm); Alungul orormv, Angkula otil "nape" (the latter "likely a loan from Uw-Olkola odel", and other languages use different words); Alungul obmo(gng), Angkula obmu "nose" (but also e.g. Athima ubmu); Alungul atïl, Angkula atï "see" (Ikarranggal ara); Alungul amadhv, Angkula amadh "shin" (Athima amadhv); Alungul anggul, Angkula angkul "tooth" (Athima angkul, Ogunyjan enggul). Many words were even recorded from the same informant, with one scholar writing "West recorded considerable material in Ogh Alungul and Ogh Anggula from the now deceased Mr Jack Burton, but subsequently l was able to record and transcribe not only a few fragments more of these but a quantity of another dialect." It is difficult to say if they should be treated as one language, and what it would be called. They have separate codes for now. They could always be merged later. - -sche (discuss) 18:35, 25 May 2017 (UTC)

Other comments

I also added a code for Yangkaal, which Ethnologue had conflated into nny for some reason. - -sche (discuss) 03:25, 26 May 2017 (UTC)

Languages of France

For discussion about the addition of codes for Angevin (roa-ang), Champenois (roa-cha), Lorrain (roa-lor), and Franc-Comtois (roa-fcm), Orléanais (roa-orl), Poitevin-Saintongeais (roa-poi), and Tourangeau (roa-tou), and discussion of Bourbonnais and Berrichon, Mayennais and Sarthois and Percheron, see Wiktionary:Beer parlour/2017/May#Language_codes_for_Bourbonnais_and_Poitevin. - -sche (discuss) 21:57, 1 June 2017 (UTC)

Renaming Standard Moroccan Amazigh (zgh)

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The preferred form in the ISO 639-2 is: Moroccan Amazigh. — This unsigned comment was added by YesIn (talkcontribs) at 17:34, 24 June 2017‎.

I have moved this from Wiktionary talk:Language treatment/Discussions, the out-of-the-way place where finished discussions are archived. - -sche (discuss) 18:00, 24 June 2017 (UTC)
In general, we do try to avoid using "Standard" in language names (e.g. we have "Estonian", not "Standard Estonian"), so a rename would be good, and "Moroccan Amazigh" does seem to be more common than "Moroccan Tamazight". - -sche (discuss) 18:00, 24 June 2017 (UTC)
Support per -sche. This language is currently our only one using "Standard" of all the coded languages with categories. —Μετάknowledgediscuss/deeds 14:33, 27 June 2017 (UTC)
You can find in the official Request for new language code:

"Name(s) of language (English): (Required)

Moroccan Amazigh, Amazigh, Common Moroccan Amazigh, Standard Moroccan Amazigh, Moroccan Berber

If giving variant names, indicate preferred form first
Name(s) of language (French):

Amazighe marocain, amazighe, amazighe marocain commun, amazighe marocain normalisé, berbère marocain, amazighe standard.

If giving variant names, indicate preferred form first
Reference where found:

The French name is used by Institut royal de la culture amazighe (IRCAM). Amazighe is the name found in the French version of the constitution approved by a referendum on the 1st of July 2011.
http://lematin.ma/Events/discours-royal/constitution-referundum.pdf (article 5) and http://www.sgg.gov.ma/bo5952F.pdf?cle=42 (Official Government Gazette)
The qualificative "marocain"/"Moroccan" is used to make sure it does not refer to a specific Moroccan dialect of Berber (Moroccan being broader than the Central Atlas qualitificative used for ) or to the Berber/Amazigh family as a whole (Moroccan being narrower than the whole family’s coverage area).

Name(s) of language (indigenous):

ⵜⴰⵎⴰⵣⵉⵖⵜ". ‒ YesIn (discuss) 03:08, 5 September 2017 (UTC)


Renaming Banggarla (bjb)

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Judging by Google Books and Scholar searches, and based on the references cited by Glottolog and Wikipedia, the most common name for this language in recent literature (since the 1990s?) seems to be Barngarla, while the most common name historically/overall (still found in some recent references) is Parnkalla. Some old references have been updated from Parnkalla to Barngarla, e.g. Mark Clendon's Clamor Schürmann’s Barngarla grammar: A commentary on the first section (where the referred-to original had Parnkalla), which argues based on recordings as well as etymology that the name is /parnkarla/, with the /ŋ/-form Banggarla as a northern dialect form "or even an exonymic pronunciation". Banggarla, and Barngala with no second 'r' and Parnkarla with 'P' and two 'r's, seem relatively uncommon, and many other spellings exist (see Glottolog). I suggest a rename to Barngarla, or Parnkalla (this entails moving the categories). (Fr.Wikt has "banggarla" and "barngala" as separate languages, but this seems to be an oversight and I will let them know.) - -sche (discuss) 15:43, 26 June 2017 (UTC)

I support a rename to something, with "Barngarla" being my preference, but we seem to vacillate in general on whether we should use the more traditional spelling or the one that is becoming the standard. Compare "Kikuyu" vs "Gikuyu". —Μετάknowledgediscuss/deeds 19:22, 26 June 2017 (UTC)
Renamed. - -sche (discuss) 03:14, 9 July 2017 (UTC)


East Franconian (vmf), High Franconian (gmw-hfr)

Noting here that gmw-hfr was retired and vmf restored after becoming usable; see Wiktionary:Beer parlour/2017/November#Restoring_vmf_and_eliminating_gmw-hfr. - -sche (discuss) 23:45, 29 December 2017 (UTC)

Splitting Monguor into Mangghuer and Mongghul

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This was already discussed at Wiktionary:Beer_parlour/2016/December#Splitting_Monguor, but seeing as it's been a few days I've decided to use the circumstance of this being the preferred place for such requests to bump the topic. Crom daba (talk) 05:19, 12 December 2016 (UTC)

@Crom daba, what would happen to words in older literature such as de Smedt / Mostaert 1933 or Todaeva 1973 which are not clearly marked as being either Minhe or Huzhu? Now they can be added under Monguor. Someone can later label them as Mangghuer or Mongghul using {{lb|mjg|}}. After splitting, a lot of useful stuff from older sources will hang in the air. --Vahag (talk) 14:49, 22 December 2016 (UTC)

Most sources are identifiable as either Mangghuer or Mongghul, we could specify which resource contains what in the about page. A bigger problem for me is how do we even enter data from the old sources? Do we put in Todaeva's Cyrillic and Mostaert's pre-IPA phonetic symbols or do we transcribe it into Pinyin? I wasn't around long enough to see what the precedent here is. Crom daba (talk) 22:08, 22 December 2016 (UTC)
If you can specify which resource contains what in the about page, then I support the split. Otherwise, I would like to have three codes, like we do with ku (Kurdish macrolanguage), kmr (Northern Kurdish variety), ckb (Central Kurdish variety).
As for entering words from older sources, you can normalize them into Pinyin, as long as the rules of normalization are clearly defined. Look at what I have done with Udi at WT:AUDI. --Vahag (talk) 06:56, 23 December 2016 (UTC)
After some research, it appears that correspondence of Pinyin spellings (as written by Dpal-ldan-bkra-shis et al) and older attestations is non-trivial, so I will put off transcribing anything which isn't already in Pinyin, at least until I see an example of it supporting the full range of dialectal and historical Monguor variation. Crom daba (talk) 00:16, 24 December 2016 (UTC)
@Crom daba, Vahagn Petrosyan, Angr, Metaknowledge: I added codes for Mangghuer (xgn-mgr) and Mongghul (xgn-mgl). I suggest that we move as much content as possible to the new codes and update the orthography as we go, and that will give us an idea of whether or not it is feasible to retire the macrolanguage code yet. - -sche (discuss) 00:58, 2 June 2017 (UTC)


Splitting Evenki and Solon

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Solon is a language spoken by a Tungusic people living in Manchuria and Inner Mongolia, they consider themselves Evenks, but their language is somewhat different and is not usually counted among Evenki dialects in literature. We follow Ethnologue in categorizing it as Evenki, but most Russian Tungusic literature (Tsintsius, Vasilevich, ...) counts it as a separate language and I too think this would be preferable.

Also worth mentioning is that we have the language of Oroqen as a separate language already for whom Janhunen claims are " in fact much closer to the "Ewenki proper" (i.e., the Evenks of Siberia) than the Solon are" (quote from Wikipedia). Crom daba (talk) 08:37, 7 August 2017 (UTC)

Sadly (as the lack of response shows), I think you may be the only editor with knowledge of Tungusic. The English-language works I can find discussing the lects, while mostly general rather than specialist, also seem to mostly speak of them as separate languages, even though Wikipedia redirects "Solon language" to Evenki. I see that Solon currently has an etymology-only code ("evn-sol"); do you want it to have a "full" code and its own header and entries like дяви, @Crom daba? That seems reasonable, although the code should be "tuw-sol", I think, to fit the customary naming scheme describe in WT:Languages; if that's OK, ping me to add it or add it yourself to Module:languages/datax (and then update the entries that refer to it and remove the etymology-only code "evn-sol"). - -sche (discuss) 06:44, 29 December 2017 (UTC)
Yes thank you, I'd like it to have its own header @-sche. Crom daba (talk) 17:02, 29 December 2017 (UTC)
Done Done: , . - -sche (discuss) 23:17, 29 December 2017 (UTC)


Renaming (A)Ngas, anc

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming anc

A pretty clear case. We call it "Angas", but the name "Ngas" has been more common for quite some time now. Compare google books:"Angas language" with google books:"Ngas language" for an example. —Μετάknowledgediscuss/deeds 20:53, 14 August 2017 (UTC)

Yes, perusing those searches and Glottolog's list of reference works about it, I see that some books do still use "Angas" but "Ngas" does seem to have been more common for at least a couple decades. (And you have access to better resources on African languages than I do and I trust your judgment.) I recall that we renamed the related Mwaghavul (from "Sura") only a couple years ago. Renamed. - -sche (discuss) 06:15, 29 December 2017 (UTC)


Removing Jorto, jrt

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Removing jrt

An anon has pointed out that Jorto language is apparently a spurious invention. There is a wordlist, and a case could be made for including it, but given that we chose to exclude the Oropom language, I don't see how this is any different. —Μετάknowledgediscuss/deeds 00:57, 28 December 2017 (UTC)

Remove before it lays eggs. Palaestrator verborum sis loquier 🗣 01:07, 28 December 2017 (UTC)
Done Done. - -sche (discuss) 23:39, 29 December 2017 (UTC)


Even more languages without ISO codes, part 4

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Even more languages without ISO codes, part 4

Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)

Others:

  • Alazapa/Alasapa/Pinto (nai-ala), a fragmentarily attested language of northern Mexico, scarcely described in English and not that much better described in Spanish. Considered to be related to Quinigua and Cotoname by Norman A. McQuown's 1968 Handbook of Middle American Indians (volume 5: Linguistics, page 100), it is sometimes identified with or considered a dialect of Coahuilteco, apparently as part of the former belief that the "the Coahuiltecans belonged to a single language family and that the Coahuiltecan languages were related to the Hokan languages of California, Arizona, and Baja California. Most modern linguists, however, discount this theory for lack of evidence and believe that the Coahuiltecans were diverse in both culture and language. At least seven different languages are known to have been spoken" Some of the scholars who responded to and followed up on del Hoyo's vocabulary of Quinigua provide a few Alazapa words, like axi "tobacco" (compared to Karuk úuh "tobacco", Esselen ka'a "tobacco"). - -sche (discuss) 03:48, 6 June 2017 (UTC)
    • I'm away from my books at the moment, but I seem to remember Yuman languages having something along the lines of /up/ for tobacco. Chuck Entz (talk) 04:01, 6 June 2017 (UTC)
      • Yes, Cocopa has ˀu·p "tobacco", and ˀu·p xyay "smoke tobacco". Quechan/Yuma itself has axta/ak’sa’ for a tobacco pipe; in a short search, I didn't find the word for tobacco itself, but the list I found the Alazapa word in was comparing it to mostly words for "tobacco" but some words for "pipe". (I updated the spelling of the Karuk and Esselen words to the spellings given in dedicated works on those languages.) - -sche (discuss) 04:30, 6 June 2017 (UTC)
    Done Added. - -sche (discuss) 19:48, 16 June 2017 (UTC)


Ossetian: RFM discussion: December 2012–January 2018

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Template:os

The Digor and Iron dialects of Ossetian seem quite different, and already many (most?) of our entries distinguish which is meant. It seems to me that there is a fair chance that the two are separate enough to deserve being called different languages here. —Μετάknowledgediscuss/deeds 04:46, 19 December 2012 (UTC)

I've seen them referred to as separate languages before, but there's still some debate over that. Doesn't matter to me. But would there still be plain Ossetian language entries or would all be sorted into the new languages? There are some that aren't labelled as either Iron or Digor.Word dewd544 (talk) 17:58, 20 December 2012 (UTC)
Iron is by far the more common dialect, and the literary Ossetian language is based on Iron. However, Digor is different enough that it could be considered a separate language. The main things against it here are the relatively small number of speakers and that it does not yet have a written standard, as far as I know. But there is now a Digor dictionary out there, and it’s probably just a matter of time before Digor develops a literary standard of its own. I think it’s unlikely that we will get enough Digor contributions to make a difference, but it is always possible that someone will start entering words from a Digor dictionary. The Digor language code is oss-dig. We could use os for Ossetian proper (and Iron), and oss-dig for Digor. —Stephen (Talk) 02:20, 21 December 2012 (UTC)
In that case, I support. — Ungoliant (Falai) 03:40, 21 December 2012 (UTC)
A standard language based on one widely-spoken dialect, and another lect sometimes considered a dialect and sometimes considered its own language? This reminds me of Tosk vs Gheg Albanian: some references say they're mutually unintelligible separate languages, speakers say their differences present no impediment to communication. Unfortunately, we lack speakers of the Ossestian lects, and the dictionary of Digor is said to waffle, the author calling it a language and the editor calling it a dialect. Stephen is probably right that it's just a matter of time before Digor develops its own standard (and merits separation as much as Luxembourgish and Limburgish do from each other and from German); OTOH, Wiktionary, like Wikipedia, is not a crystal ball. My preference would be to wait and not split them for now. If we do split them, I agree with using {{os}} for Iron (compare {{lt}} and {{sgs}}), and we should devise an exceptional code for Digor that fits our usual naming scheme (ira-odg or ira-dig), rather than using Linguist List's ersatz "oss-dig". - -sche (discuss) 03:42, 21 December 2012 (UTC)
Etymology-only codes have been created for Digor and Iron, per the last part of this January 2018 WT:ES thread. - -sche (discuss) 03:21, 10 January 2018 (UTC)
Not merged at this time; cf the WT:ES discussion; but especially if the developments with regard to resources in the lects over the past few years suggest they should, in fact, have separate codes, please feel free to reopen (or start a new, non-stale) discussion. - -sche (discuss) 14:47, 12 January 2018 (UTC)


Renaming (Tshi)Luba, lua

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming lua

This language is currently called "Tshiluba", which is a really awful choice. First of all, tshi- is that good ol' language prefix that we often try not to have in language names (which I think is ci- in modern orthography), and there are in fact two Luba languages (the other is lu "Luba-Katanga"). To avoid confusion, we rightfully give neither the name Luba, but this is not much better, and we should rename it to "Luba-Kasai", as Wikipedia does. —Μετάknowledgediscuss/deeds 19:28, 27 November 2015 (UTC)

On the one hand, we do try to avoid prefixes. On the other hand, "Luba-Kasai" seems to more often be a placename and an ethnonym than a language name, and "Tshiluba" seems to be about twice as common. - -sche (discuss) 21:15, 8 January 2016 (UTC)
The issue is that, AFAICT, "Tshiluba" is more commonly used because it refers to both Luba languages! This is not so much about prefixes so much as the issue of the name being exceedingly ambiguous in its referent. —Μετάknowledgediscuss/deeds 01:28, 9 January 2016 (UTC)
OK, renamed. - -sche (discuss) 21:19, 14 January 2018 (UTC)


Taishanese and Teochew

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


@-sche I think Taishanese dialect of Cantonese and Teochew dialect of Min Nan both need language codes. They are covered by Chinese pronunciation modules but theire transliterations is very different from Cantonese and Min Nan accordingly. E.g. in 鉛筆铅笔 (qiānbǐ) "yon3 bit2" is not standard Jyutping and "ing5 big4" is Teochew Peng'im, not POJ. @Kc_kennylau, Wyang, Suzukaze-c, Justinrleung. Perhaps these subdialects needs nesting in translations and numbered tone marks (also Gan, Jin, Xiang) also need superscripts, just like Cantonese Jyutping. --Anatoli T. (обсудить/вклад) 11:32, 12 November 2016 (UTC)
typo fix→@Kc_kennylau, Wyang, Suzukaze-c, Justinrleungsuzukaze (tc) 12:05, 13 November 2016 (UTC)
Thanks, suzukaze! --Anatoli T. (обсудить/вклад) 12:08, 13 November 2016 (UTC)
They seem to have language codes already (yue-tai for Taishanese and nan-teo for Teochew); at least they work in the etymology templates. — justin(r)leung (t...) | c=› } 00:58, 14 November 2016 (UTC)
@Justinrleung Thanks but if I try add a translation using 'yue-tai' I get the error:
Lua error in Module:languages/templates at line 28: The language code 'yue-tai' is not valid.: Lua error in Module:parameters at line 110: The parameter "<strong class" is not used by this template. --Anatoli T. (обсудить/вклад) 06:15, 14 November 2016 (UTC)
Yes, those are etymology-only codes. They would standardly need different codes if we are to treat them as full languages, though. The point is that you guys need to decide what status you want these lects to have. @Atitarev, Justinrleung, suzukaze-c, WyangΜετάknowledgediscuss/deeds 06:24, 14 November 2016 (UTC)
If there's a need to enter them into translations tables (on account of their different transliteration and, according to Wikipedia, sometimes different vocabulary), they could be given full codes, which as Meta notes would be named a little differently (using the system described in WT:LANG): zhx-tai and zhx-teo. I await Wyang's input. As an aside, we should consider taking a Chinese approach to Arabic, i.e. not have separate headers for each dialect, but retain the option of listing each dialect's pronunciation and maybe having each dialect in translations tables. - -sche (discuss) 16:36, 14 November 2016 (UTC)

@Kc_kennylau, Wyang, Suzukaze-c, Justinrleung, -sche, Metaknowledge Thanks all. Will further nesting open a Pandora's box of nesting subdialects if we do it like this? (pls note new rows for Taishanese and Teochew):

* Chinese:
*: Cantonese: {{t|yue|鉛筆|sc=Hani}}, {{t|yue|铅笔|tr=jyun4 bat1|sc=Hani}}
*:: Taishanese: {{t|zhx-tai|鉛筆|sc=Hani}}, {{t|zhx-tai|铅笔|tr=yon3 bit2|sc=Hani}}
*: Gan: {{t|gan|鉛筆}}, {{t|gan|铅笔|tr=nyyon4 'bit6}}
*: Hakka: {{t|hak|鉛筆}}, {{t|hak|铅笔|tr=yèn-pit}}
*: Jin: {{t|cjy|鉛筆}}, {{t|cjy|铅笔|tr=qie1 bieh4}}
*: Mandarin: {{t+|cmn|鉛筆|sc=Hani}}, {{t+|cmn|铅笔|tr=qiānbǐ|sc=Hani}}
*: Min Dong: {{t|cdo|鉛筆}}, {{t|cdo|铅笔|tr=iòng-bék}}
*: Min Nan: {{t+|nan|鉛筆}}, {{t|nan|铅笔|tr=iân-pit}}
*:: Teochew: {{t|zhx-teo|鉛筆|sc=Hani}}, {{t|zhx-teo|铅笔|tr=ing5 big4|sc=Hani}}
*: Wu: {{t|wuu|鉛筆}}, {{t|wuu|铅笔|tr=khe piq}}
*: Xiang: {{t|hsn|鉛筆}}, {{t|hsn|铅笔|tr=yan2 bi6}}

There's some work for translation adder as well. --Anatoli T. (обсудить/вклад) 21:27, 15 November 2016 (UTC)

In this case I would really prefer:

* Chinese: {{zh-l|鉛筆}}

where {{zh-l}} extracts and displays the simplified form from the entry, as well as extracts all the readings of the word from the entry, and (if on a computer) displays the readings on hover-over or (if on a mobile device) something. Additional topolect-specific translations can be added as:

* Chinese: {{zh-l|鉛筆}}
*: Cantonese: {{zh-l|鉛鉛筆}}
*: Mandarin: {{zh-l|筆鉛鉛}}

Wyang (talk) 21:34, 15 November 2016 (UTC)

We'd need a full-blown {{zh-t}} that can serve the function of linking to zh.wikt if an entry exists, and you'd also need to run a bot to update those every now and then (or convince Ruakh to do it). This would imply some changes to the translation adder as well, and possibly other gadgets and bits of code lying around. In short, that's a big jump from what Anatoli proposed. It does sound like a good idea, though, if you want to put the requisite work into it. —Μετάknowledgediscuss/deeds 00:10, 16 November 2016 (UTC)
@Wyang: so, do we need to add language codes for these varieties (in which case, please add them), or are they adequately covered by zh? - -sche (discuss) 23:19, 11 January 2018 (UTC)
@-sche I think having full language codes for these would be useful in translation tables. @Justinrleung, Atitarev, Suzukaze-c, Tooironic, Dokurrat What thinkest thou? Wyang (talk) 13:12, 12 January 2018 (UTC)
@Wyang: Thank you for pinging me but I'm not familiar with Cantonese and Banlamgu and their regional speeches and I currently have no idea about this... Dokurrat (talk) 13:35, 12 January 2018 (UTC)
@Dokurrat Thank you for your frankness... :) Wyang (talk) 13:42, 12 January 2018 (UTC)
Support. — justin(r)leung (t...) | c=› } 13:35, 12 January 2018 (UTC)
OK, thank you all for your input. Done Done. :) - -sche (discuss) 14:44, 12 January 2018 (UTC)
Hmm, having translations is nice but do we really want Category:Teochew lemmas? —suzukaze (tc) 23:18, 13 January 2018 (UTC)
I presume such categories would be populated the same way as Category:Min Nan lemmas (and Category:Old Chinese lemmas, etc), i.e. the entries use the consolidated "Chinese" L2 header... at which point, having Category:Teochew lemmas seems no better or worse to have than Category:Min Nan lemmas. (There are some other languages where it could be good to do similar, e.g. Arabic and possibly Romani.) - -sche (discuss) 23:38, 13 January 2018 (UTC)
That's true. It seems weird though, because Teochew is Min Nan, and Taishanese is Cantonese. —suzukaze (tc) 05:18, 14 January 2018 (UTC)


Diegueño

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Diegueño

I thought I might dig out a few references I have and add a few entries in one of the lects that go by this name, but I'm a bit confused by the way we have the language codes set up.

Diegueño is the name that anthropologists have traditionally used for the language originally spoken around Mission San Diego in the southwestern corner of California. Older references referred to it as single language covering most of San Diego County, California and northern Baja California, Mexico, but I always understood it to be at least three languages:

  1. Northern Diegueño, known to its speakers as 'Iipay 'aa, and generally called Ipai in the literature
  2. Central Diegueño, now called Kumeyaay or Kumiai
  3. Southern Digueño, now calley Tipai or Tipay

Just to confuse things, Kumeyaay is sometimes used to refer to all three, and there are some scholars who have merged Tipay and Kumeyaay into a single language that they call Tipai. Then there's Kamia, which has been used in older literature for a variety of groups who all seem to have been Diegueño of one sort or another. The ISO has a single code, dih (which we call Kumiai) and our lone entry using that code is the 'Iipay 'aa word for water. That would make sense if we treated all of Diegueño as one language, but we have an exception code for Tipai: nai-tip, and a single entry. As far as I know, no one currently considers Ipai to be part of Kumiai unless they consider Tipai to be part of it, too- hence my confusion.

I've only studied Ipai (perhaps I should say "dabbled in"), so I can't judge for myself how different the lects are from each other. Based on what little I have read, it would seem to me that we have just a few credible options:

  1. Treat Diegueño as a single language, keeping dih and retiring nai-tip
  2. Treat Ipai as separate, but merge the rest of Diegueño into Tipai
  3. Treat each of the three lects as independent, preferably all with exception codes (nai-ipa, nai-kum and nai-tip, perhaps?).

I would recommend against using dih for anything but the single-language option- this isn't a macro-language with a standard or prestige lect, and it would probably just confuse things. Chuck Entz (talk) 06:12, 19 June 2017 (UTC)

(After digging into the history of the codes, I've refreshed myself that) I added the code for Tipai in diff, so I could add diff, after seeing that we called dih "Kumiai" and taking that to mean that it referred to the Central dialect and the others needed codes. (Apparently, the ISO/Ethnologue's actual reason for calling it "Kumiai" is that some people use "Kumiai" instead of "Diegueño" as the cover term, as you note.) I probably didn't add a code for the Northern variety at that time because I didn't want to bother figuring out what to call it ("'Iipay 'aa" struck me as a suboptimal name; do we have other names with spaces in them where the part after the space isn't capitalized?) at a time when no-one was coming forward with content that needed to be added in it. :p
Victor Golla, California Indian Languages (2011, →ISBN, page 120, says: "While Kroeber (1925) and others treated the Kamia as a Diegueño subgroup, there is no firm evidence in support of this approach, although the name they are known by appears to be a variant of Kumeyaay (Langdon 1975a). With this possible exception, all of the groups definitely known to have spoken varieties of Diegueño were located west of the present San Diego-Imperial County line or in Baja California west of the Sierra de Juarez. Although Ipai and Tipai are to some extent mutually intelligible, they show numerous differences in vocabulary and structure (for a comparison of Mesa Grande Ipai and Jamul Tipai see A. Miller 2001:359-363) and have sometimes been treated as separate languages. Winter judged to be no closer to (Northern) Diegueño than to Cocopa. The most recent classification (Langdon 1991; Miller 2001:1-4) distinguishes ."
Amy Miller's referred-to work, A Grammar of Jamul Tipay, says "A comparison of descriptions of Mesa Grande with the results of my own research reveals that differences between Mesa Grande and Jamul pervade the phonology, lexicon, derivational morphology, inflectional morphology, syntactic morphology, syntax, and discourse."
OTOH, Tipai Ethnographic Notes: A Baja California Indian Community (2001, →ISBN, edited by Langdon et al, says: "These Mexican Diegueno, who call themselves Ipai or Tipai 'people,' cannot be described as now forming a tribe; they are a group of Indian families speaking mutually intelligible dialects (Northern and Southern) of a language"
- -sche (discuss) 18:51, 24 June 2017 (UTC)
Amy Miller has a comparison of Ipai and Tipai in her work cited above; I have put a shorter comparison of various words at User:-sche/Diegueño. - -sche (discuss) 20:19, 24 June 2017 (UTC)
@Chuck Entz I have split (and retired) dih into three codes as proposed above. - -sche (discuss) 22:02, 19 January 2018 (UTC)


Renaming Azeri to Azerbaijani

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming az

I propose to rename the language name Azeri to Azerbaijani. The reasons are (1) to distinguish it from the Iranian language Azeri, (2) Ethnologue, Glottolog and Wikipedia call the language Azerbaijani, (3) Azerbaijani is closer the native form Azərbaycan dili, (4) Azeri has pejorative connotations in Armenian. @Allahverdi Verdizade, currently the most active Azerbaijani contributor, agrees. --Vahag (talk) 14:01, 24 January 2018 (UTC)

Support. Allahverdi Verdizade (talk) 15:52, 24 January 2018 (UTC)
Support. And to summon evidence more appropriate to our dictionarying efforts, Google Books searches for "speaking X" or "the X language" have invariably returned more hits with "Azerbaijani" than with "Azeri" when I have tried them. —Μετάknowledgediscuss/deeds 17:22, 24 January 2018 (UTC)
Pinging several potentially interested users for more opinions: @Atitarev, ZxxZxxZ, Crom daba, Anylai. --Vahag (talk) 15:23, 25 January 2018 (UTC)
Support. It took me some getting used to the English "Azeri" at Wiktionary, which also has negative connotations in Russian and many Azerbaijanis speak Russian. а́зер (ázer) is pejorative for азербайджа́нец (azerbajdžánec) in Russian. --Anatoli T. (обсудить/вклад) 15:28, 25 January 2018 (UTC)
Support. Azeri is a little "vague" while Azerbaijani is kind of long and sounds only limited to Azerbaijan, but I support the change, "Azeri" is used for people rather than the language itself in Turkish as well. --Anylai (talk) 17:33, 25 January 2018 (UTC)
Support --Z 20:06, 25 January 2018 (UTC)
Support for the reasons Vahag and Meta outlined. Someone with a bot will need to implement the rename, because at least three thousand entries will need to be updated, and that's just the ones where the L2 header needs to be changed; in other entries, translations tables and descendants lists and etymologies where the name is spelled out will need updating. - -sche (discuss) 23:51, 25 January 2018 (UTC)
Support. I never liked that we used that name. PseudoSkull (talk) 00:16, 26 January 2018 (UTC)
Looks like this is uncontroversial, so Done Done. I made a request at the Grease Pit for a rename by a bot. --Vahag (talk) 17:52, 9 February 2018 (UTC)

Has anyone modified the tool that assists in adding translations to translation tables so that it uses the word “Azerbaijani”? — SGconlaw (talk) 04:57, 10 February 2018 (UTC)

That happens automatically. DTLHS (talk) 05:04, 10 February 2018 (UTC)
Oh, good. — SGconlaw (talk) 05:54, 10 February 2018 (UTC)


Removing spurious Dororo, Guliguli, Yarsun, and Wares

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Some spurious languages to merge or remove, 1
merge Dororo and Guliguli into

In 1953, Lanyon-Orgill provided short wordlists of two languages he called Dororo and Guliguli. His lists are so similar to Kazukuru that subsequent scholars have suspected they are dialects, if not alternative transcriptions, if not jokes. Karen Davis, in A grammar of the Hoava language, Western Solomons, notes "there was no one in present day Kusaghe who had heard of , and Lanyon-Orgill does not identify his informant. As one of the names of the dialects, Guliguli, can mean 'masturbate' in Hoava-Kusaghe, I have my doubts about the existence of this language." Michael Dunn and Malcolm Ross expand on Davis's point in their 2007 article Is Kazukuru really non-Austronesian, with the view (accepted also by e.g. Harald Hammarström and Sebastian Nordhoff, in Melanesian Languages on the Edge of Asia: Challenges for the 21st Century) that if the lects were real, they were the same language as Kazukuru, which I propose to merge them into. (Sample words in Kazukuru, Guliguli, and Dororo: pito, bito, bito "arrow"; vinovo, vino, bino "banana"; viniti, vini, vinitini "body"; minata, minate, minate "die"; meta, mata, mata / meta "eye"; rano, rano, rano "head"; muni, moni, muni / moni "night".) - -sche (discuss) 06:44, 12 May 2017 (UTC)

Done Done. - -sche (discuss) 21:03, 19 January 2018 (UTC)
remove Yarsun and Wares

As noted by Hammarström, Melanesian Languages on the Edge of Asia: Challenges for the 21st Century, the existence of a Yarsun language seems have been based on the confusion of language names with place names which is not uncommon in Papua (Yarsun is near Anus Island, and that's not a joke); "no such language is attested". Likewise with Wares. - -sche (discuss) 06:44, 12 May 2017 (UTC)

Done Done. - -sche (discuss) 23:26, 29 December 2017 (UTC)


RFM discussion: July 2016–January 2018

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Bunurong

This Pama-Nyungan lect seems to have been given an exceptional code with no discussion, for its use at one transwikied entry, ngargee. It's by no means clear that we should be giving it a separate code rather than treating it as a dialect of Woiwurrung wyi; Wikipedia claims they are 90% mutually intelligible, but doesn't cite that claim directly. —Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)

I noticed this a while ago, but left it alone because I could not at the time find a reference (i.e. outside WP) that confirmed that they were mutually intelligible, probably because the huge variety of spellings made searching for information difficult. However, I can now find references that suggest we should be merging more than just these two. Leigh Boucher and ‎Lynette Russell's Settler Colonial Governance in Nineteenth-Century Victoria (2015, →ISBN, page 8, speaks of "the Woiwurrung (Wurundjeri), Boonwurrung, Wathaurung, Taungurong and Dja Dja Wurrung mutually intelligible languages that share up to 80 per cent of their terminology." A paper by Barry Blake and Julie Reid on Sound Change in Kulin, in the La Trobe Working Papers in Linguistics, v 6-8 (1993), speaks of a single Kulin language, with "material available on three dialects: Boonwurrung (B), Woiwurrung (W) and Thagungwurrung (T)" (emphasis mine). Dja Dja Wurrung = dja, and Taungurong / Thagungwurrung = dgw, and Wathaurung = wth, all of which we currently treat as separate. Kulin suggests that at least the four eastern ones, if not also Wathaurung, could be merged. - -sche (discuss) 06:48, 6 July 2016 (UTC)
I've merged Bunurong; the others still need to be merged. - -sche (discuss) 16:52, 18 September 2016 (UTC)
Bunurong was merged at the time this discussion was current. The discussion has since gone stale, and I am going to leave the others alone (unmerged). - -sche (discuss) 06:26, 29 January 2018 (UTC)


RFM discussion: January–March 2018

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


ISO code changes for 2017

Here is a document summarising the changes to ISO codes that have been made in 2017. I have gone through all the approved request documents and briefly summarised their findings and my conclusions about what I propose we should do. —Μετάknowledgediscuss/deeds 10:36, 31 January 2018 (UTC)

Deprecated codes

  • Mosiro
    Only a clan name, confirmed with fieldwork. I concur with removal.
  • Ndaktup
    Merge into Kwaja , confirmed with fieldwork. I concur with removal.
  • Lyons Sign Language
    Apparently spurious. I concur with removal.
  • Mediak
    Only a clan name, confirmed with fieldwork. I concur with removal.
Done I also agree, and have removed them all, adding Mosiro and Mediak and several other things as alt names of oki (our spelling of the canonical name, Okiek, seems more common than Wikipedia's Ogiek), and adding Ndaktup and several other forms as alt names of kdz. - -sche (discuss) 16:56, 31 January 2018 (UTC)

Added codes

  • Gyalsumdo
    Confirmed with fieldwork and dictionary creation is underway. I concur with creation.
    Done Done, tentatively with Latn, Deva and Tibt scripts listed, based on the two scripts Ethnologue lists for broader Manang and how I see linguists documenting it. - -sche (discuss) 23:35, 7 February 2018 (UTC)
    But is this consistent with how we treat other Tibetan/Tibetic varieties? E.g. kte is subsumed into bo, somewhat like with Chinese varieties. This may need further discussion, maybe separate from this big list of codes, with Wyang. - -sche (discuss) 23:58, 7 February 2018 (UTC)
    I realised this when reading your comment at Talk:གསར་པ. We decided on a Chinese-style merger, but this compounds the lack of interest and expertise that has plagued documenting anything other than Lhasa Tibetan (which is admittedly the only one I have formally studied either). —Μετάknowledgediscuss/deeds 00:35, 8 February 2018 (UTC)
  • Ngen
    Confirmed with a Swadesh list as worthy to be split from Beng . I cannot find any documentation after a quick search, nor do I find confirming fieldwork. I abstain on creation.
    Glottlog, citing Anna Maloletnyaya's 2014 Brief presentation of Ngen language, agrees that "Ngen, a Southeastern Mande language of the Ben-Gban group not intelligible to Beng". Therefore, added. - -sche (discuss) 03:55, 7 March 2018 (UTC)
  • Western Armenian
    Request document is chiefly concerned with social needs (e.g. a new Wikipedia). There are significant dialectal differences, but Western and Eastern Armenian are known to be mutually intelligible and have been successfully treated as at Wiktionary (though there has been little to no prior discussion). There is currently a parallel discussion of this issue in the Beer parlour. I disagree with creation.
    The BP discussion will take care of this (and seems to be reaching the conclusion that the code should not be added). - -sche (discuss) 23:27, 7 February 2018 (UTC)

Name changes

  • Helambu Sherpa
    Changed to Hyolmo, based on the current name being a misnomer (the speakers are not Sherpas and the language is not closely related to Sherpa) and preference of native speakers. I find that Yolmo is in fact the most-used spelling on Google Books and Google Scholar, so I suggest we use that name instead.
    Done Done (Yolmo). - -sche (discuss) 05:25, 13 February 2018 (UTC)
  • Dzodinka
    Changed to Lidzonka, based on preference of native speakers. This name has almost never been used in the linguistic literature, whereas Dzodinka has and the request document itself admits that "both are correct". I disagree with the name change.
    Ergo, not renamed. (But struck, for the sake of tracking which ones have been looked at.) - -sche (discuss) 05:25, 13 February 2018 (UTC)
  • Shixing
    Changed to Shuhi, based on local usage; the requester actually uses the name Xumi when documenting the language. All other use in the linguistic literature appears to follow the original name, Shixing. I disagree with the name change.
    Ha, weird. N Not done at this time. - -sche (discuss) 05:25, 13 February 2018 (UTC)
  • Irigwe
    Changed to Rigwe, based on the current name being erroneous. What little linguistic literature there is seems to have fully shifted over. I support the name change.
    Done Done. - -sche (discuss) 16:50, 22 February 2018 (UTC)
  • Palor
    Changed to Paloor, based on the language's new orthographic norms. There is very little linguistic literature on it, most of it in French, and it is difficult to tell what name is or will be more common. I abstain on the name change.
    The new spelling may well take hold, but it hasn't yet (and we have no contributors in the language who might clamor to use the new spelling when adding entries), so, not renamed at this time. We should reexamine this in a few years. - -sche (discuss) 16:50, 22 February 2018 (UTC)
  • Australian Sign Language
    Changed to Auslan, based on popular and linguistic usage. This does seem to be more common as the standard name, and is preferred by native speakers. I support the name change.
    Has been done, apparently.

The name of was also changed from "Khmer, Central" to "Khmer", but we have chosen to exclude this code, so it does not affect us.

Μετάknowledgediscuss/deeds 10:36, 31 January 2018 (UTC)

Where should discussions like this be archived? — SGconlaw (talk) 03:40, 22 February 2018 (UTC)
They are archived to Wiktionary:Language treatment/Discussions. If you're asking so you can archive them, you'd be better off leaving them for -sche. —Μετάknowledgediscuss/deeds 03:48, 22 February 2018 (UTC)
*Thumbs up* — SGconlaw (talk) 03:57, 22 February 2018 (UTC)
Oh, no, the more people who know how to do this kind of thing, the better! :) You just have to be sure all the requests have been handled (done/accepted, or rejected). But yes, WT:LTD is the catch-all for anything that doesn't have a more logical place to be archived; and then, if there's a change in how we treat a language (e.g. a language has been split between two codes), update WT:LT and link to the discussion. If a language code has been removed (e.g. merged into something else), I find that it's helpful (and prevents accidental re-creation) to leave a --placeholder comment in the module (as I did for e.g. "frc" and am slowly doing for other codes). - -sche (discuss) 16:42, 22 February 2018 (UTC)


RFM discussion: March 2018

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Songhai to Songhay

Our spelling for the name of the Songhay languages is simply much less common, on Google, Google Books, or in the literature. Note that we will also need to change the name of Proto-Songhai if this changes. —Μετάknowledgediscuss/deeds 01:10, 14 March 2018 (UTC)

Done Done. - -sche (discuss) 18:12, 17 March 2018 (UTC)


Proto-Sarmato-Alanic

">edit]

Copied from WT:Etymology scriptorium/2018/April.

OK, let me try this again. I'd like to have Alanic xln renamed to Sarmato-Alanic and have entries use xln-pro. Alanic and Sarmatian (which has no language code otherwise) occupy a dialect continuum, and neither might be the direct ancestor of Ossetian or Jassic. Alanic would then be made into an etymology-only code, xln-ala. Alternatively, a new language code could be created, ira-sma-pro, and xln removed all together, but I think the former the better option. @-sche, Tropylium --Victar (talk) 20:30, 3 April 2018 (UTC)

There are a couple problems here that I see just from a brief reading of the discussion. You want us to change to a name that is very rarely used instead of a more common one, and also switch to considering it a protolanguage rather than an attested one, despite the fact that it is actually attested (yes, we do that for Proto-Norse, but it is not ideal and is in large part because, as is relevant here, we try to use the most commonly used names where possible). —Μετάknowledgediscuss/deeds 01:19, 5 April 2018 (UTC)
Thanks for the reply, @Metaknowledge. 99% of the entries I'll be entering will be reconstructions, and most will be derived from Ossetian, not Alanic or Sarmatian borrowing. I think there is a ton of precedence for using alternative names for codes on wikt, but I'll concede that I can't think of any example of using the language code of a dialect to refer to a whole dialect family. I'm not opposed to using ira-sma-pro instead, but I do still think then the xln code should be discontinued, because Alanic, Sarmatian and Proto-Ossetic should all be under the same code. --Victar (talk) 04:13, 5 April 2018 (UTC)
To give an example of what I had in mind for formatting descendant trees:
* Sarmato-Alanic {{l|xln-pro}}
*: Alanic: {{l|xln-ala}}
*: Sarmatian: {{l|xln-sar}}
** Ossetian: {{l|os}}
--Victar (talk) 04:22, 5 April 2018 (UTC)
The proportion that are reconstructions is irrelevant unless it's 100%. When we have mainspace entries, we should avoid assigning them to protolanguages (which are technically hypotheses) wherever possible. We always try to use to the most common, unambiguous name possible, and if you know of any exceptions, we should see if they ought to be fixed. Basically, I think you're conflating the needs of descendant trees and the criteria for determining what ought to be a separate language. Bear in mind that regardless of what codes and names are, you can always structure descendant trees to show distinct dialects or sublects (Crom daba has done this quite fruitfully). —Μετάknowledgediscuss/deeds 04:48, 5 April 2018 (UTC)
@Metaknowledge, I think you're missing the point of my need. I want to create reconstructions of Proto-Ossetic and Sarmatian. Sarmatian and Alanic are well established as two separate dialects. --Victar (talk) 04:59, 5 April 2018 (UTC)
And if they're dialects, they shouldn't have separate codes. Remember, you can still give them separate lines and reconstructions, when and where those are supported by scholarly sources, regardless of the situation with codes. —Μετάknowledgediscuss/deeds 06:18, 5 April 2018 (UTC)
Exactly, @Metaknowledge, which is why Alanic shouldn't have its own code. --Victar (talk) 07:37, 5 April 2018 (UTC)
Then why did your example above have them with two different codes? —Μετάknowledgediscuss/deeds 16:22, 5 April 2018 (UTC)
Sorry, I tried to indicate xln-ala and xln-sar where etymology-only codes by the dash in them, but I guess that wasn't clear (though I did make that point in my opening statement). We also have oos for Old/Proto-Ossentic, which we could used instead for parent of Alanic/Sarmatian/Ossentic. We currently list it below xln, and I've always considered it a stage between, yet MultiTree seems to use it as their catch-all. Again though, I'm not opposed to using a new ira-sma-pro code. --Victar (talk) 17:34, 5 April 2018 (UTC)
We create etymology-only codes for etymology sections. If there's a language that derives terms from both Alanic and Sarmatian with a meaningful difference between the two, then those codes should exist, but we shouldn't create them just for descendant lists, which can be freely formatted. —Μετάknowledgediscuss/deeds 18:39, 5 April 2018 (UTC)
  • Here's a different idea. Alanic and Sarmatian are both barely attested; from my brief reading of the literature, it seems to be unclear whether or not they represent dialects or fully separate speech communities, and whether they represent the ancestor of Ossetian or a close relative (and given the timescales over which they are attested, they cannot be the same thing as a protolanguage, which is a hypothesis of the most recent common ancestor). The resultant action would be to have separate codes for Alanic, Sarmatian, and Proto-Ossetic, with the former two only in mainspace (in original script, e.g. Greek) and the latter only in Reconstruction space (in normalised form). As always, you can format descendant lists however you like. Does that seem like it would make sense? —Μετάknowledgediscuss/deeds 18:39, 5 April 2018 (UTC)
@Metaknowledge: No, I don't agree with that solution. Whether they exhibit the same exact timeframe is irrelevant and labeling Sarmatian and Ossetic as Alanic is inaccurate. What my sources are reconstructing is a common ancestor of all three of these dialectal branches. See https://ibb.co/jFEnSH. --Victar (talk) 19:01, 5 April 2018 (UTC)
Your response is confusing; I did not suggest labelling Sarmatian and Ossetic as Alanic (in fact, I suggested separating all three), and I was under the impression that Proto-Ossetic is the unattested ancestor of all these languages. —Μετάknowledgediscuss/deeds 19:30, 5 April 2018 (UTC)
Am I understanding this correctly that it is similar to the problem that he have had/are having with Sanskrit and the Prakrits, namely that there is a dialectal continuum between Sarmatian, Alanic, and the unattested ancestor of Ossetian? —*i̯óh₁n̥C 19:24, 5 April 2018 (UTC)
@JohnC5: Not exactly. Sarmatian, Alanic and Ossentic are all largely unattested dialects form a single language of the Middle Iranian period. So what I'm suggesting is that we unify them under a single code and name, and have the dialects differentiated only by etymology-only codes. I'm recommending the name Sarmato-Alanic, which is what I mostly see in literature when referring to them as a whole, but if I had to choose to unify them under one name out of the three, it would be Ossetic, being the only one with modern descendents. What code we use, be it a repurposed one, or a new one, doesn't matter to me. --Victar (talk) 20:26, 5 April 2018 (UTC)
@Metaknowledge: Would you support the idea of unifying Sarmatian and Alanic under Old Ossetic oos (as per MultiTree) with both reconstructed and mainspain entries, and making xln an etymology only code? That should resolve your Proto-Norse argumentment. --Victar (talk) 22:37, 5 April 2018 (UTC)
I see that the name "Old Ossetic" is broadly attested (when spelt correctly), and many sources seem to equate it with Alanic, but that doesn't mean Sarmatian should necessarily be merged as well. If they are indeed dialects as you claimed, then that would be perfectly fine. WP cites EB for the following: "The languages of the Scytho-Sarmatian inscription may represent dialects of a language family of which Modern Ossetian is a continuation, but does not simply represent the same language at an earlier time." If that is true, then Sarmatian should be kept separate. —Μετάknowledgediscuss/deeds 23:29, 5 April 2018 (UTC)
@Metaknowledge, there is no code for Sarmatian. If we put Alanic under Old Ossetic as part of a dialectal continuum, Sarmatian, as a dialect thought to be very similar to Alanic, should unequivocally be included. Otherwise it defeats the point. --Victar (talk) 23:49, 5 April 2018 (UTC)
I would be in favor of Old Ossetic for the continuum of Alanic and Ossetic and, if it can be demonstrated as true, Sarmatian. In what way can we adjudicate this Sarmatian situation? As mentioned before, I'm getting a little bit frustrated at continuously running across this "continuum problem" (attested languages descending from unattested near neighbors). There's been a fair amount of research saying that the phylogeny of language change tends to be binary in nature, but that depends on how you look at language continua versus dialectally diverse super-languages. I'd be interested to think about the principled use of language continua in our language data (like "oss-cnt" or the like), not just the "substrata" we use in the etymology-only language data. The question is in the utility of such a demarcation, but the inherent assumption of our current n-ary (or perhaps my theoretically binary) branching system tends to omit this subtlety of language change because frequently these continua are not protolanguages and exist clearly in the data... I dunno. —*i̯óh₁n̥C 00:07, 6 April 2018 (UTC)
Ignoring John's tangent... I know there is no code for Sarmatian. That's immaterial; we can make one if we deem it necessary. You claim that Sarmatian is very similar to Alanic, to the point of being a continuum; I know little about this, but found a scholarly source that claims otherwise. Can you respond to that with actual evidence? —Μετάknowledgediscuss/deeds 00:16, 6 April 2018 (UTC)
@JohnC5, Metaknowledge:
  1. "Sarmatian and Alanic represent a dialect continuum" and "it is difficult to draw the line between Sarmatian and Alanic".
  2. "Ossetic is the last remnant of the essentially unknown Middle Iranian dialect area that included Sarmatian, and is said to descend from Alanic."
  3. " is the sole surviving descendant of the Northeast Iranian dialects of the ancient Scythians and Sarmatians and medieval Alans".
  4. "Deine klare linguistische Scheidung zwischen Sarmatisch und Alanisch aufgrund der Materiallage nicht möglich ist"
Even if Alanic and Sarmatian were divergent enough to call separate languages, that distinction isn't apparent in the little material we have, so to reconstruct them separately at this time would be folly. --Victar (talk) 01:20, 6 April 2018 (UTC)
Thanks, that seems like good evidence for merger. I am now satisfied with having "Old Ossetic" as an L2 header with categorising context labels for the dialects. I would like to wait a couple of days just in case anyone raises an objection, so please ping me with a reminder. Also, please clarify if there are any etymology sections that need to distinguish between the dialects; if not, we can dispense with etymology-only codes and simply remove xln. —Μετάknowledgediscuss/deeds 04:44, 6 April 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Metaknowledge, if you have a moment, I would appreciate you making these changes. Thanks. --Victar (talk) 03:10, 16 April 2018 (UTC)

@Victar, could you please respond to the query in the last sentence of my last comment? —Μετάknowledgediscuss/deeds 04:39, 16 April 2018 (UTC)
@Metaknowledge, yes, need the etymology-only codes as well, not for the linguist distinction, really, but for the historical one, i.e. the names of Alanic kings. --Victar (talk) 04:44, 16 April 2018 (UTC)
Done Done. —Μετάknowledgediscuss/deeds 05:33, 16 April 2018 (UTC)
Changes look good. Thanks, @Metaknowledge! --Victar (talk) 05:35, 16 April 2018 (UTC)

References

  1. ^ Novák, Ľubomír (2013) Problem of Archaism and Innovation in the Eastern Iranian Languages (PhD dissertation), Prague: Univerzita Karlova v Praze, filozofická fakulta
  2. ^ Fortson, Benjamin W. (2004) Indo-European Language and Culture: An Introduction, first edition, Oxford: Blackwell
  3. ^ Kim, Ronald (2013) “On the Historical Phonology of Ossetic: The Origin of the Oblique Case Suffix”, in Journal of the American Oriental Society 123.1
  4. ^ Schmitt, Rüdiger, editor (1989), Compendium Linguarum Iranicarum, Wiesbaden: Reichert Verlag, →ISBN

RFM discussion: August 2016–June 2018

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Some more missing South American languages, 2

Here are a few more South American languages for which we could add codes:

  • Coeruna / Koeruna (sai-coe), said to be attested.
  • Conambo language (sai-cnb) is sometimes considered a dialect of the Záparo language, but Loukotka has samples, which show differences: "head" is ku-anak in Zaparo, ku-anaka in Conambo, "eye" is nu-namits (Z) / ku-iyamixa (C), "fire" unamisok (Z) / umani (C), "woman" itumu (Z) / maxi (C). OTOH, telling it apart from what a number of references refer to as the Zaparo spoken in the Conambo river could be non-trivial, so perhaps treating it as a dialect would be easier...
    Meh, left as a dialect. - -sche (discuss) 20:28, 4 June 2018 (UTC)
  • Koihoma language should probably be put off until we can confirm it as a distinct lect; its alt names are all the names of other languages...
  • Yao language (Trinidad) (sai-yao), attested in a single wordlist.

- -sche (discuss) 21:18, 16 August 2016 (UTC)

RFM discussion: March–June 2018

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming siz

We currently call this "Siwa". That's the name of the oasis, but the proper name for the language, which sees far more use in the literature and in general, is "Siwi". —Μετάknowledgediscuss/deeds 01:26, 14 March 2018 (UTC)

Done Done. - -sche (discuss) 19:05, 4 June 2018 (UTC)


RFM discussion: March–June 2018

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming cak

We currently call this "Cakchiquel". Language-learning materials and linguistic literature generally prefer the de-hispanicised spelling "Kaqchikel", as can be seen at Google Books. —Μετάknowledgediscuss/deeds 04:41, 17 March 2018 (UTC)

Done Done, although I note seems to have Kaqchikel only recently (circa 1995) displaced Cakchiquel. Note also the naming of ckz. - -sche (discuss) 19:19, 4 June 2018 (UTC)


RFM discussion: July 2019

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Converting Proto-Tibeto-Burman to an etymology-only language

I propose that we treat Proto-Tibeto-Burman as an etymology-only variant of Proto-Sino-Tibetan as it appears that the consensus among linguists in the field is that the set of non-Sinitic Sino-Tibetan languages is not monophyletic. (For a parallel, we treat Proto-Baltic as an etymology-only variant of Proto-Balto-Slavic for the same reason.) By the same token, I propose that we change all languages and families currently listed as family = "tbq" to family = "sit" instead. Wyang has said on his talk page that he isn't opposed to the idea, and I don't know who else here is working on Sino-Tibetan issues. What do other people think? —Mahāgaja · talk 14:32, 19 July 2019 (UTC)

Three days with no response; I'm doing it now. —Mahāgaja · talk 19:05, 22 July 2019 (UTC)


RFM discussion: July–August 2018

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Wanji languages

Currently the canonical name of ira-wnj is Wanji which is also used by wbi. One of them should be changed. We also have wny which is Wanyi. DTLHS (talk) 18:27, 22 July 2018 (UTC)

(Also similar: wdd Wanzi / Wandji.) I've renamed ira-wnj to "Vanji", which is the spelling Wikipedia uses anyway. We don't have much content in either language, just a few entries in descendant trees for the Iranian one and a few translations for the Bantu one, but yes, it causes problems (starting with conflated categories) if they have the same name. - -sche (discuss) 15:08, 24 July 2018 (UTC)
I was wondering what the heck happened to this. --Victar (talk) 22:43, 3 August 2018 (UTC)


RFM discussion: September–December 2019

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (_from_Inupiak_to_Inupiaq|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming from Inupiak to Inupiaq

Searches for "speak X" and "X language" show that the spelling Inupiaq is much more common than Inupiak, which is what we currently use. —Μετάknowledgediscuss/deeds 20:59, 12 September 2019 (UTC)

Done DoneMahāgaja · talk 10:09, 24 October 2019 (UTC)

I've moved all the categories to the new spelling, but it would be great if someone with a bot could change all L2 headers from ==Inupiak== to ==Inupiaq==. —Mahāgaja · talk 10:12, 24 October 2019 (UTC)
Also all instances of "Inupiak" in Translation sections (including |langname=Inupiak inside {{t-simple}}). —Mahāgaja · talk 22:31, 24 October 2019 (UTC)
@Mahagaja, Erutuon: Has this been done, and if not, can someone with a bot make it so? —Μετάknowledgediscuss/deeds 07:24, 24 December 2019 (UTC)
@Metaknowledge, Erutuon: I don't know if there are any L2 heading left using the old spelling, but there are definitely still translation sections calling it "Inupiak". I don't have a bot (or the remotest idea how to write one); @DTLHS, Benwing2, Ruakh, is this something one of y'all's bots would be interested in doing? —Mahāgaja · talk 08:02, 24 December 2019 (UTC)
Happily all the "Inupiak" headers were caught quickly because they showed up on my incorrect headers page. I went and fixed all cases in translation sections with JWB because there were less than a hundred. — Eru·tuon 10:15, 24 December 2019 (UTC)
Glad to hear it! I'm still hoping a bot runner will take care of the translations sections. —Mahāgaja · talk 15:10, 24 December 2019 (UTC)
Erutuon in his last comment: "I went and fixed all cases in translation sections". —Μετάknowledgediscuss/deeds 18:26, 24 December 2019 (UTC)
Oh right. Never mind.Mahāgaja · talk 18:52, 24 December 2019 (UTC)


RFM discussion: August–September 2018

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Kamkata-viri merger

We have language codes for both Kamviri xvi and Kati/Kata-vari bsh, despite being dialects of Kamkata-viri. I'd like to merge them under either code and rename the canonical name to Kamkata-viri. @-sche, Metaknowledge --Victar (talk) 22:31, 3 August 2018 (UTC)

Support. The phrasing in Liljegren (2016) that you linked to leads me to think that perhaps bsh is the code we should keep, and xvi the one we should retire, although it's wholly arbitrary. Thanks for paying attention to Nuristani, in any case. —Μετάknowledgediscuss/deeds 23:07, 3 August 2018 (UTC)
Kata-vari is the larger of the two by at least five-fold, which on one hand would make it the logical encompassing one, but bsh is actually named for its eastern subdialect Bashgali (taken from Dardic), making it a somewhat inaccurate code to begin with, but I don't really care either way. Here is Strand's tree. --Victar (talk) 23:39, 3 August 2018 (UTC)
And here is from Strand's 1974 paper, back when he used to refer to Kati as being the parent language of all three dialects: "Kati (Bašgalī) has three major dialects: Katə́viri, Kamvíri, and Mumvíri. Katə́viri is spoken by members of the Katə́ tribe. It is divided into two major subdialects: Western Katə́viri and Eastern Katə́viri. Western Katə́viri is further subdivided into the dialects of Ramgə́l, Kulám, Ktívi (Kantivo), and Pə́řuk (Papruk)". I'm also fine keeping it named Kati, despite it perhaps being outdated, if that's preferable to others. --Victar (talk) 00:05, 4 August 2018 (UTC)
Support the merger. As for the name: on one hand, "Kati" seems to be about twice as common even when I search for the names together with "Nuristani" to filter out the New Guinea-area language, but on the other hand the absolute number of words that mention either name is small, so if "Kati" is dated and "Kamkata-viri" is preferred these days, and "Kamkata-viri" would also make clearer to readers, etc that we're considering all the dialects under that one header (and that we're not talking about kti, Muyu / Kati), then go ahead and use "Kamkata-viri", (with the other names as alt names). - -sche (discuss) 00:58, 7 August 2018 (UTC)

Done. --Victar (talk) 16:57, 5 September 2018 (UTC)

RFM discussion: December 2019–January 2020

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming

Currently "Ruund", which is also is on WP, but "Ruwund" is more common in the literature. —Μετάknowledgediscuss/deeds 07:20, 24 December 2019 (UTC)

Renamed. Searching the site, the only pages I see using "Ruund" are categories (in family trees) and Reconstruction pages (using {{desc}}), which should update automatically; I think there is, therefore, nothing that needs changing(?) besides the three modules I just changed. - -sche (discuss) 08:35, 11 January 2020 (UTC)


RFM discussion: July 2016–January 2020

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Dungan is technically Mandarin, or a dialect of Mandarin

Hi, I'm not sure whether this is the right venue for this discussion, but I would like to bring up Dungan, which is spoken in Central Asia. According to Wikipedia, Ethnologue, and Glottolog, this is a Chinese language, specifically Mandarin. There are only 20 entries in Wiktionary that are for Dungan, and all of them are Mandarin words, with some from the Gansu and Shaanxi dialects. The difference, however, is that, Dungan is written in Cyrillic. In Wiktionary, all Chinese dialects are merged into one single Chinese entry, and pronunciations are listed. Shouldn't we do that, or at least partially, for Dungan? Please feel free to comment. Thanks. --Mar vin kaiser (talk) 14:54, 16 July 2016 (UTC)

Re venue: this is indeed a venue where discussions of merging/splitting language codes and categories and entries take place. I tend to put my "biggest" proposals (ones that needed votes in the past, or that concern major or controversial languages that I suspect will need votes) in the WT:BP, but here is OK.
Wiktionary:About Chinese#The_Chinese_lects and Wiktionary:Votes/pl-2014-04/Unified_Chinese intentionally left lects that don't use Hanzi separate from Unified Chinese, but I don't know if that was because Chinese editors felt they should never be merged, or just felt that merging them would be difficult and best attempted after everything else had been merged. It would obviously be possible to merge Dungan and other such lects if Chinese editors wanted to; we have plenty of other languages which use multiple scripts (e.g. Afrikaans). However, the various Chinese lects which are distinctive to the point of potentially being not-mutually-intelligible when spoken were able to be unified here because they share a written form in which they are theoretically mutually intelligible. If Dungan is potentially not intelligible with lects from other areas (lects that differ from Mandarin enough that speakers don't understand it without study) in either speech or writing, then what would be the basis for unifying them? - -sche (discuss) 15:42, 16 July 2016 (UTC)
Actually, Shaanxi and Gansu Mandarin is mutually intelligible with Dungan. Furthermore, a large majority of Dungan vocabulary is from Chinese, which therefore, has Chinese equivalent entries written in Chinese characters. There are Russian and Turkic vocabulary. My suggestion is to leave Dungan loanwords from Russian and Turkic as written in Cyrillic, and merge the Dungan Chinese words with Chinese entries, and perhaps leaving the Cyrillic entry of those words like how Chinese pinyin and Japanese romaji are left. --Mar vin kaiser (talk) 15:53, 16 July 2016 (UTC)
Also, I speak Mandarin, and I tried listening to Dungan videos in Youtube. They're actually understandable for the most part. As in I can write down what they're saying in Chinese, except for some words though (presumably Russian and Turkic loanwords). --Mar vin kaiser (talk) 16:04, 16 July 2016 (UTC)
  • I am not opposed to this, but it would require that Dungan orthography be incorporated into the relevant Chinese templates, so @Wyang's aid and support will be critical. —Μετάknowledgediscuss/deeds 16:21, 16 July 2016 (UTC)
  • I support this. Cyrillic Dungan forms can be added to {{zh-pron}}, under Mandarin. Wyang (talk) 00:25, 17 July 2016 (UTC)
    • The main caveats are that Cyrillic is apparently the standard script for Dungan, unlike with Pinyin or Romaji, and there may be some vocabulary that only exists in Cyrillic. I suppose the writing systems for Hokkien might be analogous, though. Chuck Entz (talk) 01:07, 17 July 2016 (UTC)
  • This doesn't feel right to me. I think it would be like folding Maltese into Arabic, or merging Hindi and Urdu. I foresee a lot of complaints from anons if entries like дянхуа and شِيَوْ عَر دٍ have a ==Chinese== heading, and I would find it disconcerting myself, too. And what would the definition then say? {{lb|zh|Dungan}} {{form of|Cyrillic script|電話|lang=zh}}? I think readers would find that more confusing than helpful. And then what about the Russian and Turkic loanwords that don't exist in China? They would have to have full definitions without a link to a Hanzi entry, and that would probably baffle readers even more, despite the {{lb|zh|Dungan}} tag. —Aɴɢʀ (talk) 14:14, 17 July 2016 (UTC)
    Hindi and Urdu should be merged — they are separate for political reasons. Remember, we allow for Afrikaans entries in Arabic script, Old French entries in Hebrew script, and other odd happenstances of historical script usage. We can continue to use the Dungan header for words only existing in Cyrillic form, just like I believe we do for Min Nan. —Μετάknowledgediscuss/deeds 16:10, 17 July 2016 (UTC)
Undecided for now. There are pros and cons. Cyrillic and Arabic spellings could potentially be added to each Mandarin standard pinyin syllable (non-standard could also be considered if confirmed). Multisyllabic only for confirmed ones.
Mandarin pinyin (with tone marks and monosyllabic tone numbered syllables), Min Nan POJ, zhuyin characters have not been "unified" under the Chinese umbrella for various reasons. Some are described above.--Anatoli T. (обсудить/вклад) 07:39, 18 July 2016 (UTC)
I think it would be more convenient for editors if Dungan were part of unified Chinese, since it would be easier to edit. It would just feel like too repetitive if I made a new Dungan entry that technically already has an equivalent Chinese entry. --Mar vin kaiser (talk) 08:32, 18 July 2016 (UTC)
Should I bring this somewhere else to a vote whether Dungan should be merged into the unified Chinese? --Mar vin kaiser (talk) 16:08, 22 July 2016 (UTC)
@Wyang: if you still support folding Dungan into Chinese, how best could that be accomplished? Make the attested Cyrillic (and Arabic?) forms soft redirects...? (Would the L2 header of e.g. фонзы be "Chinese" or remain "Dungan"?) What is to be done if, as several users worry above, there are loanwords that don't have Han-character representations? - -sche (discuss) 04:48, 21 January 2018 (UTC)
@-sche I'm now also concerned about the complexity of this... a possible solution is to have {{zh-pron}} link to the Dungan word, but keep the Dungan L2 heading, and treat Dungan as a full language with the Cyrillic entry showing full usage examples, etc. and linking back to the Hanzi form, perhaps on the headword line. Wyang (talk) 07:03, 21 January 2018 (UTC)
@Wyang, -sche, Justinrleung, Suzukaze-c, Tooironic: Perhaps attested Dungan Cyrillic and Arabic syllables and multisyllabic words could be allowed into {{cmn-pron}}? E.g. йүян (yüi͡an) and يُوْيًا (not sure if the latter is right) in 語言语言 (yǔyán)? Each or almost each standard pinyin syllable should have a corresponding Dungan Cyrillic syllable, so monosyllabic entries may have them by default. --Anatoli T. (обсудить/вклад) 07:28, 21 January 2018 (UTC)
I think it is simple. For special words w/o hanzi, do it like the Min Nan lo͘-lài-bà entry, otherwise do it like the Min Nan phōng-kó entry.
{{zh-pron}} > Mandarin > Dungan (Cyrillic), Dungan (Xiao'erjin) doesn't seem problematic to me.
Cyrillic quotations are welcome on the hanzi entries. —suzukaze (tc) 07:45, 21 January 2018 (UTC)
(I just noticed, phōng-kó uses the Chinese header. I think it should use the Min Nan header, similar to how pinyin entries use the Mandarin header. But that's a separate discussion, I think. —suzukaze (tc) 07:46, 21 January 2018 (UTC))
We had a discussion on this but I don't remember where. I'm OK to use the Chinese L2 header on all romanised entries if they link to (and consequently, have a written form in) hanzi. --08:03, 21 January 2018 (UTC)
@Wyang, Suzukaze-c, Atitarev, Mar vin kaiser I have made some trials to Module:zh-pron. But currently there're some problem:
  1. As Cyrillic script does not imply tone and merges some consonant, the Cyrillic form is generated from pinyin-like romanisation instead. We need Wiktionary:About Dungan to document it. Also, current pinyin-like romanisation include some redundant vowels.
  2. IPA value are from Wikipedia and needs checking.
  3. I don't know how neutral tone work, nor whether there're tone sandhi or erhua in Dungan.
  4. Some values can not be generated by current pinyin-like romanisation. e.g. чў=q+u (not ü) which is not a valid Pinyin syllable, and ңыйлу (="ng+er+y" lou).
  5. We need a module to transcript Cyrillic Dungan entries.

--Zcreator (talk) 21:17, 23 January 2018 (UTC)

My thoughts:
  • Instead of Template:docparam, it should be Template:docparam for consistency.
  • I don't think we should make a "pinyin". I think we should input the original Cyrillic directly, annotated with tones (somehow).
  • There is definitely erhua.
suzukaze (tc) 05:13, 24 January 2018 (UTC)
I have removed pinyin-like romanisation. However point 2 and 3 still needs to be solved.--Zcreator (talk) 09:05, 24 January 2018 (UTC)
I oppose the use of tones in the transliteration for Dungan if it's reintroduced. The Cyrillic spelling have no tones and there should be no tone numbers or marks in the romanisation. Further, the transliteration should show what's actually written, not matching the Mandarin pinyin, e.g. йүян (yüi͡an) is actually "yüyan" (without a space), not "yu3 yan2" but perhaps, there's no need to provide any additional transliteration. --Anatoli T. (обсудить/вклад) 10:56, 24 January 2018 (UTC)
We have tone and length annotations for every other language whose orthography does not reflect relevant phonemic differences. Why should this be different? Korn (talk) 12:08, 24 January 2018 (UTC)
Just to be clear, I was talking about the transliteration, not IPA. Anatoli T. (обсудить/вклад) 20:19, 24 January 2018 (UTC)
@Atitarev: I don't think the Cyrillic in Dungan should be referred to as a transliteration, but as a script of its own. Therefore, I agree with you that tones shouldn't be put in the Cyrillic writing precisely because it is a script, and it is written without the tone, but the IPA should definitely provide the tone, since the script traditionally doesn't write it down. --Mar vin kaiser (talk) 06:30, 25 January 2018 (UTC)
@Mar vin kaiser: I don't think Atitarev is referring to the Cyrillic as transliteration. He's talking about the romanization of the Cyrillic. I found out from Omniglot that there are two conventions used for indicating tones: -, ъ, ь or I, II, III. We should probably adopt one of these two. — justin(r)leung (t...) | c=› } 07:05, 25 January 2018 (UTC)
@Justinrleung: I see. Then it's settled then, I guess. For cognates with other Chinese languages, a unified Chinese entry will be used with the Cyrillic entry still existing the same way POJ entries in Min Nan still exist. The Dungan portion of the pronunciation would then contain the Cyrillic entry or the Cyrillic script with tone marks? What do you think? Maybe the Cyrillic script with tone marks, but it would link to the Cyrillic entry without tone marks? --Mar vin kaiser (talk) 07:20, 25 January 2018 (UTC)

The module can't handle variants, separated by commas, e.g. dg=Җун1гуй1,Җун1гуә2 in 中國中国 (Zhōngguó). --Anatoli T. (обсудить/вклад) 14:01, 25 January 2018 (UTC)

@Atitarev I found a new source for Dungan . It shows more dialects and more tones compared to the Russian source. --Mar vin kaiser (talk) 16:32, 13 February 2018 (UTC)


RFM discussion: December 2017

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming fay

"Southwestern Fars" is a really awful name. Fars is not a macrolanguage as the name would suggest, but a province of Iran, and there are other lects spoken in southwestern Fars. We would do much better to use the unambiguous name "Kuhmareyi", as used by Wikipedia. —Μετάknowledgediscuss/deeds 19:44, 9 December 2017 (UTC)

@Eeranee, who has been adding entries in it, may have an opinion. —Μετάknowledgediscuss/deeds 20:53, 9 December 2017 (UTC)
hi, Metaknowledge. At the beginning when I wanted to create the category for the language I searched online to see if I can find Kuhmareyi. I could not find online sources mentioning Kuhmareyi other than Wikipedia. Online sources mostly mention Southwestern Fars. I know Fars is a province. ethnologue and some other sites use this name. I am trying to find the name Kuhmareyi in the book that I have, A treasury of the Dialectology of fars. I should also mention that I am very cautious in adding the new words and double check the words and have not recorded some words that can not be found in other sources. However the book is written by a professional linguist and is funded or guided by Iranian Academy of Persian Language and literature. I think I saw Kuhmareyi in one online source. I have not found the name "Kuhmareyi" So far but the name seems correct.--Eeranee (talk) 21:09, 9 December 2017 (UTC)
@Eeranee: It sounds like you're being a very conscientious editor, so thank you! Do you think "Southwestern Fars" really is the most used name? If so, I guess we'd be wrong to do otherwise. —Μετάknowledgediscuss/deeds 21:29, 9 December 2017 (UTC)
I can find online sources in Persian using Kuhmareyi which might not be reliable but I could not find a source written in English using Kuhmareyi. If anyone thinks kuhmareyi is more correct we can use it--Eeranee (talk) 22:06, 9 December 2017 (UTC)
Searching Google Books for both names, I find nothing that would help us much to decide one way or the other: nothing using "Kuhmareyi", and only a couple of general reference works on world languages that mention "Southwestern Fars" in giant lists of languages, probably just copying Ethnologue. (Wikipedia curiously says "the southwestern dialects can be divided into three families of dialects according to geographical distribution and local names: Southwestern (Lori), South-central (Kuhmareyi) and Southeastern (Larestani)", as if calling it "southwestern" might be slightly confusing/misnomial.) @ZxxZxxZ, Vahagn Petrosyan, do either of you have a preference or knowledge of which name is more common or appropriate? - -sche (discuss) 19:19, 17 December 2017 (UTC)
I don't know anything about this. --Vahag (talk) 18:38, 18 December 2017 (UTC)


RFM discussion: June 2018–June 2020

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming the Manda languages

We currently call "Nyasa", and it came to my attention due to the entry mtua, which User:Wikitiki89 mistakenly created as a "Nyasa" entry, based on an obsolete dictionary of Chichewa that calls it "Kiniassa". Nowadays, Nyasa is usually a name for a group of closely related languages including Chichewa (Nyasa languages) but WP claims it is also used for mjh. Following WP, we could call it "Manda (Tanzania)", rendered problematic by the two other Manda language possibilities, which we call "Manda" and "Australian Manda" (!). I propose that we go all in for national disambiguation and make those parenthetical as well, as "Manda (India)" and "Manda (Australia)". @-scheΜετάknowledgediscuss/deeds 06:18, 28 June 2018 (UTC)

Just for the record, I did a lot of research to determine what the correct language code was for the language described in that dictionary, even referring to this map (source) and strongly considering "mjh". Thanks for finally sorting it out! Note also, there is a "Nyasa" translation at water, which does not align with Chichewa "madzi" (which the dictionary I used gave as "madsi"). --WikiTiki89 15:07, 28 June 2018 (UTC)
@Wikitiki89: Next time, try Google instead of poring over maps. :) I was already aware of this dictionary (and its miserable orthography), but searching "Rebman Kiniassa" gets you the Wikipedia article for Johannes Rebmann, which in turn tells you this is Chichewa. As for the "Nyasa" translation, masi... I really don't know what language that should be. The word for "water" in mgs is máchi. —Μετάknowledgediscuss/deeds 17:16, 28 June 2018 (UTC)
I did Google quite a bit, and I probably did look at the w:Chichewa Wikipedia article, but didn't trust that it was accurately citing the dictionary. --WikiTiki89 17:21, 28 June 2018 (UTC)
WP says Australian Manda (zma) is also called "Menhthe", but the only published resources on it I could find used Manda. To distinguish it and and , the proposal of ="Manda (Tanzania)" and ="Manda (India)" and ="Manda (Australia)" sounds good. (As an aside, there is also a "Ma Manda" language.) - -sche (discuss) 03:11, 21 January 2020 (UTC)


RFM discussion: June 2018–June 2020

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming mjh > Merging mjh

Tangentially relevant to the discussion of languages named "Nyasa" above, mjh is not only one of those, but is also called "Mwera" (a name currently occupied by mwe, which can't even by distinguished by a parenthetical, because both languages are from Tanzania!). Maho's Guthrie List, the standard list used by Bantuists, calls it "Mbamba Bay Mwera", but the only hits for that string on Google Books are of that very list. WP chooses to simplify this as just "Mbamba Bay", which is both the most unambiguous option, and also an option that seems to be used only there. I'm really unsure what to call it — anything but our current name of "Nyanza", which refers to the lake and offers only more confusion. —Μετάknowledgediscuss/deeds 06:27, 28 June 2018 (UTC)

@-sche, Mahagaja: Any thoughts on this (or the above discussion)? —Μετάknowledgediscuss/deeds 03:19, 17 January 2020 (UTC)
@Metaknowledge Can they be distinguished by linguistic family, like WT:RFM#Canonical_name_of_"fan"? Say, "Mwera (Nyasa)" vs "Mwera (Ruvuma)" or whatever? Btw, we should standardize on putting the family in parentheses (where we also put geographic disambiguators), which will require renaming some languages that were named with the family first, like Papuan Mor (but not Sepik Iwam, which is apparently normally called that, to distinguish it from the other Iwam that is also Sepik). - -sche (discuss) 07:05, 17 January 2020 (UTC)
I don't know anything about these languages, but if indeed both mjh and mwe are usually called Mwera and both are spoken in Tanzania, then I'd support "Mwera (Nyasa)" and "Mwera (Ruvuma)". —Mahāgaja · talk 07:29, 17 January 2020 (UTC)


RFM discussion: April 2017–August 2020

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Misc language code retirements (2017)
  • Moksela language Μετάknowledgediscuss/deeds 04:15, 4 April 2017 (UTC)
    Charles Grimes says in Spices from the East: Papers in Languages of Eastern Indonesia: "This speech variety has been extinct since 1974, when the last speaker died. No clues other than the name of a stream east of Kayeli called Moksela, give any indication as to where it was spoken or what it was like. If it was spoken from the stream by that name eastward, then chances are likely that it was also a variety of the Kayeli language. People in the Kayeli area remember nothing more than the name of the language, who in the community spoke it " (I cannot view beyond this in the Google Books preview.)
    "...who in the community spoke it before they died, and that it was somehow different enough to have its own identity." is the rest of the sentence, I managed to coax Google into telling me. The name seemed familiar, as if it had been in one of the wordlists I've been looking at recently, but I just went back over them and searched through various other sources and indeed the only mentions of it I find all just say it's extinct and not recorded; how sad. Removed. - -sche (discuss) 08:19, 10 May 2017 (UTC)
  • In this vein, Makolkol is claimed to be extinct (per Wurm 2003, after having 7 speakers in 1988) and apparently unattested (per Stebbins 2010). Harald Hammarström and Sebastian Nordhoff accept this conclusion in Melanesian Languages on the Edge of Asia, but it may be a cautionary tale instead, because an article in LoopPNG from 2016 says five Makolkol still live, and even provides words(!), saying it is related to Simbali: "mam, meaning father, and nan, meaning mother". A 2005 article in Anthropological Linguistics (volume 47, page 77) agrees on the relation to Simbali: " Makolkol (extinct), is locally understood to have been a 'mixed language' combining Simbali and Nakanai (an Austronesian language on the northern side of New Britain)." I suppose the code should be left alone for now, pending further data. (There were widely varying estimates of how many speakers it had earlier in the 20th century, and fanciful tales of who they were, "headhunters" or "giants" who "lived in trees" and who no white person had survived meeting at first.) - -sche (discuss) 08:43, 10 May 2017 (UTC)
    (Ergo, code kept.) - -sche (discuss) 01:30, 6 August 2020 (UTC)
  • Maramba
Also Maramba (myd)? (And many more at Spurious languages need to be checked, but some are not spurious, like Ammonite.) - -sche (discuss) 09:51, 3 June 2017 (UTC)
myd has been retired by the ISO and hence now also by us. - -sche (discuss) 07:47, 23 February 2019 (UTC)


RFM discussion: July 2016–August 2020

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Even more languages without ISO codes, part 5

This next batch is of languages from lists other than Ethnologue and LinguistList. As before, I've tried to vet them all beforehand, but I will have doubtlessly made some mistakes. NB if you want to find more: I've avoided dealing with most of the Loloish languages, because all the literature seems to be in Chinese. —Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)

Australian languages
Tasmanian languages
Western Tasmanian:
Northern Tasmanian:
Eastern Tasmanian:
Oyster Bay (Big River, Paredarerme/Paritarami, Lairmairrener, Lemerina)? - -sche (discuss) Done Done as aus-par
Little Swanport? - -sche (discuss) Done Done as aus-lsw
comments

@-sche, back when I suggested these Australian languages, I included the codes for the Tasmanian languages that Bowern (2012) teased out of various wordlists. At the time, I was ignorant of the fact that there is an ISO code, xtz, for a language called "Tasmanian", and we have a few words in it. There was no single Tasmanian language, so I think this code should be retired and the words sorted into their respective languages by Bowern's scheme. —Μετάknowledgediscuss/deeds 05:28, 3 May 2017 (UTC)

Other needed codes

Here are other languages we might need codes for: - -sche (discuss) 05:21, 29 August 2016 (UTC)

  • Indanga (Kɔlɔmɔnyi, Kɔlɛ, Kasaï Oriental) (bnt-ind?)
    It lacks a Wikipedia article but is documented by Jacobs, Texte et lexique indanga (2002). fr.Wikt already has a word from it. OTOH, fr.WP considers it a regional variant of Tetela. And fr.Wikt does have a tendency to treat dialects as language, also splitting e.g. Alsatian German from Alemannic German, Hoanya from Papora, etc. - -sche (discuss) 05:21, 29 August 2016 (UTC)
    Well, it's definitely part of the dialect continuum known in Guthrie as C.70, which has 8 ISO codes that cover it rather poorly (this is a typical situation with Bantu languages, which really need their own overhaul at some point). I see that its word for "water" is bash in that reference, rather different than Tetela proper ashi. We have to draw lines somewhere, and I can't figure out where Indanga would be merged, so I suppose a new code is in order. —Μετάknowledgediscuss/deeds 05:30, 4 October 2016 (UTC)
    Done DoneΜετάknowledgediscuss/deeds 01:58, 5 April 2018 (UTC)


RFM discussion: March 2019–February 2021

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I believe that Category:Nama language should be renamed to Category:Khoekhoe language. The Nama are an ethnic group, but other ethnic groups like the Damara, Haiǁom, and ǂĀkhoe also speak the same language. It's more accurate and inclusive to call it Khoekhoe. Smashhoof (talk) 18:44, 18 March 2019 (UTC)

On Google Books, "Nama language" and "Khoekhoe language" are about equally common. This is misleading, however, because many of the hits refer one of multiple Khoekhoe languages, as the term is used somewhat more broadly by some authors. Further evidence of this is supplied by "speak Nama" being much more common than "speak Khoekhoe" on Google Books. If we restrict ourselves to linguistic literature, "Nama language" is significantly more common than "Khoekhoe language".
In short, we do sometimes use a less common name for a language in order to disambiguate or avoid a name widely considered offensive. The name "Nama" is less inclusive, but also slightly less ambiguous, and seems to be the most common name in usage, pace Wikipedia. —Μετάknowledgediscuss/deeds 18:52, 18 March 2019 (UTC)
The native name is Khoekhoegowab (literally "Khoekhoe language"). "Nama language" may have been more common in the past, but the standardized language today is called "Khoekhoe" or "Khoekhoegowab". The dictionary I have says that "Khoekhoegowab is the Language of mainly the Damara, Haiǁom and Nama." In The Khoesan Languages (2013, Routledge Language Family series), they exclusively refer to the language as Khoekhoe; however, they distinguish between Namibian Khoekhoe (Nama/Damara) and Haiǁom/ǂĀkhoe. "Khoekhoe(gowab)" does seem to be a more accurate and preferred term to me. Smashhoof (talk) 21:43, 18 March 2019 (UTC)
If you search a digital copy of The Khoesan Languages, you'll see that different authors use different terminology in the book. The sections concerning the language in question call it Khoekhoegowab , which is far less common than either Khoekhoe or Nama and little used in non-scholarly English. —Μετάknowledgediscuss/deeds 00:58, 19 March 2019 (UTC)
Looking in my copy of The Khoesan Languages, I see a few usages of "Khoekhoegowab", but "Khoekhoe" seems to be used more. "Khoekhoe (N.Kh.) is strictly a suffixing language...", "Khoekhoe categorizes nouns according to ...". The whole morphology section on the language seems to use "Khoekhoe". The syntax section uses "N.Kh." for "Namibian Khoekhoe(gowab)". Regardless, since the standard language is called Khoekhoegowab, I think it would be best to call it Khoekhoe, as that seems to be equivalent and more common than the full Khoekhoegowab in English. Also, Wikipedia uses the term Khoekhoe (see Khoekhoe language), so it would also make sense to keep the same terminology between wikis. Though, that article does use the term "Nama" more often than "Khoekhoe", which is a bit odd given the title is "Khoekhoe language". Smashhoof (talk) 02:49, 19 March 2019 (UTC)
I asked someone who knows more about this and they said that Khoekhoegowab is the standard language, taught in schools and used in media, but Nama/Damara is the colloquial spoken language. Locals refer to it as Damaranama, Namadamara, Namataal, or Namagowab. Smashhoof (talk) 23:06, 18 March 2019 (UTC)


RFM discussion: October 2018–February 2021

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Rename yrl

We currently call this "Nhengatu", but Nheengatu is where we've put our actual lemma for the language name, and it does seem to be more common in English per BGC. —Μετάknowledgediscuss/deeds 19:09, 21 October 2018 (UTC)

Support. - -sche (discuss) 17:59, 14 November 2018 (UTC)