Wiktionary:Etymology scriptorium/2017/July

boterham

Right now the etymology of boterham reads boter +‎ ham, which is confusing at best as the second part doesn't mean "ham". Philippa calls the origin of the seconds part uncertain while others are happy to go with a meaning like "cut, morsel" (Philippa mentions this of course). Any preferences for a certain approach? @CodeCat, Morgengave, KIeio Lingo Bingo Dingo (talk) 13:41, 1 July 2017 (UTC)

If it means something other than just "ham", then are there attestations for that sense? —CodeCa t 14:17, 1 July 2017 (UTC)

I don't think it's attested, there only seems to be a mention by Kiliaan. Lingo Bingo Dingo (talk) 14:23, 1 July 2017 (UTC)

I think it somewhat rhymes with Bemme. 91.66.71.216 00:42, 3 July 2017 (UTC)

Well that would explain bammetje.. W3ird N3rd (talk) 01:18, 6 August 2017 (UTC)

Limburgish bleudse

Limburgish Wiktionary has an entry li:bleudse with the meaning "to heal by releasing blood". This word agrees perfectly in form with Proto-Germanic *blōþisōną, the origin of English bless. However, there are no other cognates of this word, and it's not found in any older Germanic languages other than Old English, including any that could be ancestral to Limburgish. Where could it possibly have come from? Is this really a gap in the attestation? I'm not sure what to make of it. —CodeCa t 12:51, 2 July 2017 (UTC)

According to Kroonen, there was a Norse settlement near Beek-Elsoo in Limburgish territory. No idea if the dates match up, but if they do it could conceivably be from ON. KarikaSlayer (talk) 19:32, 3 July 2017 (UTC)

Nevermind, I didn't see the -eu-. That would make me think it's inherited. KarikaSlayer (talk) 19:36, 3 July 2017 (UTC)

What's the Old Limburgish blöhdhsan they refer to? The spelling looks rather strange (although I admittedly don't know anything about "Old Limburgish spelling".) Otherwise, it could be a secondary derivative. In western Central Franconian there's a form blödije, bledije with the same sense. Not perfectly the same, but similar. Kolmiel (talk) 14:53, 7 July 2017 (UTC)

pooch

The senses "A bulge, an enlarged part" and "A distended or swelled condition" are listed under the etymology "From German putzig (“funny, cute, small”, adjective)", the same as the "dog" meaning, but wouldn't these senses more likely be related to "pouch"? Are they really the same etymology? Mihia (talk) 19:04, 2 July 2017 (UTC)

It might be a homonym sense unrelated to putzig, potsig. It might even be an old spelling of pouch. We can't know unless the author gives a source. The etymology is largely irrelevant for the meaning, though, if usage can be attested. I'm not sure whether the given example would be more likely with pouch and if so, pooch in that sense might be a calque. Especially as a dog name, and given the funny connotation and sound of it, a word play doesn't seem unlikely. 91.66.71.216 00:35, 3 July 2017 (UTC)

The etymology may be "largely irrelevant for the meaning" if you care only about the meaning and not about the etymology, but etymology is important and interesting in its own right, and is also the whole basis of the Wiktionary article organisation. It "might be" anything if one doesn't actually know, but, as I now see, various dictionaries agree with me that it is related to "pouch", so I'm going to split the article. Mihia (talk) 00:18, 8 July 2017 (UTC)

Can you please add the sources for the second etymology? 91.66.13.99 18:10, 15 July 2017 (UTC)

French ouate de phoque

This is a humorous nonsensical translation of what the fuck, which literally means "cotton wool of seal", but I wonder exactly what type of borrowing this is. We have Category:Phono-semantic matchings by language, but I don't think that's what this is. Wikipedia speaks of "Homophonic translation", could that apply here? --Barytonesis (talk) 23:08, 2 July 2017 (UTC)

Homophonic translation seems like a good fit to me. DTLHS (talk) 00:39, 3 July 2017 (UTC)

Here's another example of this sort of thing. DTLHS (talk) 00:59, 3 July 2017 (UTC)

@DTLHS: Nice :p --Barytonesis (talk) 13:11, 4 July 2017 (UTC)

I'm not so sure this is really French. I think it's more like Dog Latin: nonsense phrases chosen for their English homophones. It reminds me of the following fake Latin verse:

O civili, si ergo,

fortibus es in ero.

O nobili Deis trux.

Vatis in em, causan dux.

Which is supposed to sound like:

Oh, see, Billy, see 'er go-

forty buses in a row!

Oh, no, Billy, they is trucks.

What is in 'em? Cows and ducks.

The usage on Google Books is very limited, and in Usenet is mostly mentions in bilingual contexts- I'm not sure :this would pass rfv as being used in French to convey meaning. Chuck Entz (talk) 01:09, 3 July 2017 (UTC)

@Chuck Entz: Mh, maybe not. However, as a French speaker, I think I've heard it before, and might even have used it myself. --Barytonesis (talk) 13:11, 4 July 2017 (UTC)

Anyway, I created Category:Homophonic translations by language. --Barytonesis (talk) 13:12, 4 July 2017 (UTC)

There's a whole book of Mots D'Heures: Gousses, Rames...

-- AnonMoos (talk) 06:53, 24 July 2017 (UTC)

I'm also reminded of one my French teacher told me for Latin a long time ago: “Mē tamen amābit” (he/she will love me) for the French “Met ta main à ma bite” whose translation I shall leave as an exercise for the reader. —John C5 07:20, 24 July 2017 (UTC)

FEW

Are there any alternative sources for the FEW (Französisches Etymologisches Wörterbuch)? https://apps.atilf.fr/lecteurFEW/ is down. --Victar (talk) 01:15, 3 July 2017 (UTC)

Not that I could find. Given the time and the day of the week, it's a good bet that it's just down for maintenance for a few hours. Chuck Entz (talk) 01:35, 3 July 2017 (UTC)

Dang. It's been down for a few days now. --Victar (talk) 01:54, 3 July 2017 (UTC)

struthiocamelus

There is no στρουθιοκάμηλος in perseus (or in the rest of my sources). Does someone has another source? --Xoristzatziki (talk) 23:00, 3 July 2017 (UTC)

struthio

Can someone verify the etymology? --Xoristzatziki (talk) 23:01, 3 July 2017 (UTC)

I believe it comes from Ancient Greek στρουθίων (strouthíōn), diminutive of στροῦθος (stroûthos, “sparrow”). – GianWiki (talk) 21:01, 4 July 2017 (UTC)

Please, I asked for verification, not believes. Have you seen any source that states so or is just a hunch? Wikifriendly --Xoristzatziki (talk) 13:27, 11 July 2017 (UTC)

Starbucks

I think this information should be incorporated into the etymology ():

Terry Heckler mentioned in an offhand way that he thought words that begin with "st" were powerful words. I thought about that and I said, yeah, that's right, so I did a list of "st" words.
Somebody somehow came up with an old mining map of the Cascades and Mount Rainier, and there was an old mining town called Starbo. As soon as I saw Starbo, I, of course, jumped to Melville's first mate in Moby-Dick. But Moby-Dick didn't have anything to do with Starbucks directly; it was only coincidental that the sound seemed to make sense.

— Ungoliant ^(falai) 12:45, 4 July 2017 (UTC)

temh₁- or temh₂-?

Our entries currently only mention the latter variant (as can be seen in the "What links here" of the page). But De Vaan 2008 has the former variant, as does LIV. Is there any particular evidence for one or the other laryngeal here? —CodeCa t 22:33, 6 July 2017 (UTC)

I think the Doric perfect form of τέμνω (témnō) (that is, the Doric dialect of Ancient Greek for those who don't know), τετμάκει (tetmákei, “he has cut”), indicates *temh₂-. It likely has long ᾱ (ā), because it corresponds to the Attic form τετμήκει (tetmḗkei). *eh₂ developed into Doric ᾱ (ā) but shifted further to η (ē) in the Attic and Ionic dialects, while *eh₁ developed into η (ē) in both Doric and Attic–Ionic. So the correspondence of Doric ᾱ (ā) to Attic η (ē) in τετμᾱ́κει (tetmā́kei) and τετμήκει (tetmḗkei) would indicate that the root contains *h₂. — Eru·tuon 00:07, 7 July 2017 (UTC)

I wonder what De Vaan and LIV think of this point. They don't even choose the generic H, but specifically h₁, suggesting that there is also positive evidence for h₁ in particular. —CodeCa t 00:14, 7 July 2017 (UTC)

@CodeCat: Do you have Beekes? He talks about the issue, and reconstructs *temh₁- as well. --Barytonesis (talk) 00:21, 7 July 2017 (UTC)

Beekes what? I only have his IE grammar thing, nothing specifically about Ancient Greek. —CodeCa t 00:22, 7 July 2017 (UTC)

{{R:grc:Beekes}} --Barytonesis (talk) 00:23, 7 July 2017 (UTC)

No I don't have that. —CodeCa t 00:29, 7 July 2017 (UTC)

Whether to mention vowel grades in Ancient Greek etymologies

@CodeCat prefers not to mention vowel grades in Ancient Greek etymologies, and has been removing such mentions, as in ἀλοιφή (aloiphḗ).

I think it is a useful thing to mention. Ancient Greek verbal roots frequently have such grades, and explicitly saying so helps readers to understand why, for instance, ἀλοιφή (aloiphḗ) has a diphthong with ο (o) while ἀλείφω (aleíphō) has one with ε (e). I found it a fascinating topic in the discussions of vocabulary in my introductory Attic Greek course, Hansen and Quinn.

What are other people's opinions on this? I think @Barytonesis has also added mentions of vowel grades to etymologies.

What is your reasoning, @CodeCat? I feel like this has been discussed before, but I don't remember where. — Eru·tuon 23:43, 6 July 2017 (UTC)

I'm not opposed to mentioning it, but I think we should mention it as part of the suffix. After all, it's the suffix that triggers a particular grade. For example, Proto-Indo-European *-tós always triggers zero grade. The etymology should make this clear; often etymologies seem to treat the different grades as distinct entities that suffixes are then applied to, but in actuality the suffix is primary and the grade a consequence. —CodeCa t 23:48, 6 July 2017 (UTC)

говаривать

гова́ривать (govárivatʹ), from говори́ть (govorítʹ) +‎ -ивать (-ivatʹ), interests me because it changes о (o) to а (a).

This change makes sense in a certain way, because говори́ть (govorítʹ) can be analyzed as /ɡavaˈrʲitʲ/, phonemically speaking, as if spelled гавари́ть (gavarítʹ), with the unstressed letters о (o) being pronounced as /a/. So the spelling change must be a result of the stress shift: the second unstressed о in говори́ть keeps the pronunciation /a/, but receives stress because of the addition of the suffix -ивать, and hence has to be spelled а (a), while the other two letters о need not change their spelling. It seems a rare case where an unstressed vowel merger is manifested in spelling (only because the unstressed vowel is now stressed), which is mostly not the case in Russian, as opposed to Belarusian.

Of course, I'm just speculating here. (Not sure if my explanation will be intelligible.)

I see a similar change in a few other words suffixed with -ивать (выма́щивать, вывола́кивать, выпра́шивать, just from the first page of the category), so perhaps it is a regular phenomenon. Unfortunately, I don't have access to sources on Russian phonology. I think there should be some kind of a note explaining what's going on, and a category for words of this type.

Does anyone interested in Russian have more information on this: @Benwing2, @Atitarev? — Eru·tuon 04:02, 7 July 2017 (UTC)

The feature is standard and it's called чередова́ние гла́сных (čeredovánije glásnyx) - vowel gradation; vowel interchange. A few examples are here. --Anatoli T. ^{(обсудить}/^вклад) 05:39, 7 July 2017 (UTC)

Since Russian оа comes from (postlaryngeal) PIE *ā and *ō, while ао comes from *a and *o, I suppose this alternation goes all the way back to PIE ablaut. —Aɴɢʀ (talk) 15:13, 7 July 2017 (UTC)

You have that backwards. Slavic o is originally short, a is originally long. It's the reverse of Germanic. —CodeCa t 15:29, 7 July 2017 (UTC)

And you have the reason wrong. This has nothing to do with PIE ablaut, but rather with vowel lengthening. This suffix lengthens the preceding vowel, so о becomes а, ъ becomes ы, and ь becomes и (and maybe е becomes ѣ, but I can't find solid examples of that one). --Wiki Tiki 89 16:50, 7 July 2017 (UTC)

@CodeCat: Fixed. I did know that, I was just typing faster than I was thinking. —Aɴɢʀ (talk) 17:48, 7 July 2017 (UTC)

@Wikitiki89: Ahh, so my theory was completely wrong. Thanks for the additional explanation. (It should probably be added to the entry -ивать. I guess it does say о changes into а, but not why.) I wonder, what is the origin of the lengthening: from PIE or from a post-PIE sound change? Perhaps Winter's law? — Eru·tuon 18:15, 7 July 2017 (UTC)

After looking a little closer, it seems that the situation is a bit trickier. I'm currently analyzing a bunch of verbs and will post more information later. --Wiki Tiki 89 18:59, 7 July 2017 (UTC)

It may be common knowledge, not sure, but in regard to Russian imperfective aspect, I think it's interesting that the Russian infix -ива-/-ыва- (very common marker of imperfective verbs, as in гова́ривать (govárivatʹ)) is found in various other Indo-European languages, such as Latin (amābat), Italian (amava), Spanish (amaba), and Lithuanian (mylėdavo. —Stephen ^(Talk) 19:03, 7 July 2017 (UTC)

I'm not sure those are related. The Slavic *v generally does not correspond to Latin b. --Wiki Tiki 89 20:23, 7 July 2017 (UTC)

@Erutuon: Balto-Slavic extended the existing PIE lengthened-grade ablaut to include i and u, while also extending it further for o and e. So the lengthening is an innovation specific to Balto-Slavic. I don't know exactly which derivations trigger the lengthening, but it seems you've already found one case. Since this is a Balto-Slavic phenomenon, you should be able to find cognate formations in Latvian and Lithuanian as well. I'm curious if there are any remnants of the o-a distinction visible in this, since these two vowels merged in Balto-Slavic. They should in theory lengthen to ō and ā respectively, and these vowels remain distinct in the non-Slavic languages, so you might find a-ā pairs next to a-ō, revealing the original quality of the short vowel. —CodeCa t 20:36, 7 July 2017 (UTC)

Germanic also has exactly the /a~o/ vs. /ā~ō/ alternation in class VI strong verbs (shake/shook < *skakaną/*skōk). The obvious place for this contrast to originate is in zero grades with interconsonantal h₂/h₃ (> Gmc. a, BSl. a, Sl. o) vs. full grades with eh₂/eh₃ (> Gmc. ō, BSl. ā/ō, Sl. a). —Aɴɢʀ (talk) 07:33, 8 July 2017 (UTC)

Vowel lengthenings in certain derivations are very common in Balto-Slavic. As CodeCat notes, this is probably an analogical extension of PIE vrddhi (lengthened-grade) formations. Latin also independently generalized vrddhi into vowel lengthening in certain derivations (e.g. the perfect tense), although it seems more productive in Balto-Slavic. There's also a proposed late-PIE law that suggests that there was general pre-tonic vowel lengthening in many daughters; I forget what the name of this law was but I think it's one of many controversial sound changes endorsed by Kortlandt. Benwing2 (talk) 20:10, 8 July 2017 (UTC)

@Erutuon -ивать and -ывать do mention that they generally change о -> а in the stressed syllable; I added this. I'm not sure if it makes sense to add the etymological origin of this. Generally the usage notes I added for various suffixes take a synchronic approach, and the whole analysis of e.g. говаривать as говорить + -ивать may not be completely valid diachronically. Benwing2 (talk) 20:14, 8 July 2017 (UTC)

@Erutuon Note also that although the change о -> а is standard, there are some exceptions. One systematic one is with verbs in -овать, which become -о́вывать not *-а́вывать. Benwing2 (talk) 20:15, 8 July 2017 (UTC)

legitimate

What's the point of having two etymologies here? DTLHS (talk) 20:56, 7 July 2017 (UTC)

I am no etymology expert, but none at all, as far as I can see. Off topic, I also question the usage note that says "Forms of legitimate are somewhat more common than the forms of the verbs legitimize and legitimise in the UK combined". I have scarcely even heard of "legitimate" as a verb, whereas "legitimise" is very familiar. Mihia (talk) 00:33, 8 July 2017 (UTC)

I also see no point in having 2 etymologies either, as the pronunciations can still be shown for each. Leasnam (talk) 01:21, 8 July 2017 (UTC)

I grouped them under the same etymology, and had to use the Pronunciation headers to further subgroup the P'sOS. It looks a little odd..., but I guess it works (?) Leasnam (talk) 01:36, 8 July 2017 (UTC)

Personally, I think this organisation gives the pronunciation differences more weight than they really deserve. If you don't want to list the pronunciations all under one heading at the top, I would put the pronunciations beneath the PoS headings, not the other way around. Mihia (talk) 01:51, 8 July 2017 (UTC)

I tried that initially, and it looked terrible. It put too much space between the Header and the senses and just ended up being too confusing :/ Leasnam (talk) 04:54, 8 July 2017 (UTC)

In that case, I would put them all under one heading at the top, which I see someone has now actually done. Mihia (talk) 13:07, 8 July 2017 (UTC)

The etymologies aren't exactly the same, since the verb comes from the adjective by conversion (Category:Conversions by language could maybe be created). It probably doesn't warrant two headers though. --Barytonesis (talk) 12:12, 8 July 2017 (UTC)

I think it does. In these situations I always include a separate etymology. Ideally every part of speech should have its own etymology. —CodeCa t 12:15, 8 July 2017 (UTC)

I strongly disagree with this given the current layout where etymology headings are at the highest heading level. I think it is confusing and unhelpful for ordinary dictionary users. I believe that a high-level etymology division should be employed only for words that are unrelated (or at least not at all closely related) in origin. By all means explain any intricate issues to do with the development of different parts of speech, but under the same header. I guess another option would be to have the etymology beneath the PoS, but then a new way would have to be found to make the major etymology divisions for words that really are unrelated. Mihia (talk) 13:20, 8 July 2017 (UTC)

Words with different etymologies should have different etymologies, it's as simple as that. We don't include multiple etymologies in one etymology section. —CodeCa t 14:36, 8 July 2017 (UTC)

Not if it creates a massive and completely misleading "Etymology 1" / "Etymology 2" top-level heading division for extremely closely related words, that looks exactly the same as the division for unrelated words. Another way has to be found. Mihia (talk) 17:14, 8 July 2017 (UTC)

When two related words have different spellings, we give them each their own etymology. So it makes sense to do the same when they happen to be homographs. —CodeCa t 17:30, 8 July 2017 (UTC)

Yeah, but then the presentational problem doesn't arise (because the words are on different pages anyway, presumably). Even so, if a spelling difference is a trivial variation of what is fundamentally the same etymology, I wouldn't repeat the whole etymology in two different places, just as I wouldn't repeat the definitions for mere minor spelling variations. It just makes maintenance more of a nuisance, and things easily get out of sync. I would put it in one place and then have cross-reference from one to the other. Mihia (talk) 17:47, 8 July 2017 (UTC)

I agree with Mihia (and apparently Leasnam and DTLHS) here. A simple "the verb is from the adjective" at the end of the one etymology section is sufficient; compare how we treat cases where later senses are derived by extension from earlier ones. - -sche (discuss) 17:35, 8 July 2017 (UTC)

I agree with -sche and Mihia. It's absurd to have separate etymology sections for each POS in cases where one is clearly derived from the other, especially in isolating and analytic languages like English. —Aɴɢʀ (talk) 18:09, 8 July 2017 (UTC)

Me too. DCDuring (talk) 19:51, 8 July 2017 (UTC)

I agree with the above. I realize you prefer it differently, CodeCat, but it might be best to stick with the consensus rather than creating all sorts of inconsistencies. Andrew Sheedy (talk) 21:37, 10 July 2017 (UTC)

нужда

I've tried to add Old Church Slavonic нѹжда to the etymology of Bulgarian нужда and got an error message telling me Old Church Slavonic is not an ancestor to Bulgarian.

Given that 1. at present, Wiktionary has no language code/template for Old Bulgarian, and 2. Old Church Slavonic is also referred to as Old Bulgarian in academic circles, I wonder what to do.

And while we're at it, OCS also had a parallel form нѫжда. I believe this was just a spelling variant, as ѹ and ѫ had likely been merged in the spoken language at the time.

So... any advice on / help with what to do? --EstendorLin (talk) 23:17, 8 July 2017 (UTC)

Not directly related, but you shouldn't use ѹ. Not on Wiktionary, not anywhere else really either. It's a deprecated character. —CodeCa t 23:41, 8 July 2017 (UTC)

Sorry, copied it from the Sofia University's site where I found the etymology. What should I use instead?b --EstendorLin (talk) 01:29, 9 July 2017 (UTC)

оу —CodeCa t 11:19, 9 July 2017 (UTC)

I'm not an expert on Slavic languages, but if OCS is the ancestor of Bulgarian, it should be added as such in Module:languages/data2, where the data for Bulgarian is contained. There may be other Slavic languages that need to have their ancestors listed. I just added Old East Slavic as the ancestor of Belarusian, Ukrainian, and Rusyn. — Eru·tuon 23:45, 8 July 2017 (UTC)

Also, a code for Old Bulgarian could be added to Module:etymology languages, if editors who know more about Slavic think it is distinct enough to warrant that. — Eru·tuon 23:47, 8 July 2017 (UTC)

The problem with OCS is that it's not one language from one area. Writers from all over the Slavic area wrote in OCS, and they continue to write modern CS today. Each of them put their own local twists on it. So to call an OCS document written by a Czech "Old Bulgarian" just isn't right. —CodeCa t 00:18, 9 July 2017 (UTC)

I know it was used over a wide area, and I wasn't proposing that all OCS be called "Old Bulgarian": that's why I said the code would be added to Module:etymology languages, not to the regular language data modules. — Eru·tuon 00:30, 9 July 2017 (UTC)

Sure, the relationship between OCS and modern Slavic languages is much like that between Latin and modern Romance languages. It was a literary language based on the Bulgarian vernacular, but standardized and extended with features from other early Slavic languages. My main issue here is that currently there is no way to add Old Bulgarian etymons. While in the case of, say, Croatian, nužda is considered a loanword from OCS (the regular reflex would be **nuđa), the modern Bulgarian word нужда is the direct continuation of Old Bulgarian нѹжда / нѫжда. --EstendorLin (talk) 01:29, 9 July 2017 (UTC)

I'd support making OCS the ancestor of Bulgarian, just because it was spread across a wider area doesn't change the fact that it developed naturally in Bulgaria. Crom daba (talk) 02:54, 14 July 2017 (UTC)

@-sche --Per utramque cavernam (talk) 17:18, 12 January 2018 (UTC)

Done. :) - -sche (discuss) 17:24, 12 January 2018 (UTC)

@Crom daba, Erutuon, -sche, Rua: I think we made the wrong decision to make Bulgarian a descendant of OCS. Crom daba says "just because it was spread across a wider area doesn't change the fact that it developed naturally in Bulgaria", but just because it developed in Bulgarian doesn't mean that modern Bulgarian is descended from it. Also, whatever we do with Bulgarian we'll also have to do with Macedonian, since they are closely related, and according to many linguists just dialects of one language. There are even some sound changes that took different directions in OCS and Bulgarian/Macedonian, for example, Proto-Slavic *vьlkъ metathesized in OCS to become влькъ (vlĭkŭ), but not in Bulgarian вълк (vǎlk) or Macedonian Macedonian волк (volk) (or any other South Slavic language, for that matter). --Wiki Tiki 89 14:22, 4 April 2018 (UTC)

@Wikitiki89 Bulgarian and Macedonian are closely related, but they must have split off before earliest OCS since Macedonian doesn't reflect *tj as št, but as ḱ. If you're wondering, the shift of *št > ḱ is impossible not only phonetically, but also because OCS št merges *tj and *šč, a merger that isn't present in Macedonian.

Bulgarian ъl and Macedonian ol are not preservations of *ьl but reflexes of an intermediate syllabic l̥.

Some of the isoglosses between Macedonian and Bulgarian may be due to areal influence, Torlakian Serbo-Croatian also shares some of these despite taking part in earliest Shtokavian developments (merger of yers, *ǫ > u, *jego > (n)jega). Crom daba (talk) 15:04, 4 April 2018 (UTC)

@Crom daba: Regarding "they must have split off before earliest OCS": That is only under the assumption that Bulgarian is indeed descended from OCS, so let's establish that first. You can just easily say that the št itself was due to areal influence, or due to literate influence from OCS. You could also say that the merger of *ť and *šč was incomplete in Bulgarian until much later. I find that the deep morphological similarities as there are in the verbal and nominal morphologies of Bulgarian and Macedonian are much more indicative of a common origin than a single phonological merger. --Wiki Tiki 89 15:27, 4 April 2018 (UTC)

@Wikitiki89 That's just, like, your opinion man.

I would have to know a lot more about South-East Slavic languages than I do now to discuss this effectively. On the face of it, presuming continuity here seems more parsimonious to me personally (I trust phonology relatively more I guess), but I see why you'd see it the way you do.

So what does the literature say? Wikipedia at least seems to imply continuity. Crom daba (talk) 22:26, 5 April 2018 (UTC)

@Crom daba: Right, so what I'm trying to say is that we rushed into a decision here without any of us having expert knowledge of the history of the Southeast Slavic languages and without consulting any literature written by such experts. And I definitely think it's better to treat Bulgarian as a sibling of OCS when it's actually a descendant than to treat it as a descendant when it is actually a sibling. --Wiki Tiki 89 13:40, 9 April 2018 (UTC)

@Wikitiki89 Fine, but Bulgarian being a descendant of OCS is conventional wisdom due to aforemented shared phonological innovations, and I have never heard of these theories of OCS being basal to Bulgarian and Macedonian before.

We could speculate whether these innovations represent literary influence as you did, but I consider this level of skepticism unwarranted seeing as we're dealing with languages spoken on the roughly on the same geographical location, by the same ethnos, and with some level of continuity.

One could wonder how many "unproblematic" cases of inheritance could withstand this much scrutiny. Latin and Romance? Sanskrit and Indic languages? Jurchen and Manchu? Even something like Old English and English could be problematized if we were to dwell on the West Saxon standard (when does y back and when does it unround?) 12:04, 10 April 2018 (UTC)

I did think about this with regard to Ancient Greek, since Modern Greek is not directly descended from any of the Ancient Greek literary dialects we know (I might be wrong about this, but in that case, just take it as a thought experiment). So it comes down to what we mean by "Ancient Greek". As I see it, if the direct ancestors of the Modern Greek dialects had been written down, they would have been considered part of Ancient Greek, therefore the term "Ancient Greek" covers the direct ancestors of the Modern Greek dialects. So the question then is this: If we have written attestations of the direct ancestor dialects of Macedonian or Neo-Štokavian, would we consider them to be OCS? If so, then we should treat OCS as the parent of all the South Slavic languages. If not, then we have to further ask ourselves, if we found medieval written attestation of some features of Modern Bulgarian that are not generally considered to be OCS features (such as loss of case, merger of ѫ (ǫ) and ъ (ŭ), etc.), would we consider this to be OCS? If yes, then we need to consider why we said no to the first question but yes to the second. If not, then OCS is not an ancestor of Modern Bulgarian. I hope this logic isn't too complicated to follow. My preferences would actually be to say yes to the first question and let OCS be the parent of more than just Bulgarian. --Wiki Tiki 89 15:22, 10 April 2018 (UTC)

There are South Slavic monuments that aren’t in OCS, like Freising manuscripts. In the second case, yes, we definitely would, because that would be OCS features then, which would spread to other areas with the rest of the language. OCS arose as a written vernacular: the only reason these modern features weren’t a part of it is because they did not yet exist. Guldrelokk (talk) 18:28, 10 April 2018 (UTC)

Well let me prod a little further. What makes us say that the Freising manuscripts are not OCS? --Wiki Tiki 89 19:18, 10 April 2018 (UTC)

Dialect continua do not really split off unless something causes them to, which wasn’t the case for Bulgarian and Macedonian. The (so-called; they aren’t really a subgroup separate from anything, see Torlakian) Eastern South Slavic dialects are no doubt very close; in no way does this mean that they were identical in the Early Middle Ages, in fact they couldn’t be. Guldrelokk (talk) 09:22, 6 April 2018 (UTC)

These are good points, here's the general impression I had of the situation from tangentially related literature:

Old Church Slavonic refers exclusively to texts written in the first few centuries, there is only a small corpus of OCS texts and a lot of them are translations of Greek originals (think Ulfias' Bible).
Church Slavonic was a literary language of Slavic lands (possibly also Romania?) from Bohemia and Russia to Croatia and Macedonia.
This literary language was based on Bulgarian, but adapted to local vernaculars.

But there is admittedly some nuance and problems with this simplification, some of which I've only discovered due to this discussion after opening Old Church Slavonic: An Elementary Grammar by S.C. Gardiner.

OCS is only attested from the middle of the 10th century and most of the manuscripts are found outside of the Balkans because a lot of the Balkan ones didn't survive the Ottoman conquests.
Manuscripts we have are not originals and it is presumed that copyists introduced some of their vernacular as copying errors.
Gardner includes as OCS not only texts from Bulgaria but also texts of the Macedonian and Czech recension, some texts with SCr influence and even a latin script text in Old Slovenian (above mentioned Freising manuscript).
OCS Cyrillic manuscripts write the reflex of *ť as щ (Unicode uses the same point to encode Russian and OCS letters, but OCS variant looked like the attached image). This is contrary to our decision to write шт in our OCS lemmas, which I understood (apparently mistakenly) had basis in manuscript evidence. I do not know how the reflex of *šč was spelled.
Also this letter is not a combination of ш and т but taken from Glagolitic Ⱋ, so maybe some scribes did use it to represent /c/ or /t͡ɕ/ rather than undoubtably Bulgarian /ʃt/ (However, Czech recension writes ц instead of using it).

However, and I feel this is the most important aspect for us as lexicographers, languages that have a Church Slavonic tradition (Serbian, and more visibly Russian) unequivocally have a layer of Old Bulgarian loan words, so we have opština/općina in Serbo-Croatian and город/град in Russian (also -ающий since you mentioned verbal morphology). If we made made OCS their ancestor this would obscure an important point, that Bulgarian elements were imported with the liturgical language. Crom daba (talk) 21:00, 10 April 2018 (UTC)

Yes, there was even a Glagolitic letter for *ď, but 1) *ť could also be written ⱎⱅ št in two letters, 2) *sť could also be written Ⱋ… The chaos.

After thinking a bit, I’m not entirely sure what is better. The language origins are definitely in the Eastern South Slavic lands, that’s where most of the translations were made and that’s where it spread from. Also normalised OCS forms given in Wiktionary follow Bulgarian practice, so they have жд etc. Guldrelokk (talk) 01:05, 11 April 2018 (UTC)

It depends on what you mean by Old Church Slavonic. In the widely accepted understanding going back to Leskien, OCS is the ancestor of Bulgarian, being the language of a few oldest manuscripts mainly from the Bulgarian-Macedonian area, with other monuments rejected as they differentiated from the Bulgarian variant. Even in its home area the language functioned as a written koine, being subject to regional variation, scribe mistakes owing to differences from their vernacular, hypercorrection and such. However, the very same thing can be said about Old East Slavic; nobody seems to question its ancestry to the modern East Slavic dialects. Among the unique traits OCS shares with some of the modern Bulgarian-Macedonian dialects (and with no other) are the development of *ť and *ď, lack of iotation before *a- and the development of tense yers (they hadn’t yet merged with anything in OCS, but the spelling is indicative). The verbal system is a preservation, but mind that in Old East Slavic, for example, the aorist and imperfect had been probably already lost by that time (birch bark letters do not show any signs of them expect some with a heavy OCS influence). It’s already been pointed out that OCS <лъ> and <ръ> represented syllabic sonorants – this is a universally accepted wisdom, and they were actually written as <ъл> and <ър> as well, because scribes felt no difference. I am not aware of a single feature that would prevent you from deriving the modern Bulgarian term from the OCS one, while this is not possible for any other Slavic language. Guldrelokk (talk) 08:48, 6 April 2018 (UTC)

Also, references seem to treat (or explicitly identify) (Old) Church Slavonic and Old Bulgarian as synonyms. Horace Gray Lunt's Old Church Slavonic Grammar speaks of the language "known as Church Slavonic. Since the majority of the early manuscripts which have survived were copied in the Bulgaro-Macedonian area and since there are certain specifically eastern Balkan Slavic features, many scholars have preferred to call the language Old Bulgarian, although Old Macedonian could also be justified." - -sche (discuss) 17:10, 9 April 2018 (UTC)

I'm pinging @Vorziblix too. --Per utramque cavernam (talk) 21:16, 10 April 2018 (UTC)

I don’t have all that much to contribute that hasn’t already been said, but a few points:

OCS was basically the standard literary register of Late Common Slavic (starting about a century later than the Proto-Slavic forms we reconstruct), standardized on the basis of a South Slavic dialect. It seems inconsistent to me that we treat Proto-Romance as unified with Classical Latin, yet treat Proto-Slavic as disunified from OCS, but I suppose it’s a consequence of how these lects have historically been treated in scholarship.
Different authors have different opinions on which texts are admissible as part of the OCS corpus, and which should be excluded. These differences of corpus can result in differences in how one views OCS.
Some of the developments of OCS that it shares with Bulgarian/Macedonian are not universal in OCS writings, and evidence suggests they may not have been present in the oldest OCS:
- It seems the original Glagolitic alphabet had the letters ⰼ and ⱋ to represent *ď and *ť (rather than ⱋ for št). This was already argued in Durnovo 1931 and Trubeckoj 1954; Lunt also notes that ‘the pronunciation probably varied from region to region’. The Kiev Missal, universally included in the OCS corpus, has z and c as the reflexes of *ď and *ť intead of žd and št.
- Word-initial iotation was non-phonemic except before u and ǫ (Cyrillic initial ꙗ- generally represents early OCS (j)ě-, and initial а- represents (j)a-, cf. the Glagolitic writings ⱑ- and ⰰ-, with a few exceptions) and often varied.
- The original Glagolitic alphabet also seems to have had only one yer (suggested by the acrostic of Constantine of Preslav, among other considerations); if this was indeed the case, whether this merger was purely orthographic is impossible to tell.
Thus, if we take the language as standardized by Cyril and Methodius as our model for OCS, it could possibly be considered ancestral to all South Slavic languages as far as phonology is concerned; but most surviving OCS manuscripts, coming from a century or two later, show distinctly Bulgarian phonological developments. (Yet others, like the Kiev Missal, show distinctly non-Bulgarian developments.)
Setting aside phonology, the lexis of OCS includes many learned words that may never have had currency in the spoken language, coined or borrowed to facilitate translation from Greek.

Overall, I vaguely lean toward considering OCS ancestral to Bulgarian, due to shared phonological history, but there are certainly many complications at play. — Vorziblix (talk · contribs) 04:03, 11 April 2018 (UTC)

@Vorziblix: Re: It seems inconsistent to me that we treat Proto-Romance as unified with Classical Latin, yet treat Proto-Slavic as disunified from OCS - It's because most languages already formed by the time OCS came about. For example, Old East Slavic (especially its Eastern part) and early Russian were heavily influenced by OCS and they have many parallel forms but neither Old East Slavic or Russian are derived from OCS. See Appendix:Russian doublets, for example. --Anatoli T. ^{(обсудить}/^вклад) 05:53, 11 April 2018 (UTC)

@Atitarev: Yes, of course, a handful of phonological splits already took place by the time of OCS, but going by the criteria of mutual intelligibility one wouldn’t call the resulting lects separate languages (yet). I agree, Old East Slavic and Old Church Slavonic are not derived from each other, but at least in their early stages (before c. 1000 AD, say) they function rather as two linguistic registers based on two dialects of a single Abstand language (Late Common Slavic). I definitely think the distinction is useful to make, though, and I’m not suggesting we unify Proto-Slavic, OES, and OCS; if anything, I’d be more inclined to disunify the other pair. Anyway, sorry for bringing this up; I didn’t mean to distract away from the discussion at hand. — Vorziblix (talk · contribs) 06:31, 11 April 2018 (UTC)

Also Български етимологичен речник by Вл. И. Георгиев treats OCS as an ancestor of Bulgarian. Guldrelokk (talk) 16:19, 11 April 2018 (UTC)

@Wikitiki89 Another update, I've done some reading of Codex Zographensis and Codex Marianus, Glagolitic manuscripts belonging to the Macedonian recension. It turns out that Zographensis doesn't use ⱋ (št) but exclusively ⰞⰕ (ŠT), and Marianus uses them interchangeably, even in same words, and without regard for *ť vs. *šč (a merger which modern Macedonian, unlike modern Bulgarian, doesn't have as I've mentioned).

Also, according to Wikipedia article on Glagolitic, Schenker apparently thinks that št is a ligature of š and t after all. Crom daba (talk) 23:58, 11 April 2018 (UTC)

Although the situation appears to be somewhat more complicated yet looking at this map... Crom daba (talk) 00:07, 12 April 2018 (UTC)

Keep in mind that people move. The ancestor dialects of a modern dialect are not necessarily located on the same territory. --Wiki Tiki 89 19:05, 12 April 2018 (UTC)

Washington, D.C.

Yeah, I understand it was named after George Washington, but why is it comma and then District of Columbia? That suggests that Washington is a name of a city inside of the District of Columbia, which is not the case, as both names are the same entity. PseudoSkull (talk) 20:22, 10 July 2017 (UTC) EDIT: I think we should explain why that is somewhere in this entry. I used to, as a child, mistakenly think that Washington was a city inside of DC, and I feel others may have the same misperception. PseudoSkull (talk) 20:27, 10 July 2017 (UTC)

Is it not more like "Elizabeth, the queen of England" then? —CodeCa t 20:23, 10 July 2017 (UTC)

Washington is (or was, historically) a municipality inside the District; at the time Washington was founded, there were two other municipalities in the District, Georgetown and Alexandria. In 1871, Congress repealed the individual charters of Georgetown and Washington, and vested the power of government of them into a unitary territorial government for the whole District of Columbia. - -sche (discuss) 21:44, 10 July 2017 (UTC)

The original northern boundary of the municipality of Washington within the District of Columbia was Florida Ave.

There is a gravestone in an Alexandria, VA cemetery that reads Alexandria, DC. Neelthakrebew (talk) 18:25, 26 April 2024 (UTC)

Proto-Slavic word for "east".

I've been trying to find this one and it doesn't seem to exist on Wiktionary. Its descendants are quite diverse, with some having different forms of the same word. All I know is that the Polish word might be derived from a word meaning "to rise". I tried finding that word to no avail. A bit of help, if you may? 71.1.97.49 23:47, 11 July 2017 (UTC)

I found the etymology for Russian восто́к (vostók), but it's from Old Church Slavonic and was calqued from Greek. Polish wschód is a calque of Latin oriens. Both of these etymologies came from Vasmer. (He doesn't give any further morphological analysis of them.) These seem to be unrelated, so perhaps there is a third word that is actually from Proto-Slavic, or no word for "east" in Proto-Slavic at all. — Eru·tuon 03:32, 12 July 2017 (UTC)

As suggested above, many Slavic words look like "coinages" as opposed to tracing to a single, common origin (without knowing any Polish I can instantly conjecture that their word means "ascension" or "up-going" because восход (vosxod) means "up-going" in Russian)

I checked my Latvian etym source for aust (“to dawn”) (austrumi (“east”)) and it lists Old Church Slavonic za ustra among cognates meaning "early in the morning," my conjecture -- the ustra element resembles Russian утро (utro, “morning”), perhaps a "morning" sense could have displaced an "east" sense. In summary, if they are more recent coinages modern Slav. words for "east" won't trace to a single parent, words similar to "morning" may have been the "original" word for east (perhaps?) Neitrāls vārds (talk) 01:35, 14 July 2017 (UTC)

And here it is: *utro. Neitrāls vārds (talk) 01:35, 14 July 2017 (UTC)

朝

The current page for 朝 says that the right-side 月 is a graphical corruption of 川 (“river”). But other sources say that the 月 existed as 月 in the oracle bone script, which became 水 (some say 川, some say 中間有三點的水流之形) in the bronze script, which became 舟 in the seal script (some say small seal script), and was finally restored to the oracle-bone-script 月 in the regular script after 隸變.

“rice”: Middle Persian blnj, Sanskrit व्रीहि (vrīhi), Proto-Dravidian *wariñci, etc.

Two questions:

What are the origins of the nasal infix -n- in Iranian and Dravidian? From the same source, or a coincidence?
Are all of these (incl. other Indo-European descendants, e.g. English rice) ultimately related to Proto-Sino-Tibetan *b-ras (“rice”) > Tibetan འབྲས ('bras), Proto-Austronesian *bəʀas (“rice”) > Malay beras?

Wyang (talk) 01:52, 13 July 2017 (UTC)

The Middle Persian word is ultimately from the Proto-Dravidian word (or from the same source as the Proto-Dravidian word). Is it possible that although (some of?) the attested intermediaries lack the nasal, alternative forms existed which preserved the nasal of the Proto-Dravidian word (or of its source) and that Persian borrowed those forms?

Van Driem, citing Osada (1995) and Diffloth (2005) for the reconstruction, thinks Proto-Austro-Asiatic *rǝŋkoːʔ "rice grain" is the source of the Proto-Dravidian and Middle Persian words.

He also mentions that the Proto-Hmong-Mien word for "rice grain" was *n̥jeŋ (and mentions that this term may have been borrowed from, or loaned into, Old Chinese 饟 and/or 囊), but without reading further I'm not sure if he's suggesting that *rǝŋkoːʔ and *n̥jeŋ are connected or not.

- -sche (discuss) 16:31, 16 July 2017 (UTC)

cat

In our entry for ‘cat’, the ultimate origin is currently given thus:

Jean-Paul Savignac suggests it is from Late Egyptian čaute, feminine of čaus (“jungle cat, African wildcat”), from earlier Egyptian tešau (“female cat”).

But none of these words are even plausibly Egyptian, Late or ‘earlier’; Egyptian was not written with vowels, and the word for female cat would necessarily have a feminine suffix -t. As far as words for cats go, the fairly comprehensive Thesaurus Linguae Aegyptiae has only mjw (“tomcat”), mjwt (“female cat”), and wšft (“a cat-like animal”).

What seems to have happened to yield ‘tešau’ is that Savignac took a Coptic word ϣⲁⲩ (šau, “tomcat”) and slapped a feminine article ⲧⲉ (te) on the front, but I have no idea where ‘čaute’ or ‘čaus’ come from. Anyone else able to unravel what Savignac meant? Should I just remove it from the entry? — Vorziblix (talk · contribs) 13:46, 13 July 2017 (UTC)

I looked into various references on this when I edited the etymology a bit in May. As I understand it, the general view is that it comes from Afro-Asiatic, but each proposed etymon has problems. Without outright dropping any of the current content/theories, one might say:

from Latin catta (used around 75 AD by Martial), which is generally though to be from an Afroasiatic language, although each specific proposed etymon has presented problems. Many references refer to "Berber kaddîska (“wildcat”)" and "Nubian (kadīs)" as etyma or cognates, but M. Lionel Bender opines that the Nubian term is a loan from Arabic. Jean-Paul Savignac suggests it is from a Late Egyptian term *čaute, feminine of *čaus ("jungle cat, African wildcat"), from a word *tešau ("female cat"), but such words are unattested and morphologically problematic.

^ Douglas Harper, Online Etymology Dictionary, s.v. "cat", , retrieved on 29 September 2009: .

^ John Huehnergard, Qitta: Arabic Cats, in Classical Arabic Humanities in Their Own Terms

^ Jean-Paul Savignac, Dictionnaire français-gaulois, s.v. "chat" (Paris: Errance, 2004), 82.

Of course, we could also engage in some more extensive trimming. :p

Btw, one writer makes the argument that the term went in the other direction, from Germanic into Afro-Asiatic, but that seems somewhat difficult to reconcile with how both the animal and the word are understood to have spread... - -sche (discuss) 21:08, 13 July 2017 (UTC)

Support for the Germanic theory can be found in PGmc Proto-Germanic *katazô (“male cat”) (> German Kater), a word which lacks the geminate t of *kattuz, and which is postulated to be from a much older form. Compare also Czech kocour (“male cat”). Leasnam (talk) 21:29, 13 July 2017 (UTC)

That's suspect, why didn't the -t- sibilate? Czech form is just *kotъ + -*erъ. Crom daba (talk) 02:46, 14 July 2017 (UTC)

I wondered that too. Could it be a central form (the word is also found in Middle Low German, Middle Dutch, and Middle English, all with a single t)? Additionally, it's found in West Slavic, if it's indeed the same word; M. Philippa mentions a Proto-Slavic *kot'urŭ for the Czech term...Anyway, Germanic seems to be rife with variations of this root, making it appear to be older than merely a LL borrowing Leasnam (talk) 13:46, 14 July 2017 (UTC)

There's also Bulgarian котарак (kotarak) and the elusive Hungarian kandúr. The Hungarian form looks like it could be from unattested Slavic **kǫturъ with an n-infix although the voicing seems to be irregular. Its form doesn't look native in any case.

Bulgarian -ар- is also irregular, but it could possibly be a case of replacing a rare suffix (-*erъ~-*orъ~*urъ) with a more common one *-arjь.

Crom daba (talk) 14:27, 14 July 2017 (UTC)

I’ve just looked into Savignac’s dictionary, and it seems much of the etymology we have is not accurately taken from there. Under ‘chat’ he says:

Ce terme ne remonte pas nécessairement au latin cattus. Rappelons que le nom de cet animal, venu probablement d’Egypte en Europe assez tard, se dit chaou, fém. chaout en égyptien hiéroglyphique et en copte. Cf. v. h. a. kazza, v. norr. kǫttr, lituan. katė̃ « chat ».

So it seems he is indeed referencing Coptic ϣⲁⲩ (šau, “tomcat”), and then extrapolating it back to Egyptian in order to add feminine -t to the end for *chaout (*šwt? *ḫwt?). This is much more reasonable as far as Egyptian is concerned, although still unattested; I’ve no idea how it got so mangled. — Vorziblix (talk · contribs) 15:52, 14 July 2017 (UTC)

Not that it matters as far as evidence is required, but arguably, animal names are one of the early words a child learns, pets being rather prominent examples. If the use of the word was reduced to baby-speech, that would explain lack of written record.

Baby-speech would imply all sorts of irregularities, I guess and presume that would hint at a rather old root (as with mama). One possibility would be an onomatopoeia (hissing and meowing) as common nickname. 91.66.13.99 19:07, 17 July 2017 (UTC)

Uralic origin of Latin cannabis

(More specifically: Uralic origin of the Scythian word because up to Scythian it's pretty uncontroversial.) Any sources/references for this?

An Estonian IP left a borderline-mocking HTML comment pointing out that Estonian kena is a Germanic borrowing according to “kena”, in Eesti etümoloogiasõnaraamat (in Estonian) (online version), Tallinn: Eesti Keele Sihtasutus (Estonian Language Foundation), 2012 (< click) (was removed.)

So, I'm curious if there's anything at all supporting a Uralic origin of the Scythian term? Alternatively the etymology could just be truncated at Scythian and call it a day. Neitrāls vārds (talk) 02:34, 14 July 2017 (UTC)

For the Uralic origin refer to Schrader and Hehn. --Vahag (talk) 06:12, 14 July 2017 (UTC)

At any rate, cannabis#Latin and cannabis#English have totally different etymologies for the Scythian. —Aɴɢʀ (talk) 13:28, 14 July 2017 (UTC)

The origin is disputed. We should probably pick one of the early words (I prefer Ancient Greek) and treat the different theories there. The other cognates can refer to it for further discussion. --Vahag (talk) 14:01, 14 July 2017 (UTC)

Good idea. I've created κάνναβις (kánnabis) now; feel free to add an etymology section. —Aɴɢʀ (talk) 15:21, 14 July 2017 (UTC)

Uralic "*keńe" and "*piš" as reconstructed here are not recognized by any normal references in comparative Uralic research, and some of the alleged reflexes clearly cannot belong together (e.g. p- never occurs in native Hungarian vocabulary). UEW only accepts Mari-Permic *känɜ, and treats this as a Wanterwort (with no mention of the theory of a compound with 'nettle'). Permic *pyš 'hemp' (not 'nettle') is possibly from *pOčV 'layer'. --Tropylium (talk) 16:18, 14 July 2017 (UTC)

At risk of joining the Bright Shiny Object school of historical linguistics, I suppose it wouldn't hurt to mention Biblical Hebrew פִּשְׁתָּה (pishtá, “flax”) in connection with "*piš" — This unsigned comment was added by Chuck Entz (talk • contribs).

Ok, κάνναβις (kánnabis) is ready now. --Vahag (talk) 11:39, 15 July 2017 (UTC)

Great work, thank you! (Wikipedias seem to be full of off-the-wall, crazy etymologies for this word, but now I have something to work with.) Neitrāls vārds (talk) 01:10, 11 February 2018 (UTC)

While we're on the topic, @Wyang and others might be interested in adding an etymology to Thai บ้อง (bɔ̂ng). —Μετάknowledge^{discuss/deeds} 04:06, 17 July 2017 (UTC)
I haven't had much luck at finding a Tai etymology for this word. Pinging other Thai-language editors @Octahedron80, หมวดซาโต้, Iudexvivorum, YURi, Alifshinobi, Atitarev You guys would be much more capable at Thai etymologies. Wyang (talk) 04:53, 17 July 2017 (UTC)
- It might be a corrupted form of บั้ง (bâng) (this is just my suggestion). Synonymous/(probably) cognate terms are:

Thai	foreign
กล้อง (glɔ̂ng)	Lao ກ້ອງ (kǭng) Shan ၵွင်ႈ (kāung)
บอก (bɔ̀ɔk)	Lao ບອກ (bǭk) Mon ၜံက် (bɔk) Shan မူၵ်ႇ (mùuk)
บั้ง (bâng)	Lao ບັ້ງ (bang)
ปล้อง (bplɔ̂ng)	Mon ပၠံၚ် (plɔŋ)

--iudexvivorum (talk) 05:36, 17 July 2017 (UTC)

I've run into a reference (cited by Marek Stachowski elsewhere) that may be useful to check out:

Marszewski, T. (1996): An ethnohistorical approach to the controversies concerning the provenance and diffusion of ancient Iranian and Indian names for hemp (Part I). — FO 32, pp.1–64.

"FO", I would guess, is probably the journal Folia Orientalia. --Tropylium (talk) 17:57, 24 July 2017 (UTC)

楽

"A white bird flapping its wings on top of a tree, having fun" sounds suspiciously like a mnemonic, especially considering 樂#Glyph origin, as well as shinjitai forms such as 攝 > 摂. —suzukaze (t・c) 04:43, 14 July 2017 (UTC)

Definitely just a mnemonic. It should just be simplified from 樂. — justin(r)leung _{{ (t...) | c=› }} 04:53, 14 July 2017 (UTC)

τάπης

RFV of the etymology. Isn't this a (Persian) loanword into Greek ? Leasnam (talk) 22:56, 15 July 2017 (UTC)

My only source states that is of uncertain etymology and some believe that the Persian as well as the Greek word are both loans from some Asia Minor's word. --Xoristzatziki (talk) 10:41, 16 July 2017 (UTC)

I added an etymology with a source. --Vahag (talk) 06:46, 17 July 2017 (UTC)

There's also the Mycenaean word 𐀲𐀟𐀊 (ta-pe-ja), which must be related. --Barytonesis (talk) 04:27, 19 July 2017 (UTC)

I wish we had some comparative Iranists here, too many etymologies end at citing a Persian word. Crom daba (talk) 04:43, 19 July 2017 (UTC)

Horus and ḥr

Two things:

1) The etymon at Ancient Greek Ὧρος (Hôros) gives Hr (Egyptian 𓎛𓂋𓁷 (Ḥr)), but Horus gives Egyptian ḥr. Can the difference in the capitalization be resolved so that the two etymons be merged?
2) ḥr has three entries, can the etymologies be merged? I am not even sure how to read the entry on the god. Because there is no translation of haru, it's not obvious if that's in contrast to Proto-Afroasiatic *x̣al. If the stem (haru) was related to *xal (which is rather obvious from the meaning and derivatives of 'above'), that could be made clearer. 91.66.13.99 20:40, 17 July 2017 (UTC)

The capitalization issue is now resolved. Regarding the etymologies at ḥr, they are all obviously related, but I hesitate to merge them without first knowing exactly how they are related for fear of getting something wrong. Basically, what is there at the entry right now is what it says in the cited sources; if you or someone else is confident enough to synthesize it all together into a coherent whole, feel free to merge them. — Vorziblix (talk · contribs) 13:27, 25 July 2017 (UTC)

hostis humani generis

An anonymous editor modified the etymology hostis humani generis to state that hūmānī is the singular form of hūmānus. Could someone confirm if this is correct? Thanks. — SGconlaw (talk) 14:52, 18 July 2017 (UTC)

It's genitive singular neuter agreeing with generis, yes. —Aɴɢʀ (talk) 15:18, 18 July 2017 (UTC)

Thanks! — SGconlaw (talk) 16:16, 18 July 2017 (UTC)

καταστροφή

Can we tell which of the possibilities in the etymology is the correct one? —CodeCa t 18:11, 18 July 2017 (UTC)

I'd say the second one (derivation from the prefixed verb) is more accurate than the first, and that goes for all other similar cases. --Barytonesis (talk) 01:04, 19 July 2017 (UTC)

Then there's the Far Side cartoon showing a feline derriere mounted on a wall plaque... Chuck Entz (talk) 02:16, 19 July 2017 (UTC)

Mycenaean 𐀒𐀵𐀙

Related to χθών (khthṓn)? --Barytonesis (talk) 02:45, 19 July 2017 (UTC)

Clearly a borrowing from kotona. ;) --Tropylium (talk) 03:31, 19 July 2017 (UTC)

I feel like Reconstruction:Proto-Uralic/kota#Etymology or Reconstruction:Proto-Slavic/kǫťa#Etymology might be relevant here. It's best to find a proper reference for the word though. Crom daba (talk) 04:19, 19 July 2017 (UTC)

@Barytonesis: It seems very likely. 𐀒𐀵𐀙 (ko-to-na) is exactly how both the accusative singular χθόνα (khthóna) and the accusative plural χθόνας (khthónas) would be spelled in Mycenaean. —Aɴɢʀ (talk) 07:48, 19 July 2017 (UTC)

ἀρχή

Is this an o-grade or a zero grade derivation from the root? Purely etymologically, I would expect the o-grade to descend from *h₂orgʰ-, which would presumably retain its o in Greek. However, it's possible that Greek modified this kind of formation to use the zero grade with laryngeal-initial roots. Are there any real o-grade nouns in -η that descend from such roots? —CodeCa t 09:29, 19 July 2017 (UTC)

{{R:grc:Beekes}} says it's a Greek formation, not older. Do you still use tweeënveertig? --Barytonesis (talk) 13:15, 20 July 2017 (UTC)

Rhyming compounds

As I mentioned on the talk page of todger dodger, I think I once read a specific name used for these compounds made of two rhyming words. Would anyone know something about it? At any rate, shouldn't we have something like Category:English rhyming compounds? --Barytonesis (talk) 13:04, 19 July 2017 (UTC)

If you're thinking of a word that itself rhymes and denotes a specific kind of word, are you thinking of hobson-jobson? If you're thinking of a word that doesn't necessarily rhyme but that denotes rhyming compounds, "(reduplicative) rhyming compound" seems to be the phrase used by a number of sources including Merriam-Webster. - -sche (discuss) 02:58, 20 July 2017 (UTC)

@-sche: Not hobson-jobson, no. I'm definitely thinking of the latter. Would you be ok with Category:English reduplicative rhyming compounds? --Barytonesis (talk) 16:17, 20 July 2017 (UTC)

I think so, but BP would be better place to test the waters. DCDuring (talk) 02:33, 21 July 2017 (UTC)

Well, are all of these compounds reduplicative, or are some just rhyming compounds? For example, "todger" and "dodger" are independently words, so I'm not sure if "todger dodger" is reduplicative per se. So your original suggestion of "rhyming compounds", which I wasn't meaning to contradict, seems best. - -sche (discuss) 09:01, 21 July 2017 (UTC)

I made Category:Mongolian_rhyming_compounds before I understood how our category system worked. Perhaps we could have a subcategory for compounds which are made by reduplication. Crom daba (talk) 13:08, 21 July 2017 (UTC)

Are there words that are simultaneously compounds and reduplicated? I mean, I thought reduplication was the addition of a meaningless repetition of part of the existing word, while compounds are the combination of two meaningful words. — Eru·tuon 17:15, 21 July 2017 (UTC)

"reduplicative compound" gets a lot of hits on google books so I guess they can. Crom daba (talk) 19:03, 21 July 2017 (UTC)

Reconstruction:Proto-Indo-European/ǵʰmṓ

The etymology here currently says that the dental is regularly lost in such a word-initial cluster. However, would it not rather be preserved as a thorn cluster? What causes it to be lost in this instance but preserved in e.g. Sanskrit क्षम् (kṣam) or the root *tḱey-? —CodeCa t 16:50, 19 July 2017 (UTC)

The explanation by Lipp (2009) is that *TKC- > *KC- (the cluster simplifies before a following further consonant). --Tropylium (talk) 17:05, 19 July 2017 (UTC)

Ah, ok. Is this sound change also applicable for Anatolian, i.e. "PIE proper"? —CodeCa t 17:15, 19 July 2017 (UTC)

Only for Indo-Iranian, actually. Maybe someone else has argued extending it for the rest of IE, though. --Tropylium (talk) 13:26, 20 July 2017 (UTC)

Can a case be made for renaming it to *dʰǵʰmṓ then? —CodeCa t 19:32, 20 July 2017 (UTC)

σχίζω

A wild aspirate appeared! Aren't Ancient Greek aspirates from the PIE aspirate series? Anti-Gamz Dust (There's Hillcrest!) 18:26, 19 July 2017 (UTC)

{{R:grc:Beekes}} simply states that there is no explanation. --Barytonesis (talk) 12:14, 20 July 2017 (UTC)

Siebs' law? Claimed Sanskrit cognates also feature aspiration. Crom daba (talk) 16:56, 20 July 2017 (UTC)

hearsal

Curious how/why this word needed the re- prefix attached to it (rehearsal) if it already had the same definition in its original form. My guess is it's because the etymology of "rehearse" is from Middle English which likely predates "hearsal"? -- OlEnglish ^(Talk) 05:28, 22 July 2017 (UTC)

The atrocious intrusion of a false Gaelic cognate for cog.

That etymology intrusion, (because it was an intrusion) that Metaknowledge thankfully saw and corrected as to a Gaelic cognate for cog, because, whilst I got confused, it does not exist. If anyone had no excuse for getting it wrong it was I! I had all the stringent guidelines set out painstakingly on (my) user page for my guidance and beyond. If I do not adhere to them by not checking an etymology properly before editing any entry main page again, I shall personally get a blocking administrator to block me permanently! Andrew H. Gray 20:08, 24 July 2017 (UTC)Andrew

Taken to user talk pages. Anti-Gamz Dust (There's Hillcrest!) 01:23, 25 July 2017 (UTC)

rube

A "rube" who "just fell off a turnip truck" would combine two American colloquialisms. The etymology given at rube is that it is from the name "Rube", while no explanation is given for the latter. But a rube is a turnip! I'm skeptical of the first derivation, but if it is true, then it certainly would explain where the latter phrase came from. Wnt (talk) 12:35, 25 July 2017 (UTC)

Words for Medes: Aramaic מָדַי, Coptic ⲙⲁⲧⲟⲓ, Egyptian mdy, …

In his Coptic Etymological Dictionary, Černý gives the etymology of ⲙⲁⲧⲟⲓ (matoi) as “ mdy; mty, ‘Persian’, ‘Persia’, lit. ‘Mede’, through Aramaic Māday.” I’m not sure how to interpret ‘through’ here; what was borrowed into what, and when? Was the Aramaic word borrowed from Egyptian/Demotic and then back into Coptic? (Seems pretty unlikely.) Was the original Egyptian word borrowed from Aramaic? Anyone know where the Aramaic word comes from, or the source of this ethnonym in general? Our Greek entry at Μῆδος (Mêdos) is a dead end. — Vorziblix (talk · contribs) 13:41, 25 July 2017 (UTC)

@Vorziblix: Although what I am aware of may seem contraversial to some scientific minds, the true origin of Aramaic Māday is actually the name of the third son of Japheth - םדי (Māday) - meaning uncertain; around four thousand three hundred and sixty-five years ago. Regards. Andrew H. Gray 18:08, 25 July 2017 (UTC)Andrew (talk)

Medes seem to have become relevant during the Neo-Assyrian empire, which apparently coincides with the rise of Aramaic as a lingua franca so if the Egyptian term isn't from Akkadian it's most probably from Aramaic. I can't help with the Coptic situation though, but Aramaic form is definitely not from Egyptian.

It's "Māda-" in Old Persian (someone who understands cuneiform should find the original spelling), and "mada" in Elamite (ditto for cuneiform).

Mayrhofer suggests Proto-Indo-European *mag- (as in English make) and Skalmowski *médʰyos (as in middle) for the ultimate origin of the name.

Crom daba (talk) 21:07, 25 July 2017 (UTC)

Thanks, that was good to know; I’m guessing Černý then meant that the Egyptian comes from the Aramaic. It’d be good to add the other info to an etymology section somewhere. — Vorziblix (talk · contribs) 05:41, 26 July 2017 (UTC)

Without getting into the strength of the claim itself, I interpret "through" the same as many of our (en.Wikt) etymologies that say "via", i.e. saying that the Egyptian word came from Aramaic but Aramaic was only a middleman and had taken the word from some third language. - -sche (discuss) 23:34, 27 July 2017 (UTC)

χάος

I would mention as etymon χέω as mentionned by Liddell & Scott.

for the meaning: "pour" versare => renversement (in French) i.e. disorder

which also explains χάιος, "good", i.e. well versed, bien tourné (in French)

--Diligent (talk) 06:35, 26 July 2017 (UTC)

I'm not sure if LSJ is saying that the word actually derives from the root of χέω (khéō), or if it's saying that someone else said that. Regardless, I don't see a clear way for the PIE root *ǵʰew- (other vowel grades *ǵʰu-, *ǵʰow- to yield χα- (kha-), so that etymology could be mentioned as a historical theory on the part of someone, but not as a clearly explained derivation. — Eru·tuon 19:09, 27 July 2017 (UTC)

Proto-Sara etymologies and reconstructions

Khu'hamgaba Kitap (talk • contribs) has been adding etymologies referencing "Proto-Sara", a language we do not possess, and has even created RC:Proto-Sara/blày. I don't really know whom to ask about this, but we should either sanction or remove these etymologies. @Metaknowledge, Chuck Entz —John C5 07:18, 26 July 2017 (UTC)

These seem to be sourced from work by John Keegan (see e.g. bángàw); I don't see the problem in including them, though they probably need more attention with formatting, sourcing etc. --Tropylium (talk) 12:42, 26 July 2017 (UTC)

I've been meaning to bring that up here, myself. Aside from an African Languages class at UCLA thirty years ago and minor dabbling in Swahili, I haven't dealt much with sub-Saharan languages. Since this deals with creating a language code, we should see if @-sche has anything on the subject. Chuck Entz (talk) 13:45, 26 July 2017 (UTC)

Sorry about not citing this, but I do have some things to say. For one, I'd be fine if RC:Proto-Sara/blày was deleted, due to there not being a proto-sara code and all, but I think that the etymologies should stay. It was a mistake on my part that I forgot to do this, but I will fix it now, John Keegan's work doesn't actually include anything about the etymologies, instead, I used the book An Analysis of Proto-Sara by Olukayode Mudiwa. It just slipped out of my mind to cite it for some reason. But, I will add it to the articles right now. --Khu'hamgaba Kitap^{ᐅᖃᕐᕕᐅᔪᖅ - talk} 13:56, 26 July 2017 (UTC)

There you go, I've added the citations - e.g. à̰ȳ or bàhāy --Khu'hamgaba Kitap^{ᐅᖃᕐᕕᐅᔪᖅ - talk} 14:06, 26 July 2017 (UTC)

So why don't we just add Proto-Sara? Crom daba (talk) 17:06, 26 July 2017 (UTC)

Addition of language codes is usually handled at WT:RFM. I'm not sure if sar-pro is a good name. A search for incategory:Language_data_modules sar yields the code sar for Saraveca and the code sem-sar for the South Arabian languages. The Sara languages currently don't have a code either. Ideally the code for Proto-Sara should be related to the code for the Sara languages. Pinging @-sche, who does a lot with language codes. — Eru·tuon 19:16, 27 July 2017 (UTC)

Thanks for the pings. :) (When I first started editing, the person who understood the language code system best was Liliana; I learned from them and tried to document how it worked, and I'm glad there are now several users who understand how to formulate and add codes, even if I seem to be the go-to expert, heh.) As mentioned above, we'll need to add a family code and then a proto-language code based on it. As documented in WT:Families, the family code should start with the nearest ISO code: the Sara languages are Central Sudanic languages, csu, so I've added the code csu-sar for the family of Sara languages, and csu-sar-pro for Proto-Sara. - -sche (discuss) 23:19, 27 July 2017 (UTC)

etymology of Hungarian words ending in -áció, -ikus, etc

Many Hungarian words with suffixes have etymology sections written in a way that I think is somehow incorrect and I wanted to ask more experienced contributors if they think they should be edited. I am showing below an example, taken from word arisztokratikus:

From German aristokratisch, from French aristocratique, from Ancient Greek ἀριστοκρατικός (aristokratikós) +‎ -ikus.

which implies that the French word aristocratique has an Hungarian suffix, while it's actually the original Hungarian lemma that has an Hungarian suffix. Do you agree with me that they should be edited maybe in the following way:

From German aristokratisch, from French aristocratique, from Ancient Greek ἀριστοκρατικός (aristokratikós). With +‎ -ikus ending.

I am not sure this is correct though, in particular if the suffix template should be used here. And also if just

Equivalent to Ancient Greek ἀριστοκρατικός (aristokratikós) with +‎ -ikus ending.

should rather be used. For more examples see -ikus Epantaleo (talk) 23:09, 26 July 2017 (UTC)

^ Tótfalusi, István. Idegenszó-tár: Idegen szavak értelmező és etimológiai szótára (’A Storehouse of Foreign Words: an explanatory and etymological dictionary of foreign words’). Budapest: Tinta Könyvkiadó, 2005. →ISBN
^ Tótfalusi, István. Idegenszó-tár: Idegen szavak értelmező és etimológiai szótára (’A Storehouse of Foreign Words: an explanatory and etymological dictionary of foreign words’). Budapest: Tinta Könyvkiadó, 2005. →ISBN

They weren't formed in Hungarian, so the suffix shouldn't be shown at all. —CodeCa t 23:14, 26 July 2017 (UTC)

I am fine with reformatting, but I'd like to show the suffix since it is valid in Hungarian words. See reference Attila Mártonfi: The System of the Hungarian Suffixes, Theses of PhD Dissertation, Budapest, 2006, bottom of page 8. --Panda10 (talk) 16:29, 27 July 2017 (UTC)

The common practice across Wiktionary is to only show the affixes that were used in the creation of the word. —CodeCa t 16:59, 27 July 2017 (UTC)

But it is used in the creation of the word. The Hungarian word is arisztokratikus and not aristokratisch. How can we explain the -ikus ending? --Panda10 (talk) 18:16, 27 July 2017 (UTC)

The Ancient Greek original has it, and the French has a descendant of it. —CodeCa t 18:23, 27 July 2017 (UTC)

But it came to Hungarian from German. --Panda10 (talk) 18:40, 27 July 2017 (UTC)

Then why did the Hungarian replaces -isch with -ikus? I find that quite baffling. German putting in -isch is at least understandable because it's a native suffix, but -ikus is not native to Hungarian as far as I know. How does Hungarian deal with other words from German that have -isch? Do they all get -ikus? —CodeCa t 18:43, 27 July 2017 (UTC)

All I can say is that out of the 66 adjectives in Category:Hungarian adjectives suffixed with -ikus, 29 came from German -isch, 26 from Latin -icus, 6 from English -ic, 3 are native Hungarian derivation, 2 are not indicated. It's a small sample. But the -ikus suffix is productive. --Panda10 (talk) 19:06, 27 July 2017 (UTC)

We should probably format these as being entirely from Latin where applicable (so e.g. publikus is not strictly speaking suffixed), and note that -ikus, where needed for words formed within Hungarian, originates from the words loaned from a Latin equivalent.

For words that come straight from Latin but are analyzable (as in atletikus as compared with atléta), we could still add a mention that they're analyzable as atlet- + -ikus, maybe with just a mention of the suffix, not necessarily derivation.

If we want to be really systematic about this type of a thing, we could consider adding an intermediate type of semi-suffix category, for "words ending in" some analyzable component. This comes up in just about all languages that have borrowed substantial amounts of foreign technical vocabulary ater all, not just Hungarian (is action = act + -tion?), and it additionally comes up also with native suffixes, both where etymologically justified (is néz = né- + -z?) and where not. --Tropylium (talk) 14:34, 28 July 2017 (UTC)

The reason why this form is used is quite simple. There is a transit between Greek and Latin declensions. Greek adjectives such as ἀριστοκρατικός, ἀριστοκρατική, ἀριστοκρατικόν can be latinised into aristocraticus, -a, -um (singular, nominative forms). Hungarian words of Latin or Greek origin appear in their non-inflected, singular nominative form (as an agglutinative language, unlike German, English or Neo-Latin languages, it does not perceive that the Latin ending "-us" has only morphological function). This is the same in the case of the nouns formed with -iō (cf. reflexio with reflexió, or the inflected reflexionem with réflexion).--Martinus Poeta Juvenis (talk) 19:15, 27 July 2017 (UTC)

Thank you all who contributed their expert opinion above. I can reformat the entries I just need to know the standardized approach that would be supported by the community. Two possible options:

Create Appendix:Hungarian words ending in -ikus. List all words ending in -ikus grouped by their etymological origin. Reference this appendix in each entry under See also header. Do not mention the -ikus ending anywhere in the entry.
Create Category:Hungarian words ending in -ikus. Add this category at the bottom of each entry. Do not mention the -ikus ending anywhere in the entry.--Panda10 (talk) 18:59, 28 July 2017 (UTC)

some calques

Entries like Papal States should use the calque template. Is that correct?

Would you think that a correction from

Translation of Italian Stati Pontifici, from Latin Status Pontificius

to

Calque of English Stati Pontifici, from calque of English Status Pontificius

is an improvement? Epantaleo (talk) 23:33, 26 July 2017 (UTC)

The calque template is nice because it automatically categorizes the entry as a calque, so yes, I’d say it’s preferable to use it. However, you have the language codes mixed up in your example, and using “from” between the calque templates sounds awkward; try something like this instead: Calque of Italian Stati Pontifici, which is in turn a calque of Latin Status Pontificius. (Edit: I don’t know if a calque template should be used for the second one at all; do we only use it for direct calques? If so, entries like Holy Ghost need to be changed.) — Vorziblix (talk · contribs) 00:12, 27 July 2017 (UTC)

Italian Stati Pontifici isn't an exact calque of Latin Status Pontificius because the former is plural and the latter is singular. Is Statūs Pontificiī also attested in Latin? —Aɴɢʀ (talk) 18:45, 27 July 2017 (UTC)

The calque template by default adds categories. These categories should only be added when the language of the entry is the one that is calquing. Since Italian is not the language of the entry, you have to either add the parameter |nocat=1 to suppress the categories, or not use the calque template at all. — Eru·tuon 19:23, 27 July 2017 (UTC)

pal etymology

The article gives:

Angloromani phal, from Romani phral, from Sanskrit भ्रातृ (bhrātṛ), from Proto-Indo-European *bʰréh₂tēr.

Wouldn't पाल पाल (pāla) be a more likely origin? -- Q Chris (talk) 11:09, 27 July 2017 (UTC)

Have come across a similar etymology to your first one, that is Angloromani phal, from Romani phral, from Sanskrit भ्रातृ (bhrātṛ) and it is more likely to be logical, since the latter idea raises doubts due to the absence of gradations over such a period of time gap. Andrew H. Gray 11:40, 27 July 2017 (UTC)Andrew (talk)

@Q Chris: No, that's completely wrong. How Sanskrit reach Britain/America/any modern English speaking population? In New Indo-Aryan languages pal =/= पाल (pāl, “protector”) either. —Aryaman ^{(मुझसे बात करो)} 14:28, 8 August 2017 (UTC)

@Aryamanarora: The etymology via Anglo-Romani (a Indic diaspora language) is widely accepted. I've heard this etymology for many years in many reputable sources. —John C5 15:04, 8 August 2017 (UTC)

@JohnC5: Oh no, of course that is the right etymology. I think Q Chris was suggesting पाल (pāla) as the source for pal (which is wrong, how would the "r" be explained in the Romani lemma?). I know what Angloromani is :) —Aryaman ^{(मुझसे बात करो)} 16:19, 8 August 2017 (UTC)

hit

In the etymology of English hit, we show Proto-Germanic *hitjaną. Just a sanity check (for me): Wouldn't PGmc *hitjaną produce Old Norse *hitja instead of hitta ? I'm having uncertainties about the PGmc reconstruction... Leasnam (talk) 16:46, 27 July 2017 (UTC)

*hittijaną is also a possibility I guess. —CodeCa t 16:58, 27 July 2017 (UTC)

~~Would *hititjaną/*hitatjaną, etc. also be a possible reconstructions ? Just wondering if the -ta on hitta is suffixal...~~ nm, I see now that in ON the verb was class 1 weak Leasnam (talk) 17:09, 27 July 2017 (UTC)

isimbongi

If there are any Zulu speakers out there, isimbongi could do with some help. It appears as Word of the Day in September. (Also, I can only find citations for the plural form, izimbongi, and not the singular isimbongi.) — SGconlaw (talk) 08:10, 28 July 2017 (UTC)

Quick check of a trusted dictionary plus a sweep of Google gives imbongi as the isiZulu singular (class 9) form with izimbongi as the class 10 plural. User:MDCorebear (talk)

Yeah, it looks like isimbongi is an English creation, probably an incorrect back-formation from the plural izimbongi. The real Zulu singular is imbongi. —CodeCa t 20:34, 22 August 2017 (UTC)

I've created a Zulu entry now. —CodeCa t 20:41, 22 August 2017 (UTC)

Could you please edit the etymology section of isimbongi to note this erroneous construction, together with a reference to the trusted dictionary? Also, given what you've found, is it still correct to say izimbongi is the plural form of isimbongi in English? — SGconlaw (talk) 01:56, 23 August 2017 (UTC)

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ @Metaknowledge, you created this entry. However, I can't seem to find any quotations supporting isimbongi as a singular form in English. I can only find izimbongi as a plural form, presumably of imbongi. The OED only lists imbongi and not isimbongi. Do you have any objections if I delete isimbongi, and instead create izimbongi as a plural of imbongi? Note that the word is scheduled to appear as the Word of the Day on 22 September 2017. (@CodeCat, for your information and comments, if any.) — SGconlaw (talk) 02:51, 18 September 2017 (UTC)

Nope, just a mistake. Thanks for fixing it. —Μετάknowledge^{discuss/deeds} 04:29, 18 September 2017 (UTC)

OK, thanks for confirming. I'll proceed as indicated above, then. — SGconlaw (talk) 04:32, 18 September 2017 (UTC)

śledź

Is the "tent peg" sense an extension of the "herring" sense (based on some resemblance of a shiny metal tent peg to a shiny long thing herring), or the verb, or does it have a different etymology? - -sche (discuss) 15:10, 29 July 2017 (UTC)

Dutch haring also has this combination of senses, so it may be a calque. —CodeCa t 11:40, 31 July 2017 (UTC)

a#Irish

Are all the senses from PIE *éy? Anti-Gamz Dust (There's Hillcrest!) 18:53, 30 July 2017 (UTC)

@Hillcrest98: Not all of them. The senses meaning "his", "her", and "their" certainly are from various genitive forms of it; the sense meaning "how" is probably ultimately the same as "his"; the relative particles and the pronoun for "all that, whatever" might be from it as well. The vocative particle and the preposition with verbal noun definitely aren't; the numeral particle probably isn't. —Aɴɢʀ (talk) 09:35, 31 July 2017 (UTC)

apple of Sodom

Our etymology does not include a supposed derivation from Hebrew. "Tapuah Sdom". See w:Calotropis procera. DCDuring (talk) 01:58, 31 July 2017 (UTC)

English leaf

The etymology at English leaf currently lists a derivation from Proto-Indo-European *lewbʰ-. However, that entry lists a meaning of love, with nothing at all about leaf.

Does anyone have any further insight into what's going on with our entries, and what the actual PIE etymon is of English leaf? ‑‑ Eiríkr Útlendi │^{Tala við mig} 06:00, 31 July 2017 (UTC)

Check Albanian labë and Latin liber, most presumed cognates point to *lewbʰ-, but it can't explain Latvian lapa, Albanian lapë, Russian лапоть (lapotʹ) or Ancient Greek λοπός (lopós) which Beekes claims is "pre-Greek" Crom daba (talk) 10:26, 31 July 2017 (UTC)

Could leaf be from a different PIE stem, say *lewbʰ-₂ ? Leasnam (talk) 17:21, 31 July 2017 (UTC)

Yeah, this is surely just some accidental homophony, we already have a few roots like that. Crom daba (talk) 18:09, 31 July 2017 (UTC)

Entries for PIE extensions

I think it would be useful to have entries for some PIE extensions. I'm wondering how to format them though. Here's an attempt. --Victar (talk) 06:14, 31 July 2017 (UTC)

Is Reconstruction:Proto-Indo-European/-ey- something that is actually recognised as an affix by linguists? Also, how is it an infix? All the examples are suffixal. —CodeCa t 09:50, 31 July 2017 (UTC)

This is the same suffix as in typical *-eye-iteratives, right?

But yeah, I don't think a suffix stops being a suffix just because it's followed by other suffixes. An infix would be something like *-n-, which attaches within a root (*TeK- → *Te-n-K- etc.) --Tropylium (talk) 22:40, 31 July 2017 (UTC)

Whoops, my mistake calling it an infix. Fixed. @Tropylium, see the examples here. --Victar (talk) 06:18, 1 August 2017 (UTC)

I cleaned it up and sourced Sihler. --Victar (talk) 07:21, 1 August 2017 (UTC)

Thank you. I can't find the passage in Sihler though. That page covers numerals in my edition. Also, our practice is to cite verb suffixes in the third person singular, like *-éyeti. Should this be moved to *-éyti? —CodeCa t 09:56, 1 August 2017 (UTC)

@Dghmonwiskos edited it to I think more what you were thinking. If that's how it should actually look, I wonder if *-yéh₁- should be modeled in the same fashion. What do you think, @CodeCat:? Re:Sihler, you have the right page. *-ey- is just mentioned in passing, so we probably need some better sources. --Victar (talk) 14:16, 1 August 2017 (UTC)

*-yéh₁- is not an independent verb-forming suffix though, although it likely was one in earlier PIE. —CodeCa t 14:53, 1 August 2017 (UTC)

Hmm, OK. I thought *-ey- was similar to *-yéh₁-, forming the lexical aspect rather than the optative mood. --Victar (talk) 15:00, 1 August 2017 (UTC)

Aspects were still independent verbs in PIE. That's why their formation is so haphazard, they are essentially derivational in origin. It's like in modern Slavic. See also w:PIE verb which elaborates on the subject quite a bit. —CodeCa t 15:02, 1 August 2017 (UTC)

Thanks for explaining. So yes, let's move it to *-éyti, no? --Victar (talk) 15:08, 1 August 2017 (UTC)

I'm fine with that. User:Dghmonwiskos provided a thematic inflection in the entry, but I'm not sure if that is warranted. Athematic verbs were converted to thematic all the time. —CodeCa t 15:21, 1 August 2017 (UTC)

So than should pages like *tḱey- be reconstructed as *tḱéyti instead, and not as roots? --Victar (talk) 17:06, 1 August 2017 (UTC)

I'm not sure. In principle, these verbs aren't root verbs. But there are many derivations listed that typically only occur with roots, such as *tḱéy-tis ~ *tḱi-téy-s. So I'm not sure what the situation actually is. It is, maybe, telling that this suffix doesn't survive as a productive element in any language. Verbs with this suffix might not have been recognised as such by later speakers, and could have been reanalysed as root verbs. These new "roots" would have then had new words formed from them. Such reanalysis of certain elements as root is not unknown elsewhere in Indo-European. English stand is a nasal-infix present formed from a root that did not exist in PIE. The fact that it's a nasal-infix present shows that it was analysed as a root at some point, since such presents were only formed from roots. —CodeCa t 17:32, 1 August 2017 (UTC)

So the entries for *tḱey- and *dʰgʷʰey- should not be reanalyzed as verbs. Whatever may have caused their appearance (a reanalysis of a verb or some sort of verbal extension), they seemed to be functioning like full-fledged roots in even the oldest descendants and in PIE itself. They certainly deserve a root entry. —John C5 21:02, 1 August 2017 (UTC)

@CodeCat, JohnC5: Should the etymology be constructed as so?:

Reanalysed root of *tḱéyti, from *teḱ- (“to sire, beget”) +‎ *-éyti (*éy-present suffix).

--Victar (talk) 21:53, 1 August 2017 (UTC)

PIE *-eh₁i-

@CodeCat, JohnC5: I've seen some roots with what appear to be *-eh₁i- suffixes, like *skeh₁i- from *sek- (“to cut”). Are these extensions on reanalysed roots from the stative *-éh₁ti, ex. *sek- > *skéh₁ti > *skeh₁- > *skeh₁-i-? --Victar (talk) 22:27, 2 August 2017 (UTC)

Are there any sources on it? —CodeCa t 16:02, 8 August 2017 (UTC)

That's what I'm asking. I'm trying to understand the above transition. --Victar (talk) 20:14, 8 August 2017 (UTC)

Which sources say that there is a transition? —CodeCa t 20:15, 8 August 2017 (UTC)

Mallory/Adams cites *skeh₁i(-d)- as from *sek- ‘cut’ on pages 373-374, and 510, for example. We also have *pteh₁- (“to fall”) from *ped- (p. 401), *h₂meh₁- (“to mow”) from *h₂em- (p. 482). Mallory/Adams actually goes on to call *-eh₁- a deadjectival verb suffix on page 57. @JohnC5, any thoughts as well? --Victar (talk) 21:09, 8 August 2017 (UTC)

Reanalysed roots

@CodeCat, JohnC5: I had a couple of questions on reanalysed roots:

Are reanalysed roots actually based on a form that existed and was in use, or were they simply extensions, and intentionally added to roots to create new roots? And if that last statement is wrong, please correct me.
Should we place reanalysed roots in the extensions section on entries, with extensions like *-dʰh₁-, which don't modify the root as reanalysed roots do? Or do these belong in the Derived terms? See Reconstruction:Proto-Indo-European/kes-.

Thanks. --Victar (talk) 15:58, 15 August 2017 (UTC)

Roots never existed as such, because they weren't words anymore than -ness is. I think that speakers would have treated any single-syllable stem as a root, though they must have understood that the thematic vowel was not part of the root. If a root was nonsyllabic while some verbal suffix had an e-grade alternating with zero grade, then that looked to speakers very much like a plain root verb, and reanalysis was possible. The next step would have been for speakers to create new formations derived from such roots, using derivations that were normally restricted to being directly from roots. Examples are the *-tis and *-tus nouns, the *-tós adjective, and characterised verbs such as nasal-infix presents or causatives. —CodeCa t 14:02, 16 August 2017 (UTC)

@CodeCat: Sorry, I think maybe I wasn't clear, or maybe I'm not following your reply. So, for example, the reanalyzed root of the *éy-present suffix above, *tḱey-. Are we sure that an actual *tḱéyti verb ever existed and was in use, or did people just use *éy as a extension on the basis of the existing -éyti verb suffix, along with noun suffixes like *ksew- from *kés-u-s ~ *ks-éw-s? --Victar (talk) 18:09, 16 August 2017 (UTC)

That's one reason why I find the explanations given by sources a bit ad-hoc. There aren't really many instances of the intermediate verb existing directly. At the same time, though, if these kinds of formations were falling out of use, that would be a big reason for the reanalysis happening in the first place. If the formations remained transparent and productive to speakers, they wouldn't have been motivated to reanalyse them as roots. So I think we're digging into the earlier history of PIE here, looking at formations that were moribund in late PIE already. —CodeCa t 18:21, 16 August 2017 (UTC)

@CodeCat: Right. I suppose a prime example would be r/n-stems which became unproductive very early in PIE, requiring speakers to tack more productive suffixes onto them.

So to my second question though, should these reanalysed roots be in the extensions section or in the descendants list, as I've done on Reconstruction:Proto-Indo-European/kes-? --Victar (talk) 19:07, 16 August 2017 (UTC)

I think the "extensions" section is a cop-out. It's basically saying "there's just some random element stuck on the end of the root, but we don't understand it". I think it's bad linguistics to say that two terms are related, without also saying how they are related. —CodeCa t 19:11, 16 August 2017 (UTC)

Hah, OK. Glad I asked. *Victar is now off to better understand "dʰh₁-extentions"* --Victar (talk) 19:19, 16 August 2017 (UTC)

Slander

Have just been listening to an lecture by a Msr. Sagot who immediately corrected a slanderous comment by a lady present that most of the etymologies on Wiktionary were from old fashioned sources, that is certainly not true, as I can confirm that many are updated. In any case, most of the hard back dictionaries of the last century present safer etymologies than some of the recent more scientific ones which wander into wild connections and false assumptions, such that even intelligent visitors to the site would negate! I feel that this should come to the notice of this organisation, in case of potential infamy. Andrew H. Gray 17:54, 28 November 2017 (UTC)Andrew (talk)

[4] Tótfalusi, István. Idegenszó-tár: Idegen szavak értelmező és etimológiai szótára (’A Storehouse of Foreign Words: an explanatory and etymological dictionary of foreign words’). Budapest: Tinta Könyvkiadó, 2005. →ISBN

[5] Tótfalusi, István. Idegenszó-tár: Idegen szavak értelmező és etimológiai szótára (’A Storehouse of Foreign Words: an explanatory and etymological dictionary of foreign words’). Budapest: Tinta Könyvkiadó, 2005. →ISBN