Hello, you have come here looking for the meaning of the word User talk:Benwing2/2020-2021. In DICTIOUS you will not only get to know all the dictionary meanings for the word User talk:Benwing2/2020-2021, but we will also tell you about its etymology, its characteristics and you will know how to say User talk:Benwing2/2020-2021 in singular and plural. Everything you need to know about the word User talk:Benwing2/2020-2021 you have here. The definition of the word User talk:Benwing2/2020-2021 will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofUser talk:Benwing2/2020-2021, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
I can definitely write a script that can be run to add empty translation tables to a list of French terms. The only thing is that I'd have to run the script because it requires a bot account and various things set up on my computer. The idea would be that you'd give me a list of pages and I'd run the script on those pages. We can repeat this as many times as is needed. If you're looking for something more interactive, you might want something written in JavaScript. For that, ask User:Erutuon, who is much better than I am at JavaScript. Benwing2 (talk) 10:55, 13 January 2020 (UTC)Reply
Latest comment: 4 years ago2 comments2 people in discussion
Believe it or not, this along with its ca, es and pt children, has shown up with "out of time" module errors. If it helps any, none of these shows any categories, so it must be timing out before that part. The only thing I can think of is that there may be some kind of recursion or conflict due to Valencia being both a region/state in Spain and a city in California. Please have a look. Thanks! Chuck Entz (talk) 01:10, 27 January 2020 (UTC)Reply
@Chuck Entz This was because I had "Valencia" as both an autonomous community of Spain and a city within that autonomous community. For now I've removed it from the city list. Benwing2 (talk) 01:19, 27 January 2020 (UTC)Reply
@Koavf Yes, I know about that and I'm planning on deleting it. I've added support to {{place}} for major cities all over the world, and I've decided not to include the country name after the city. The logic is as follows: (a) nearly every existing city category does not include the country, and Category:en:Washington, D.C., USA is the only exception I know of; (b) including the country name causes problems e.g. for the city of São Paulo, where there is already Category:São Paulo, Brazil for the state of São Paulo (similarly for Rio de Janeiro). Benwing2 (talk) 06:13, 29 January 2020 (UTC)Reply
re: "shiretown" parameter
Latest comment: 4 years ago2 comments2 people in discussion
Latest comment: 4 years ago14 comments2 people in discussion
I'm replacing a bunch of templates with new tone-requiring versions of them. For each, the former parameter |h= (or |head=) is now going to be the first positional parameter in the new template, with the other positional parameters being incremented by 1 when present. I need some help with the ones I can't easily do by hand, and Erutuon told me you have a script to do this. The replacements are:
I have another one, if you don't mind. I want to replace {{head|ny|noun plural form}} with {{ny-plural noun}}: the former parameter |h= (or |head=) is now going to be the first positional parameter in the new template (if neither of these parameters are present, the first parameter should just be the pagename), and the former |g= is going to be the second positional parameter (but it needs the first character stripped, so |g=c12 becomes |12). Sorry for the extra trouble with this one. —Μετάknowledgediscuss/deeds18:19, 2 March 2020 (UTC)Reply
@Metaknowledge OK, it is running. As there are over 40,000 pages using the template, it will take a little while. I'm running 5 processes, each one handling 8,000 pages running at 1 per second, so it should be done in a little over 2 hours. Benwing2 (talk) 04:22, 23 March 2020 (UTC)Reply
@Chuck Entz Sorry about that, I should have checked CAT:E after pushing those changes. I try to do that but occasionally forget. BTW, any time you see a sporadic error like this, it means I used the bot to push changes that I manually made to a text file, and the error is because I mistyped something. When I write a script to make the changes, typically you won't see any such errors, or if you do, you will see a lot :) Benwing2 (talk) 04:36, 4 March 2020 (UTC)Reply
Bulgarian headwords and manual transliterations - bot request
It looks like there are some good changes mixed in with the bad ones. Can you identify the bad ones? If you can do that, I have a script I can use to undo those changes (even if there are subsequent changes from other users, although it looks like mostly there aren't). Ideally, save the HTML from the User contributions page and edit it; that will preserve the diff ID's. But that might be painful. If so, just select the lines from the User contributions page that contain the bad changes and paste into a text file, and I'll parse out the page names and undo the changes to those pages from this user. Note that there are some bad changes to pages that don't have чн in the page name, e.g. the change to тряпица. Benwing2 (talk) 01:36, 6 March 2020 (UTC)Reply
@Victar I could write a script to do that, but (a) it would take longer than just manually reviewing the commits, (b) some of the changes (at least for Russian) were correct. Benwing2 (talk) 04:31, 6 March 2020 (UTC)Reply
@Victar Yes. There are the most recent 19 contribs, and about 30 more on March 1st, and that's it. Do you have rollback privileges? If not, I can give it to you, it will make your life easier. Benwing2 (talk) 05:11, 6 March 2020 (UTC)Reply
@Benwing2: Yes, thank you. The rest of edits were OK. I wonder if manual (required) translits should have a hidden comment or something. I feel that I have to do it on a regular basis. I still feel that manual transliterations (even with commas) are important, like this one Кузьми́ничнаf(Kuzʹmínična, Kuzʹmínišna). @86.134.66.200. --Anatoli T.(обсудить/вклад)04:25, 6 March 2020 (UTC)Reply
@Victar Damn, this guy is persistent. There are an awful lot of Iranian-related changes; if they need to be mass-reverted and can be done by a script, let me know how, and I'll see if I can write the script. Benwing2 (talk) 05:57, 6 March 2020 (UTC)Reply
Yeah... annoying. I would actually even just settle for reverting any edits with pal and xpr. Those are the worst edits. They don't seem to realize that some characters can represent multiple transcriptions, which is why manual ones were set. --{{victar|talk}}06:19, 6 March 2020 (UTC)Reply
@Romanophile Yeah I don't think this is a good idea. Doing this makes things more obscure as you won't find e.g. Russia under either Europe or Asia when you'd expect it under both. BTW I don't think any country except Turkey and Russia should be listed as being in both Europe and Asia; the rest are only in Asia. Benwing2 (talk) 03:54, 16 March 2020 (UTC)Reply
Latest comment: 4 years ago6 comments2 people in discussion
I was surprised to see these words marked with a long vowel, and looking through the edit history for verno, I noticed that you had added a macron to it in September 2019. I see that Lewis 1890 marks these words with a macron, and Bennett 1907's entry for vernus does also, simply stating "from vēr" (page 66).
However, vowels before a consonant cluster starting with a resonant are presumed to have been shortened in Latin at some point by a sound change, often called "Osthoff's Law" (a name that also applies to a similar sound change in Greek). Some exceptions are thought to be present in Classical Latin, but I don't know of any firm basis for supposing that the vern- words were such an exception. According to de Vaan 2008, the root of ver itself originally had a short vowel, and the long vowel found in Classical Latin vēr is secondary, resulting from compensatory lengthening when s was lost before n in the genitive: vesnos > ve:nos (with later replacement of ve:nos > ve:ros). Our entry in Wiktionary agrees with this account. De Vaan says that the vern- in vernus might come from either vesin- (in which case the e would have been short all along) or from ve:ri-n-. In the second case, Osthoff's shortening still seems like a possibility. "Osthoff’s Law in Latin", by Ollie Sayeed , assumes that vernus has a short vowel (page 157, in Indo-European Linguistics 5 (2017) 147–17).
So the etymological situation seems inconclusive, and as far as I know, there is no non-etymological evidence of a long e in the vern- words (e.g. in the form of either Latin-era inscriptions with apices, or distinctive Romance reflexes). Do you know more about this?--Urszag (talk) 05:04, 19 March 2020 (UTC)Reply
@Urszag I added the long vowel based on Alatius's web page , which quotes Bennett but contains corrections from several later authors. vērnus isn't corrected so I took it as correct. I have heard the arguments about Osthoff's Law but AFAIK it applied long before the Classical Latin period; at least that's how I learned this law worked. There are several exceptions like fōrma (as shown by Spanish horma not *huerma, French fourme), vēndō, quīnque, probably vāllum, ūllus, sūrsum, etc. So I am skeptical there was an Osthoff's Law that applied late enough to make a big difference in Classical Latin. De Vaan is of the Leiden school, which has its own peculiar ideas about Indo-European linguistics, so I wouldn't take everything he says at face value. I don't know about Ollie Sayeed but I see he's a PhD student. Benwing2 (talk) 05:33, 19 March 2020 (UTC)Reply
Yes, there certainly were words in Classical Latin that for various reasons had long vowels before resonant-initial clusters, as I mentioned. The date of the shortening law would only matter if vernus was a late formation from vēr: if it was formed early on, then one possible scenario is the vowel being shortened by Osthoff's Law and remaining short after Osthoff's Law ceased being actively applied. Is there any evidence that it was a late formation?--Urszag (talk) 06:23, 19 March 2020 (UTC)Reply
@Urszag I don't have any evidence one way or other. But the assumption that Osthoff's Law continued to be applied well into the Classical period seems dubious to me given the large number of exceptions. It seems more logical to me that it applied very early on and then ceased to be active even before the Old Latin stage. Is there specific evidence that Osthoff's Law continued to apply into and past the Old Latin period? The only cases I know of where shortening before resonant + consonant seems to have occurred are before 'nt' and 'nd'.
Ultimately it seems that the best we can do is add a note indicating that there is some disagreement in the sources as to the length of 'vērnus/vĕrnus', maybe by writing it as 'vē̆rnus' with a note indicating which sources say it's long and which ones say it's short. Benwing2 (talk) 06:53, 19 March 2020 (UTC)Reply
For many specific consonant clusters of the form RC, there aren't many examples of OL in Latin. Sayeed mentions shortening before -rn- in perna (>Spanish pierna), and before -mb- (from -ms- or -ns-) in membrum and in month names ending in -ember. Perna, membrum and -ember are supposed to be from PIE roots with long e (pages 156-157).--Urszag (talk) 07:25, 19 March 2020 (UTC)Reply
@Urszag The problem is that perna and membrum look to be very old constructions, meaning Osthoff's Law might have applied at the Proto-Italic stage or even earlier (in fact, the Wiktionary entry for membrum effectively dates Osthoff's Law to Proto-Italic or earlier by assuming shortening already at the Proto-Italic stage), and it's far from obvious how the -ember nouns evolved. In fact, Wiktionary's etymology for september assumes that the -em comes from the end of septem, not from the originally long ē of mēns. All of this is not to say that Osthoff's Law couldn't have applied later as well, but I feel we need better evidence. Benwing2 (talk) 04:10, 20 March 2020 (UTC)Reply
@Ilawa-Katawa Thanks. This happened because I accidentally deleted the line that demarcated the division between two adjacent pages when manually editing the text. This is not indicative of a larger issue, just human error :) Benwing2 (talk) 00:54, 27 March 2020 (UTC)Reply
Bulgarian and Ukrainian pronunciation modules - minor fixes, hopefully
Latest comment: 4 years ago4 comments2 people in discussion
Hi,
When you have a chance (I know you're busy), could you please fix some most striking problems or issues:
Handling of "я" and "ю" by Module:bg-pronunciation/testcases when they are not following consonants, should be , not like in Russian.
There are no cases yet but Module:uk-pronunciation shouldn't make "и" as in unstressed positions but always . I don't have a good reference on Ukrainian pronunciation. Ivan Štambuk must have based on some old book, which not so valid.
Thanks for that, also please make Ukrainian "е" to always be , also unstressed. This will make it a bit more phonemic but also more generic for most speakers.
Hiya, I've changed my mind about the Ukrainian modules changes. It's better to follow some model, rather than mixing. Fixing Bulgarian module's errors are essential. --Anatoli T.(обсудить/вклад)21:43, 29 March 2020 (UTC)Reply
WingerBot creating errors
Latest comment: 4 years ago2 comments2 people in discussion
Latest comment: 4 years ago6 comments2 people in discussion
I noticed a mass-creation of Bulgarian entries by your bot. They look good - with the right senses, stresses and inflections! How?! It takes a long time to create entries manually. Did you extract all of them from a dictionary? Did it take you long?
I want to ask if it's possible to repeat this feat in the future (theoretically) with Ukrainian and Belarusian entries - The Ukrainian and Belarusian inflections exist on public sites, a smart bot would be able to load them. Russian might use some missing feminine forms, for example. Ivan Štambuk was able to mass generate Ukrainian entries with inflections, which is not easy to do if you have to do it manually. --Anatoli T.(обсудить/вклад)09:11, 19 April 2020 (UTC)Reply
@Atitarev I manually create a text file specifying the declensions, stresses, meanings, synonyms, antonyms, derived and related terms, and then use a script to generate the entries, and another script to push the entries using my bot. It takes a long time: the recent batch of 259 entries took 2 days to create. The time goes into looking up the entries, figuring out the inflection and meaning, and finding related terms. Running the actual scripts is fast. If I didn't have to worry about definitions or derived/related terms, it would go faster, for sure. The first few entries in the manually-created text file look like this:
n роб raw:From_West_Slavic,_from_{{inh|bg|sla-pro|*orbъ||slave}},_from_{{inh|bg|ine-pro|*h₃órbʰos||orphan}}._{{doublet|bg|раб}}. <+и+ове>|f=роби́ня|adj=ро́бски (also)(f)];];(f)];ux:]_]_'''роб'''|your_obedient_'''servant''' der:ро́бство
n ро́бство роб+-ство </n:sg> ],];syn:робу́ване;];syn:и́го rel:роб:роби́ня
n ро́ба - <> ],]
n хълм inh:sla-pro:*xъlmъ <+ове+и> ] der:хълми́ст
n щрих de:Strich <+и>|adj=щри́хов (l)],] der:щрихи́рам,щрихо́вам
Hiya. How is going? Do you want me any help with Bulgarian - something I could do? The verbs are going to be a bit more complex but I hope there will be much more commonalities by types, for example, a missing verb ми́сля(míslja, “to think”) has many verbs with similar or identical conjugations of type 173ti in https://rechnik.chitanka.info/type/173ti. We will need to check if a common type also includes stress patterns. --Anatoli T.(обсудить/вклад)23:29, 21 April 2020 (UTC)Reply
@Atitarev Hey. I've written the code for multiword nouns and adjectives and I'm testing it now. Afterwards comes the verbs. It would help if you could sort out the different possible stress patterns of verbs, as I have little idea of the possible variations. Maybe make a table similar to User:Benwing2/test-bg-ndecl that lists various verbs and certain forms. I think Bulgarian verbs don't have stress movement within a single tense, unlike Russian, so it may be enough to list the following forms:
the first-person singular present (future for perfective verbs0
the first-person singular aorist
the first-person singular imperfect
singular imperative
singular imperative
masculine indefinite singular present active participle
masculine indefinite singular past active aorist participle
indefinite plural past active aorist participle
masculine singular past active imperfect participle
plural past active imperfect participle
masculine indefinite singular past passive participle
indefinite plural past passive participle
adverbial participle
singular indefinite verbal noun
Apologies for requesting so many forms per verb. I think it may be possible to do without the plural variants of the participles, which would eliminate three forms (14 -> 11). But it looks from Bulgarian conjugation that Bulgarian verbs are just very complicated. Benwing2 (talk) 02:15, 22 April 2020 (UTC)Reply
Latest comment: 4 years ago1 comment1 person in discussion
This Latin noun is documented only in the dative and accusative. Is there a way to reflect this fact in the declension table? I could find no instructions in the template documentation for missing forms. --EncycloPetey (talk) 16:35, 22 April 2020 (UTC)Reply
Incorrect bot edits
Latest comment: 4 years ago1 comment1 person in discussion
Latest comment: 4 years ago13 comments4 people in discussion
Hi,
You have done an amazing piece of work for Bulgarian. Thank you! There's always room for improvement but it seems the infrastructure is now in a very good shape.
Please let me know if you're interested in improving Belarusian and/or Ukrainian contents as well. I am more familiar with these languages but I can't give much of a technical advise and I don't think we'll get much help from others. Resources are somewhat better for Ukrainian. Some work on conjugation templates is currently going on Russian and Ukrainian Wiktionaries. The inflections are more complex and there is more variety than Russian but even small improvements would be appreciated. I am OK to continue to add inflections manually but some templates don't even allow that, for example, Ukrainian templates can't handle reflexive verbs. --Anatoli T.(обсудить/вклад)02:00, 13 May 2020 (UTC)Reply
@Atitarev Sure, I can do some work on Ukrainian, esp. since the resources are better than for Belarusian. I will start by looking into the issue with reflexive verbs. If you could point me to any resources on Ukrainian declensions or conjugations, it would be helpful. Benwing2 (talk) 03:03, 13 May 2020 (UTC)Reply
Thanks. You can start by analysing our currently used templates. For example, a transitive verb говори́ти(hovorýty) has a full paradigm. If you add any term in https://goroh.pp.ua/Словозміна, it will give you inflections, sometimes more than one for different senses. It's very good for nouns and adjectives. The verb conjugation is not 100% complete there. We will need to fill the rest - or leave unpopulated to be filled. E.g. https://goroh.pp.ua/говорити doesn't provide participles, and only one type of the future tense, e.g. говори́тиму - I will talk (infinitive + му ending for 1st pers. sg). The missing type is formed the same way as Russian, e.g. бу́ду говори́ти - I will talk. --Anatoli T.(обсудить/вклад)03:29, 13 May 2020 (UTC)Reply
BTW, many Ukrainian terms have audio files, so you can get a feel how Ukrainian sounds. You may wonder if unstressed "и" is represented correctly by the pronunciation module. E.g. говори́ти is but you will hear . It is based on a more classical accent, which is fading away. --Anatoli T.(обсудить/вклад)03:39, 13 May 2020 (UTC)Reply
I have found another online resource for Ukrainian conjugations, the dictionary I have been using for a while - https://www.lingvolive.com/en-us/translate/uk-ru/давати - need to scroll down. It doesn't give stresses here but it gives MORE forms missing at https://goroh.pp.ua/Словозміна. (The dictionary needs some getting used to. It annoys you with requests to log on but t won't do it again after the first query - you can switch languages and look up other words. Clicking refresh is always better and adding the term to the URL). --Anatoli T.(обсудить/вклад)08:02, 13 May 2020 (UTC)Reply
@Atitarev It looks like Ukrainian grammar is pretty similar to Russian grammar, so as a first approximation maybe we can use Zaliznyak's system of notating nouns and verbs. Benwing2 (talk) 01:50, 14 May 2020 (UTC)Reply
Yes, there are many similarities.
For verbs, there are more forms - alternative future, роби́тиму/бу́ду роби́ти ("I will do"). Ukrainian is unique in this among Slavic languages, forming is easy but care should be taken for reflexive verbs - -сь/-ся are also used, like in Russian but the rules are different.
Many verbs in -ти have alternative infinitives in -ть, the tables could use those: розмовля́ти/розмовля́ть but not verbs like нести́.
Pluperfect tense is sometimes described, e.g. "чита́в був" (I had read), no need to include in tables, IMO.
Present active participle is normally missing, it may be confused with a noun, e.g. даючий. So, I am not sure if it should be included. In any case, it's a bit hard to find for each verb and find the right stress. Perhaps as a parameter, rather than an automatic feature.
"past_pasv_part_impers", is currently not displayed, e.g. "ро́блено" in tables but it should. It's very typical and is also used with intransitive verbs.
All nouns have vocative forms, even if they are not used in the real life, that's what grammar books do.
Some challenges will be with alterations, such о/і, г/з, к/ц, х/с, л/в, у/в, which are not present in Russian.
You can request inflections for different terms and I will try to make entries or just inflection tables. If you improve the current conjugation tables, it will make it easier for me, so that I could use reflexive as well. I will be able to add multiple different verb conjugations on a page, so that you could build a module, if this is what you're planning. Just let me know. If you want to focus on nouns first, that's fine. They are easier. You'll get exposed to many sound changes, which are common for Ukrainian. Some things are still very unfamiliar to me and even current adjective templates include a number of archaic/very rare forms. It's definitely more complex than Russian.
You can also take a look at Ukrainian Wiktionary inflection templates, some are good but far from comprehensive in coverage. --Anatoli T.(обсудить/вклад)
@PUC: Thanks, it would be good to label them for attention but maintenance categories for Belarusian are now quite large. I personally have no issue adding a stress mark at all. I am using SC Unipad, which decomposes diacritics and I just copy/paste the stress mark in the right place. At http://www.slounik.org/, you can see the underlined stress marks but you can't copy it. Belarusian headwords need a lot of rework too. --Anatoli T.(обсудить/вклад)10:15, 31 May 2020 (UTC)Reply
I am your student
Latest comment: 4 years ago10 comments2 people in discussion
Hello from el.wiktionary. I know nothing about Lua, computers, but I am fascinated by your work. And i learn a lot from your comments. I have been trying to unify templates for 'places'. Nothing complicated like your Module:place. I tried a /data page at el:Module:sarritest but I cannot make even one little link work, I cannot make them speak to each other. Is there a magic word for it. ‑‑Sarri.greek♫|17:45, 23 May 2020 (UTC)Reply
@Sarri.greek Can you explain exactly how it's not accepting terms with spaces or diacritics? I'm not quite sure how the module is being used and what error you're seeing. Benwing2 (talk) 16:44, 6 June 2020 (UTC)Reply
It was a silly bug?? @Benwing2 I am so sorry to have bothered you. I thought it was some very difficult font issue. I have asked Lua support for small wikis at meta.wikimedia.org but noone answers. I thank you very much. I would never solve it. ‑‑Sarri.greek♫|21:48, 6 June 2020 (UTC)Reply
Virile and nonvirile
Latest comment: 4 years ago13 comments2 people in discussion
@Benwing2 Definition of nonvirile from Appendix:Glossary: "In Slavic languages, a plural gender used for all groups that do not contain men, as well as plurals of masculine animate, masculine inanimate, feminine and neuter nouns. Contrast virile." Nonvirile and virile are typically abbreviated as "nv" and "vr" on templates and "nonvir" and "vir" in glossing (according to the Oxford Handbook) respectively by the way. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 01:59, 26 May 2020 (UTC)Reply
@Benwing2 There are three problems with the current execution. Firstly, the category descriptions are misleading in that they sound like they are singular whereas virility is strictly plural. I would write for the virile and nonvirile categories respectively: "Polish nouns that refer to a group with at least one male human." and "Polish nouns that refer to a group without male humans." Also, I think both categories should be (also?) subcategories of Category:Polish pluralia tantum since these categories by their nature refer to plural-only nouns. Lastly, Module:pl-headword can neither handle virility categorisation nor display the virile and nonvirile genders. While it looks to be a straightforward edit adding a couple of lines between lines 190 and 191, I am not remotely confident in my moduling skills. Thank you in advance. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 16:58, 26 May 2020 (UTC)Reply
@Ilawa-Kataka Apologies for not getting to it faster. I'm not sure it's correct to put virile nouns under pluralia tantum, though; this is a particularity of Polish. I think it should be added by the Polish module. Benwing2 (talk) 01:01, 28 May 2020 (UTC)Reply
@Benwing2 Go ahead and change it, I am uncertain of any application of virility outside of Polish. Also, there is no need to be sorry, I was glad to get a bit of moduling experience. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 01:12, 28 May 2020 (UTC)Reply
Ojibwe verb categorization
Latest comment: 4 years ago3 comments2 people in discussion
Hi there,
I saw your recent creation of the Ojibwe verb "gaanda'an". I've been trying to sort out the Ojibwe verb classes, and was aiming for a 4-way classification: VII-VAI-VTI-VTA. I'm no expert in wiktionary editing, but this 4-way classification matches the consensus in Ojibwe grammar. That said, your "gaanda'an" entry (and i assume others) doesn't follow that, separating the transitivity from the animacy, which i don't think works because those two elements can't be disassociated from one another. I don't want to assume my classification (following the Ojibwe People's Dictionary and others) is most appropriate for wiktionary. So, before continuing, i thought i would check with you.
@SteveGat Hi. I guess you are referring to the categories that classify a verb separately by animacy and transitivity? I'm confused as to why this doesn't work. What is it about these two properties in Ojibwe that makes them so intimately bound? In many languages, nouns, verbs, etc. have multiple properties, and just because a verb is e.g. identified in the dictionary as "transitive inanimate" doesn't mean it needs to be categorized with both at once and not with the two separately. That said, I'm not opposed to creating a category like "Ojibwe transitive inanimate verbs" but in that case I think we should also categorize at least under "Ojibwe transitive verbs" for consistency with other languages. This can be accomplished by creating a special {{oj-verb}} template that takes a parameter to specify the transitivity and animacy; I can create that template for you if you want. Benwing2 (talk) 01:37, 29 May 2020 (UTC)Reply
You can see my overall response in the Beer Parlour. Specifically on animacy versus transitivity: intransitive verbs have different paradigms depending on the animacy of the subject, while transitive verbs (which by definition have animate subjects) have different paradigms based on the animacy of the object. In other words, animacy isn't a feature of the verb itself, so classifying a transitive verb as "animate" (VTA - animate object) has a completely different meaning than classifying an intransitive verb as "animate" (VAI - animate subject). I hope this clarifies the issue, and i appreciate your helping my think this through. SteveGat (talk) 17:11, 29 May 2020 (UTC)Reply
Small bot request
Latest comment: 4 years ago5 comments3 people in discussion
@Nueva normalidad: Hey. I don't know if you realised that it wasn't Benwing2 asking you the question. "You" must be referred to him.
The modules are complex but it's the grammar, variations and inconsistencies that are ridiculously complex in Ukrainian, especially nouns. I can attest that Benwing2 is bloody efficient at this and the modules are very powerful if used correctly. It's not only Ukrainian this year - Bulgarian modules for all inflections are also done this year. I think with the experience (Russian, Arabic, Latin modules) and the approach he has taken, many things are possible, you don't really have to speak the language well. That's why it's even more amazing. In any case, I can't complain about the development speed or accuracy, LOL. --Anatoli T.(обсудить/вклад)02:03, 27 June 2020 (UTC)Reply
Module:ang-verb
Latest comment: 4 years ago2 comments2 people in discussion
@Mahagaja I did this because User:Atitarev has been consistently making changes of the same sort manually. I figured it makes it easier to copy text from one entry to another because you don't have to manually enter the pagename. However, I didn't think about file name changes. Under what circumstances does this happen, and who runs the bots to handle these changes? Can you point to an example diff, as I don't recall having seen such changes? If it ends up making these bots not work, potentially I could certainly do a run to undo all the changes although easier might be to fix the bots to handle this. Benwing2 (talk) 14:35, 19 July 2020 (UTC)Reply
Here is an example where the bot CommonsDelinker changed the name of a file at Wiktionary because the name of the file was changed at Commons. That bot is run by c:User:Magnus Manske; I guess you'd have to ask him whether the bot would still recognize the name if it uses {{PAGENAME}}. Of course, files don't get their names changed very often anyway; it usually only happens when there's a typo in the old name. —Mahāgaja · talk18:10, 19 July 2020 (UTC)Reply
@Mahagaja Thanks. All of these audio files have a very simple and consistent naming format, e.g. Ru-альтернатива.ogg so they've probably been put there by bot and are unlikely to change. However, I'd definitely like to hear from c:User:Magnus Manske. Benwing2 (talk) 18:54, 19 July 2020 (UTC)Reply
There's also the chance of them being deleted, since removing links to deleted files is CommonsDelinker's other task. Again, it's unlikely to happen, but it is theoretically possible that some such file was later found to be a copyvio or something and deleted, and if CommonsDelinker didn't know we were using it, we'd be left with a red link. —Mahāgaja · talk19:36, 19 July 2020 (UTC)Reply
@Mahagaja: I, in turn adopted this method, when I saw User:PUC using it in Belarusian and later Ukrainian and Russian entries. It is a big time-saver when mass-creating entries and should be embraced by others, IMO. If audio or image files get renamed, deleted or replaced, it is a good practice to change all linked entries and I've seen conscientious editors do exactly that with edit summaries. If it doesn't happen, then well, we'll have to fix it ourselves. There's no full protection from this, whether we copy the page name or just replace it with {{PAGENAME}}. I'd like to continue doing it, if there are no objections and recommend doing the same for other languages where the same pattern works. If bots don't work with {{PAGENAME}}, then I guess, it needs to be trained to work with it. --Anatoli T.(обсудить/вклад)02:51, 20 July 2020 (UTC)Reply
@Atitarev: But surely saving time when mass-creating entries is no reason to change existing file names so that they use {{PAGENAME}} instead of the actual name? Can't you subst it in when mass-creating the entries? —Mahāgaja · talk07:51, 20 July 2020 (UTC)Reply
@Ilawa-Kataka If you wish. Not sure it will get anywhere, though; we had a discussion about a similar topic awhile ago and the upshot was there are differing opinions on which template name is preferred. Benwing2 (talk) 08:51, 22 July 2020 (UTC)Reply
I think I remember that one, so never mind. But anyways when I categorise a page it is with {{c}} and I will assume that is not a problem unless I hear otherwise. In an unrelated matter, I left a template edit request at the Grease pit about the Kyrgyz declension template—could you add singulare and plurale tantum support when you have some extra time? İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 19:27, 22 July 2020 (UTC)Reply
@Atitarev: That is the one to which I was referring, though my memory failed me this morning. I admire your and Benwing's recent work on the Slavic languages and have been following it closely, though I do not know enough about those languages to really contribute. If you get to Polish I would be more active, but its infrastructure is in a much better state than several other Slavic languages so it's not a priority. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 01:25, 27 July 2020 (UTC)Reply
New requests
Latest comment: 4 years ago17 comments2 people in discussion
Hi,
Thank you for all the work - Bulgarian, Ukrainian and Belarusian inflections and headwords are so much better this year!
Are you able to run a bot to replace grave accents to acute accents for all Bulgarian templates?
Would you be interested in much smaller improvements for Macedonian (headword) and Hindi (inflections are simple)? --Anatoli T.(обсудить/вклад)
@Atitarev How much programming experience do you have? If you know how to program, writing modules isn't that hard, although you have to learn Lua (which is similar in many ways to Javascript and somewhat similar to Python). You can refer to Wiktionary:Scribunto to get started, and copy an existing module. Benwing2 (talk) 05:45, 5 August 2020 (UTC)Reply
@Atitarev I ran a bot to check for grave accents in Bulgarian headword templates but it found only one instance, азе. What other templates are you thinking of? I also added Module:mk-headword. I'm looking into Hindi nouns; do any nouns require manual translit? It looks like it's enough in most cases to specify the gender of the noun, and the rest can be inferred automatically. Benwing2 (talk) 02:27, 6 August 2020 (UTC)Reply
@Benwing2: Thanks. I come across grave accents all over the place. E.g. User:Benwing2/sla-pro-bg-redlinks shows a few like "бадàкам". I meant {{t}}, {{t+}}, {{l}}, {{m}}, {{cog}}, {{desc}}, etc. Many sorts of templates.
Thanks for Module:mk-headword. I see that older features are preserved too. What functionality has been added? I see dim, adj, m2/f2, etc. are working now. Great!
As for Hindi transliterations, adding them for inflections seems cumbersome. I think you can use the automated transliteration in 99% of cases. The only problem is the shwa-dropping and dropping on the right syllable, which is the work of the translit module. In cases where the automated correct transliteration is impossible, an invisible virama symbol ् could be used to manipulate or phonetic respelling. Say, I want अलार्म क्लॉक(alārm klŏk, “alarm clock”) to be displayed and declined in all forms as "alārm klŏk" (dropping "a"), it could be phonetically described as अलार्म् क्लॉक(alārm klŏk) (with a virama at the end of the first word). Perhaps the phonetic respellings could be used as an additional optional parameter. E.g. कॉफ़ी(kŏfī) can be spelled in a number of ways, with or without nuqta़, with ॉ or with ा but the |phon= could force the desired reading. So, I think respellings (exposed or non-exposed, to be discussed?) is better than manual transliterations. I am not sure yet if we want the respelling to be shown to the user, e.g अलार्म् क्लॉक(alārm klŏk) for अलार्म क्लॉक(alārm klŏk)
@Atitarev I see. I think I can trick the translit module into tracking cases where grave accents occur. As for Hindi, the manual transliterations would be needed only where the translit module gets it wrong, similar to Russian transliterations, and only on the lemma, not on the inflections. The declension module will take care of adding the translits to the inflections. So it looks like the choice is between manual translits and respellings. It's the same amount of work in the module in either case. As a first effort I will probably not worry about this, as the existing declension templates don't support it; I can add it afterwards. BTW I don't see the virama at all that you added, either in the normal display or when editing. What do you want the nuqtaless-form-of category to be called? Something like Category:Hindi nuqtaless forms? Benwing2 (talk) 03:14, 6 August 2020 (UTC)Reply
OK, just automating the inflections is a good start.
Virama is a little diacritic at the end of the of अलार्म् (at the bottom) - letter by letter: अ ल ा र ् म ्. How about ◌्. Can you see here? I am using Chrome. You can download SC UniPad, which is very good for decomposing conjuncts, diacritics, accents, etc. You can copy/paste most of the symbols you need (good for vocalising Arabic sentences, for example).
@Atitarev For the example you give, I see viramas after the fourth and fifth letters and under the dotted circle. But it looks like you're putting them under spaces instead of letters, which is probably why I see them. I am also using Chrome, on a Mac Book Pro. Benwing2 (talk) 03:41, 6 August 2020 (UTC)Reply
I have only put virama under spaces in this example अ ल ा र ् म ्, after र (r) and after म. Normally अलार्म(alārm) is spelled with one virama after r. You can't see it, as it's all merged into a conjuct र्म, with र appearing as a little hook on the top. Without a virama - रम look as separate too letters and it would be pronounced as "ram" (with the final inherent "a" dropped). अलार्म् with two virama after r and m is a (possible) respelling to force the final inherent "a" to also drop, i.e. "alārama" becomes "alārm". Virama is also called a vowel/shwa killer, I think.
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ Just FYI. I just found a good example where a respelling or manual transliteration would be required: मेहनती(mehantī), spelled म(ma) + े(e) + ह(ha) + न(na) + त(ta) + ी(ī), should be pronounced as "mehnatī", not "mehantī". The way to tell, which shwa (a) should be dropped could be मेह्नती(mehnatī) (not the actual spelling), just applying a virama after the consonant ह(ha) where the inherent vowel "a" should be silent - ha -> h.
Hindi Devanagari is very phonetic. The few issues with spellings is that shwa-dropping rules (not complicated for humans) sometimes don't work with automated transliterations or syllabification can't be predicted, like in this case or in loanwords.
Another issue is that Hindi speakers (writers) typically fail to write nuqta (dot), not unlike Russians change ё to е or don't always use a couple of other diacritics: ॊ = ǒ, 'ॆ = 'ě', ॉ = 'ŏ', creating cases where pronunciations differ from spellings. At Wiktionary we make terms with nuqta and with strict spellings the main forms, just like we do with Arabic (hamza-less alif, etc.) or Russian - less formal native spellings. --Anatoli T.(обсудить/вклад)01:11, 7 August 2020 (UTC)Reply
Hi, @Benwing2 and @AryamanA. Thank you very much for improving Hindi modules and templates! The complexity actually exceeded my original knowledge and I can't access good grammar resources, so I wasn't able to contribute much at the end, especially with verbs and adjectives. I am sure I will get back to Hindi work later. I am currently busy learning Korean. --Anatoli T.(обсудить/вклад)07:36, 30 September 2020 (UTC)Reply
Kyrgyz
Latest comment: 4 years ago4 comments2 people in discussion
@Ilawa-Kataka Hi. I'll get to these soon. The issue with ь can maybe be handled by an argument, something like |keep_soft_sign= for those cases where the soft sign should not be dropped. Benwing2 (talk) 01:43, 7 August 2020 (UTC)Reply
How about |soft_sign=b(oth)/y(es)/n(o) or just |soft=? For Тайвань I found both in use (with Тайвандын getting a few thousand results and Тайваньдын a few hundred—both used in professional contexts). İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 00:13, 8 August 2020 (UTC)Reply
Latest comment: 4 years ago1 comment1 person in discussion
I was wondering if, for example, you know a good formal source of the declension rules for Russian verbs?
I see you're the author of the rather impressive ru-conj table, and was wondering if that work was based on some formal set of rules codified somewhere?
I'd like to have a crack at rewriting the conjugation system for my own learning, and was wondering if there's an authorative source, rather than interpreting your code.
Thanks
The authoritative reference is called Грамматический Словарь Русского Языка by A.A. Zaliznyak. It's in Russian so you have to be able to read it at least somewhat, or make liberal use of Google Translate. Benwing2 (talk) 00:48, 22 August 2020 (UTC)Reply
Latest comment: 4 years ago8 comments5 people in discussion
Hello Benwing2!
I was wondering if it would be possible to write a module that creates automatic German IPA pronunciations, just like Module:fr-pron does for French words. German pronunciations seem to follow certain fixed rules, at least for Hochdeutsch, but certainly it would require quite some effort to write it. LinguisticMystic (talk) 11:06, 28 August 2020 (UTC)Reply
@LinguisticMystic: But you wouldn’t know before pleongraphs whether the vowel is the long or the short one. And it can’t know the stress in prefixed verbs where the prefix can be either separable or inseparable, like umfahren. Orthographic depth is lower than in French but with these things there is this more left unexpressed in the script. Fay Freak (talk) 11:23, 28 August 2020 (UTC)Reply
@LinguisticMystic, Fay Freak, Atitarev Not impossible. I implemented a similar module for Old English, which has many of the same concerns as German. As Anatoli notes, you would need respellings to accommodate unpredictable elements. This would include long vowels where short vowels are expected (indicated by adding a macron over the vowel or an h afterwards), short vowels where long vowels are expected (indicated by adding a breve over the vowel or doubling the following consonant), unpredictable stress due e.g. to prefixes like um- or loanwords (indicated by an acute accent over the appropriate vowel), secondary stress (indicated by a grave accent and/or by adding a hyphen to divide components of compound words), glottal stops in the middle of words, etc. In Old English respelling, for example, I used < to separate an unstressed prefix that can't be predicted, > to separate an unstressed suffix that can't be predicted, + to prevent prefix separation that would normally occur, and - to separate components of compound words that receive individual stresses. More respellings are required than in French, for sure, but it's not impossible. Benwing2 (talk) 01:55, 29 August 2020 (UTC)Reply
I see the main problem being education about the need for respellings. There have been far too many bad edits consisting of replacing hardcoded IPA with empty language-specific IPA templates. Module errors are one option, but the worse offenders are mass-editors (often IPs) who don't preview and don't look at the entry after they click "Publish changes". There's also using the technique by which {{taxlink}} alerts editors that there's already an entry for the taxonomic name, but there again, it only shows in preview. The question then becomes whether we're creating an attractive nuisance, like an unprotected pit that pedestrians fall into if they're not paying attention. Chuck Entz (talk) 02:26, 29 August 2020 (UTC)Reply
The situation differs per language, depends on how many conscientious and thorough editors exist per language. Manual IPA is no guarantee for errors, either. We do correct such bad edits when we see them. Modules can trap a few errors or slack editing, like headwords requiring accents, the same could be done for pronunciation modules. German is not very different from many languages already automated and it has a high level of predictable pronunciation. Like Slavic languages, it may require stress marks, a list of prefixes, which may determine the stress, respellings, marks for morpheme or component word boundaries.
Module:de-IPA/testcases already uses |orig=. (German is known to retain partial or full pronunciation from source languages.)
There hasn't been much support from native speakers, the obstacle being variations in pronunciations, even for Hochdeutsch. I personally think it would be good to choose a variety/standard and stick to it for the purpose of module development. Regionalism/versions could be added later, if they are really needed. --Anatoli T.(обсудить/вклад)02:53, 29 August 2020 (UTC)Reply
Latest comment: 4 years ago11 comments2 people in discussion
A lot of people forget to use the |nobycat=1 parameter for people who have only ever made one coinage, causing a bunch of bot-created categories that will only ever have one entry (in which case they shouldn't exist). Conversely, it is possible that people may use |nobycat=1 in error, not knowing that somebody does in fact have multiple coinages to their name. Would it be possible for you to do regular bot runs to add or subtract this parameter as necessary, and delete the empty categories that will result? —Μετάknowledgediscuss/deeds02:39, 30 August 2020 (UTC)Reply
@Metaknowledge This is running. It output some warnings that you should probably look at:
Page 159 star-crossed: WARNING: Lang en, coiner William Shakespeare has 6 total words coined but has nocat=1: {{coin|en|William Shakespeare|in=1597|nocat=1}}
Page 256 brave new world: WARNING: Lang en, coiner William Shakespeare has 6 total words coined but has nocat=1: {{coin|en|William Shakespeare|in=1610|nocap=1|nocat=1}}
Page 258 commonty: WARNING: Lang en, coiner William Shakespeare has 6 total words coined but has nocat=1: {{coin|en|William Shakespeare|in=1594|nocat=1}}
Page 376 primrose path: WARNING: Lang en, coiner William Shakespeare has 6 total words coined but has nocat=1: {{coin|en|William Shakespeare|in=1609|nocat=1}}
Page 531 cold fish: WARNING: Lang en, coiner William Shakespeare has 6 total words coined but has nocat=1: {{coin|en|William Shakespeare|in=1611|nocat=1}}
Page 647 all the world's a stage: WARNING: Lang en, coiner William Shakespeare has 6 total words coined but has nocat=1: {{coin|en|William Shakespeare|in=1599|nocat=1}}
Page 684 Ailurus: WARNING: Lang mul, coiner Frédéric Cuvier has 2 total words coined but has nocat=1: {{coin|mul|Frédéric Cuvier|in=1825|nat=French|nocap=1|nocat=1|occ=zoologist|occ2=paleontologist}}
Page 721 Ailurus fulgens: WARNING: Lang mul, coiner Frédéric Cuvier has 2 total words coined but has nocat=1: {{coin|mul|Frédéric Cuvier|in=1825|nat=French|nocat=1|occ=zoologist|occ2=paleontologist}}
Page 101 ambivalence: WARNING: Lang en, coiner Eugen Bleuler has 2 total words coined but has nocat=1: {{coinage|en|Eugen Bleuler|in=1910|nat=Swiss|nocat=1|occ=psychiatrist}}
Page 215 autism: WARNING: Lang en, coiner Eugen Bleuler has 2 total words coined but has nocat=1: {{coinage|en|Eugen Bleuler|in=1912|nat=Swiss|nocap=1|nocat=1|occ=psychiatrist}}
Thank you! I think it would be helpful to both add and subtract the nocat parameter in an automated way; these should all have it removed, and I don't think a human needs to look over this kind of report in general. —Μετάknowledgediscuss/deeds06:16, 22 November 2020 (UTC)Reply
@Metaknowledge Done. I purged all the subcategories of 'English coinages' so they get re-sorted by last name, but it will take a little while for subcategories of other languages to get sorted by last name. Benwing2 (talk) 04:56, 20 December 2020 (UTC)Reply
Latest comment: 4 years ago16 comments3 people in discussion
Hello! Sometimes we need to quote only the title of a work to illustrate words, and in such cases, it is desirable, while using the {{quote-book}} or the {{quote-journal}} template, to hide the colon that appears at the end of bibliographical informations, as we are not giving the passage in such cases. So, could you possibly create a parameter something like |nocolon= to that effect? Thank you. —inqilābī10:58, 31 August 2020 (UTC)Reply
My two cents: (1) try not to use just a title as an example wherever possible; (2) just repeat the title as the text to be quoted using the |passage= parameter, and make the cited word boldface there. That way, there won't be unusual boldface in the title of the work. — SGconlaw (talk) 12:10, 31 August 2020 (UTC)Reply
@Sgconlaw: Thanks for your suggestions. I myself have used a title as an example only once, but I have seen quite a few ones that other users have added earlier (such as this); and I have seen that they did not use any templates: so, would it not be better to not use templates while quoting titles, as a solution to the problem? But I do not like the idea of repeating the title as the passage, because that would give the impression that the author themself wrote that way. —inqilābī11:13, 1 September 2020 (UTC)Reply
@Lbdñk: My main point is the first one – titles are not really good illustrations of terms as they are usually very short and don’t provide enough context of the terms. Sometimes they do no more than repeat the term, so I don’t see much value in them. — SGconlaw (talk) 11:38, 1 September 2020 (UTC)Reply
I also wanted such a thing, albeit not because of quoting titles but to give other editions/loci containing the same text, although it would also be useful for the aforedescribed purpose because I often refrain to type off a whole passage and link to a scan instead (transcribing the editions is extra work that begins at even deciding where to begin the quote if it is a medieval text not using punctuation, or worse if it is a tablet in a fragmentary state); for the purpose of multiple loci the template should also not cause a line break. Fay Freak (talk) 20:21, 1 September 2020 (UTC)Reply
@Lbdñk: For colon and line break options? Why, this is an uncontroversial functionality extension. It just enables to do the things we do anyway more orderly. Fay Freak (talk) 11:50, 2 September 2020 (UTC)Reply
@Fay Freak: I meant the parameter to hide the colon: let us see if the community supports this proposal... But I did not get what you mean by "line break" & "multiple loci". —inqilābī11:59, 2 September 2020 (UTC)Reply
@Fay Freak: What is the need to provide multiple editions? Never seen such a thing before. Do not take so much trouble: just give the earliest or the major edition. —inqilābī15:43, 2 September 2020 (UTC)Reply
@Lbdñk: Readings, variants? Not everyone has the same edition? It is good for SEO? Wrong question. It is the superobligatory. There is no need, but we like to do things better or more than is required. It’s free! @Sgconlaw also often adds bibliographic information to the quotation templates which are “not needed” but he does it because he can. Fay Freak (talk) 16:05, 2 September 2020 (UTC)Reply
Providing multiple editions and providing full bibliographical informations are wholly otherly things. The latter is meedful, the former is frowned upon. —inqilābī16:27, 2 September 2020 (UTC)Reply
@Fay Freak: if there's a need for some reason to state more than one edition (for example, a reprint of a first edition), you can just use the additional parameters ending with a "2" (|title2=, |location2=, |publisher2=, etc.) as described at {{quote-book}}; it isn't necessary to use a new quotation template. But note that "Wiktionary:Quotations" states: "The year should be that of the earliest edition known to use the word. Where feasible, the page number should be taken from the first edition, but if a later edition is used (found in a library or digitised by Google Books], etc), then the publication date should be added in parentheses after the publisher’s name. In these cases, publication details should reflect the work actually cited: do not give the name, location etc. of the publisher of the first edition if you are not citing it directly." — SGconlaw (talk) 17:43, 2 September 2020 (UTC)Reply
@Sgconlaw This is all aimed at English quotes. There technically isn’t a first edition if the text circulated in manuscript, no reprint of works written before the age of printing (which is for Arabic-script lands before 1800). The year in |year= should of course be that of the composition – which is often approximated –, not the publication year of the earliest edition as claimed by Wiktionary:Quotations, that is also |year_published= (if the first edition is given). And usually modern Arabic editions of medieval texts are better than the 19th editions of Orientalists who had few manuscripts, which are however nonetheless quoted, but the editions stand side by side. Fay Freak (talk) 18:06, 2 September 2020 (UTC)Reply
@Fay Freak: gotcha, in which case you can still use the “2” parameters as I mentioned. That would avoid the problem of the extra colon and line break in your scenario. — SGconlaw (talk) 18:47, 2 September 2020 (UTC)Reply
, is an oversimplification; the supine ōsus does exist, and the verb is also optionally semi-deponent—ōsus sum was used for ōderam (e.g. Plautus, Amphitryo 900). Therefore, the header should be
. This, however, creates problems: ōdisse disappeares and ōsus sum is shown to be in the same tense as ōdī. I honestly don’t know how it should be displayed in the end (whether the present infinitive should still be displayed or instead of it ōderam and ōsus sum). Thorny stuff. --Biolongvistul (talk) 14:34, 15 September 2020 (UTC)Reply
Sanskrit reverse transliteration
Latest comment: 4 years ago2 comments1 person in discussion
Latest comment: 4 years ago8 comments5 people in discussion
Hey, you know some Arabic. Is this IP actually trying to contribute or just messing with stuff? It looks like the latter to me, but I don't know what I'm talking about. —Globins (yo) 05:09, 2 October 2020 (UTC)Reply
@Globins Most of the changes by this IP look fine, but not the change to صهيون. I don't know why the IP keeps edit-warring over this. The vocalization ending in -awn is not normal in Arabic, for sure. Benwing2 (talk) 05:18, 2 October 2020 (UTC)Reply
@Fenakhay, Benwing2, Atitarev: Since one does not learn it systematically, and I have only become cognizant of it because of adding many of these words, I note that there is a pattern where Standard Arabic allegedly has KiLMawN or KiLLawM, and the classical dictionaries list only this, but the most real pronunciation, which the Saudi IP despises and which you virtually always hear when someone on TV speaks Modern Standard Arabic, is KaLMūN or KaLLūM. Other examples include حِرْذَوْن(ḥirḏawn), خِنَّوْص(ḵinnawṣ), بِرْذَوْن(birḏawn), سِنَّوْر(sinnawr), فِرْجَوْن(firjawn) (the last word is not found often so I cannot speak of its real pronunciation, but of Zionists one speaks often); with a remarkable difference خِرْوَع(ḵirwaʕ). You may see that there is no word with such a pattern which is not to be deemed a borrowing. In some other words like تَنُّور(tannūr), زَرْجُون(zarjūn), زَيْتُون(zaytūn), كَمُّون(kammūn) for some reason that KiLMawN or KiLLawM pattern is somehow never mentioned. On other occasions, and more often, the alleged norm is KuLMūN or KuLLūM as opposed to KaLMūN or KaLLūM, e.g. زُنْبُور(zunbūr). Which works analogously with ي(y), as in بِطْرِيق(biṭrīq) instead of بَطْرِيق(baṭrīq) which latter the Saudi IP hates much, قِنْدِيل(qindīl), بِطِّيخ(biṭṭīḵ). Due to borrowings in other languages and the source forms it is also certain that the allegedly colloquial forms have always been of greatest use. I wonder in which contexts one could actually and unironically hear the alleged norm forms. Fay Freak (talk) 12:44, 8 October 2020 (UTC)Reply
Female equivalents
Latest comment: 4 years ago10 comments2 people in discussion
While you're on these I have a request. Can you remove |m= parameters from the headword line on these pages? I think the redundancy at e.g. mercière is pointless and distracting. Ultimateria (talk) 06:37, 3 October 2020 (UTC)Reply
Thanks. Also, I cleaned up the ~120 Portuguese female equivalents that didn't have {{pt-noun}}, but the accelerated entry script will keep creating them. Can you take a look at it? Ultimateria (talk) 08:17, 4 October 2020 (UTC)Reply
Latest comment: 4 years ago3 comments2 people in discussion
You've renamed Kurdish entries to Northern Kurdish and by doing so you broke some links. See for example mereq. The link in "Please also see mereq in the Northern Kurdish Wiktionary" is wrong because it links to kmr.wiktionary.org. The Kurmanji Kurdish Wikipedia and Wiktionary use the ku code, not kmr and since it is just called Wîkîferhenga kurdî (translation: Kurdish Wiktionary. Just "Kurdish" is used in many other places referencing the Wikipedia or Wiktionary too) the "Northern Kurdish Wiktionary" is partly wrong. I know that the dialects differ a lot and perhaps specifying the dialect is good but the name "Northern Kurdish" is pretty uncommon (I know that it's not made up by you, it is used in other templates too). -- Guherto (talk) 19:14, 24 October 2020 (UTC)Reply
@Guherto I see, you're referring to Tbot links. I can fix them to use ku as the Wiktionary code. It will still be referred to in the entries as the "Northern Kurdish Wiktionary", which isn't completely wrong since that's what it really is. Meanwhile, if you think we should rename "Northern Kurdish" -> Kurmanji and/or "Central Kurdish" -> Sorani, feel free to bring that up in the beer parlour. Benwing2 (talk) 19:41, 24 October 2020 (UTC)Reply
Latest comment: 4 years ago2 comments2 people in discussion
Just wanted to say again, thanks for all your work on declension and conjugation! Looking back now, it is seriously impressive that you managed to learn the whole noun and verb paradigms so quickly :) Really a huge improvement on what we had before. —AryamanA(मुझसे बात करें • योगदान)03:52, 27 October 2020 (UTC)Reply
Why prefer ] to {{l|en|LINK}} for non-english definitions?
] to {{l|en|LINK}} for non-english definitions?">edit]
Latest comment: 4 years ago2 comments2 people in discussion
Hi there Benwing2! I noticed that your bot changed some links from the {{l|en|LINK}} format to the ] format. I thought that when specifically the English definition of a word is being referenced as a definition to a non-English word, it would've been preferable to directly link to the English section of the entry, which is achieved by {{l|en|LINK}} (or also ], I suppose). I was just wondering if you had any particular reasoning why it might be preferable to link to the top of the article instead? – Guitarmankev1(talk)12:20, 30 October 2020 (UTC)Reply
@Guitarmankev1 Hi, this is a good question. The general practice I've seen across Wiktionary is to use bare links in English definitions except when the foreign lemma is spelled identically to the English definition, as e.g. on Welsh cobra (in which case the bare link will display in bold and not be clickable). AFAIK the reason for this is that it's much easier to type a bare link, and it's visually less intrusive in the source code, and the English definition is already at the top of the page so it usually doesn't matter very much whether you link to the top of the page or to the English definition. I think there may have been votes on this but I'm not sure. I know that some editors don't even like using {{l|en|...}} links in synonyms/antonyms/derived terms/related terms sections of English entries, but here the practice of using {{l|en|...}} seems more established. Benwing2 (talk) 14:15, 30 October 2020 (UTC)Reply
Request for mechanical fixing of several dozen misworded Jeju entries
Latest comment: 4 years ago6 comments2 people in discussion
Currently a lot of Jeju entries have "Sino-Jeju word from {{m|jje|}}" in their etymology sections. This is pretty bad because first, "Sino-Jeju" is a term nobody uses either in Western academia or in Korea itself, and second, because Jeju does not have its own character readings that correspond to e.g Sino-Korean, Sino-Vietnamese, Sino-Japanese.
The correct way to handle these should be {{ko-etym-sino||nocat=y}}, because these words derive from Sino-Korean readings.
Could you set up a bot so the Wikitext sequence "Sino-Jeju word from {{m|jje|foo}}" is automatically changed to "{{ko-etym-sino|foo|nocat=y}}"?
@Karaeng Matoaya Apologies, it took me several days to get to this. I fixed things up, and also replaced Sino-Jeju -> Sino-Korean in some synonym sections. Can you review the pages with these latter changes? They are as follows:
Having looked at those Jeju entries and tried to fix some of them (mostly by removing the false "from Middle Korean" etymology and just linking to Korean cognates instead), I am now of the strong opinion that Jeju should be merged back into Korean:
The etymology section for Jeju is unmanageable separate from Korean etymology sections, because 1) Jeju is only attested in the twentieth century but 2) Jeju is not directly descended from fifteenth-century Middle Korean, so all that every etymology section can really say "Cognate to Middle Korean blah blah; see Korean blah blah".
It is aesthetically very unpleasing to have Jeju above Korean in entries like 나 (na). The definition "A Hangul syllabic block made up of ㄴ and ㅏ" should absolutely come at the top of the page.
The significant majority of Jeju lemmas are also attested in some form of Early Modern or modern dialectal Korean. So far the Jeju-Korean distinction works because 99.5% of our Korean lemmas are only from Contemporary Standard Seoul, but our coverage of the dialects and of the historical forms has been improving lately and eventually there will come a point when most Jeju entries have a Korean entry below them with identical pronunciation and identical semantics. This seems very unhelpful.
With just a single tweak for the Jeju /ɔ/ vowel, this will allow us to use Module:ko-translit for Jeju forms instead of transliterations being manually inputted.
People in Jeju Island today speak a Jeju-influenced Korean, but a lot of these Korean words seem to have been added as Jeju entries (presumably because a lot of online resources fail to make this distinction). Merging Jeju into Korean solves this issue.
@Karaeng Matoaya Yes, definitely. Similar considerations led me to suggest merging Scots and English. Nearly all the distinctive Scots vocabulary is also attested in dialectal northern England English, and since we do have coverage of those dialectal terms we end up with a huge amount of duplication, which just adds a lot of unnecessary load onto editors. On top of this there aren't really any Scots editors to maintain the Scots terms in any case. In this case, someone else created a vote which was shot down; there was a lot of huffing and puffing about how Scots was obviously a separate language and to suggest merging them was to deny the linguistic identity of Scots, which in reality wasn't and isn't anyone's intention. But for Jeju vs. Korean you might get more traction since (a) it's not so familiar as with English vs. Scots so less likely to elicit emotional reactions, (b) there's already the example of the merged Chinese varieties. Benwing2 (talk) 15:03, 9 November 2020 (UTC)Reply
Removing macrons from Latin words built on signum (and other words containing -ign-)
Latest comment: 4 years ago4 comments2 people in discussion
I started out trying to do this manually, but there are too many for me to be able to do it well. Could you please feed WingerBot a list of all the Latin words containing -sign- that are now marked with a macron (-sīgn-) to have it remove the macrons? It added them in August 2019, but the idea that these words had a long vowel is not well supported, and the general current consensus is that they didn't. Please also remove the macrons from any words built on other roots that we transcribe with -īgn-, e.g. from dignus.
Per W. Sidney Allen: "The change of e to i indicates a short vowel for an early period in ignis, dignus, lignum, signum, ilignus (cf. p. 23) and Romance evidence points to a short vowel at a later period in dignus, pignus, pugnus, lignum, signum (e.g. Italian degno, French poing)." (Vox Latina page 72)
Likewise, Carl Darling Buck writes in "The Quantity of Vowels before gn" that "For the cultivated language, which is what we aim to represent in our pronunciation and spelling a long vowel before gn is to be recognized only where it is long in origin, as, for example, in rēgnum." (page 314,
The Classical Review, Vol. 15, No. 6 (Jul., 1901)).
@Urszag Yeah I'm not sure why I added those macrons originally. My primary source has been here , and under section 38 they mention Buck's views, saying "Buck’s argument is a very strong one, and his conclusions deserve at least provisional acceptance. It should be noted, however, that three words, rēgnum, stāgnum, abiēgnus, being derived from stems with a long vowel, were legitimately entitled to their long quantity and always retained it." Are the macrons only on -sign- words and not on any other words with -ign-? If so, I was probably influenced by this evidence: sIgnum, CIL. vi. 10234; seignvm, xiv. 4270; sIgnificabo, vi. 16664;. But I will fix this. It would help if you could make a list of all the lemmas in question, or at least the classes of lemmas (e.g. "words in -sign-", "words in -dign-", etc.), so that I can look for them. Here is what appears to be the complete list of lemmas in -sign-:
Thanks. I found some words built on ignis that also need to be fixed:
ignesco
igneus
ignifer
ignio
ignipes
ignitus
ignivagus
--Urszag (talk) 21:24, 8 November 2020 (UTC)Reply
Latest comment: 4 years ago4 comments2 people in discussion
Sir, the primitive auto cat trials are going well at el.wikt. Thank you for your help, so much! If you could help me with splitting the data pages? I tried but failed as explained at :el:Module talk:yy/alldata. I do not understand why :( ‑‑Sarri.greek♫|20:13, 8 November 2020 (UTC)Reply
Latest comment: 4 years ago2 comments2 people in discussion
Hello Benwing. I don't oppose the redirect per se, but bring to book isn't the "standard" form: afaict, there is absolutely no consistency in the way we lemmatize such entries.
In the meantime, I wouldn't delete the original page, because with a red link, someone is bound to create a duplicate entry, not knowing that we already have one at bring to book. PUC – 17:39, 12 November 2020 (UTC)Reply
@PUC I made the change because when I looked at lots of English multiword verbal expressions, it became clear that the cases without someone are much more common than the cases with someone. Feel free to recreate the someone variant as a hard (or soft) redirect if you prefer. Benwing2 (talk) 05:11, 13 November 2020 (UTC)Reply
{{RQ:Dryden Meta}} and {{RQ:Byron Harold}}
Latest comment: 4 years ago3 comments2 people in discussion
Could you please do the following replacements?
#* {{RQ:Dryden Meta|12}}
#*: And where man ended, the continued vest,<br>Spread on his back, the '''houss''' and trappings of a beast.
→
#* {{RQ:Dryden Metamorphoses|book=XII|passage=And where man ended, the continued vest, / Spread on his back, the '''houss''' and trappings of a beast.}}
#* {{RQ:Byron Harold|3|1}}
#*: When last I saw thy young blue eyes, they '''smiled'''.
→
#* {{RQ:Byron Childe Harold|canto=III|stanza=I|passage=When last I saw thy young blue eyes, they '''smiled'''.}}
Latest comment: 4 years ago3 comments2 people in discussion
Hi, another one:
#* {{RQ:Dryden AZ}}
#*: We are both love's captives, but with fates so '''cross''', / One must be happy by the other's loss.
→
#* {{RQ:Dryden Aureng-zebe|passage=We are both love's captives, but with fates so '''cross''', / One must be happy by the other's loss.}}
Latest comment: 4 years ago3 comments2 people in discussion
Another one, when you have the time:
#* {{RQ:Spenser FQ|3|2|stanza=8}}
#*: of which great worth and '''worship''' may be won
→
#* {{RQ:Spenser Faerie Queene|book=III|canto=II|stanza=8|passage=of which great worth and '''worship''' may be won}}
#* {{RQ:Clarendon Rebellion}}
#*: They discerned a body of five '''cornets''' of horse very full, standing in very good order to receive them.
→
#* {{RQ:Clarendon History|passage=They discerned a body of five '''cornets''' of horse very full, standing in very good order to receive them.}}
Latest comment: 4 years ago5 comments3 people in discussion
Hello Benwing2, I noticed that you recently removed the code |nocat=1 from an instance of Template:coinage at autism and some other pages. Most of the removals look good and I thank you for making them. For autism and ambivalence, though, my understanding is that they are borrowings of German coinages and were not coined natively in English. Because of that I think keeping |nocat=1 is appropriate on the pages. I do think the equivalent German terms, Autismus and Ambivalenz should be categorized as coinages. Let me know what you think and I hope you the best. —The Editor's Apprentice (talk) 03:57, 24 November 2020 (UTC)Reply
Besides which, the only reason to use the template is for the categories. If you just want to display the text, type the text in directly- you get the same results without having to type |nocaps=1, |nodot=1 and |nocat=1. Chuck Entz (talk) 07:46, 24 November 2020 (UTC)Reply
{{RQ:Browne Errors}}, {{RQ:L'Estrange Fables}}, and {{RQ:Chapman Odyssey}}
Latest comment: 4 years ago3 comments2 people in discussion
When you have time:
#* {{RQ:Browne Errors}}
#*: Preventive physic preventeth sickness in the healthy, or the '''recourse''' thereof in the valetudinary.
→
#* {{RQ:Browne Pseudodoxia Epidemica|passage=Preventive physic preventeth sickness in the healthy, or the '''recourse''' thereof in the valetudinary.}}
----
#* {{RQ:L'Estrange Fables|passage=}} or
#* {{RQ:L'Estrange Fables}}
#*:
→
#* {{RQ:L'Estrange Fables of Aesop|passage=}}
----
#* {{RQ:Chapman Odyssey}}
#*: The doors of plank were; their '''close''' exquisite.
→
#* {{RQ:Homer Chapman Odysseys|passage=The doors of plank were; their '''close''' exquisite.}}
Latest comment: 4 years ago4 comments2 people in discussion
Another replacement of a redundant quotation template:
#* {{RQ:Chapman Iliad}}
#*: Childish, unworthy '''dares''' / Are not enough to part our powers.
→
#* {{RQ:Homer Chapman Iliads|passage=Childish, unworthy '''dares''' / Are not enough to part our powers.}}
@Sgconlaw BTW this change has caused a bunch of errors, e.g. on inflame, tappish, etc.: A big red "Unexpected < operator". Presumably this is because the volume or something isn't explicitly given. Could you look into this? Benwing2 (talk) 20:49, 5 December 2020 (UTC)Reply
I didn't realize they got deleted, lol. Feel free to replace them all again when I've finished up adding a whole bunch of Milton quotes, if you want. Oxlade2000 (talk) 19:56, 13 February 2021 (UTC)Reply
Latest comment: 4 years ago2 comments1 person in discussion
Hi,
Thanks for all the work. How difficult would it be to generate a maintenance category of entries that are simultaneously using {{ko-etym-sino}} and {{ko-IPA|com=}}? The majority of such entries are missing a parameter in their etymology sections and need to be manually fixed.--Karaeng Matoaya (talk) 09:01, 14 December 2020 (UTC)Reply
Latest comment: 4 years ago4 comments2 people in discussion
Hi,
Sorry to bother you again, but when you have the time, could you add a new parameter ("hangul=y") to Module:ko-etym so that {{ko-etym-native|hangul=y}} produces "In the Hangul script, first attested in..." instead of just "First attested in..."? There are a fair number of Korean words which are attested before Hangul, but the earliest Hangul form is always still important to note, typically because it's the first phonologically transparent orthography.
Additionally, could you have the module put the language as "Late Old Korean" instead of "Middle Korean" if the year is before 1300?
@Karaeng Matoaya Apologies for taking so long! I did the first part. For the second part, we should probably add an etymology-only language "Late Old Korean" that has Old Korean as its parent language. However, per Wikipedia, there is disagreement about whether to categorize the Korean period from 900-1300 as Old Korean or Middle Korean. I'm not familiar enough with Korean to know what to do here. @TAKASUGI Shinji, Atitarev, HappyMidnight, LoutK, Quadmix77, Suzukaze-c Anyone else have comments? Benwing2 (talk) 22:29, 26 December 2020 (UTC)Reply
Thanks so much!!
For the second part, I actually wrote the Wikipedia article on Old Korean, and while I tried to be neutral WRT the periodization question there, there really is a growing consensus in South Korean academia that the periodization up to 1300 is correct/more useful. The previous 900 periodization still holds weight IMHO mainly because of academic inertia, but it was formulated at a time when we knew much less about Old Korean than we do now—almost nothing, really. I've actually already added many thirteenth-century words as Old Korean entries, and explained the reasons for doing so on Wiktionary:About Korean/Historical forms#Periodization.--Karaeng Matoaya (talk) 00:28, 27 December 2020 (UTC)Reply
Latest comment: 4 years ago7 comments2 people in discussion
Hi, you deleted {{categoryboiler}} for being obsolete, but it appears on every empty category when one starts to create it. We should either restore the template or arrange it (somehow) that categories in the process of being created display something else. —Mahāgaja · talk21:32, 26 December 2020 (UTC)Reply
Latest comment: 3 years ago1 comment1 person in discussion
The mistake in this Module is that at one point it attempts to wrap a TD with a DIV, whereas it should be the reverse. As you are more experienced in writing Lua code, you might want to figure out how to reverse this behaviour, as doing so will considerably reduce the number of LintError's being generated.
<divclass="th-reading"><tdstyle="border-right:0px"><spanlang="th"class="Thai ">ลำ-ไส้-ไหฺย่</span><br><small>l å – <spantitle="Vowel sign appearing in front of the initial consonant."style="border:1px dotted gray;border-radius:50%;cursor:help">ai</span> s ˆ – <spantitle="Vowel sign appearing in front of the initial consonant."style="border:1px dotted gray;border-radius:50%;cursor:help">ai</span> h ̥ y ˋ</small></td></tr></div>
The class should be applied to the TR surely?, or the relevant class should be applied to the TR or TD not the DIV, and the DIV tag removed.
I don't understand how this is rendered in the Lua, so perhaps you can make more sense of it than I can?
ShakespeareFan00 (talk) 23:12, 17 January 2021 (UTC)Reply
Latest comment: 3 years ago2 comments2 people in discussion
Hello. Recently, {{root-sa}} has been created to categorise Sanskrit terms by root (see CAT:Sanskrit terms by root). I was just asking that can you use your bot to add {{root-sa|<root>}} below the etymology sections of Sanskrit entries for words which are derived from roots (which have something like From the root {{m|sa|<root>}} in their etymologies)? Thanks, 🔥𑀰𑀩𑁆𑀤𑀰𑁄𑀥𑀓🔥05:45, 18 January 2021 (UTC)Reply
Latest comment: 3 years ago2 comments2 people in discussion
The pages for some gerund forms (e.g. addicendo) list the form under the "Verb" heading and treat it as an inflection of the verb lemma (e.g. addico), with the basic {{inflection of}} template to link back to this lemma. Others (e.g amando) list the form under the "Gerund" heading, treat it as a form of the gerund head form (in the accusative, e.g. amandum), and use the {{la-gerund-form}} template to link them back to this gerund head form; in turn, the page for the gerund head form (again, e.g. amandum) sometimes uses the {{la-gerund}} template to mark this as the gerund head form, and the {{la-decl-gerund}} template to generate a gerund declension table. Which of these is the intended correct treatment of Latin gerunds? Wewebber (talk) 06:04, 18 January 2021 (UTC)Reply
@Wewebber I'm not really sure, but I think probably the former treatment is better because it's not obvious which case of the gerund should be basic. It feels strange to me to say that amandum is a "gerund" and the others are "gerund forms"; in reality, all of them are on equal footing. Benwing2 (talk) 04:49, 23 January 2021 (UTC)Reply
Module italics and recognising escaped brackets?
Latest comment: 3 years ago6 comments4 people in discussion
The 'problem' I was trying to solve was that of the italic brackets issue. In comments made by others in respect of {{nb...}} It was indicated that there was an issue with Module:italics failing to recognise that [ and ] were equivalent to , when used as a replacemet. The code I attempted to add in Module:Italics was my attempt to add a recogniser for that situation. However, this is only my third or fourth to modify Lua code, and thus I'd like a second view on the suitability of the code concerned.
Apart from the Linter errors which are apparently being created (but which don't appear to lead to any visual effects), we've identified the following:
The current version of the template which uses square brackets creates a clash in quotation templates if used in combination with an external link, for example, if {{nb...}} is used in |chapter= and |chapterurl= is used to apply an external link to the chapter.
This problem is solved if the brackets in {{nb...}} are replaced with [ and ]. However, the quotation templates don't recognize these codes and so are unable to unitalicize the brackets and ellipses in titles. (I think this is handled by Module:italics.)
Typing "" within {{nb...}}, for example |title=Just Testing:{{nb...| A Book about How to Carry out Tests}}, breaks {{quote-book}}. However, a workaround is to type "[...] instead. If we're going to implement the workaround, then we'll need your help to replace occurrences of "" within {{nb...}} and {{...}}.
Latest comment: 3 years ago4 comments2 people in discussion
Given that Module:form of/cats is restricted to template editors, and you're the one who worked on it the most recently, could you make some fixes to it? {{inflection of}} isn't assigning the proper categories to Middle English verb forms (see stod, stode, stoden); I think I've identified what needs to be changed in Module:form of/cats:
At line 304, {"13","s","past"}, should be replaced with {"1","3","s","past"},
Lines 306-317 need to be entirely rewritten; replace with: {"hasall",{"p","pres","sub"},"plural subjunctive forms"}, {"hasall",{"p","pres","ind"},"plural forms"}, {"hasall",{"p","past","sub"},"plural subjunctive past forms"}, {"hasall",{"p","past","ind"},"plural past forms"},
@Mahagaja Don't worry. All uses of {{etyl}} were eliminated by hand; I reviewed every case and decided manually whether to substitute {{inh}}, {{der}}, {{bor}}, {{cal}}, etc. The only purpose of the bot was to push the changes (that's what "manually assisted" means in the commit message). Benwing2 (talk) 02:43, 2 February 2021 (UTC)Reply
@Hazarasp I pushed the code to fix this. However, there are various existing issues with Middle English verb forms, so you won't necessarily see what you expect. For one thing, the category spec for Category:Middle English first-person singular forms and such is specified as {"hasall", {"1", "s", "pres"}, "first-person singular forms"}, so it will trigger on subjunctive forms as well as indicative forms. If you want it to trigger only on indicative forms, you should probably add the tag "ind" to the spec; but it looks like not all existing verb forms include this tag (e.g. fole does but wones doesn't). An alternative, then, is to use {"tags=", {"1", "s", "pres"}, "first-person singular forms"}, but that won't trigger if there are additional tags like ind present. You could use something like this:
but IMO you're better off just fixing the verb forms to consistently include the ind tag.
Another thing is the naming of the categories themselves. If you want the category to only include first-person singular present indicative forms, IMO it should be named Category:Middle English first-person singular present indicative forms, so that the name correctly reflects what's in the category.
A third thing is this spec:
{"hasall", {"1", "3", "s", "past"}, "first/third-person singular past forms"}.
This should now be changed back to
{"hasall", {"1//3", "s", "past"}, "first/third-person singular past forms"}
as you had it before; the latter now works, and the former doesn't.
Thanks for doing this; to solve the continuing miscategorisation, I've chose to implement a solution entirely different from anything you've suggested (but of course which was made possible by your work!), so nothing should have to be recategorised (as far as I know); see my edits to Module:form of/cats. Hazarasp (parlement · werkis) 03:49, 15 February 2021 (UTC)Reply
Latest comment: 3 years ago4 comments3 people in discussion
It may not have been obvious due to Chinese terms flooding the category (whatever error it was has been cleared- now it's just the queue), but there are a half-dozen+ Welsh and Norwegian entries that have module errors in the {{inflection of}} template that seem to be due to your edits on "form of" modules. It's not a huge issue- I'm only bringing it up because I wasn't sure if you were aware of it. Chuck Entz (talk) 02:18, 15 February 2021 (UTC)Reply
@BlaueBlüte, Akletos Most verbs that are both separable and inseparable already had this system of etymology. I simply systematized it across all such verbs. To me, this makes sense, because separable prefixes are fundamentally distinct words that just happen to be written attached to the verb in some circumstances, whereas inseparable prefixes are fundamentally prefixes that don't exist as distinct words. This is clearer on the one hand with prefixes like ver- and ent- and on the other hand with collocations like kennen lernen and spazieren gehen (formerly written kennenlernen and spazierengehen). It's true that for some pairs like ich durchziehe vs. ich ziehe durch the meanings can overlap, but in many they don't at all, as in ich umfahre vs. ich fahre um. Benwing2 (talk) 07:02, 17 February 2021 (UTC)Reply
@BlaueBlüte Without having come to a final conclusion on the issue, in my eyes the distinction between durch and durch- is artificial, but if we make it, we should be consistent and I agree with Benwing that the separable prefixes could be written as separate words, so Benwing's approach seems reasonable to me. I now almost regret that I've created the entries -fähig and -gleich, and am considering if it wouldn't be better to treat the words with these and similar elements as compounds with fähig and gleich etc. Same with durch- et al. --Akletos (talk) 07:29, 17 February 2021 (UTC)Reply
@Akletos +1 re: artificial and consistency. Although, consistency probably shouldn’t stop at how we analyze such verbs in terms of etymology, but also look at the entries for those prefixes. For example, the ‘prefix’ entry for um- claims to pertain both to the separable and inseparable case, and the ‘distinct word’ entry um pretty much mentions the meanings for both ich umfahre and ich fahre um (even more clearly so for ich umgehe (‘go around, circumvent’) vs. ich gehe um (‘go about, haunt’)). It would seem that if such a distinction in etymologies is to be made on the lemma level, further consistency issues will arise. I’m not sure I’m clear on the ‘written as a separate word’ part yet. Would the test for prefix vs. compound be whether the verb can be rewritten with an adverb/a preposition that is ‘more separate’ than a separable prefix? Ex.: ich umfahre das Hindernis ⟶ ich fahreum das Hindernis ich fahre den Baum um ⟶ *ich fahreum den Baum (except maybe poetic) (a similar test could be devised for -fähig and -gleich: widerstandsfähig ⟶ zum Widerstand fähig) At any rate, maybe the etymology for separable and inseparable forms could be explained distinctly in an overall less involved way if the etymology, while referring to the same lemmas as the compound elements, were more specific about the distinct (albeit perhaps artificially so) mechanism of compound formation. This used to be the case for durchströmen (see erstwhile source; I can’t reproduce what the rendered page looked like at the time with templates expanded). Also I wonder if this should perhaps be discussed more widely—or maybe that has happened already? ―BlaueBlüte (talk) 08:38, 17 February 2021 (UTC)Reply
Wingerbot's Latin macron edits from September 2020
Latest comment: 3 years ago3 comments2 people in discussion
Hi, Benwing2. I see that some people have discussed a couple issues with the Wingerbot edits to Latin macrons above (regarding vērnus/vĕrnus and -ign- words). I recently noticed cornīx was probably erroneously given a macron... Is *ḱorh₂- > *ko:r- a possible vowel compensation? Spanish has cuervo < cŏrvus at least, and the Alatius page with the discussion by Bennett also has cornīx without a macron. Is there also a way to get a list of the words WingerBot changed at the time (did it include all short -rn- words?)? I would like to try examining them... The discussion on vērnus/vĕrnus seems interesting.--Ser be être 是talk/stalk14:19, 21 February 2021 (UTC)Reply
@Ser be etre shi I think cōrnīx was an error on my part, I can't find any evidence for the long ō. The full list is here: User:Benwing2/latin-macrons It's a concatenation of 7 files that served as the direct input to my macron-frobbing script and you have to edit the source in order to make sense of it. Benwing2 (talk) 16:46, 21 February 2021 (UTC)Reply
Ah, thanks! It's a lot of words. Looking quickly at the list I now see why you'd do it though, to improve the coverage of less obvious macrons. Some of these really aren't obvious, like the ū of rūrsum/rūrsus... or the ē of comprehendō > comprēndō. Of course they're vastly correct, and it's only cōrnīx and maybe a few others that annoyingly got through the cracks. Hm.--Ser be être 是talk/stalk00:08, 22 February 2021 (UTC)Reply
Request for simple botting
Latest comment: 3 years ago2 comments2 people in discussion
Hello, when you have the time, could you run your bot to change all instances of "deriving transitive verbs" to "deriving active verbs" and "deriving intransitive verbs" to "deriving passive verbs" in Korean etymologies (there should be around fifty of both)? Thanks, and hopefully this isn't too much of a bother!--Tibidibi (talk) 14:47, 22 February 2021 (UTC)Reply
@Tibidibi Apologies, I overlooked this. I'm looking now and it seems you've already made all the changes? I only see one term with "deriving transitive verbs" in it and none with "deriving intransitive verbs". Benwing2 (talk) 23:33, 7 March 2021 (UTC)Reply
Latest comment: 3 years ago1 comment1 person in discussion
Today 16 Georgian entries showed up there, and I can't figure out why. None of them has been edited recently, nor has any of the main forms pointed to by the first argument in the template. I went through the entire transclusion list of one of those entries: aside from one unrelated edit to Module:scripts/data on the 16th, absolutely nothing has been edited since your edits to the "form of" family of modules.
Latest comment: 3 years ago6 comments3 people in discussion
Hi! Hope you are doing well! Just came here to say maybe it's time to move "ku" to a family. Your bot keeps creating a lot cats like Category:Kurdish terms borrowed from Turkish even when there is only a single page, so we can get rid of this problem. Most of the pages in cat:Kurdish language right now are there since we don't know from which Kurdish language a word originates. Once we make a "Kurdish languages" code, we will be able to tag those pages appropriately. Thank you! --Balyozxane (talk) 04:04, 7 March 2021 (UTC)Reply
@Balyozxane, Chuck Entz I have done this. It may flush out some errors; let's be on the lookout over the next day for this. Most of the issues should already have been fixed, as I have done various bot runs to convert uses of 'ku' to either 'kmr' (Northern Kurdish) if the term is in Latin script or 'ckb' (Central Kurdish) if the term is in Arabic script. Any remaining issues will be things that were added using 'ku' since I did the last bot run. Benwing2 (talk) 18:48, 7 March 2021 (UTC)Reply
@Balyozxane, Chuck Entz, Fenakhay OK, most of the remaining errors are in translation tables (which should use 'kmr' or 'ckb' as above) or in descendants tables. Since 'ku' is no longer a language, it can't be used in {{desc}}, even in a section header. I have changed them to write out 'Kurdish:' as raw text. The alternative is to avoid grouping the different Kurdish languages and put each one alphabetized appropriately, e.g. Northern Kurdish under N. Benwing2 (talk) 23:20, 7 March 2021 (UTC)Reply
Note to Benwing and @Chuck Entz, since I suspect at least the former is not even acquainted with the possibility and the latter followed lead, that one also puts the borrowing arrow as raw text, or with {{tooltip}}, or because one has done it so often there is {{→}}. Such as in front of “Hindustani” and ”Aramaic”. Now you could argue that unlike with Hindustani you put the arrow in front of the individual Kurdishes as there is not one borrowing but separate borrowings as there are multiple languages, but one can say alike that one just puts it once in front of the brackets like in mathematics (what is the equivalent idiom for vor die Klammer ziehen? In law probably none due to the structure of common law, but else?) and it does not even depend on whether something is really one or multiple languages but it should just look nicer, on one level with the other mentioned languages in the same table and uniform. Sometimes one also puts a gloss at the top, if the meaning is the like. Fay Freak (talk) 13:48, 8 March 2021 (UTC)Reply
@Chuck Entz, Fay Freak I have a script to clear these errors that moves the |bor=1 down to the individual languages, on the theory that they're individual languages and potentially each one could have borrowed it separately. If it's agreed to do it the other way, I'll write another script to undo the change. Benwing2 (talk) 16:26, 8 March 2021 (UTC)Reply
Latest comment: 3 years ago4 comments2 people in discussion
@Chuck Entz Heads up, I am in the process of pushing a bunch of changes I made to Spanish lemmas. As these were mostly done manually there may be some errors. It is running overnight (UTC-6); if there are any errors I'll try to fix them in the morning if they are still around. Benwing2 (talk) 06:16, 8 March 2021 (UTC)Reply
BTW the bug in acribillar appears to be an underlying bug in Module:compound, but I don't have time to fix it right now, have to go to sleep. Will fix tomorrow. Also, there are continuing to appear errors due to 'ku' appearing in {{desc}}; I have a script to fix these as they come up but it will take a little while for them all to appear. Benwing2 (talk)
@Vox Sciurorum On the general theory that there shouldn't be periods in foreign-language defns. I agree it doesn't look perfect as-is; either we can put the periods back or add an argument to lowercase the initial a. Benwing2 (talk) 16:28, 8 March 2021 (UTC)Reply
Another observation: When linking to a different page, {{l|en|a}} => a is better than ] => a because it skips over a potentially long table of contents. On the same page the #English form is better. Vox Sciurorum (talk) 16:46, 8 March 2021 (UTC)Reply
{{RQ:Addison Freeholder}}
Latest comment: 3 years ago1 comment1 person in discussion
Hello, is it feasible to do the following bot edit?
#* {{RQ:Addison Freeholder|50|June 11 1715}}
#*:The enemies of our happy establishment seem at present to copy out the piety of this seditious prophet , and to have recourse to his laudable method of '''club-law''', when they find all other means of enforcing the absurdity of their opinions to be ineffectual.
→
#* {{RQ:Addison Freeholder|issue=50|date=11 June 1716|passage=The enemies of our happy establishment seem at present to copy out the piety of this seditious prophet , and to have recourse to his laudable method of '''club-law''', when they find all other means of enforcing the absurdity of their opinions to be ineffectual.}}
In particular, all dates after 1 January should have the year 1716 instead of 1715; WF indicated the year incorrectly. (Addison published The Free-holder between December 1715 and June 1716.) Thanks. — SGconlaw (talk) 18:18, 8 March 2021 (UTC)Reply
Kurdish language code
Latest comment: 3 years ago18 comments5 people in discussion
@Victar We had several discussions a few months back, in November, I think, and all the Kurdish editors were in agreement. See the Beer Parlour. I made almost all the changes at the time to convert 'ku' to 'kmr' or 'ckb', but never finished it, and a Kurdish editor (see just above) asked me now to finish the job, so I did. Benwing2 (talk) 15:27, 9 March 2021 (UTC)Reply
I remember there being talk about moving pages from the Kurdish header, but I don't recall any discussion about making ku a family code. If we make ku a family code, we should have a ku-pro to replace all the instances of {{desc|ku|-}}. This isn't a good solution to the problem. --{{victar|talk}}19:57, 9 March 2021 (UTC)Reply
Yes, that would be implied. In the case exampled in the link, they certainly weren't borrowed from Aramaic multiple times. Using ku-pro would be the best one-to-one replacement for all such examples. --{{victar|talk}}20:53, 9 March 2021 (UTC)Reply
Oof, I see WingerBot already replaced {{desc|ku|-}} with Kurdish: everywhere. Again, I don't recall any discussion on that choice. --{{victar|talk}}20:58, 9 March 2021 (UTC)Reply
@Victar, Fay Freak, Balyozxane I can certainly do a bot run to replace Kurdish: with something else if needed, but I strongly disagree with a blanket replacement to ku-pro. Declaring that a borrowing happened in Proto-Kurdish is only appropriate if the term was borrowed 1000 or more years ago (or whatever the age of Proto-Kurdish is). This may be appropriate for some Aramaic borrowings but hardly for the majority of borrowings into Kurdish. It's much more likely IMO that terms were borrowed into a single Kurdish language and then diffused through the others, or borrowed simultaneously into multiple Kurdish languages around the time the originating term was created. In both cases a noncommittal "Kurdish:" header is much better than "Proto-Kurdish". I have no problem with using the arrow template {{→}}, as Fay Freak suggested, before "Kurdish:" rather than separately adding |bor=1 to each language; either way works fine for me. Benwing2 (talk) 02:42, 10 March 2021 (UTC)Reply
It was likely borrowed in the Middle Iranian period from Middle Aramaic, so yes, Proto-Kurdish is probably appropriate. The scenario you're giving sound more like a much later Arabic borrowing. Just because we have a proto language code, doesn't mean it should be used for entries, i.e. Proto-Armenian. That said, if someone wanted to come along and start creating reconstructed entries using the proper sources and methodology, I wouldn't object. --{{victar|talk}}06:02, 10 March 2021 (UTC)Reply
Those are examples to be remedied, not exemplified. I haven't gotten around to cleaning up the branches of SWIr, and we haven't decided on a name for the Caspian branch. Also, some families are areal, not genetic, like Southeastern Iranian, and I agree that those should be limited to family codes. You reply to my last comment, but you haven't said why a Proto-Kurdish code shouldn't exist. --{{victar|talk}}07:16, 11 March 2021 (UTC)Reply
@Victar I never said a code for Proto-Kurdish shouldn't exist. Feel free to create ku-pro if you wish. What I said was I object strongly to a blanket replacement of Kurdish: in Descendants tables with Proto-Kurdish. This should *only* happen if the term was actually borrowed into Proto-Kurdish, not if it was borrowed at a later period and diffused through the various Kurdish languages. In other words, you have to go case by case deciding whether to use ku-pro. For Iranian terms that were inherited, of course it's ok to use Proto-Kurdish, but that's not the majority of the cases. I also disagree with the idea that every line in a Descendants table needs to use a language code, that seems just an arbitrary assertion on your part. Areal families, for one, should not have any associated code; nor should "Hindustani" even though it's a convenient grouping of what is essentially a single language. Benwing2 (talk) 07:41, 12 March 2021 (UTC)Reply
If you have no objections to a ku-pro language code, I'll create it. As I said, areal codes are one thing, but there are cases where Proto-Kurdish did borrow terms into it, which surely applies to those borrowed from Middle Aramaic. --{{victar|talk}}04:50, 15 March 2021 (UTC)Reply
Semi-related: is the format at بئر#Descendants considered ideal? Personally, I'm not fond of how "Kurdish" is missing an arrow (making it look like an inheritance), and then there is an arrow before each Kurdish language (making them look like borrowings from "Kurdish"). It's not a problem in the sense that humans can easily intuit what we actually mean, but it still feels like we could be doing a better job of presenting this. —Μετάknowledgediscuss/deeds21:36, 16 March 2021 (UTC)Reply
@Metaknowledge I see your point. I had my bot make the changes in this fashion but User:Fay Freak thinks it's better to put the arrow before the word "Kurdish:" (even though it's not a language) rather than before each language. If you agree, I can do a bot run to change things in this fashion. Benwing2 (talk) 03:41, 17 March 2021 (UTC)Reply
I agree, but I think some kind of template is better than just typing in an arrow, especially for automated programs trying to read our etymologies. I also feel it might be useful to get more community engagement on this, rather than just running with my and FF's preference. —Μετάknowledgediscuss/deeds03:45, 17 March 2021 (UTC)Reply
Latest comment: 3 years ago2 comments2 people in discussion
This is an odd one: the problem is caused by the "LL." in the {{af}}|lang2= parameter- if I change it to "la", the problem goes away. If etymology-only languages aren't allowed in such cases, the code should detect for it and give a real error message instead of Lua error in Module:compound at line 164: attempt to call method 'makeEntryName' (a nil value). Chuck Entz (talk) 15:57, 9 March 2021 (UTC)Reply
Latest comment: 3 years ago3 comments2 people in discussion
Hi, kindly replace:
#* {{RQ:Thackeray VF|37}}
#*: His jaw was '''underhung''', and when he laughed, two white buckteeth protruded themselves and glistened savagely in the midst of the grin.
→
#* {{RQ:Thackeray Vanity Fair|chapter=37|passage=His jaw was '''underhung''', and when he laughed, two white buckteeth protruded themselves and glistened savagely in the midst of the grin.}}
@Sgconlaw I fixed this issue and the previous Addison Freeholder issue except that it's too hard to change the date in the Addison Freeholder usages, because the date format is so varied. I would suggest you make the changes by hand; there are only about 30 usages. Benwing2 (talk) 06:42, 11 March 2021 (UTC)Reply
Latest comment: 3 years ago16 comments3 people in discussion
Hi, Benwing. For Sanskrit terms in other scripts (see Template talk:sa-sc#Sanskrit terms in other scripts), do you know any template (can you create one if not?), which simply detects and returns the script name/code of a particular text? I am looking for a template like {{script detection template|देवनागरी}} which returns "Devanagari" or "Deva". Even a module would do, if you can tell me which of its parameter to invoke. I think this is possible because:
which implies that "some module" has the ability to detect script by text. I tried really hard to find something like this, but I could not. Sorry for bothering you. 🔥शब्दशोधक🔥04:43, 12 March 2021 (UTC)Reply
@शब्दशोधक If you know the language of the text in question, you can use findBestScript in Module:scripts/templates; you can invoke this from template code using {{#invoke:scripts/templates|findBestScript|देवनागर|sa}}. This should return the script code Deva. If you don't know the language of the text, you have to call the module function findBestScriptWithoutLang in Module:scripts from another module. I could create a template interface for this in Module:scripts/templates if you need it. Benwing2 (talk) 07:47, 12 March 2021 (UTC)Reply
@Metaknowledge, शब्दशोधक I implemented |sccat=1 in {{head}}; this automatically adds the correct 'LANG POSes in FOO script' category. I fixed the above Sanskrit headword templates to use this param. Metaknowledge, if you want the same thing done to the Ladino headword templates, I can do it too. Benwing2 (talk) 18:50, 13 March 2021 (UTC)Reply
Bot operation to change {{head|sa|noun}} to {{sa-noun}} and others
Hi, will this be possible? I've seen many (espicially the older ones) Sanskrit entries use {{head|sa|<pos>}}, which doesn't categorise by scripts, which is why this should be fixed. If this will be too tedious to do, can you do something like this to the template {{head}}: if parameter 2=sa, then sccat=1? Thanks. 🔥शब्दशोधक🔥12:15, 31 March 2021 (UTC)Reply
@शब्दशोधक Hi. This is easy to do by bot. However, what about the parameters that need to go into {{sa-verb}}? They can't be filled in very easily automatically. I can certainly provide you the list of verbs that are missing parameters to {{sa-verb}}, but you'll have to either fill them in or provide me the correct parameters and I can fill them in. Benwing2 (talk) 01:35, 1 April 2021 (UTC)Reply
@शब्दशोधक See User:Benwing2/head-sa-verb. This is a list of all the cases that use {{head|sa|verb}}. There are 56 entries. Each of them is in the form <from> ORIG <to> NEW <end>. Here, ORIG is the current text of the headword, and NEW is currently a copy of this same text but you should change it to the appropriate call to {{sa-verb}} while leaving the ORIG text alone. If you can do this, I will use my bot to push all these changes to the appropriate entries. This should be much easier for you to do than having to go through and edit and save each page by hand. Benwing2 (talk) 04:47, 1 April 2021 (UTC)Reply
I did the same for nouns; see User:Benwing2/head-sa-noun. There are 334 entries here, but I've already converted {{head|sa|noun}} to {{sa-noun}} on the NEW side and made a few other fixups, e.g. removing the |sc= param if present, so all you need to do is review the existing entries and make any changes, e.g. adding missing genders and translits. Thanks! Benwing2 (talk) 05:30, 1 April 2021 (UTC)Reply
Help with subcategorizing Latin adjectives of the third declension
Latest comment: 3 years ago1 comment1 person in discussion
Hello, thanks for all your work on the modules and templates for Latin. I would find it helpful to have categories for Latin third declension adjectives that have one ending, two endings, or three endings based on gender. I think the auto-categorization of Latin words based on declension type is currently handled by Module:la-nominal, but I don't understand the code there well enough yet to know how it could be changed to add subcategories "Category:Latin third declension adjectives of one termination", "Category:Latin third declension adjectives of two terminations" and "Category:Latin third declension adjectives of three terminations". Could you possibly help me with this?--Urszag (talk) 06:52, 14 March 2021 (UTC)Reply
This form without implicit vowels is a spelling error for สวากขาตะ(svākkhāta), but the misspelling is commoner than the correct form without implicit vowels. I'm not sure if the spelling difference implies a difference in tone for the second phonetic syllable (this form IPA(key): /waːk̚˥˩/, but the correct spelling IPA(key): /waːk̚˨˩/) - perhaps @Octahedron80 can advise on that issue. I have found a Thai blog complaining about the misspelling. The misspelling very occasionally turns up in the Roman script, but I don't think at a frequency often enough to record. Besides, I have no durable quotation for the Roman script misspelling. Contrariwise, I don't recall finding a durable quotation for the correct Thai script spelling without implicit vowels. RichardW57 (talk) 20:03, 14 March 2021 (UTC)Reply
@RichardW57, Octahedron80 Sounds like a very weird edge case. Instead of resorting to {{head}} and a transliteration that just makes things even more mysterious, why don't you leave it using {{pi-adj}} and add a usage note explaining exactly what's going on, including the Latin translit and why exactly it differs from the Latin spelling? {{pi-adj}} doesn't generate a translit so it isn't wrong. Benwing2 (talk) 20:09, 14 March 2021 (UTC)Reply
A usage note would be an odd place to put the transliteration. Doesn't an explanation of the misspelling belong in the etymology? The explanation is that the epenthetic (I think 'svarabhakti' would be more precise) vowel in the Thai pronunciation of the onset (ancient /sv/) has been wrongly treated as a full vowel in the misspelling. RichardW57 (talk) 20:29, 14 March 2021 (UTC)Reply
@RichardW57 The problem is that no other Pali word even has a transliteration so it's far from clear what "transliteration" even means with Pali if not for the Latin spelling. I personally have no idea what the distinction is. So including a "transliteration" that's different from the Latin spelling is IMO worse than not including it; it's just confusing. Better to not include it, and include an explanation of what's going on (a usage note is generally the best place for such explanations, as a synchronic explanation is not an etymology; but it can go in the etymology if you prefer). Benwing2 (talk) 22:05, 14 March 2021 (UTC)Reply
@Octahedron80 I've put forward a Pali transliteration module, but I've had no progress in getting it adopted (or sent back for improvement). There are interface issues for the Thai and Lao scripts - I occasionally need to know whether I'm dealing with an abugida or an 'alphabet'. The glosses for non-Latin entries almost always say "xxx script form of blah", so it seems redundant to have the transliteration in the header, and without automatic translation, it is an invitation to error. There is also a specific transliteration issue with Lao. When it uses only the character set for Lao, d, dh, ḍ and ḍh are all represented by the same letter, ທ. Do I transliterate these all as 'd'? So far I've ducked the issue. RichardW57 (talk) 22:51, 14 March 2021 (UTC)Reply
The point here is that the word "savākkhāta" does not exist in Latin script Pali. It is a Thai script error, induced by Thai phonology. The correct form of the word is "svākkhāta". (Remember that the West has gone 'native', and accordingly uses Pali in its own script, i.e. that of the West. The East seems to have largely adopted European punctuation, though the Indians might be doing their own thing.) I repeated my Google frequency determinations. It turns out that, if I actually chase down the Google hits, correct and incorrect spellings get roughly equal numbers of hits in both systems, namely with and without implicit vowels. Without implicit vowels, the correct spelling gets about 7 times as many raw hits as the wrong one; the preponderance seems to be due to multiple copies of the same text. RichardW57 (talk) 22:51, 14 March 2021 (UTC)Reply
I've found a verifiable quote for yet another spelling of the word, "ส๎วากขาโต ภะคะวะตา ธัมโม". I hadn't seen yamakkan in alphabetic Pali before! It's in the Wat Concord chanting book. RichardW57 (talk) 00:38, 20 March 2021 (UTC)Reply
@RichardW57, Octahedron80 Octahedron80, you may have missed the ping. RichardW57, I still don't think the "transliteration" in the headword belongs there, whether you write it manually or use {{head}}, since no other Pali entries have transliteration in the headword. Benwing2 (talk) 03:47, 17 March 2021 (UTC)Reply
So how do you stop them from automatically appearing once a transliteration module has been accepted? Are you implying that there is a policy against having a transliteration module for Pali? If so, I suggest you reply at my request on the Information Desk for Module:pi-translit to be adopted. RichardW57 (talk) 08:07, 17 March 2021 (UTC)Reply
Latest comment: 3 years ago1 comment1 person in discussion
I mentioned in Wiktionary:Grease pit/2020/October#Experimental template deployed but not finished that the template {{+preo}} is marked experimental and not to be widely deployed. This did not get much response so I thought the template was, or should be made obsolete. But when I tried to replace the template with what I thought was more appropriate format, WingerBot undid the change, see the change history for denken. My main problems with the +preo template are that it is rather cryptic and that it puts grammatical information in a gloss when it would normally go in the label. Plus, as mentioned, it's marked experimental and it's undocumented so I don't know how to fix it if it comes out wrong.
There really should be some standard way to deal with prepositional verbs, but German, the language I'm mainly working on, has a number of variations on the idea and I not sure how a template can deal with them all rationally; is the preposition required or optional, are different prepositions possible, and in which case is the prepositional phrase if the preposition allows more than one? Plus I'm still not sure how to make the whole thing readable to someone who may not be terribly familiar with German grammar and doesn't want to have to figure out a string of cryptic symbols. I'd like to see some kind of consensus on the format first, then if possible have it implemented with templates. At least document the template, so someone other than the person who created it can use it correctly, before deploying it. --RDBury (talk) 02:26, 15 March 2021 (UTC)Reply
Latest comment: 3 years ago26 comments6 people in discussion
Lang. code
Hi, again. I just thought you might be able to change pra from an etymology-only language code to full-fledged language code, so that any Prakrit entries from now on will be "Prakrit" instead of "Maharastri Prakrit". The bot implementation can definitely wait until AryamanA has the time to do so. The ancestor of this new "Prakrit" should be Ashokan Prakrit -> Sanskrit -> PIA -> PII -> PIE. I'm not thinking of altering the already existing codes like psu, pmh, inc-psc, inc-kha, elu-prk, inc-mgd, etc. into etymology-only-codes (it'll be done later). If this is done, I'll be able to make some of its basic templates. Thanks. 🔥शब्दशोधक🔥03:23, 15 March 2021 (UTC)Reply
@Victar, शब्दशोधक AFAIK, there was a discussion among the South Asian contributors who agreed to group all the Prakrits similarly to how the Chinese languages are currently grouped. This would imply creating an overarching Prakrit language, I think. I did not take part in this discussion and I forget where it occurred; User:SodhakSH can you link it? Benwing2 (talk) 06:01, 15 March 2021 (UTC)Reply
@Victar Ordinarily I would agree with you but I think a lot of the Prakrit language codes were recently created so there isn't really any longstanding consensus on how to handle them. Benwing2 (talk) 06:33, 15 March 2021 (UTC)Reply
@Benwing2: While I think the idea of merging some of the Prakrits is warranted, it does seem like something that deserves a vote, especially since some Prakrits will be merged and some will not. Regardless, the current language code distribution is way too messy and seems to be hindering progress for MIA. —*i̯óh₁n̥C09:37, 15 March 2021 (UTC)Reply
Greetings! Can you do something to Module:typing-aids/data/sa so that aï changes into अइ on using {{chars|sa|aï}}? We all are currently working on Prakrit alternative forms in Devanagari script. I'll tell you why this is needed- see 𑀭𑀇 - on automatic Devanagari conversion, the spelling is given "रï" instead of the correct one "रइ". {{chars|sa|a-i}} and {{chars|sa|a-u}} already give अइ and अउ, so can you make it so that {{chars|sa|aï}} and {{chars|sa|aü}} are also able to? Thanks. 🔥शब्दशोधक🔥02:48, 30 April 2021 (UTC)Reply
Wherever, in etymologies, there is {{inh|LANG|psu/pmh/pka/inc-psc/inc-kha/inc-mgd}} (compare {{cog|psu/pmh/pka/inc-psc/inc-kha/inc-mgd|TERM}}), replace it with {{inh|LANG|inc-pra|TERM}}.
Replace the following codes in etymology sections of entries:
@SodhakSH I can help but why are you suggesting replacing the language codes? If you are trying to convert full languages into etym-only languages, it's not necessary or correct to change the codes. You should keep the same codes; just fix the cases that won't work when they are etym languages (I can help you find those cases) and then directly make them etym languages and fix any errors generated. Benwing2 (talk) 03:02, 7 May 2021 (UTC)Reply
Now Kutchkutch and Bhagadatta have already made the new codes. These older codes have a lot of lemmas so can't straightaway change them into etym-only. Once the lemmas of the old codes are cleared, we can simply remove them. 🔥शब्दशोधक🔥03:05, 7 May 2021 (UTC)Reply
@SodhakSH, Kutchkutch, Bhagadatta IMO it's still not correct to change the codes. (Furthermore, the new codes are not ideal. It's preferred to use the first three letters of each language name in the code whenever possible, rather than a random subset of letters.) For example, when I changed the code 'ku' for Kurdish from a language to a family I did not need to change the code. Instead I used my bot to move all the lemmas to Northern Kurdish or Central Kurdish (depending mostly on the script). In this case, a bot could potentially move the lemmas to be under Prakrit. (I don't know if this is realistic, it depends on how much reformatting is needed and whether this can be done automatically.) Benwing2 (talk) 03:10, 7 May 2021 (UTC)Reply
Still, we've started using these codes and I think these are just ok. Before too, such codes were there : Maharastri Prakrit = pmh, Magadhi (Indic) = inc-mgd, etc. 🔥शब्दशोधक🔥03:17, 7 May 2021 (UTC)Reply
I know that moving the codes psu, pmh etc to the etymology only module was the simpler solution, but doing so without first changing all instances of {{head|pmh}} and {{head|psu}} to {{head|inc-pra}} etc would cause errors. I'd have suggested waiting for these entries to be converted to Prakrit first, but by then the update of the descendants and etymology sections of different languages were already underway, which made these etym only codes necessary. Now I would suggest first replacing all pmh and psu entries with inc-pra so that pmh psu etc have zero lemmas and THEN replacing the etym only codes inc-pse, inc-pmh etc with psu, pmh respectively. A lot fewer entries have inc-pmh instead of pmh so converting inc-pmh to pmh will be easier than converting pmh to inc-pmh. -- 𝓑𝓱𝓪𝓰𝓪𝓭𝓪𝓽𝓽𝓪(𝓽𝓪𝓵𝓴)03:28, 7 May 2021 (UTC)Reply
@Benwing2: Yes, the duplicate language codes are unnecessary in the long run. However, until a bot can operate on the entries affected by the merger, the duplicate codes are/were needed as a temporary measure to prevent errors while editors edit Prakrit-related terms with the merged Prakrit language. Prakrit editors could have waited until the bot operations until creating the merged Prakrit language, but waiting would lead to a loss of valuable time that could be used to improve Prakrit coverage.
@Bhagadatta If It's preferred to use the first three letters of each language name in the code whenever possible and the initial p is disregarded, then the codes for the lects in ISO 639-3 would be:
inc-ard: Ardhamagadhi
inc-mah: Maharastri
inc-sau: Sauraseni
The codes for the lects not in ISO 639-3 would be:
@Kutchkutch, Bhagadatta, SodhakSH If you need to switch the codes of these lects from the original ones used as full languages, I'd strongly recommend using the codes suggested just above by @Kutchkutch rather than the "new existing" codes like inc-pmg. We can come up with a different scheme for Apabhramsa variants, maybe inc-apa-foo or just apa-foo. I can help rename any uses of the "new existing" codes to these regularized codes. Benwing2 (talk) 05:12, 8 May 2021 (UTC)Reply
Fine by me. First codes like pmh, psu, pka, inc-mgd etc in etymologies and descendants section will have to be replaced by inc-pra. Then, etym-codes like inc-pse, inc-pmh etc need be replaced by the new proposed codes like inc-sau, inc-mah and so on. Right? -- 𝓑𝓱𝓪𝓰𝓪𝓭𝓪𝓽𝓽𝓪(𝓽𝓪𝓵𝓴)05:22, 8 May 2021 (UTC)Reply
Latest comment: 3 years ago2 comments2 people in discussion
Hi, template guy. As you're currently working on a new Template:es-adj, I was wondering if you could work Template:es-adj-inv into it. That template is very simple, and only used for invariable Spanish adjectives. Ideally, the entries would have instead {{es-adj|inv=yes}} with the same output. Oxlade2000 (talk) 11:30, 20 March 2021 (UTC)Reply
@Oxlade2000 I did break this, but I added new functionality to replace it. Use the value +first for the plural to pluralize only the first word, e.g. {{es-noun|m|+first}}, and the value +first-last to pluralize the first and last word, e.g. {{es-noun|m|+first-last}} for abanico aluvial plural abanicos aluviales. There's also +second to pluralize only the second word but I'm not sure how useful this is. I added it originally for adjectives for expressions like más lento que el caballo del malo (plural más lentos que el caballo del malo). In adjectives you can write e.g. {{es-adj|sp=second}} to have it automatically generate the plurals and feminine based on only the second word, for example. I will document this all shortly. Benwing2 (talk) 11:48, 20 March 2021 (UTC)Reply
@Oxlade2000 The problem is that the old algorithm was very fragile and didn't work well in a lot of cases, and the code was very hard to work with. It worked OK if there was a 'de' or 'a' in the word but in two-word expressions it was wrong much of the time and it was hard to predict how it would function. I prefer making it a bit more explicit which words need to get pluralized. Benwing2 (talk) 12:24, 20 March 2021 (UTC)Reply
@Oxlade2000 I restored the old plural handling of words with 'de(l), a(l)' in them, now also 'con' and 'por'. You should still use +first, +first-second or +first-last for multiword expressions without a preposition in them. Benwing2 (talk) 15:10, 20 March 2021 (UTC)Reply
Latest comment: 3 years ago3 comments2 people in discussion
Hello. Since you have lately been doing bot edits to standardise the Etymology section of some language entries by inserting the terms “Inherited from …” and suchlike stuff, do you not think it would be worthwhile editing the templets {{inh}}, {{bor}} so that the full sentence appears: “Inherited / Borrowed from …”, just as other templets as {{clq}}, {{lbor}}, etc. function? If that be acceptable, then the parameters |nocap=, |notext= could be revisited. Thanks! -⸘- inqilābī‹inqilāb·zinda·bād› 21:17, 7 April 2021 (UTC) P.S. Though, |notext= is now not really needed, given the said standardisation. -⸘- inqilābī‹inqilāb·zinda·bād›21:23, 7 April 2021 (UTC)Reply
@Inqilābī At one point {{bor}} generated the text "Borrowing from ..." before the term. It was changed to remove the text to make it consistent with {{inh}} and {{der}} and because people preferred to write "Borrowed from" instead of "Borrowing from". If we were to add the text back to {{inh}} and {{bor}} it would need to be discussed at the Beer Parlour to make sure people are on board with it. Benwing2 (talk) 03:23, 8 April 2021 (UTC)Reply
Latest comment: 3 years ago2 comments2 people in discussion
In diff, English {{m|de|schnapps}} (with the right language name but wrong code) was changed to {{cog|de|schnapps}} when it should've been {{cog|en|schnapps}}. I don't know if you have a way of telling whether the bot changed any other cases where a language name and code didn't match...? (Ideally there wouldn't have been any such mismatches, but if there were, removing the language name will have made the potentially incorrect code harder to detect.) - -sche(discuss)03:15, 8 April 2021 (UTC)Reply
@-sche Thanks for pointing that out. That change was made by hand by me (in a text file containing all the German lemmas); the only thing the bot did was push the change to Wiktionary. (That's what "(manually assisted)" means in the changelog.) So it's unlikely there are very many more like this. I did changes like that fairly quickly, and I imagine I overlooked the context since "schnapps" looks like a German word, and just assumed the preceding text was wrong rather than the language code. Benwing2 (talk) 03:27, 8 April 2021 (UTC)Reply
Minor issue with Accelerated Generation in Welsh
Latest comment: 3 years ago3 comments2 people in discussion
Hi Benwing! Hope everything is going well for you. Not sure if you're the right one to bring this up to, but maybe you can be some help.
I was on the page tlodaidd, and was about to created an accelerated entry for the softly mutated form "dlodaidd". However, when I went to do so, the auto-generated page incorrectly included that "dlodaidd" was also the equative degree. I believe this is because another accelerated link for "dlodaidd" in the page for tlodaidd, where it mentions the equitive degree as "cyn dlodaidd". However, "dlodaidd" alone is not an equative form, and is just the softly mutated form in the "cyn + _" equative construction. Compare with cyfforddus / cyngyfforddus, and contrast with cyfoethog / cyfoethoced.
Also note that I haven't created dlodaidd yet, so you can try it out an see for yourself... Hopefully nobody rushes in to generated it before that happens!
This isn't a showstopper or anything since I can easily remove the lines stating that "dlodaidd" is the standalone equative, but I thought I ought to raise that it's an issue, in case it's a simple fix with a template somewhere.
There's a third option, but I'm a bit fuzzy on the details. You can use Special:Search with the insource: keyword. The tricky part is escaping any characters that have a function in the search syntax so you don't get false results. The idea is to search for the string "|pa-old" in the wikitext in various namespaces. You would have to use some other search syntax for the Template and Module namespaces, but I'm sure there wouldn't be much use of hard-coding for an obscure code like that. I would suspect that most of the usage would be in mainspace and the Talk, User Talk, Reconstruction, Reconstruction Talk and Wiktionary namespaces. See mw:Help:CirrusSearch for documentation on the search syntax. Chuck Entz (talk) 03:25, 13 April 2021 (UTC)Reply
@Chuck Entz: Yes, the code is so obscure that I'm the one who added almost all instances of it. I tried looking for documentation on the search syntax, but wasn't aware of mw:Help:CirrusSearch.
Latest comment: 3 years ago12 comments2 people in discussion
Hi Benwing. Can you do something so that {{sa-decl-noun-m}}, {{sa-decl-noun-f}}, and {{sa-decl-noun-n}} are able to decline a few more Sanskrit consonant-stem terms like those ending in -t (-at, -it, -ut, -ṛt, etc.) -d, -dh, -bh, -j, -c and more? If it is possible, I can help with how these words are declined. Thank you. 🔥शब्दशोधक🔥04:35, 25 April 2021 (UTC)Reply
@SodhakSH Yes, I can look into this. I will respond to your pings tomorrow morning (it's late night where I am). BTW I hope you are staying safe, the news out of India now is pretty terrifying. Benwing2 (talk) 06:09, 25 April 2021 (UTC)Reply
Yes, I'll tell you. For now, though, see शिच्#Declension - declension is pretty simple for those ending in -c. Decl of -ic will be -ik -icau -icaḥ, -uc will be -uk -ucau -ucaḥ and same for -c. Masculine is just like feminine in these. Neuter is a little different. See below neuter endings (apply for all -ac, -ic, -uc, -ṛc, etc.):
k cī ñci (nom)
k cī ñci (voc)
k cī ñci (acc)
(rest - dat-loc is like masculine/feminine)
I want the table which appears on using {{sa-decl-noun-m}} to appear like it does in any normal a-stem declension.
Pitch accent is usually on the same vowel throughout except the vocative case, where it shifts to the first vowel only. 🔥शब्दशोधक🔥04:21, 2 May 2021 (UTC)Reply
@SodhakSH I notice that the vocative singular or neuter nouns is different from the nominative singular, e.g. कमल. Is that correct? I always thought all IE languages had neuter vocative and nominative singular (and dual and plural) the same. Benwing2 (talk) 05:31, 2 May 2021 (UTC)Reply
No it is correct. The nominative singular is 'kamalam' while the vocative is 'kamala' (similar to the actual word). For some other words like manas, both the nominative and vocative are same - manas (contracted form : manaḥ). 🔥शब्दशोधक🔥02:46, 3 May 2021 (UTC)Reply
@SodhakSH I aded support for -c nouns but for the others I need exact declension tables along with several examples so I can properly test on them. Benwing2 (talk) 20:46, 2 May 2021 (UTC)Reply
Latest comment: 3 years ago2 comments2 people in discussion
Hi! I've found your User name in the history of {{cuneiform}}. I've started to revise Akkadian, Sumerian and cuneiform entries these days, but I'm not familiar at all with templates or any programming language. I'd really like to clean up Akkadian and Sumerian entries and give them a better structure so that they might become useful to, if not scholars, at least people who are studying those languages. I've been trying to see what I can do with existing templates, but being quite a newbie, I'm between "I think I got it now" and "Please god kill me". Would you have time and energy to help/guide me on this endeavour? Thank you! Sartma (talk) 18:08, 25 April 2021 (UTC)Reply
Latest comment: 3 years ago17 comments5 people in discussion
Hey Ben. You've been working on IPA templates for various European languages, so I was curious whether you wanted to take on a project with Italian that I've been thinking about for a while (but which no bot-runner had the time and interest to take on). The problem is not with {{it-IPA}}, which is good, but with the fact that it is so underused. We can, in theory, bot-add {{it-IPA}} with the appropriate parameters to all Italian entries that lack IPA, have {{rhymes}}, lack <z>, and lack spaces. We can also bot-add it to entries that lack {{rhymes}}, but do have {{hyph}} where stress is marked (so where one of the arguments to {{hyph}} contains a vowel with a grave accent). Finally, we can bot-add it to any disyllabic Italian entries that lack both {{rhymes}} and {{hyph}}, but lack diacritics in the pagetitle and have any of <a i u> as their first vowel — but these would have to be scanned by a human first to catch borrowings with unchanged orthography, mostly from English. That would cover a lot of entries with minimal human effort. —Μετάknowledgediscuss/deeds21:50, 7 May 2021 (UTC)Reply
@Metaknowledge I will look into it. I have also thought of writing a bot to add {{es-IPA}} to all Spanish pages without it, except for those that are likely or possibly unadapted English words. I used a complicated regex to look for words that don't fit normal patterns of spelling in Spanish:
This pulls out 3,561 words that might have a nonstandard pronunciation (including some false positives), and 52,049 words that probably have standard pronunciation (including a few false negatives that have nonstandard pronunciations, which we might be able to pick out by hand). A sample, from booster to brownie, most of which I would not trust to have standard pronunciation:
@Metaknowledge, Erutuon I looked into this. First of all, I do think changes should be made to {{it-IPA}}. For one, I don't really like how it defaults to penultimate stress, closed and , and unvoiced . I would prefer it to throw an error rather than defaulting to any of these, forcing the user to specify explicitly whether they want penultimate or antepenultimate stress, closed or open <e> and <o>, and unvoiced or voiced <z>. This is similar to how {{ru-IPA}} operates (w.r.t. stress) and {{pt-IPA}} operates (w.r.t. open or closed vowels and <x> except in certain positions). Otherwise, it's too easy for lazy editors to just add {{it-IPA}} to a page without looking into whether the output is wrong. What I'd prefer to do is run a bot to add the missing pronunciations whenever {{it-IPA}} is given without an argument, and then make specifying these features mandatory. I'd also prefer to use respelling to indicate whether <z> is voiced or unvoiced (e.g. using <ts> for unvoiced, <dz> for voiced), in case for example there is a combination of voiced and unvoiced <z> in a word, or multiple words with different pronunciations of <z> in them. Also, it looks like {{it-IPA}} doesn't yet handle multiple words. These are all things I can fix.
As for automatically adding {{it-IPA}}, I think this is a good idea. I generated some counts:
Total number of Italian lemmas: 117,079
Number of current lemma pages with {{it-IPA}}: 8,363
Number of current lemma pages with {{IPA|it}}: 7,848
Number of current lemma pages with either {{it-IPA}} or {{IPA|it}}: 16,193
Number of current lemma pages with {{hyph|it}}/{{hyphenation|it}}: 9,619
Number of current lemma pages with {{hyph|it}}/{{hyphenation|it}} with stress marked: 9,293
Number of current lemma pages with {{rhymes|it}}/{{rhyme|it}}: 10,014
Number of current lemma pages with either {{hyph|it}}/{{hyphenation|it}} or {{rhymes|it}}/{{rhyme|it}}: 14,516
Number of current lemma pages with either {{hyph|it}}/{{hyphenation|it}} with stress marked, or with {{rhymes|it}}/{{rhyme|it}}: 14,270
Number of current lemma pages with either {{hyph|it}}/{{hyphenation|it}} with stress marked, or with {{rhymes|it}}/{{rhyme|it}}, and without {{it-IPA}}: 11,362
Number of current lemma pages with either {{hyph|it}}/{{hyphenation|it}} with stress marked, or with {{rhymes|it}}/{{rhyme|it}}, and without either {{it-IPA}} or {{IPA|it}}: 10,706
So we have around 11,000 pages where the pronunciation can probably be autogenerated. I haven't yet looked into including disyllabic words with stressed <a i u> or excluding pages with <z> or pages that look like English words. I also think we should exclude pages with 'gli' + consonant (cf. aglina with /ʎ/, but aglifo with /gl/), and with 'sci' + vowel (cf. sciame with /ʃa/, but sciare with /ʃi.a/). I also think it's fairly safe to include words with certain well-known endings, e.g. verbs ending in '-are' or '-ire', nouns ending in '-mento', adverbs ending in '-mente', etc. I think the Catalan Wiktionary module for Italian pronunciation has a long list of these things (although it's not able to look into the contents of the page to see what the part of speech is, which my bot can do). Benwing2 (talk) 19:30, 9 May 2021 (UTC)Reply
Agreed on all counts. I don't think the defaults in {{it-IPA}} are a big issue, but the logic behind changing them is sound. The only other thing I have to add is that this is a fantastic resource, if you don't know about it already. —Μετάknowledgediscuss/deeds19:49, 9 May 2021 (UTC)Reply
Thanks for taking this on. I like how {{ca-IPA}} handles this, it detects ambiguous patterns and errors automatically if no input is provided, and in most cases you can specify simply one vowel without spelling out the whole word again. – Jberkel21:09, 9 May 2021 (UTC)Reply
@Jberkel That is a good idea but needs to be modified for Italian, because there is no default stress (it can either be penultimate or antepenultimate). I'm thinking you can specify a single vowel, but you'll get an error if the same vowel occurs in the penultimate and antepenultimate syllable (or if there are multiple words). Benwing2 (talk) 21:51, 9 May 2021 (UTC)Reply
@Metaknowledge, Jberkel, Erutuon What endings should the module automatically handle? Currently only -izzare, automatically respelled -iddzàre. I'm thinking -mento (respelled -ménto), -mente (respelled -ménte), -ezza (respelled -éttsa), -zione (respelled -tsióne), -tore (respelled -tóre), -trice (respelled -trìce). Other possibilities: -are (either verb or adjective, but in both cases -àre), -ale, -ire, -oso (respelled -óso), -ello (is this always -èllo?), -ella (likewise), maybe -etto, -etta, -evole, -ibile. Maybe this is less necessary if we support the single-vowel notation and/or a partial-word notation like for words in -are. Benwing2 (talk) 22:06, 9 May 2021 (UTC)Reply
I think it might be overkill to stop treating penultimate stress as default. More respelling than it's worth, in the end. But I will accede to those actually doing the work, of course. —Μετάknowledgediscuss/deeds22:17, 9 May 2021 (UTC)Reply
@Metaknowledge, Jberkel, Erutuon Erutuon, can you speak to the section in Module:it-pronunciation that handles z? The logic is rather complex in deciding whether to default to voiceless or voiced, and whether to generate a single z as double. When looking through the existing manually specified pronunciations using {{IPA|it}}, there is no consistency as to whether a single z between vowels is specified as single or double. It actually appears that most voiced z between vowels are given as single, although some are double. Cf. alcazar given as /alˈkad.dzar/ or /al.kadˈdzar/, but apozema given as /aˈpɔ.d͡ze.ma/|, including with vowel lengthening before single z. There are even weirdnesses like azoto given as /adˈd͡zɔ.to/|, with double pronounced zz in the phonemic notation but single pronounced z in the phonetic notation. Who are the active editors here who are native speakers? Perhaps they can speak to what's going on? Benwing2 (talk) 01:32, 11 May 2021 (UTC)Reply
Not a native speaker, but my understanding is that there is no normative distinction in pronunciation between singleton and geminate z in Italian (whether voiced or voiceless), and different speakers have different intuitions about whether they are neutralized to long or short consonants. The same goes for , , ; this set of consonants is sometimes referred to as "inherent" geminates. (According to Martin Maiden, some northern speakers who don't have robust consonant length distinctions in their native accent may make a spelling-pronunciation distinction in consonant length between words like spazi and spazzi, but this is not a feature of the established standard description.) Since there is no distinction, Wiktionary could transcribe the phonemes either way, but should be consistent (for each phoneme, and probably, in the treatment of the set of length-neutralized/"inherent geminate" phonemes). The correct phonetic length transcription is a tricky question since phonetic length is not dichotomous.--Urszag (talk) 02:07, 11 May 2021 (UTC)Reply
BTW I'm pretty sure I'm going to discard the current logic in Module:it-pronunciation to decide whether to default z to voiced or voiceless, and require that all z be explicitly respelled either ts or dz, except in certain recognized endings (probably -izzare, -zione, -izzazione, -ezza, -izia). Benwing2 (talk) 01:40, 11 May 2021 (UTC)Reply
I'm going to fix the module to correct many of the existing bugs (e.g. lack of multiword support, lack of support for falling diphthongs, incorrect handling of monosyllables including lack of recognition for which monosyllables are stressed vs. unstressed).
I'm going to discard the current complex logic that defaults <z> to either /t͡s/ or /d͡z/, and also discard the |voiced= param. Instead, you'll have to respell <z> as either <ts> or <dz>, and respell <zz> as either <tts> or <ddz>, except in certain recognized suffixes (see below). If you don't, you'll get an error. Note, the current logic that automatically converts single <z> between vowels (respelled <ts> or <dz>) to double /t.t͡s/ or /d.d͡z/will be kept.
I'm going to require that, in cases with 3 or more syllables, the stress is marked using an acute or grave accent (in the process specifying the quality of <e> and <o>), except in a fairly large set of recognized suffixes (see below). If you don't mark it, you'll get an error.
I'm going to require that, if the stress falls on <e> or <o>, the quality is marked with <é> <è> <ó> <ò>, even if otherwise the stress wouldn't need to be marked (e.g. in monosyllables and disyllables). Again, exceptions for a large set of recognized suffixes (see below). If you don't mark it, you'll get an error.
I'm going to add the ability to specify only the stressed vowel, as long as the stress falls on the penultimate or antepenultimate and both vowels aren't the same. E.g. in anfiteatro you can just write {{it-IPA|à}}, which is the same as writing {{it-IPA|anfiteàtro}}. This won't work in e.g. angioedema because the penultimate and antepenultimate vowels are both <e>; if you write {{it-IPA|è}}, you'll get an error. This works for glides as well, e.g. for adenopatia you can just write {{it-IPA|ì}}.
The idea behind the changes above is to reduce the errors that are likely to appear when one of two fairly arbitrary choices is defaulted (penultimate vs. antepenultimate, é vs. è, ó vs. ò, /t͡s/ vs. /d͡z/), since lazy editors, esp. those less familiar with Wiktionary and with templates like {{it-IPA}}, are likely to write {{it-IPA}} without params and without properly checking the output.
The current set of suffixes that will be recognized is as follows (they are given in their automatically respelled form, but will be recognized without any accents and with <z> rather than <ts> or <dz>): -ménte, -ménto, -ènte, -ènto, -iddzàre/-iddzàrsi, -àre/-àrsi, -ìre/-ìrsi, -iddzatóre, -sóre/-tóre, -iddzatrìce, -trìce, -iddzatsióne, -tsióne, -óne, -àcchio, -àccia/-àccio, -àggine/-ìggine/-ùggine, -àglia/-àglio, -ìglia/-ìglio, -àia/-àio, -àntsa/-èntsa, -àrio, -sòrio/-tòrio, -àstra/-àstro, -èlla/-èllo, -étta, -éttsa, -fìcio, -ièra/-ièro, -ìfero, -ìsmo, -ìsta, -ìzia/-ìzio, -logìa, -tùdine, -ùra, -ùro when not directly following a vowel (as in e.g. centauro), -iddzànte, -ànte, -iddzàndo, -àndo/-èndo, -àbile/-ìbile, -ànico/-ènico/-ìnico/-ònico/-ùnico, -àstica/-àstico/-ìstica/-ìstico, -àto/-àta, -àtica/-àtico/-ètica/-ètico, -ènse, -ésca/-ésco, -évole, -iàna/-iàno, -ìva/-ìvo, -òide, -óso. I'm specifically excluding <-etto>, <-otto>, <-osa> as having too much ambiguity in their pronunciation. I'm also focusing here on lemmas specifically; my assumption is that nonlemma pronunciations will likely be generated by bot, where the defaulted suffix rules are less of an issue. For this reason, I currently include <-iano>, which is frequently unstressed or <-ìano> in third person plural verb forms. Maybe I should exclude it instead.
The plan for how to implement this is first to use a bot to explicitly specify the pronunciation (including accent mark and respelled <z>) for all current instances of {{it-IPA}}, then change the module according to the plans above, and then do another bot run to convert instances to the default form or short vowel-only form as appropriate.
I'm pleased with the idea of of using more straightforward morpheme-based rules for z. It was a cute idea to rely on generalizations, but because they were complicated and not always correct, it was hard to ensure the output was correct in all cases. About using dz, ts, I'm wondering if Italian ever has /t.s/ contrasting with /t͡s/ or /d.z/ contrasting with /d͡z/. I guess probably not, but they could be written t.s and d.z anyway. — Eru·tuon18:49, 17 May 2021 (UTC)Reply
Latest comment: 3 years ago2 comments2 people in discussion
In this edit: two sets of definitions with different etymologies were added as if they all sprang from the same etymology. I have corrected the page, but I thought this might be relevant info about the operation of WingerBot. --Geographyinitiative (talk) 20:25, 15 May 2021 (UTC)Reply
@Geographyinitiative Thanks, this is my own doing, not my bot's. The text "manually assisted" in the commit message means that the only use of the bot was to push changes I manually made in a text file into the entry. I don't remember this exact word but I imagine I looked up Huaxi in Wikipedia and added definitions accordingly, not correctly accounting for the different characters in the different uses. Benwing2 (talk) 19:05, 16 May 2021 (UTC)Reply
Latest comment: 3 years ago1 comment1 person in discussion
Cheers! I was making some edits and I noticed the result of Template:pt-IPA for batom. First of all, I don't know about the Brazilian pronunciation (yours says it's /baˈtõ/, while the page states /baˈtõw̃/; I have no idea which one should be correct). However it was the European pronunciation that raised my curiosity. The pronunciation should be /baˈtõ/ with an open "a" (see Infopédia and Priberam). I've edited the page accordingly, but I refrained from using the template.
The main reason I'm writing this comment is because I'm wondering if it's a rule to have an open vowel in the penultimate syllable, preceding the "-om". Alternatively, could it just be an artifact from French? Another examples are (I personally read all of them with an open "a", I also read the /õ/ as /ɐ̃w̃/, but I think that's unrelated):
I can't think of examples with other vowels, but at least I get the impression that the "om" opens the vowels in the previous syllables. At least that's I would pronounce them. I'm sorry I can't be more specific or accurate describing the issue, but I hope this helps you somehow. - Sarilho1 (talk) 15:22, 18 May 2021 (UTC)Reply
(New) Sanskrit (neologisms)
Latest comment: 3 years ago9 comments5 people in discussion
Hi. The categories CAT:New Sanskrit and CAT:Sanskrit neologisms are essentially the same. One can say that a newly coined Sanskrit term to be a neologism or a part of New Sanskrit vocabulary. There should be one proper category of such Sanskrit terms which contains all of these - not that half in one and the other half in the other cat. I tried to do something, but then all the neologism labels crashed. Either of these 2 can be done:
Bot operation to change all {{lb|sa|neologism}}s to {{lb|sa|New Sanskrit}}
Altering the label module for turning {{lb|sa|neologism}} to an alias of {{lb|sa|New Sanskrit}}, as I had tried (preferably without causing the failure of the other LANGs neologisms)
@SodhakSH If there's no essential difference between "New Sanskrit" and "Sanskrit neologisms", I'd prefer to go with the latter as we have "Foo neologisms" for various languages. I think it should be possible to change the label 'New Sanskrit' to be an alias of 'neologism' for Sanskrit. If we do that, it seems to me there's no reason to a category Category:New Sanskrit to exist. Benwing2 (talk) 00:55, 20 May 2021 (UTC)Reply
sa-neo is an etymology-only language code (and also pretty unnecessary IMO) for New Sanskrit. I don't think it is always necessary to give in an etymology that the source term is newly-coined or a thousand-year-old. So perhaps sa-neo could be removed. All "sa-neo"s in etymologies would have to be changed to "sa"s and then CAT:Terms derived from New Sanskrit would be empty. If New Sanskrit should be deleted, then even a label "New Sanskrit" should not exist. Would you be able to manage these 2 bot operations - etym. sa-neo to sa and label change {{(t)lb|sa|New Sanskrit}} and {{(t)lb|sa|Neo-Sanskrit}} to {{(t)lb|sa|neologism}}? 🔥शब्दशोधक🔥03:55, 20 May 2021 (UTC)Reply
@SodhakSH, Bhagadatta, AryamanA, Kutchkutch Hmm. I thought about this and I'm not sure "New Sanskrit" and "neologism" are the same. By analogy with Latin, for example, "New Latin" (also known as "Modern Latin") includes any Latin used after 1500 or so. A neologism, on the other hand (per Wikipedia), is a term coined in the last 20 years or so that hasn't gained wide acceptance. In the case of Latin, there are tons of New Latin terms (e.g. the names of almost all chemical elements, terms of plants and animals, etc.) that aren't neologisms in the sense that they're widely accepted, frequently borrowed into other languages, etc. An example is kalium(“potassium”). On the other hand, terms like pedilūdium(“soccer”) or ars robotica(“robotics”) might well be neologisms (although the latter term was used by Pope Francis in a Vatican encyclical). The fact that a term like autovehiculum(“car”) has three synonyms autocinēta, autoraeda, autocurrus suggests to me that all are neologisms. Wiktionary isn't always good at marking the distinction but it clearly exists. I don't know enough about Sanskrit to say whether the same applies. Benwing2 (talk) 06:01, 20 May 2021 (UTC)Reply
@SodhakSH: Conceptually, there could be distinction between "New Sanskrit terms" and "Sanskrit neologisms". "Sanskrit neologisms" could refer terms coined within the last "generation", while "New Sanskrit terms" could refer to terms coined after a certain point in time but before the current "generation". However, maintaining such a distinction may be difficult. Kutchkutch (talk) 09:24, 21 May 2021 (UTC)Reply
There is Sanskrit literature from 300 or 400 years ago which is different from the Sanskrit used in school textbooks which has words for modern appliances, etc. Benwing is right. -- 𝓑𝓱𝓪𝓰𝓪𝓭𝓪𝓽𝓽𝓪(𝓽𝓪𝓵𝓴)02:10, 22 May 2021 (UTC)Reply
@SodhakSH: Not necessarily. Sanskrit neologisms come under ‘Contemporary Sanskrit’ rather than ‘New Sanskrit’; compare Contemporary Latin vs. New Latin. By tradition these ought to be classified differently, so a subcat. is not a good idea. @Benwing2: If need be, we can even delete the neologism categories for classical languages and instead use the label “Contemporary LANG” for them. Would that be okay? -⸘- dictātor·mundī15:19, 27 May 2021 (UTC)Reply
Quotation template replacements
Latest comment: 3 years ago3 comments2 people in discussion
Hi, when you have time could you please do the following replacements?
@Sgconlaw Done. My apologies about the other request of yours, I have been meaning to get to it for awhile. Template renamings are easy, while other sorts of formatting may require more work because they need a custom script. Benwing2 (talk) 03:08, 27 May 2021 (UTC)Reply
Latest comment: 3 years ago16 comments4 people in discussion
Hey, you said in the BP discussion that you were in favour of creating the two templates {{inh+}} & {{bor+}}. So would you like to cast your vote in the crucial vote?— given that you cannot implement something when there’s no consensus for it. Thank you. -⸘- dictātor·mundī00:30, 27 May 2021 (UTC)Reply
@Inqilābī I voted, however I disagree with the general premise of the vote, if everyone needed a vote to do add any new feature, nothing would get done. Benwing2 (talk) 03:06, 27 May 2021 (UTC)Reply
Benwing, I agree with you, but you forgot about the (in)famous Victarianopposition! And, some more users were opposed to the proposal as well. I really wish we could ignore their oppositions— is that achievable? -⸘- dictātor·mundī15:37, 27 May 2021 (UTC)Reply
Since the vote has failed, I am thinking of an alternative way to make the etymological text appear: can you please add a new parameter |withtext= for {{inh}} & {{bor}}? That would be very nice, and we can preferably have a shorter one, |wt=. There’s even no need to ask approval for this as only a new parameter is to be added. -⸘- dictātor·mundī02:04, 29 May 2021 (UTC)Reply
@Inqilābī: That seems to be ignoring the result of the vote (i.e., against no consensus to add the text; the vote was for the templates, however, the point was to add text in etymology sections). J3133 (talk) 00:50, 30 May 2021 (UTC)Reply
@J3133: I am not aware of any user who was entirely against having the text. Only one or two, did say (even then they did not put much emphasis on it) that no wording was necessary for inheritances (inh only). On the whole, a formal vote on this was needless (as Benwing & others have stressed), and most people support the proposal anyway. Lastly, as a side note I would welcome comments from people who took part in the vote (folk from all sides, not just the supporters) rather than take your honourable advice (you did not even take part in the BP discussion(s)!), so I bid farewell to your bureaucratic self. -⸘- dictātor·mundī01:12, 30 May 2021 (UTC)Reply
Apparently I am bureaucratic when you needlessly created a vote only to do what you wanted after it ended. Lastly, I will quote Mahagaja (who voted and therefore should be “welcomed”) instead of being hushed after your farewell made in bad faith to ignore opposers: “These are unnecessary. If you want to say "Borrowed from" or "Inherited from", just write it in.” This seems to be against your idea, are we bidding farewell to Mahagaja? J3133 (talk) 01:29, 30 May 2021 (UTC)Reply
As you could only read the first words, I believe yours is what we should worry about. “If you want to say "Borrowed from" or "Inherited from", just write it in.” See write. J3133 (talk) 03:13, 30 May 2021 (UTC)Reply
Not sure what you really want to tell or prove. The opposers were against what MK called a ‘templatisation creep’ or what Jberkel called ‘redundant/overlapping templates’. I have no time to explain the same thing over and over again; why not ask those opposers themselves if you have comprehension disability? @Metaknowledge would be a good person to start with. The lost vote has no implication of any prohibition to introduce a single, new parameter. I am just waiting for Benwing’s response, no one had invited you for a debate in his talk page. Lastly, to analyze your bearing in this post, it is very natural you would behave thus: having neither taken part in the BP discussion(s) nor cast your vote, you have become all-knowing, right? -⸘- dictātor·mundī03:58, 30 May 2021 (UTC)Reply
Next time your disability prevents you from understanding etymologies perhaps do not suggest more ‘templatisation creep’ and immaturely shoving your opinions on everyone who opposed (as can be seen above). If you were all-knowing, you would understand if your templates were opposed (and caused you to hurriedly create a vote before thinking), your parameters, also, are not immune, rather than being salty and mindlessly blathering. J3133 (talk) 04:30, 30 May 2021 (UTC)Reply
@Inqilābī: |withtext= is too long to type, so please choose |tx=1 or |wt=1. As I said, link to the glossary is very important, so still I advocate a bot operation. I hope you have not forgotten the old discussion on this — "Inherited" is not as obvious and self-explanatory as you think it to be since I couldn't understand what it was. 🔥शब्दशोधक🔥13:55, 27 June 2021 (UTC)Reply
Such headstrongness is unwholesome, I have said sundry times that I am against a bot operation, and I have stated the reason clearly. By the way, putting the parameter should produce the link automatically, of course. ·~dictátor·mundꟾ14:09, 27 June 2021 (UTC)Reply
Replacement requests
Latest comment: 3 years ago3 comments2 people in discussion
Hello, could you please carry out the following replacements?
pt-IPA is close to being ready. I need to make some changes to the handling of hiatuses. Please don't use it until then. As I make those changes I'll document it.
it-IPA is ready and in use. What do you mean by manual parameters?
I'm referring to edits like diff; having |1=giocàre produces the same thing as raw it-IPA. I assumed you were reviewing/getting pronunciations uniform before removing these parameters when they're the expected output from the pagename alone. I guess it's fine to keep any redundant parameters, but my question is whether there's a reason to add them now in cases where raw it-IPA is correct. Ultimateria (talk) 16:51, 3 June 2021 (UTC)Reply
@Ultimateria I see. Yeah, originally I had to make all the parameters explicit before changing the handling of the module. At this point it's fine to use {{it-IPA}} by itself when it works. I've been habitually adding the explicit pronunciation but I'll probably stop doing this, and do a bot run to remove the redundant params. Benwing2 (talk) 03:38, 4 June 2021 (UTC)Reply
Bot's Italian mistakes
Latest comment: 3 years ago4 comments2 people in discussion
Okay fixed those two individual entries. I don't understand the grave vs acute accent distinction, so have used the same grave accent for the corrected participles. Kritixilithos (talk) 11:07, 3 June 2021 (UTC)Reply
Latest comment: 3 years ago7 comments3 people in discussion
Hello,
while I don't mind having the qualifiers before the referred-to items, I think you need to be aware that it necessitates hundreds of changes to be made manually in cases when the qualifier referred to more than one term and now it gives wrong information, changes like this one. Could you help me find such instances in Hungarian with some smart search? Adam78 (talk) 06:08, 13 June 2021 (UTC)Reply
@Adam78 I can go through the latest dump (of 2021-06-01) looking for uses of {{syn}} and {{ant}} in Hungarian lemmas (maybe also non-lemma entries) that have a qualifier attached to the last entry. Would that work? BTW my instinct for handling cases like this is to duplicate the qualifier on each entry needing a qualifier, but your solution seems to work fine too. Benwing2 (talk) 06:12, 13 June 2021 (UTC)Reply
This happens in English entries too (or at least I've used it in that way before). I'm a little uncertain whether it's more common for the qualifier to be in front of rather than after the term, though. In "Derived terms", "Related terms" and "Translations" sections, we seem to place it after the term. — SGconlaw (talk) 06:33, 13 June 2021 (UTC)Reply
@Sgconlaw How have you used it, similarly to what User:Adam78 has done in saying "the last two are colloquial" (or whatever)? If you just say "colloquial", it becomes ambiguous whether you're referring to the preceding entry only or the preceding two/three/etc. entries. My thought for putting it before is that (a) most (all?) templates that take a |q= or |qual= param put it before (the templates you mention above don't have such a param, it may be just convention to put a separate qualifier after, and in those sections the ambiguity doesn't appear since the terms usually stand by themselves), (b) if you have a list of several entries and the last one needs a qualifier, putting it before makes it ambiguous. The idea is that it's more likely the things at the end of the list need qualifiers than the things at the beginning of the list. But I could reconsider if several people think it's better to go after. Benwing2 (talk) 06:47, 13 June 2021 (UTC)Reply
@Benwing2 yes, I've sometimes used expressions like "the last two obsolete" or "all obsolete", if I recall correctly. I don't really know whether the qualifier should be before or after the term, but I suppose we ought to be consistent one way or another. One possible difficulty of having the qualifier in front is how it would look in combination with the use of {{sense}} – you'd have the sense in parentheses, followed by another set of parentheses with the qualifier which might be a bit odd. Putting the qualifier in front on rhymes pages might also look a bit odd. (On a related point, perhaps {{l}} should have a |qualifier= parameter built in so that it isn't necessary to separately add {{qualifier}}.) — SGconlaw (talk) 07:06, 13 June 2021 (UTC)Reply
I'm not sure it only concerns those where there is a qualifier referring to the last term; I think it would be more expedient to find strings expressing English numerals (including "both") in any qualifier field within nym invocations. But maybe we'd better postpone it to early July so that cases from the first half of June can be included, and also we can see by that time if it's really the final solution or perhaps reverted. Adam78 (talk) 12:45, 13 June 2021 (UTC)Reply
@Adam78 I'll go ahead and pull out all cases of {{syn|hu}} where there's a qualifier and see what's going on. Also, the dump is generated twice a month (on the 1st and the 20th), so we don't have to wait till July. Benwing2 (talk) 17:01, 13 June 2021 (UTC)Reply
Quotation template replacements
Latest comment: 3 years ago3 comments2 people in discussion
Hello, when you are free kindly carry out the following quotation template replacements:
@Chuck Entz Yes, the issue is somewhere in the spelling of "neighborhoods" vs. "neighbourhoods". I made all European countries and all former British colonies use the -our- spelling and everywhere else use the -or- spelling. Let me look into this and see why it's generating the -or- spelling. Benwing2 (talk) 02:49, 14 June 2021 (UTC)Reply
@Inqilābī: mmm, why not? I’ve been creating quotation templates which link to first or early editions of works which are available online for use in our entries for some time now. When I come across a template that has not been linked, I see if an online version of the actual work is available, and if so I link the work to the template. — SGconlaw (talk) 16:51, 19 June 2021 (UTC)Reply
@Sgconlaw: No no, I was talking about changes like Bronte Wuthering → Emily Bronte Wuthering Heights, i.e., those changes of the names of the templets (sorry my wording was unclear) that you are asking Wingerbot to make. ·~dictátor·mundꟾ17:03, 19 June 2021 (UTC)Reply
@Inqilābī: oh I see. In many cases there was already a pre-existing template like {{RQ:Emily Bronte Wuthering Heights}}. However, the editor Wonderfool (who goes by a changing series of usernames), who has been helping to locate quotations in entries tagged with {{rfquotek}}, has a tendency to create new templates or redirects which aren't really necessary. When I come across these, I ask Benwing2 for help to replace the unnecessary uses with the primary quotation template. (For example, I recently found that {{RQ:Hamilton Lectures on Metaphysics and Logic}} had the redirects {{RQ:Hamilton ML}} and {{RQ:Hamilton LML}} which are difficult to interpret, so I shortened the primary template to {{RQ:Hamilton Metaphysics and Logic}} and asked for replacement of all uses with the primary template so the redirects can be deleted. — SGconlaw (talk) 17:57, 19 June 2021 (UTC)Reply
Removing ‘Old Galician’
Latest comment: 3 years ago6 comments3 people in discussion
That might not be the best solution. From a Galician point of view, "Old Portuguese" is but an alias of "Old Galician". It might be better to substitute "from Old Galician/Old Galician-Portuguese". At any rate, I think the Galician editors should have a say in this- or at least be informed about it before it's done. Chuck Entz (talk) 16:36, 19 June 2021 (UTC)Reply
@Inqilābī, Chuck Entz @Inqilābī, be aware that this issue is politicized in modern Galicia. There are two camps, one which thinks of Galician as a separate language from Portuguese and spells it using Spanish conventions (sanctioned by the government of Spain) and another which thinks of Galician as a dialect of Portuguese and spells it using Portuguese conventions. Both grew from the common Old Galician-Portuguese language, which Wikipedia calls Galician-Portuguese and Wiktionary calls "Old Portuguese". The tricky thing here is that Old Galician-Portuguese was physically spoken in the area of modern Galicia up until about 1100 or so, at which point it expanded south during the Reconquista. Since Wiktionary uses the term "Old Portuguese", it's arguable that the etymology sections should just read the same, but the Galician editors understandably don't seem to like that much. Benwing2 (talk) 19:00, 19 June 2021 (UTC)Reply
We're talking about Galician entries here, not Fala. Besides which, nobody talks about "Old Falan" in English. My main point, though is that you were advocating something that might very well be seen by Galician editors as a sneaky attempt to make a political point- sort of like taking "Myanmar" out of the template at Category:Rohingya language. Right or wrong, such things should only be changed by consensus. Chuck Entz (talk) 19:28, 19 June 2021 (UTC)Reply
Well… then I guess it would be noncontroversial to change the name of the ancestor language to Old Galician-Portuguese, or maybe even Old Galician since it originated from Galicia. It just looks so odd to have ‘Old Galician’ written manually beside the templatised ‘Old Galician-Portuguese’. Let us finalise what name to choose, and then perhaps I would suggest that in the BP. (The name Galician-Portuguese is not a goodun, as it sounds like a modern dialect continuum rather than an ancestor language.) ·~dictátor·mundꟾ22:22, 19 June 2021 (UTC)Reply
Lua local environment
Latest comment: 3 years ago3 comments2 people in discussion
Latest comment: 3 years ago2 comments1 person in discussion
eg. မၞိဟ်တြုံ
I usually use bzgrep at zhwikt, but I don't download enwikt dump, can you help removing the links? bzgrep -e '\,"framed":false,"label":"Reply","flags":,"classes":}'>Reply
Howewer in Italian is razzi the second-person present conjugation of razziare, not razzii, that would be pronunced as /rat͡s.ˈt͡siː/ (razzìi), but not /ˈrat͡s.t͡si/ (ràzzi) as the verb; you can see here , in Italian razzii is the plural of razzio. I counsel you to use this site for the conjugations, because it is reliable.--BandiniRaffaele2 (talk) 00:52, 30 June 2021 (UTC)Reply
@BandiniRaffaele2 The site you quote is not so reliable, and doesn't even give the position of the stress or the quality of written e and o. DiPi agrees with the two dictionaries I cited. Benwing2 (talk) 01:01, 30 June 2021 (UTC)Reply
However, in this case (aussäen), the auxiliary verb "sein" is wrong. In German language, there are two auxiliary verbs which you can combine with the past participle. These two auxiliary verbs are "haben" and "sein". The former one is the prevalent form whereas the latter one is more seldom used. So, you can't do this converting job by using a bot. Instead, in each single case you have to check first which one of the two auxiliary verbs is the correct form.--2003:CF:3F3F:6CD:9908:7B97:F9FA:9CD210:44, 4 July 2021 (UTC)Reply
@2003:CF:3F3F:6CD:9908:7B97:F9FA:9CD2 Hi. Not sure if I can ping an IP, you should create a user account. It may be not obvious but in this case the former template had an s in its params indicating that the auxiliary should be sein. The conversion merely preserved the auxiliary that was already there. If there was an error to begin with, it got propagated; my bot didn't make any new mistakes. Benwing2 (talk) 13:00, 4 July 2021 (UTC)Reply
Hi Benwing2. Well, I'm glad to hear that this error wasn't caused by your bot which implies there is no further action needed. It was a single and isolated error made by the creator of this entry. For me, it wasn't possible to find out what caused this error bcs the previous template had already been deleted. Thanks for your quick reply.--2003:CF:3F3F:65E:DD8A:3F93:D3F6:C3D915:56, 5 July 2021 (UTC)Reply
New CJK Pages
Latest comment: 3 years ago5 comments3 people in discussion
By creating these pages from scratch, you've taken them off of everybody's watchlists. They should have been created by the method used for the monthly pages. I'll have to get myself back up to speed on what I did at the beginning of the year to see if we can fix this. I believe it would involved something along the lines of temporarily deleting the CJK pages, moving the N pages on top of them, then moving the N pages back with redirects left behind, then restoring the CJK pages over the redirects. Chuck Entz (talk) 13:09, 6 July 2021 (UTC)Reply
I went ahead and did it for the rfv page. I forgot to delete the redirect before restoring the other edits and had to redelete and selectively restore, but it should be back to normal. I see I was mistaken about your having already created the rfd page. I went ahead and moved rfdn to rfdcjk, then moved it back. Rfdcjk is now a redirect that's on everybody's watchlists. Whenever we get around to creating the new page, it will simply be a matter of replacing the redirect. Chuck Entz (talk) 13:57, 6 July 2021 (UTC)Reply
Everything I know I pieced together from examining how @Rua used to do it. At the beginning of the year it looked like no one was going to create the new monthly pages, so I looked at the edit histories from years past and figured it out just well enough to get it all done- with a bit of trial and error. Apparently, when you move a page the system updates all of the watchlists with the new page name, but doesn't remove the old pagename. That means you can add a new pagename to everybody's watchlist by moving a page that's already on their watchlists to the new pagename, then moving it back over the redirect. That leaves you with the original page still in the same place and on all the same watchlists, but now you have a redirect page that's also on the same watchlists- in effect, you've cloned the watchlist membership of the first page onto the redirect page, which you then replace with the content for the new page. If you need more pages, you can clone the clone, then the clone of the clone, etc. I believe all the monthly pages we've ever had were created using this method. Chuck Entz (talk) 15:17, 7 July 2021 (UTC)Reply
Latest comment: 3 years ago26 comments2 people in discussion
Hello, Benwing. Would you be able to do a bot operation related to {{sa-sc}}?
Basically, this has to be done: change {{sa-sc|SCRIPTCODE|TERM}}, {{sa-sc||TERM}} to {{sa-sc|TERM}} (as now the script can be auto-detected, thanks to your help) and remove any |tr= parameters. Thanks! 🔥शब्दशोधक🔥07:09, 9 July 2021 (UTC)Reply
@SodhakSH This is possible but are you sure you want all script code and translit params gone? What about non-standard transliterations and cases where the script recognition doesn't work? Benwing2 (talk) 05:06, 10 July 2021 (UTC)Reply
@Benwing2: The transliteration is once already shown at the {{head}} level, so showing it again is redundant. Your point about the scripts is correct, I'll fix it with an |sc= parameter. 🔥शब्दशोधक🔥05:18, 10 July 2021 (UTC)Reply
@SodhakSH I'm confused now because the first param of {{sa-sc}} isn't a script code, it's the Devanagari equivalent. You'll need a param for this to override it; maybe your intention is to add a param other than |1= to specify the Devanagari and move all params down by 1? Benwing2 (talk) 05:40, 10 July 2021 (UTC)Reply
@SodhakSH OK but you've made a mess of things by hand-converting some of the usages to the new format but not all of them. How can my bot know which ones have been hand-converted, so it won't try to further convert them? I would recommend either you make a list of all the entries you already converted, so I can ignore them, or undo those changes. Benwing2 (talk) 06:12, 10 July 2021 (UTC)Reply
@Benwing2: The sc code is in Latin script, so perhaps best is if you can do: if parameter 1 contains Latin script, move to |sc=. If this is not possible, I'm thinking of another way. 🔥शब्दशोधक🔥06:19, 10 July 2021 (UTC)Reply
@SodhakSH Everything converted. If the Devanagari isn't needed in most cases, it should probably be made a named param and the remaining params moved down by one. Benwing2 (talk) 18:46, 10 July 2021 (UTC)Reply
Please make a list of uses for this template like User:Benwing2/head-sa-noun. Still some uses are not correct, which I'll manually check. Also for {{pra-sc}}. Hope it's not a very tedious task. If you can, then I'll check them and then it can be replaced by bot. 🔥शब्दशोधक🔥09:47, 12 July 2021 (UTC)Reply
@Benwing2: If it is not hard, will you be able to this for some more templates I want to fix? There are just optional fixes, which would make the templates better if made. But if it is very time-taking for you, we can do without these changes. 🔥शब्दशोधक🔥04:59, 18 July 2021 (UTC)Reply
This is because of diff. Now the template can make the first parameter the page (if number) and headword (if string). If it is string, the second parameter is the page.
@SodhakSH I can do this. However, it is a bit strange to overload arguments the way you've done so that the page number goes in either 1 or 2; I would prefer that the page number goes in a consistent argument. If you put the page number in 2=, there's only one extra character (a vertical bar) to type compared with putting it in 1. What do you think? There are only about 33 current uses of page numbeers in {{R:hi:Dasa}}. Also, what you call |disp= is standardly called |alt=. There are no current uses of |disp= so I'll go ahead and change it to |alt=. Benwing2 (talk) 16:30, 18 July 2021 (UTC)Reply
Okay. Here the basic. There ara 26 letter in the alphabet. Each letter represent one sound. Same sounds as IPA: a, b, d, f, h, k, l, m, n, p, r, s, t, v, w, z. Different: c /t͡ʃ/, g /ɡ/, j /d͡ʒ/, q /k/, y /j/. Clusters: ny /ɲ/, ng /ŋ/, sy /ʃ/, kh /x/. ―Rex Aurōrum「Disputātiō」05:00, 10 July 2021 (UTC)Reply
@Rex Aurorum I need a lot more info than that, including a native speaker who also has a linguistics background. I don't really have time to devote a lot of energy to this right now. Benwing2 (talk) 05:05, 10 July 2021 (UTC)Reply
Latest comment: 3 years ago1 comment1 person in discussion
Hi Benwing2,
Do you know how one would go about adding an additional pronunciation to the Latin module? For instance there is already a Classical one, an Ecclesiastical one, and a 'Vulgar' one. Let's say I want to add an additional one named 'X', with its own phonetic rules and such: how would I do this? I have tried to do so but ran into some trouble when the phonetic output of 'X' would ignore phonetic_rules_X no matter what I tried.
@Tibidibi My apologies. I have not forgotten about this. It's just that I need to find 3 hours or so to sit down and implement and debug this, and my RL work keeps getting in the way ... Benwing2 (talk) 06:53, 23 July 2021 (UTC)Reply
{{link}} replacement
Latest comment: 3 years ago2 comments2 people in discussion
Hi, see Special:Diff/61964501. The bot replaced {{l|en|] ]}} with ] ]]]. The former was certainly an incorrect way to use the template, but nevertheless rendered sensibly unlike the latter. It would be good to check the content of the template argument and guard against that. –mwgamera (talk) 22:32, 20 July 2021 (UTC)Reply
@MwGamera Hi. Yes, I discovered this issue awhile ago and fixed it in my script. I tried to correct the mistakes of this sort that the bot made, but I seem to have missed some. Benwing2 (talk) 04:58, 21 July 2021 (UTC)Reply
{{RQ:Bentley COA}} et al.
Latest comment: 3 years ago4 comments2 people in discussion
Hi, kindly carry out the following replacements when you're free:
@Sgconlaw Are you sure about removing the page number from {{RQ:EHough PrqsPrc}}? The documentation says it's mandatory in some cases to make the link work, and the template code definitely makes use of the page number. Also if we rename |1= to |chapter= and rename or delete |2=, we should also rename |3= to |text= so it doesn't end up stranded. Benwing2 (talk) 02:42, 22 July 2021 (UTC)Reply
Yes, that is not a page number but is an artefact left over from an old version of the template which used to link to some other website. You can certainly rename any |3= to |text= or |passage= if you find any. In fact, if you find:
#* {{RQ:Bentley COA}}
#*:
you can change it to:
#* {{RQ:Bentley Confutation of Atheism|passage=}}
However, if this is troublesome to program the bot to do, then don't worry about that part – I guess it can be done manually as and when someone comes across it. — SGconlaw (talk) 05:49, 22 July 2021 (UTC)Reply
Latest comment: 3 years ago4 comments2 people in discussion
I've came to the conclusion that Index:Korean, Index:Korean/Hanja and their subpages should be moved to the Appendix namespace, just like the corresponding Chinese indexes. Though I've barely touched Wiktionary's Korean content, I feel obligated to note that they appear to contain useful content that isn't found elsewhere on WT; their deletion would thus be a detriment. Alternatively, the content of the indexes could be transferred to a template/category system (this is probably the eventual desideratum); all that matters to me is the preservation of the content. Do you have any thoughts? Hazarasp (parlement · werkis) 10:02, 2 August 2021 (UTC)Reply
The thing is, this reference is used not only for Bengali but also for Middle Bengali, Old Bengali, other Eastern Indo-Aryan languages like Oriya and Chittagonian, Magadhi Prakrit, Ashokan Prakrit, etc. The language code bn is just needless. Thank you. ·~dictátor·mundꟾ09:41, 13 August 2021 (UTC)Reply
Latest comment: 3 years ago1 comment1 person in discussion
See . Though this can be done by a bot who only takes care of pages before 2021-07, I strongly recommend checking the hyphenations with this Module by concatenating all the syllables. For example, zh:ako is in zh:Category:與詞彙不一致的斷字(hyphenation different from word), so I found {{hyphenation|id|a|keo}} in both zh:ako and ako. EdwardAlexanderCrowley (talk) 03:34, 27 August 2021 (UTC)Reply
Category:English irregular plurals ending in "-ra"
Latest comment: 3 years ago1 comment1 person in discussion
I restored this because there are plenty of English irregular plural categories, I don't see a linked deletion discussion or post to the talk, and this category is still generated from the template. —Justin (koavf)❤T☮C☺M☯18:07, 28 August 2021 (UTC)Reply
Replacer
Latest comment: 3 years ago2 comments2 people in discussion
Latest comment: 3 years ago4 comments2 people in discussion
Why did you remove these? I recreated one but want to get an explanation before I made more. Is there some conversation about this? If so, linking it in your edit summary would have been helpful. Your only deletion rationale was that it was empty but you emptied it without no explanation. —Justin (koavf)❤T☮C☺M☯17:19, 6 September 2021 (UTC)Reply
@Koavf My apologies, I have not been intentionally ghosting you. RL has been interfering for several months and taking me away from Wiktionary. I think I posted to the Beer Parlour before removing noun plural forms of Romance languages, and got some agreement on this (and no disagreement). The basic reason I removed them is that 'noun plural forms' are entirely redundant in languages (as in most Romance languages, and almost certainly including Mirandese), where plurals are the only possible noun inflection. (Diminutives and augmentatives do not count as inflections. They are derivational forms that are their own lemmas, rather than non-lemma forms.) In other words, in these languages all 'noun plural forms' are 'noun forms' and vice versa so there's no point in having 'noun plural forms' as a category distinct from 'noun forms'. I did not remove 'noun plural forms' for any Romance languages where 'noun forms' existed that were not plurals. The same logic was used before me (by User:Rua) for removing 'noun plural forms' as a category in English, and maybe in other languages as well. Apologies for not including a better explanation in the commit message. Benwing2 (talk) 06:09, 7 November 2021 (UTC)Reply
@PJTraill I think it actually got confused by the ==Verb== header instead of ==Participle==. You can see in the changelog that it says pos=Participle. The {{lb|ru|dated}} tag shouldn't make a difference. I will see about fixing the code to handle this situation. Benwing2 (talk) 03:50, 12 November 2021 (UTC)Reply
I just happened to see this conversation and wondered, per this last point, if First Nations might also include tribes from South America? Which Native American doesn't cover, according to the government definition. At least, it wouldn't include Australia etc. DAVilla11:46, 29 November 2021 (UTC)Reply
Latest comment: 3 years ago2 comments2 people in discussion
Your comment was "delete bad Spanish verb form". I don't know Spanish too well, but this word is linked in the conjugation table and appears in other dictionaries, so I've restored it. DAVilla11:41, 29 November 2021 (UTC)Reply
Latest comment: 3 years ago2 comments2 people in discussion
The Brazilian phonetic transcription shows in the template but the more common realisation is . Please look into it
@RonnieSingh You should use ~~~~ to sign your comments. As for , I think this is in fact correct. I have spent time in Brazil and I definitely heard in words like carro; if it had been I would have noticed it. does occur but primarily before voiced consonants as in carne, pardo. Benwing2 (talk) 03:59, 2 December 2021 (UTC)Reply
Oh, sorry, I left that message from my phone and thought there was an automatic sign. RonnieSingh (talk)`
@Fytcha This is because Module:links is inserting categories directly instead of using Module:utilities to format the categories, which will not put userspace pages and such in categories. This should be changed although carefully so it doesn't use more memory, i.e. it should respect the list of high-memory pages in Module:links/data. Benwing2 (talk) 03:53, 2 December 2021 (UTC)Reply
Changing Prakrit's script
Latest comment: 3 years ago10 comments5 people in discussion
Hello. Recently the Prakrit editors have decided to change the main script of Prakrit from Brahmi to Devanagari. Is it possible for you to run a bot to change all instances of, say, {{l|inc-pra|FOO}} to {{l|inc-pra|{{subst:chars|{{subst:xlit|inc-pra|FOO}}}}}}? Similar change has to be done with such occurrences of some other templates also, but first let me know if you can do it. —Svārtava03:15, 2 December 2021 (UTC)Reply
How much ever time it takes isn't really a problem. If you can do it, please replace |inc-pra|FOO in these templates: {{l}}, {{link}}, {{m}}, {{mention}}, {{cog}}, {{cognate}}, {{nc}}, {{ncog}}, {{noncog}}, {{noncognate}}, {{desc}}, {{descendant}}. Same for {{alt}}, {{alter}} but if there are multiple continuous parameters like {{alter|inc-pra|FOO|BAR}} then it'll have to be {{alter|inc-pra|{{subst:chars|{{subst:xlit|inc-pra|FOO}}}}|{{subst:chars|{{subst:xlit|inc-pra|BAR}}}}}}. In etymology templates ({{der}}, {{inh}}, {{inh+}}, {{bor}}, {{bor+}}), for example, {{inh|LANG|inc-pra|FOO}} would have to be changed to {{inh|LANG|inc-pra|{{subst:chars|{{subst:xlit|inc-pra|FOO}}}}}}. —Svārtava04:38, 2 December 2021 (UTC)Reply
@Svartava2 Here are some additional considerations regarding this change:
Would the format at پیوݨ and descendants trees using |sclb= be affected by your suggested implementation scheme, or would they remain as is?
(RichardW57, RichardW57m) Do you have an opinion regarding the Latin script used by European scholars for Prakrit similar to Pali? There is a resource that advocates using the Latin script since a distinction could possibly be made between ṃ and nasalisation ̃ when the anusvara is used in a Brahmic script. Despite the potential convenience, it was decided not have Latin script for Sanskrit. Would the same still hold true for Prakrit?
I haven't attempted to assess what the script distribution is for Prakrit encountered by English speakers, or indeed by L1 English speakers. For Pali, a quick survey of the totality of Pali on the Internet put the Roman script in third place, behind the main two Thai writing systems. (I've even seen some surprising evidence that the Sinhalese. but more precisely, those who could afford to buy, preferred the Roman script to the Sinhalese script for Pali! Maybe the Roman script printing was more legible.) I wouldn't be surprised if the Roman script were commoner than the Kannada script, though. IAST already allows bindu and candrabindu to be distinguished. We could also just have the Roman script as a 'soft redirect', but with transliterations themselves made into links for looking up inflections. --RichardW57 (talk) 22:47, 2 December 2021 (UTC)Reply
Although early European scholars thought Maharastri and Jain Maharastri were two distinct lects, it was later found that they constitute a single lect. So would you consider adding Kannada script for Jain Maharastri so that ಗುಜ್ಜರತ್ತಾ appears on the headword line of the main entry? Kutchkutch (talk) 18:27, 2 December 2021 (UTC)Reply
@RichardW57, RichardW57m: After Template talk:pra-noun, Svartava2 designed headword templates to show the Kannada script only when the lect is Maharastri since Bhagadatta wanted to be cautious about which lects are attested in the Kannada script and which lects are not. However, see Talk:𑀓𑀬#Lect: Jain and non-Jain varieties are both instances of a single lect. Therefore, at a minimum the Kannada script should display in the headword line for both Maharastri and Jain Maharastri lects. Should this cautious approach be disgarded, and should the Kannada script be used for all lects? Kutchkutch (talk) 03:13, 4 December 2021 (UTC)Reply
I would favour the cautious approach. I'm confused by the documentation for the headword template. If a word is attested in some form of Maharashtri Prakrit, is that to be explicitly recorded? Do we need some way of saying, 'All dialects, so also in Kannada script'? There might be merit in allowing |knda=+ to mean to give it in Kannada form, using automatic transliteration. --RichardW57m (talk) 11:19, 16 December 2021 (UTC)Reply
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ For now, the Kannada script is displayed for both Maharastri and Jain Maharastri. I think it's okay to continue with the cautious approach and let the Kannada script form be there for (Jain/) Maharastri for now at least. The headword-line templates is coded to generate the Kannada form, if the values j or m are entered in any or it's parameters, to specify if the term is attested in those lects. —Svārtava15:31, 16 December 2021 (UTC)Reply
FYI I already have code that knows about most of the templates out there with foreign-script text in them and which parameters need to be changed; I originally wrote it several years ago for vocalizing Arabic text based on the transliteration and have used it for various purposes since. As for پیوݨ and descendants trees with |sclb=, I may have to add special-case code for this; it would help if you could list as many examples as possible that have this sort of situation in them. Benwing2 (talk) 01:44, 3 December 2021 (UTC)Reply
Categories for gendered nouns by language
Latest comment: 3 years ago17 comments3 people in discussion
Would it be appropriate to integrate categories of the following type into Category:Fundamental?
@Kutchkutch Sure, although I'd prefer nouns with gendered equivalents or nouns with gender equivalents or maybe nouns with other-gender equivalents. "Forms" normally refers to non-lemma forms, and that's what I thought of when I first saw these category names. Note that we already have a category female equivalent nouns such as Category:Italian female equivalent nouns, which is generated by the {{female equivalent of}} template. Benwing2 (talk) 01:36, 3 December 2021 (UTC)Reply
Thanks for the implementation! Once all the hidden categories above have corresponding categories in Category:Fundamental, they would be deleted. For English, the equivalency of actor vs. actress could be considered more morphological than king vs. queen. Perhaps it should be left to the editors to decide the exact nature of equivalency. Kutchkutch (talk) 01:09, 7 December 2021 (UTC)Reply
@Kutchkutch actor/actress and king/queen are just examples of nouns that may go into this category. It's common in descriptions of categories to list examples to help understand the nature of the category, and they're usually in English because it's not easily possible to make the category description be customized per language. I think it's reasonable to include an example of both morphologically-related pairs and morphologically-unrelated pairs because both sorts will typically end up in the category; but if you really don't like them we can remove them. Benwing2 (talk) 03:21, 7 December 2021 (UTC)Reply
Those examples are fine since it's true that both sorts will probably end up in the category. I've used {{cln}} in entries, but I wasn't aware that it ensures that userspace and talk pages don't end up in the categories. I'm not proficient at coding, but Module:hi-pa-headword is certainly very interesting. Kutchkutch (talk) 05:45, 7 December 2021 (UTC)Reply
@Kutchkutch I did something similar in Module:uk-be-headword, since as with Hindi and Punjabi these two languages are very similar. BTW can you take a look at खत्रप? I tried to fix all the bad params in Punjabi terms but this one has the gender specified as neuter and I don't know what the proper gender is. Benwing2 (talk) 05:59, 7 December 2021 (UTC)Reply
@Kutchkutch, Svartava2 What is the purpose of this param? I intentionally removed it because it had no effect, and you've added it back without effect. Is it to mark ordinal numbers? If so we should create {{hi-num-ord}} or similar. BTW the current state of Hindi numerals is messed up; e.g. we have both {{hi-num}} and {{hi-num-card}} doing the same thing, and they are being used for both cardinal and ordinal numbers even though they explicitly label the term as cardinal. Benwing2 (talk) 15:56, 7 December 2021 (UTC)Reply
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ @Benwing2: Do you think it's right to change the "adjective" to "numeral"? Even English entries like seventh use "adjective" POS header. —Svārtava02:52, 8 December 2021 (UTC)Reply
@Sgconlaw Fixed; there were 67 templates needing fixing. If you need other templates to include this, I can do that. In the longer run I should write some Lua code to facilitate creating quotation templates of this sort; this will make it so you don't have to forward all the parameters but will only need to specify those that differ from template to template. Benwing2 (talk) 04:13, 3 December 2021 (UTC)Reply
Great, thank you. In the past I used to include |footer= on a case-by-case basis, but I think it’s probably better to just include it in all quotation templates.
Speaking about updates to Module:quote, other possible changes are (1) to check that all external links use https:// where possible (I noticed the WorldCat one is still using http://); and (2) in |location= and |publisher= (and |location2=, etc), to automatically link s.l. (or “S.l.”) and s.n. to the glossary. I think I may have mentioned some of these things at Module talk:quote. — SGconlaw (talk) 05:25, 3 December 2021 (UTC)Reply
@Sgconlaw Linking s.l. and s.n. can be done in the module itself. http:// occurs in the following templates:
Thanks. I actually meant the URLs that are inside the module, like those that link to the Library of Congress, Open Library, and WorldCat. For URLs in quotation template, those linking to Google Books, HathiTrust, and the Internet Archive should use https://, but for some reason Project Gutenberg Australia doesn’t, which explains most of the templates you found above. — SGconlaw (talk) 06:20, 3 December 2021 (UTC)Reply
@Sgconlaw BTW the following garbagey templates don't even include |passage=; you might want to fix them:
@Sgconlaw Doing a run now to add |footer= almost everywhere, as well as to ensure that templates support both |text= and |passage= (and I standardized them to prefer |text= over |passage=, hope that is OK). Benwing2 (talk) 06:12, 3 December 2021 (UTC)Reply
Speaking about problems with quote templates, there are about 65 pages in CAT:PFE that need fixing. You changed the first positional parameter for {{RQ:Milton Paradise Regained}} from being the book name to being the page number, so all of the entries that don't use the |book= parameter now have errors. Do you think maybe "book=" should be inserted in front of all first positional parameters that aren't numeric? Chuck Entz (talk) 06:33, 3 December 2021 (UTC)Reply
Nothing high-tech: just assembly-line techniques with tabbed browsing and text searching to find where to paste without scrolling. It took me 27 minutes, but it was worth it to clear everything up. If I was really in a hurry, I could have done it faster. It does take a lot longer than it would with a bot, but I can just do it in less time than it takes to get a bot to start on it. Null edits are a lot quicker- I've cleared hundreds of entries from CAT:E in less time. Chuck Entz (talk) 06:55, 4 December 2021 (UTC)Reply
You mentioned "garbagey" quotation templates - there are a load of them, probably all created by Wonderfool. I remember someone once searched for all RQ: pages without a "passage=" bit or anything else - there were over 1000. It's probably useful to have on a long-term to-do list the de-garbagifaction of suche templates Notusbutthem (talk) 16:22, 4 December 2021 (UTC)Reply
Discord
Latest comment: 3 years ago6 comments2 people in discussion
Latest comment: 3 years ago3 comments2 people in discussion
Hi Benwing,
In 2019, WingerBot made the a long, "per Bennett (with corrections by Allen and Michelson)." Do you have a link for that ref? It's not supported by Lewis and Short, who show all the vowels as short. (Please ping me if you answer.) kwami (talk) 21:58, 8 December 2021 (UTC)Reply
@Kwamikagami Hi Kwami. Haven't seen you much around Wiktionary. The ref is here: I don't trust Lewis and Short at all for hidden quantities; their info is from the late 1800's and quite out of date in this respect. Benwing2 (talk) 22:52, 8 December 2021 (UTC)Reply
Latest comment: 3 years ago5 comments2 people in discussion
Hello! Could you take a look at User:Kutchkutch/mr-decl, and let me know if this could be done eventually? I started this in 2017 using templates, but AryamanA pointed out that this has to be done by a module. Although there's no rush, it would be nice to at least be able to show the most common declension classes on entries. Kutchkutch (talk) 03:15, 10 December 2021 (UTC)Reply
Thanks for taking a look at it. Module:hi-noun is certainly very long, and I understand that it may take time to adapt it for Marathi. I just wanted to bring it to your attention.
AryamanA started Module:mr-decl based on an earlier version of the outline of declension paradigms at User talk:AryamanA/sandbox. However, as AryamanA was working on it, I realised that the outline had to account for subtleties that were not apparent at first. This is why I created a newer version based on your work for Hindi.
A year ago AryamanA started Module:mr-verb based on Module:hi-verb, after I created User:Kutchkutch/mr-conj, but he never had time to finish it. Although verbs are also important, the number of nouns would be expected to be greater than the number verbs. I believe AryamanA intended finish both declension & conjugation, but it seems that he’s been too busy in real life for the past year. Kutchkutch (talk) 15:13, 11 December 2021 (UTC)Reply
Thanks for accepting the request and adding it to your todo list. I intend to continue editing the outline to try to make it as clear/comprehensive/accurate as possible. Feel free to ask any questions. Hopefully, AryamanA will be back to work on it as well. Kutchkutch (talk) 18:32, 13 December 2021 (UTC)Reply
Replacement of unnecessary redirects
Latest comment: 3 years ago4 comments2 people in discussion
Hi, a while back you created a list of redirects to quotation templates at User:Benwing2/english-quotation-templates-redirects and suggested that I go through them and let you know which ones should be replaced. There are too many for me to handle at once, but if you don't mind I'll try to clear them in batches. Here is the first batch:
Also, if it is convenient, please (1) carry out the following replacement when you encounter it (probably best to do it only for the above templates and future batches that I post here rather than for every quotation template, as not all such templates may have the |passage= or |text= parameter yet):
#* {{RQ:XYZ}}
#*: This is the passage quoted.
→
#* {{RQ:XYZ|passage=This is the passage quoted.}}
and (2) on documentation subpages, please add:
* {{para|footer}} – a comment on the passage quoted.
after the |text= or |passage= line if it does not already exist. (In some cases I believe I typed "a comment about the passage quoted" – just wanted to point that out so that the bot doesn't insert another statement about |footer= in those cases. I'm trying to remember to standardize the wording!)
Latest comment: 3 years ago2 comments2 people in discussion
Thank you for the incredible rewrite of {{es-conj}} and {{es-verb}}, it was a huge undertaking and a big improvement over what we had before. Using the foundation you built for Module:es-verb, do you think it would be reasonable to extend it to {{es-verb form of}}? I'm envisioning something like a single definition with {{es-verb form of|hablar}} that would replace the four very complicated {{es-verb form of}} definitions on hable. This would be much easier for humans to use, it would ensure that page shows all of the correct verb forms, and it would enable future editors to improve the presentation without needing to edit editing 600,000 pages. JeffDoozan (talk) 12:51, 13 December 2021 (UTC)Reply
@JeffDoozan This is doable although the specification in {{es-verb form of}} would in some cases have to be more complex than just the verb infinitive; essentially you'd need to specify the same params as are passed to {{es-conj}}. It would work by generating the full conjugation and then searching through it to find forms that are the same as the page name. I've added it to User:Benwing2/todo, so it won't get lost. Benwing2 (talk) 02:59, 14 December 2021 (UTC)Reply
Replacement of unnecessary redirects (part 3)
Latest comment: 3 years ago3 comments2 people in discussion
Latest comment: 3 years ago3 comments2 people in discussion
Hi! I've been working on Akkadian for a while now and I'd like to add declension and conjugation tables to Akkadian Noun and Verb entries. I've seen you've done an amazing job with Arabic and I was wondering if you could help me. Unfortunately I have zero knowledge of programming, so I'm quite useless in that regard, but I do have the linguistic knowledge. Could I interest you in a collaboration? Thank you! Sartma (talk) 13:47, 15 December 2021 (UTC)Reply
@Sartma I can definitely help you although I may not have much time right now. I would have concerns about implementing noun or verb inflection modules using cuneiform due to the inconsistency of cuneiform spellings, but I see that Wiktionary lemmatizes Akkadian using Latin spellings, which avoids this issue. Benwing2 (talk) 04:23, 17 December 2021 (UTC)Reply
@Benwing2 That's ok, I won't have much time until January myself. We can discuss about the cuneiform. Personally, I'm happy not to use it on inflection tables (I feel it makes everything heavier), but I know that other users here have stronger feelings about it and would like to keep it as well. If we limit ourselves to the basic cuneiform syllabary plus a couple common triliteral signs and stick to the Old Babylonian syllabic writing style, it shouldn't be impossible. Are you on Discord? It might be easier to communicate there. Shall I get in touch again in a month's time or so (mid January?)? Thank's again! Sartma (talk) 21:19, 17 December 2021 (UTC)Reply
Spanish autoplurals
Latest comment: 3 years ago1 comment1 person in discussion
Latest comment: 3 years ago4 comments2 people in discussion
Hi,
I wonder if you could write a bot to create missing Russian feminitives (noun entries) from noun headwords where |f= exists? Nouns like -ница, -ица, -ка, иня, -ха, -ша and adjectival nouns in -ая?
They should be all stress pattern "a", gender=f-an. Examples:
рабо́чая(rabóčaja), больна́я(bolʹnája)(the stress can be on the ending as with adjectives)
I am also somewhat interested in generating feminine forms of surnames but I don't know how hard it is and it's of lower priority. There's no rush but maybe you could add it to your to-do list. Thanks! --Anatoli T.(обсудить/вклад)00:55, 20 December 2021 (UTC)Reply
The definition line could simply be something like {{feminine of|ru|ма́стер}}
@Atitarev Sure. I already did something like this awhile ago for Arabic and can probably repurpose some of the code. The definition should probably use {{female equivalent of}}. As for generating feminine forms of surnames, you mean generating an entry for e.g. Страви́нская(Stravínskaja) and Ивано́ва(Ivanóva)? This is hardly any more difficult than doing it for common nouns. The hardest part of either, I think, is handling the insertion of the actual entry: i.e. do nothing if the entry already exists, create a page if no page exists, create a Russian-language entry if an entry exists for another language but not Russian, add a new etymology section if necessary, etc. Pretty much the same thing that my script to create non-lemma forms does, and the logic will be the same whether I'm creating a common or proper noun. The only thing I'm uncertain about is what a feminine surname entry should look like. Should it be a lemma or non-lemma form? What should the definition look like (e.g. should it use {{female equivalent of}})? Should it have a declension table, and if so, what should the table look like? Should it be the same as the masculine table but missing the masculine column? Benwing2 (talk) 01:23, 20 December 2021 (UTC)Reply
Yes, {{female equivalent of}} sounds good for both. Yes, inserting into existing entries will be harder, sorry. The matter is further complicated somewhat by feminitives being rare, slangy, (low) colloquial or even rude, e.g. докторша, канцлерша, извергиня, президентша, врачиха but masculine forms may not even have |f= for those. I just need to keep checking those entries.
I'm OK to refer to the main (masculine) form for declension of female surnames, similar to ({{ru-conj-verb-see}}) because I don't think we have a specific declension template for them and this will be a duplication. What do you think would be best? They are sort of on the border between lemmata and non-lemmata. --Anatoli T.(обсудить/вклад)01:41, 20 December 2021 (UTC)Reply
Latest comment: 3 years ago10 comments2 people in discussion
I was recently thinking of a new parameter |fail=1 for T:rfv and T:rfd, which would put a page into Cat:Candidates for speedy deletion. The purpose of this parameter would be to allow non-admins to close RFV/RFD's which involve page deletion. You have said earlier that adding features without changing the existing features of a template doesn't require consensus beforehand, so for the {{rfv}} template, could you possibly tone down the protection to autopatroller from template editor so that I can edit it accordingly? —Svārtava06:23, 20 December 2021 (UTC)Reply
Okay, thanks for the feedback. Do I have to take it to BP to get community approval or just community knowing about this? —Svārtava05:15, 21 December 2021 (UTC)Reply
@Benwing2: I've added the parameter to {{rfd}} also. It was a bit tricky choosing the wording; I finally ended up on this one: "The voting and discussion is closed now, and this page is awaiting speedy-deletion by an administrator. If you think this page should not be deleted, please start an undeletion discussion for the same." This is supposed to make it clear that after this even admins can't vote since the rfd is closed, they just have to delete this. Any suggestions or objections? Btw I think since {{rfd}} is a wiktionary process template, it deserves a template-editor level protection. Before you changed it to "autopatroller", it was just "autoconfirmed" and anyone with user account could edit it. —Svārtava10:14, 21 December 2021 (UTC)Reply
Latest comment: 2 years ago15 comments5 people in discussion
Hey, did you discuss this with other Russian editors? Russian's relations with other languages aren't ambiguous, so using these templates doesn't seem helpful. Thadh (talk) 23:02, 26 December 2021 (UTC)Reply
The main other editor is User:Atitarev, who I seriously doubt will object; if so I'll undo the change. The text on many or most pages already read "Borrowed from ..." or "Inherited from ..." so this is mostly just templatizing the text. Benwing2 (talk) 23:06, 26 December 2021 (UTC)Reply
User:Tetromino may also be interested in this. Anyway, I think it's just generally a good idea to ask first, act later. Of course if nobody objects, you can freely ignore my opinion. Thadh (talk) 23:13, 26 December 2021 (UTC)Reply
Relatedly, what do you think about these 2 templates in Indo-Aryan languages? In most of the cases, it's already "Inherited/Borrowed from" typed out, so adding the + templates to those pages would mostly be just templatisation. —Svārtava14:08, 28 December 2021 (UTC)Reply
@Benwing2: The other Indo-Aryan editors can be pinged, of course, but I think the votes + related discussions, etc. made it clear enough that we all are in support of that (well, with the exception of Kutchkutch who is neutral but always types out the whole text). OTOH there are many more in support and some have already started using them: Taimoorahmed11, Rishabhbhat, Bhagadatta, Inqilabi, Imranqazi90, me, etc. —Svārtava04:27, 29 December 2021 (UTC)Reply
There're quite a few of them, but I'd say the "main" are hi, ur, pa, gu, mr, bn, as, ne, pi, inc-pra. Is it too difficult to do? —Svārtava07:52, 29 December 2021 (UTC)Reply
@Svartava2 Not too hard. I am saving the Assamese changes currently. These changes are done semi-automatically and I made a bunch of additional fixups to Urdu and esp. Bengali lemmas; many of the latter were really messy. But these additional fixups take significant time so I will probably skip them on the other languages. I included your username in the changes; hopefully you won't get pinged a zillion times ... Benwing2 (talk) 05:21, 30 December 2021 (UTC)Reply
@Benwing2: Thanks a million for the amazing work!! :) But now that these 2 templates have been standardised in these languages, any template-to-text replacement would have a much much greater negative impact and disruption. Per Imetsia's essay, “… it is not time to rest on our laurels just because attacks on the templates have become less frequent. If they wanted to, any one template-opponent could effectively ban the templates under the aegis of Mahagaja’s holding. Even more troubling is how to deal with that potentiality …”, so at this point, where Victar's or any other opposer's template-replacement would be really devastating, I think a policy/rule might be needed to forbid such replacements before it's too late. Your thoughts? —Svārtava12:20, 30 December 2021 (UTC)Reply
@Svartava2 What sort of policy/rule are you thinking? My concern is that bringing this up at this point would just stir up more trouble than it's worth. Maybe better to wait until it actually becomes an issue? Benwing2 (talk) 01:47, 1 January 2022 (UTC)Reply