Hello, you have come here looking for the meaning of the word Wiktionary:Grease pit/2023/July. In DICTIOUS you will not only get to know all the dictionary meanings for the word Wiktionary:Grease pit/2023/July, but we will also tell you about its etymology, its characteristics and you will know how to say Wiktionary:Grease pit/2023/July in singular and plural. Everything you need to know about the word Wiktionary:Grease pit/2023/July you have here. The definition of the word Wiktionary:Grease pit/2023/July will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofWiktionary:Grease pit/2023/July, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
Bug: Links to Punjabi words with ݨ (U0768) go to page with ن (U0646)
I tried to make a link to the Punjabi word دشمݨ (duśmaṇ), but when I use a template it goes to دشمن (duśman).
I've used a workaround here by quoting the word between ]. Then it works.
But as you can see at the Punjabi page for ਦੁਸ਼ਮਣ (Gurmukhi script), the link to the Shahmukhi spelling doesn't work.
For some reason, links to Saraiki words with ݨ seem to work just fine (for example the link here to Shahmukhi spelling). Exarchus (talk) 14:15, 1 July 2023 (UTC)
@Exarchus There's evidently an entry in the Punjabi-specific language data that maps ݨ to ن. In general this sort of mapping happens for things like macrons (e.g. in Latin and Old English), stress marks (in Russian, Ukrainian and Belarusian), vowel diacritics (in Hebrew and Arabic), tone marks (in Serbo-Croatian and Slovenian), etc.; specifically, for extra marks that are commonly found in dictionaries and contain useful pronunciation information, but which aren't normally found in the spelling of the language as naturally used by native speakers. This must have been added intentionally for Punjabi, but that doesn't necessarily mean it's correct. Unfortunately I don't know anything about Punjabi, so I can't say whether we should remove this mapping. User:Theknightwho or User:RichardW57, do you happen to know? Benwing2 (talk) 00:31, 4 July 2023 (UTC)
ARABIC SMALL HIGH TAH ( ؕ ) U+0615 is an extra mark on ن /n/ to represent the retroflex nasal consonant /ɳ/. It isn't normally found in the Shahmukhi spelling of the language as naturally used by native speakers. Thus, page titles should contain ن instead of ݨ . The spelling with ݨ should be in the |head= parameter of the headword template and in {{m}} & {{l}} to indicate the retroflexion of the nasal consonant in the Shahmukhi script.
@Benwing2: Moved that particular term. There are perhaps more instances of Shahmukhi Punjabi page titles that contain ݨ instead of ن such as پاݨی and کھاݨا. Is there a way to search for all such instances? Kutchkutch (talk) 02:58, 4 July 2023 (UTC)
I had encountered a somewhat similar problem with Malayalam chillu letters, where ർ (U+0D7C) was encoded as a combination of characters (ര്). I had moved a few such entries, but thanks to your search query I can look for the others. Exarchus (talk) 10:27, 4 July 2023 (UTC)
@Exarchus: I don't know what you've been doing, so I may well be telling you what you already know. Remember that the multi-character sequences for chillus remain valid even though now dispreferred. The entries should be moved or merged, but the hard redirects should be kept. (Keeping is the default for ordinary mortals at least.) I think we should also make that the policy for the multi character representations even for the newer chillus - there are fonts that support them, so text using the combinations may well be being created even now. So far as I am aware, our search functions don't recognise the equivalence. --RichardW57m (talk) 17:37, 5 July 2023 (UTC)
The problem was that links to the multi-character chillus didn't work. So I moved the multi-character pages to single-character pages, with redirects on the old ones. (And when there were already two pages, I merged them.) Exarchus (talk) 17:48, 5 July 2023 (UTC)
@Exarchus That's correct - if you put multi-character chillus into {{l}} (or any other link template), it will automatically correct them to a single-character chillu whenever possible. That means we shouldn't ever have actual entries that use multi-character chillus.
The reason it's set up that way is because we can't easily stop random editors from putting multi-character chillus in links, and the most important thing is that those links go to the right places. Otherwise we'd be forced to create tons of redirects, and newbie editors would keep creating pages for them when they see redlinks. Theknightwho (talk) 18:00, 5 July 2023 (UTC)
@Theknightwho, Exarchus But do we not need tons of hard redirects, so that a search will find the entry which ever way the URL is typed? Unicode does recommend that the two encodings for the older chillus should be recognised as equivalent, so permanent hard redirects should be fine. --RichardW57m (talk) 14:27, 6 July 2023 (UTC)
@Theknightwho, @RichardW57m I found another issue with broken links, this time it was a page which seemed to have a zero-width no-break space in front.
@Exarchus It might be worth us putting in an edit filter to prevent characters like that from being in page titles, as they're almost always due to people copy + pasting without realising.
The pages in that category all have links that contain discouraged character sequences (many of which are multi-character Malayalam chillus). They're probably in translation sections (for the English entries), etymology sections (for other Indic language entries), or could be anywhere in a Malayalam entry. Theknightwho (talk) 13:34, 7 July 2023 (UTC)
@Exarchus: There were almost unmentionable pages like ആണ് that were in the categories, with the multi-character chillu-encodings. (One can't easily access them with {{link}}, {{mention}}. Enclosing the name in double square brackets, and clicking the 'redirected from' breadcrumb on the resulting page gets one to them.) It seems they drop out of the category when they become redirects. I saved a list at User:RichardW57/discouraged and started fixing them by merging any content and making the pages with validly discouraged names into hard links. My criterion for 'validly discouraged' is almost that Unicode says treat them the same, but I also treat the multicharacter encodings of the newer (and much rarer) chillus as being equivalent to the atomic chillus, even though Unicode does not call for that. Basically, don't assume that only Malayalam uses the Malayalam script - I don't think Syriac and Sanskrit are the only other languages that use it. I found several pages with multi-character chillus that had more extensive content then the pages with atomic chillus. --RichardW57m (talk) 13:57, 7 July 2023 (UTC)
As @Theknightwho mentioned while I was typing, there are other links that promote inclusion in the category, which apart from translations are quite difficult to track down. --RichardW57m (talk) 13:57, 7 July 2023 (UTC)
In terms of validity, I think my original reasoning was that it's best for us to be consistent, as it obviates situations where we end up with duplicate entries. There is actually support for treating equivalent chillus as identical in MediaWiki's Malayalam-language version, in the same way it merges NFD page titles into NFC, though it isn't enabled by default (even if the wiki's language is set to Malayalam). It might be worth seeing if we can get it enabled for Wiktionary, as there are obvious benefits to it (e.g. redirects, automatic search support etc). Theknightwho (talk) 14:06, 7 July 2023 (UTC)
@Exarchus: Initial and final ZWSP in page names are no-nos. We need to move and request expungement of the resulting hard link. Do we have a tool to convert ZWSP to in text? We need it as an easy sheep dip for scriptio-continua languages. --RichardW57m (talk) 14:29, 7 July 2023 (UTC)
@Catonif has graciously written the bulk of a new module for Polish IPA, as the current one is janky, inefficient, and sometimes weird to use. There are still a few steps to do, i.e. adding a more complete list of affixes or adding qualifiers for certain transcriptions, but it's more or less ready, and testcases can be seen here. One thing I would need help with is updating many of the respellings, i.e. anything with |fs=1 is going to need to be switched to |^= (I believe the exact markup is going to be {{pl-p|^}}, please correct me if I'm wrong). A big one is going to be using the new respelling system
most morphemes have been "taught" to the module, and so pages like przeświadczyć won't need any respelling (they do now), so if the printed IPA is the same with and without the respelling, the respelling should be removed. This will also be true for words ending in -istka and -ystka and any of their declined forms
Some pages might need a respelling, but it will be different, i.e. pochwycić, which currently need po'chwy.cić, but can now be respelled as po.chwycić
Multiword terms with prepositions (all listed in the module) will automatically have them cliticize, in the past we used - to cliticize them
On that note, we will still be using - to break palatalization, and should be left alone.
Some words will still have ' for forced stress, so like above if the IPA is different without the respelling then the respelling should stay. Vininn126 (talk) 19:58, 1 July 2023 (UTC)
@Vininn126 (1) is not hard to implement, but for (2) I need specific instructions as to how to rewrite the respellings (or alternatively I can flag all pages that still have respellings after (1) is implemented, for you to fix, but there might be a lot of them). Benwing2 (talk) 21:04, 1 July 2023 (UTC)
Also, it turns out we're gonna be tweaking pl-p, too. As to 2, it's mostly going to be changing chains of .' to a single ., particularly after the affixes listed in the module. Vininn126 (talk) 21:32, 1 July 2023 (UTC)
@Benwing2 Basically there will be many words starting with u-, po- and o- that have respellings and we'd just need to check if there's a syllable breaker after them, replace it with . and remove any other syllable breakers and check if the IPA is the same. Vininn126 (talk) 21:49, 1 July 2023 (UTC)
@Vininn126 Examples would really help. I am not at all familiar with how {{pl-p}} currently works or how the new template works. Whenever you do a change like this there are always a bunch of edge cases that have to be dealt with, so the more examples you can gave, the better it will enable me to figure those out. Benwing2 (talk) 21:51, 1 July 2023 (UTC)
@Benwing2 uchwycić, pochwycić, and ochwycić would all be respelled as u'chwy.cić, po'chwy.cić, and o'chwy.cić. Basically, this is needed when the next syllable has a consonant cluster, and all three syllables are separates, as the module won't print syllable breaks otherwise. Ideally, with the new module, it would be just u.chwycić, po.chwycić, and o.chwycić, with one breaker at the morpheme boundry. Vininn126 (talk) 21:53, 1 July 2023 (UTC)
@Vininn126 Thanks. What about if there are more or less than three syllables? Do all syllable breaks except those after u-, o-, po- go away? Are there other prefixes that I need to pay attention to? Benwing2 (talk) 21:59, 1 July 2023 (UTC)
@Benwing2 The amount of syllables does not matter. It could be 2-infinite. As for changing the respelling, I believe these should be the only affixes we are worried about. Vininn126 (talk) 22:02, 1 July 2023 (UTC)
@Benwing2 In multiword entries, prepositions will have a - after them, this should be replaced with a space. A list of prepositions can be found in the module. Vininn126 (talk) 22:02, 1 July 2023 (UTC)
@Vininn126 You said you'd need help with words with |fs=1. Can you give an example which uses that parameter as I can't think of anything right now. Btw. Can't a bot just change them all? Tashi (talk) 21:42, 1 July 2023 (UTC)
Hi @Benwing2. The update is not actually as ready as you may have been made to believe. For now what's more or less in place is the transcription into IPA, but the actual implementation of that into a working template with hyphenation and parameters like qualifiers, references, etc. is at its earliest stages. These things are traditionally handled by MOD:pl-pronunciation, for which I notify @Surjection as the original author, though there are some things which should probably be ported into the transcription module. For example, MOD:pl-pronunciation handles hyphenation, but some things which before required a respelling, like for example przechwycić because of its prefix, now are handled automatically by the IPA module through affix recognition, which means that the hyphenation may be better treated together with the transcription. Another example is the stress with -yka / -ika suffix (to see how it would visibly work, see the example gramatyka), handled by MOD:pl-pronunciation (this happening after Surj's original), though this shouldn't be the case, since a multiword term can contain an yka-suffixed word, and this should be handled accordingly. This would mean that qualifiers as well might need to be taken into consideration already in the transcription module. I'm hesitant on what is the best way to address this problems, especially since this would involve heavy changes to MOD:pl-pronunciation, which is still all Greek to me. I express some problems more thoroughly in the comments of the module. I'm not used to handling this kind of thing, so I'd be thankful for some help in the code when it comes to the heavily technical part, since I assume it won't be too different from the other languages' modules, so experience would seem to play a big role, and I wouldn't want to just cause a bigger mess for future editors to then untagle. Catonif (talk) 16:50, 3 July 2023 (UTC)
@Catonif I implemented something similar for {{es-pr}} and {{it-pr}}. It is handled all in one module in Module:es-pronunc and Module:it-pronunciation. You might want to take a look at the latter; the former is more complex due to handling multiple dialectal pronunciations. Essentially there's a function to generate the pronunciation itself, which is wrapped by code to handle the argument parsing, hyphenation/syllabification and display. By putting it together in one module, you can share things like the list of affixes. I was also able to share some of the hyphenation code since both the IPA generation and hyphenation generation have to do this; but there are some differences because one operates directly on the spelling and the other operates more or less on the IPA. I can help you with some of the coding although I seem to have a lot on my plate so I'm not sure how fast I can get to it. Benwing2 (talk) 20:05, 3 July 2023 (UTC)
I mean, to be honest, if we can fix the hyphenation to recognize affixes, then we could in theory use this code just for transcriptions. Vininn126 (talk) 20:36, 3 July 2023 (UTC)
@Vininn126 It's not quite that simple, e.g. for example you mentioned changing the handling of Cr (and Cl?) combinations; we'd need to make that change in the hyphenation code as well, along with any other changes in the IPA code that affect hyphenation. Benwing2 (talk) 00:21, 4 July 2023 (UTC)
@Benwing2 One of the last major things for this module is to convert the generated IPA strings into tables, which would allow for labels. Would you be able to take a look? There are a few other minor things but those should be easily handlable. Vininn126 (talk) 18:41, 10 July 2023 (UTC)
@Vininn126 I did take a look but it will require some significant work given that it has to work with {{pl-p}} or equivalent. I'll try to work on this over the next few days but it's a nontrivial task. Benwing2 (talk) 07:22, 19 July 2023 (UTC)
@Benwing2 understood. Once that is done a few other smaller tasks can be handled and then I'll make a template and I will be asking for help replacing the old template, I've been thinking about how exactly to do this. Vininn126 (talk) 07:42, 19 July 2023 (UTC)
@Benwing2 ah, got it. Thanks for informing of the size of the task, that helps. I understand this is a huge project but I think it will be worth it, you've seen the mess that is Polish code. Vininn126 (talk) 07:39, 19 July 2023 (UTC)
@Vininn126 Yes. BTW I think as a first pass we should forget about Northern Borderlands or other dialects and just focus on the standard. We can then add dialectal pronunciations afterwards. Benwing2 (talk) 07:42, 19 July 2023 (UTC)
@Benwing2 yes, those are not the focus at the moment, but in terms of handling the standard I think most things are handled. NBD should be a simple task and I'm not worried about getting it taken care of right away, plus I think I'd want to add SBD as well. Vininn126 (talk) 07:45, 19 July 2023 (UTC)
@Benwing2 Update: the replacement should be easier than we expected, because we've decided to remove syllable breaks except stress markers from the transcriptions, meaning that there should be much more pages where the only thing we care about is the placement of the stress marker. Do you have an estimate when you'll be able to look at it? Again no rush, just trying to figure out logistics. Vininn126 (talk) 15:23, 6 August 2023 (UTC)
@Benwing2 Actually, @Catonif says it might be better to keep the transcriptions as strings after all, so once a few more small changes are implemented, it might be ready to deploy. Vininn126 (talk) 09:20, 7 August 2023 (UTC)
can we suppress the redundant second message generated on pages like instar ?
The decl table at instar#Declension generates the message Not declined; used only in the nominative and accusative singular, singular only. . The relevant template code is {{la-ndecl|īnstar<indecl>}}. i wonder if its three messages pieced together, of which we could suppress the last whenever it appears with the preceding message. I wasnt able to find anything in the code that seemed obvious however. Thanks, —Soap—09:15, 2 July 2023 (UTC)
Sometimes people add error text to entries, like this: Special:Contributions/2804:30C:1364:4E00:653E:A55:AB1E:7907. They seem to be doing it via Gadget-TranslationAdder-Data.js, which does not check whether a language code is valid or not. So this should be fixed so that it only allows valid codes (as determined from a JSON list that we have somewhere, apparently). Equinox◑12:39, 2 July 2023 (UTC)
To add to this, I think I've made this same mistake twice, once with Swedish (when I assumed it was se) and once with Old English (when i assumed it was oe). The first time, even though I could see the error message on the screen, I assumed that it would go away when I added the same word with the correct language code. But apparently it not only doesnt delete, it "chokes" on its own erorr message and that leads to a bigger and much messier error message. Essentially there is no undo function .... maybe that's just a limitation of the software, but if we could stop the text from being added in the first place, it wouldn't need to be undone later. (As for why I made the same mistake a second time? I just forgot about what happened the first time. I'm that way sometimes.) —Soap—14:35, 2 July 2023 (UTC)
Adding {{senseid}} functionality into {{lb}}?
Would it make sense to add the functionality of {{senseid}} into {{lb}} so that if the latter is already in use in an entry we could just type "{{lb|en|botany|id=botany}}" instead of "{{senseid|en|botany}}{{lb|en|botany}}"? — Sgconlaw (talk) 19:02, 2 July 2023 (UTC)
I think the pros are outweighed by the cons.
Pros:
Quicker to type.
May encourage greater predictability of IDs.
Cons:
More maintenance for template editors and template documentors.
More difficult to search for senseids in large entries. Instead of just searching for 'senseid', one also has to look for '|id=' and then decide whether it applies to the sense or to a term linked to.
(debateable) Counterintuitive. |id= normally refines targets, rather than defining them.
In short, I think the maintenance and use costs outweigh the quick gains when creating new sets of definitions. --RichardW57m (talk) 12:23, 3 July 2023 (UTC)
Thanks. I don't have strong feelings either way, but thought it might be useful to add that functionality to {{lb}}. If |id= is thought to be a confusing parameter name, we could use |senseid= instead. Happy to hear other editors' thoughts on this. — Sgconlaw (talk) 14:24, 3 July 2023 (UTC)
I like the idea (with |senseid= or |sense_id=). In many cases the senses will already have labels when linking. Maybe the label itself could (optionally) become the id, so you don't have to write {{lb|en|music genre|sense_id=music genre}} ({{lb|en|music genre|sense_id==}} ?). Jberkel15:17, 3 July 2023 (UTC)
@Jberkel: I'm thinking there might be situations where it makes sense, or one needs, to specify a different value for |senseid=—for example, if more than one sense has the same label. But we could certainly use the (first?) label as a default ID if, say, |senseid=1 is specified. — Sgconlaw (talk) 17:56, 3 July 2023 (UTC)
"senseid" as the parameter would work. "sense_id" would not, because a simple search for "senseid" would not find it. I wasn't thinking of |id= as confusing, but just more work. But I suppose having more tasks to do is confusing in itself.
Does using labels as IDs work well for languages other than English? Some IDs for Pali looked bad when exposed in category names.
My first thought is that {{lb}} and {{senseid}} are logically two separate things so it makes sense to have two templates for them. However, if it's common enough to have identical labels and sense ID's next to each other, maybe we could create a combined template {{lbsenseid}} or something that creates a label and sense ID from the same tag. IMO if the sense ID tag is different from the label, we should just write {{senseid|en|FOO}}{{lb|en|BAR}}; clearer that way. Benwing2 (talk) 00:18, 4 July 2023 (UTC)
@Benwing2: |nocat=1 appears to work for {{causative of}}. Could we please have its use sanctioned by being documented. I couldn't work out how to add the parameter's to the template's documentation. For future use, if you just go and add it, please tell us how you did it. --RichardW57 (talk) 21:14, 2 July 2023 (UTC)
A literal definition for pācayant(“having someone cooked”) such as
{{inflection of|pi|pācayati||present|participle}}, which is {{inflection of|pi|pacati||causative|t=to cook}}
appears to confess that pācayati is an inflection of rather than a causative of pacati(“to cook”), so I think I should replace it by
{{inflection of|pi|pācayati||present|participle}}, which is {{causative of|pi|pacati|nocat=1|t=to cook}}
@RichardW57 There are three entry points into Module:form of/templates. form_of_t is for templates that display arbitrary text before the lemma(s); this includes {{form of}} itself, as well as more specific versions like {{obsolete typography of}}. tagged_form_of_t is for templates that display a fixed set of one of more inflection tags before the lemma(s); this includes things like {{causative of}}. inflection_of_t is for templates that display a user-specified set of inflection tags before the lemma(s); this includes {{inflection of}} and certain variants of it like {{participle of}}. The latter two always accept |nocat=, because there may be categories generated internally by the inflection tags. The first one only accepts |nocat= if the |cat= invocation argument is given, which adds categories. The documentation for these is generated by {{form of/infldoc}}, which is well-documented; but it is missing support for the |nocat= param. (It is implemented by Module:form of doc; you can see around lines 174-177 where it adds the |nodot= and |nocap= params, but nowhere does it add |nocat=.) I need to add this. Benwing2 (talk) 21:47, 2 July 2023 (UTC)
@RichardW57 I should add, just today I added support to {{inflection of}} so you can add language-specific "base lemma" params, such as |comp-of= (for inflections of comparatives), |sup-of= (for inflections of superlatives) or (in this case) |causative-of=. So we can add |causative-of= as a base lemma param for Pali, and then you could write this:
I'm not sure that that display is correct, because of what may be alternative forms:
Causative verbs come in pairs, in -eti and -ayati. I declare them lemmas, but treat them as alternative forms of one another - it seems that only the latter comes in the middle (ex-)voice. It can be argued that they are just a single verb with a multiplicity of forms. (For example, both forms form their own present active participles.)
There can be other differences. For examples, the first vowel could have been short, and there are causatives where both short- and long-vowelled forms exist. According to the PTS PED, bhindati(“to break”) has two synonymous second causatives bhindāpeti(“to cause to be broken”) and bhedāpeti - one from the present stem and one directly from the root. Overall, I prefer 'a causative' to 'the causative', but hesitate because 'a' implies there are others, and the first pair might be one verb rather than two. It gets worse with the past participle, whose voice depends on the semantics, which may have unpredictable multiple forms, far more often than English.--RichardW57 (talk) 08:03, 3 July 2023 (UTC)
How confident are you that the reader won't misread that it is the term, namely pācayant, which is the causative, rather than pācayati? Definitions should not be comprehension tests. I use 'which is' a lot in such definitions so as to dispel the interpretation of sameness of reference. I'm treading an awkward compromise between keeping the number of clicks low and the maintenance costs of maintaining duplicated (or worse, transformed) sets of definitions. Contemplate an entry for the Sinhala script dative singular of the term, which is what we actually record a quotation for, sitting on the page for the contracted causative පාචෙති(pāceti). The previous word in the quotation is the corresponding form from the simple, non-causative verb. (In this context, 'cooking' appears to actually refer to boiling alive in oil, though I haven't found the quotations for that.) RichardW57m (talk) 11:40, 3 July 2023 (UTC)
It would be nice if you could add support for non-Devanagari Sanskrit and similar cases (e.g. Devanagari Prakrit) with links to both the same-script form (just in case, and for naturalness), and to the Devanagari form (for complete information, as the main lemma). With luck, this should make {{pi-nr-inflection of}} redundant, though it's not inconceivable that Pali could be harder as a degenerate case and also possessing multiple writing systems for several scripts. Khmer script examples could test a few things, such as different transliterations; it has such gems as potentially ambiguous gemination below repha, which seems fairly widespread in epigraphic Sanskrit and in Bengali-script Sanskrit of the Bengal Presidency.--RichardW57m (talk) 15:57, 3 July 2023 (UTC)
Any immediate thoughts on how to handle different senseids for the term's script and the language's main script? I presume the usual case would be that they would be the same (or both non-existent), but that won't always be so. Perhaps |idmain= for the main script if different, with a parameter value of '-' encoding non-existence? --RichardW57m (talk) 15:57, 3 July 2023 (UTC)
@RichardW57 You have written a lot of things here and I'm not sure I understand them all. Currently with |caus-of= and similar parameters, you can put multiple comma-separated base lemmas, each of which can have its own inline modifiers (which includes <id:...> for specifying the link ID). So you could write |caus-of=bhindāpeti,bhedāpeti or even |caus-of=bhindāpeti<t:to cause to be broken><id:some_id>,bhedāpeti<t:some other gloss><id:some_other_id>, etc. If there are multiple such base lemmas, they are separated using serialCommaJoin() in Module:table, which displays "FOO and BAR" if there are two, and "FOO, BAR, BAZ and BAT" etc. if there are more than two. The wording of the article preceding the tag can easily be changed from "the" to something else, or even made customizable if that would help. Documentation for language-specific tags of {{inflection of}} is still to come. As for the paragraph beginning "It would be nice if you could add support for non-Devanagari Sanskrit and similar cases ...", I don't understand this paragraph; it would be nice if you can give some examples and/or suggestions for {{inflection of}} syntax to support your use case. Benwing2 (talk) 21:44, 3 July 2023 (UTC)
You've got it the wrong way round. To stick to the closely linked pair of causatives for the example, pācayati and pāceti, commonly described as the uncontracted and contracted forms, what you're suggesting would be to give pācayant the definition
It's slightly jarring to claim 'pācayant' as the present participle of pāceti when it has its own synonymous participle pācent, but it gets even worse with pairs such as the synonymous double causatives bhindāpeti and bhedāpeti of bhindati, each of which has its corresponding uncontracted form in -ayati. However, with this construction for the present participle of some causative of pacati, 'the' works. Perhaps 'some' works generally in the place of the article!
A cleaner case of synonymous forms is causatives sãreti and sarãpeti of sarati(“to flow”), though the second is morphologically indistinguishable from a double causative, which it may be semantically for sarati(“to remember”).
There is no direct link to the Khmer form of the stem in this, only a manual indirect link via the alternative scripts section of the Devanagari lemma; |alt= is used by position. Additionally, there is no transliteration of the standard Devanagari form that is given, though I supposed that could be fixed via an edit to {{sa-sc}}. Also pinging (Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat): . --RichardW57 (talk) 05:57, 4 July 2023 (UTC)
@RichardW57pācayant is clearly the participle only of pācayati, so IMO pāceti or similar alternative forms shouldn't be mentioned at all in this inflection line. This is not a Pali-specific issue; many languages have alternative forms for verbs (and other parts of speech). Also the issue of having multiple scripts for a given language is not specific to Sanskrit or Pali; e.g. Serbo-Croatian can be written in either Latin or Cyrillic. However in both these cases I'm still having trouble understanding exactly what you *WANT* to have happen; you're complaining about potential issues without presenting solutions. If you give me the expected outcome I can help you figure out how to get there. Benwing2 (talk) 06:04, 4 July 2023 (UTC)
I've almost renamed them and the old categories have emptied. The hyphen felt so unnatural that I accidentally omitted it. I've left the 'Pali irregular conjugation verbs' because one would naïvely think that Pali irregular verbs listed the irregular verbs. I'm wondering if I should entirely eliminate 'Pali irregular conjugation verbs' and put its sole member atthi in the first conjugation, as Warder does. It fits the definition I gave for the first conjugation. Apart from stray remnants, it has the only athematic stem left that ends in a consonant. The concept of Pali 'conjugation' refers to how the present stem is formed from the root, and not how verb forms of the present system are formed from the present stem. It's far less relevant than the Sanskrit verb classes, and the division into seven conjugations is not universal. We don't get discussion of the classification for forms that are unclear on the surface, such as gaṇhati, which historically is fifth conjugation but as far as I am aware functions as third conjugation. --RichardW57m (talk) 11:24, 4 July 2023 (UTC) RichardW57m (talk) 11:24, 4 July 2023 (UTC)
If I can hazard a suggestion, I think you want to write something like
dative singular of វុទ្ធ(vuddha), the Khmer script form of बुद्धाय(buddhāya)
Then you mention the Devanagari form of the term itself under ==Alternative forms== rather than in the definition line. This sort of thing can be accomplished already by adding a base lemma param something like |khmer-of= that displays "Khmer script form" as the tag. This could potentially be automated so that rather than having 12 (or however many) base lemma params, one per script, you could have a single |alt-of= param whose tag displays the current page's script (or the lemma's script, which should be the same). That would require adding the ability for tags to be defined using arbitrary functions, which look up properties of the lemma. I can implement this if it would be helpful. Benwing2 (talk) 06:18, 4 July 2023 (UTC)
Why do you keep adding posts at the end of the topic, rather than after the post you're replying to, as automated by clicking on reply? your practice is very confusing when the topic forks. You suggested giving an example or suggestions, so I gave an example and got some more sleep. --RichardW57m (talk) 12:50, 4 July 2023 (UTC)
What I had in mind for the solution was a layout conveying:
Khmer script form ofबुद्धाय(buddhāya), dative singular ofវុទ្ធ(vuddha), which is a Khmer script form ofबुद्ध(buddha)
I would envisage this being done with a structure of {{sa-sc}}, {{inflection of}}. One has to be wary of commas, as in definitions they normally join adjacent meanings. It's very easy to go down a 'garden path'.
Now, for Pali I was usually able to combine a transliteration and Roman script form of as a linked to transliteration, e.g. ທັມມະ (damma, “trainable”); when different, I use an arrow to point to the Roman script form, with a basic glossed link format as in ທັມມະ (damma ⇨ dhamma, “dharma”). Similarly, one wouldn't want to give the transliteration twice when, as is usually the case, both the Devanagari and relevant Khmer script form of the stem or whatever were the same. If your suggested |alt-of= refers to the lemma being inflected, that just needs a few features to be addressed:
Normally being derived automatically from the inflected lemma e.g. by |alt-of==
Having its transliteration suppressed if it is the same as that of the inflected lemma after override, and we should have the facility to override it manually just in case. (Or usually? - what is the rule for showing the accent in the transliteration of Devanagari Sanskrit?)
Applying it as an inline qualifier with the likes of |caus-of= - or is this too recursive?
Luxury feature: selecting the article.
Should we try to make it snappier? Should we make |alt-of== be the default?
Note that for Pali we have a further simplification to make in the presentation because the transliteration and Roman script form usually coincide, even for Lao-repertoire Lao script Pali, i.e. when rejecting the Buddhist Institute's revivals/concoctions added in Unicode 12.0.
Note that Serbo-Croat is not a good example of multiple scripts - cannot one work entirely in Roman script or entirely in Cyrillic script? That is not so in other cases. --RichardW57m (talk) 12:50, 4 July 2023 (UTC)
@RichardW57: Thank you for the layout example above. I still think it's better to put the first clause (Khmer script form of बुद्धाय(buddhāya)) somewhere else. Usually such information is put in the headword. This is what is done in Hindi and Urdu, for example. This will simplify the definition and avoid some of the "garden path" effects you mention.
As for the "few features to be addressed", I need examples of the features you're proposing. For example I don't understand what |alt-of== is supposed to do (#1), and I don't understand what "Applying it as an inline qualifier with the likes of |caus-of=" (#3) means. #2 (suppressing redundant transliterations) should be possible although I need to think about it. #4 (selecting the article) is very easy to implement, although I need to know whether you want this done in the actual {{inflection of}} spec (which means it needs to be done on a case-by-case basis) or you want it automated through some logic in the language-specific data.
You also ask about showing accents in Sanskrit. It seems the current practice is to put the accents in the translit when possible, although this requires manual transliteration in many cases. I know that Devanagari itself has the ability to add accent marks in it, so I'm not sure why we don't just put the accents that way and have them automatically transliterated, similarly to what's done for Russian, Ancient Greek, etc.
Finally, yes of course in Serbo-Croatian you can work purely in one script or another but I don't know why that isn't the case in Sanskrit and Pali as well. AFAIK, all (or most of?) the various scripts have the ability to completely represent the phonology of Sanskrit and Pali, just like Devanagari does, so you could theoretically work entirely in one script and ignore the others. Benwing2 (talk) 21:05, 4 July 2023 (UTC)
Pali has about 18 writing systems that we acknowledge in the list of alternative forms, and Sanskrit about 28 scripts. Duplicating definitions across all these would be a maintenance nightmare until we moved to a database system. It would be even worse if we demanded confirmation for each sense in each writing system. And if we decided that a Sri Lankan sense should not be recorded in a writing system of Burma (Burmese, Burmese Mon, Thai Mon, Old Shan, New Shan, Tai Khuen, ...), it could get even worse.
The reply on phonology is rather long, but can be summarised as 'Don't be so sure', and the writing systems do not all have the same capability.
Last time I looked, standard-compliant Unicode Devanagari can only represent student-level Sanskrit phrases if all of yy, ll and vv are written as vertical stacks - it can't distinguish candrabindu applied to vowels and applied to consonants - you have to resort to the Latin script and use U+0310 COMBINING CANDRABINDU. Microsoft Unicode Devanagari can cope, but I don't know if HarfBuzz (which Microsoft Edge now uses!) yet supports the character sequences.
I'm not sure about the pitch marks that are beginning to show up in Roman script chanting books. Our local temple encodes them with IPA tone letters and then uses a special font. As far as I am aware, there is nothing corresponding to them in the non-Roman scripts. I've never seen the Thai script handle Vedic accents.
Notoriously, Lao-repertoire Lao script Pali, which is a real thing, only consistently represents Pali phonology that maps onto Lao phonology - it can't distinguish -ss- and -cch-, and it can't distinguish voiced and 'aspirated' voiced stops - it collapses each pair to a voiceless aspirate or sibilant with tone realisation rules to distinguish them from the voiceless aspirates and sibilants of the Pali of two thousand years ago. (I transliterate Lao script Pali: I don't transcribe it.) It's argued that no writing system completely captures the phonology of Pali of over two thousand years ago, which accounts for some of the vagaries of the writing of canonical Pali.
I've also found Thai-script Pali which does not transliterate easily - see attested kat'añjalin(“waiing”).
For Sanskrit, I don't know which scripts support jihvamuliya and upadhmaniya - IAST doesn't!
Notoriously, Bengali script Sanskrit does not distinguish <b> and <v>, and as you should have noticed today, Angkorian Sanskrit usually (but not always) wrote <v> for <b>. The latter is apparently because Malay (think w:Sri Vijaya) didn't have the appropriate contrast. --RichardW57 (talk) 22:54, 4 July 2023 (UTC)
@RichardW57 Thank you for all this info. This is a tangent though and doesn't change what I said above about needing examples and proposing to put the Devanagari form in the headword. Benwing2 (talk) 23:12, 4 July 2023 (UTC)
@Benwing2: One last tangent to address before I get to the main topic. Moving the script classification from {{pi-sc}} or {{sa-sc}} to the headword line only be done with community agreement. For Pali, that is a bit of a problem, because we have very little communication and {{wgping}} is formally unfinished. I suggest we proceed on the basis that the statement is likely to be moved.
There are some complications with {{pi-sc}} that could cause complications with automatic movement for Pali:
The declaration may declare writing system rather than script, e.g. ภาโว(bhāvo). The only automatic detection of writing system I am aware of is embedded with Module:pi-translit, and is intrinsically unreliable; transliteration in inflection tables relies on being told the writing system when it matters for transliteration. @Octahedron80 was keen on recording writing system, but
I felt it more important to get the correct script automatically detected and declared, rather than being subject to copy, paste and forget to edit errors. Thank you, @Svartava, for automating script detection and transliteration in {{pi-sc}}.
The different writing systems generally lack names rather than individuals' dubbings.
The senses may legitimately declare different writing systems; again, see ภาโว(bhāvo) for an example.
Automatic change for {{sa-sc}} could hit a similar problem with Assamese 'script' (code as_Beng) v. Bengali 'script'. These Wiktionary-scripts are not easily distinguishable. Now, one can declare the script in both the headword invocation and {{sa-sc}}, so with luck the only problem will be headword and {{sa-sc}} being inconsistent. I think I tried putting both soft redirects under the same headword, found that the categories looked wrong, and split the entry, e.g. at অন্ধো(andho).
There's currently the issue that for Pali the headword line often simply uses {{head}} directly; often, using the apparently appropriate Pali headword line template would be deleterious. I'm currently fixing much of that as part of the process of enabling the disabling of the suppression of transliteration other than by override.--RichardW57m (talk) 11:37, 5 July 2023 (UTC)
On second thoughts, I don't think we need to keep implicitly stating that the term in in the Khmer script. So, what I would suggest should do the job for this is:
The symbol '⇨' can be read as ', which in the main spelling for Wiktionary is', but without the parsing issues of lengthy English. There might be a better dingbat, conveying 'look here for full script-independent information'.
Now, for where the spellings in the two scripts are the 'same', we would want to simplify it, so I suggest
In 'alt-of==', the second '=' means the transliteration of the form being inflected into Wiktionary's main writing system for the script. The transliteration does not have to pivot through the Roman script.
I think it looks better with the sole example of the transliteration in final place.
For a more complicated system, but still just a one-step chain, for ទត្តាយ(dattāya, “to the given one”), we would somewhere have the information currently given by {{sa-sc}}, namely 'Khmer script form of दत्ताय', and the definition could go:
I think the gloss should naturally get associated with the main script form, though the glosses for the form in the entry script and the main script could both be displayed.
In a first build, the entry script to main script can often be pivoted through Roman script with only occasional losses; these can be manually corrected via the value for |alt-of=. --RichardW57m (talk) 14:15, 5 July 2023 (UTC)
The Thai-script example above is for entry สุขี(sukhī). The final example, for ទត្តាយ(dattāya), is entirely concocted to give a straightforward example of complexity. --RichardW57m (talk) 15:08, 5 July 2023 (UTC)
What I want for article selection is customisation by language, so one can do a good fit by the language's tendency to have multiple forms at each derivation, and then overridable at each invocation of the template, as shown in the example with inline qualifier art. --RichardW57m (talk) 15:19, 5 July 2023 (UTC)
@RichardW57m I really don't think the use of an arrow ⇨ is a good idea. On first glance I'd have no idea what that means. Better to spell it out in words. Benwing2 (talk) 19:37, 5 July 2023 (UTC)
@Benwing2: As I suggested above, there are dingbats that would be better. How about ☞ or 🖛, possibly with a tool tip, such as 'see entry to the right for fuller script-independent information'. The obligatory commas break things up, whereas the notation should be a binding. --RichardW57m (talk) 08:15, 6 July 2023 (UTC)
@RichardW57: I have the same concerns with all such dingbats, and tooltips are not a good solution because they don't work on mobile or for people with pop-up blockers. The text can read , which is a Khmer-script form of or just , which is an alternative-script form of or , for which the main-script form is or whatever. This is more verbose but much clearer. Benwing2 (talk) 08:20, 6 July 2023 (UTC)
@Benwing2: The complex example above then expands to:
From that, I don't get any feeling that the transliteration applies to both forms, which is an immediate loss. The example with different spellings expands to:
On second thoughts, perhaps 'entry' would be better than 'script form'; 'main script form' might be construed as hate speech. We can also condense 'for which the' to 'whose', so shortening the more complicated form to:
It makes it clearer that the Devanagari is being referenced because that it is how Wiktionary is organised. It would also work for languages in which Wiktionary's main script for a lemma is not predictable, such as Old Khmer. (Its crumbling morphology might not be suitable for avoiding duplication of senses down to derivatives.) --RichardW57m (talk) 09:01, 6 July 2023 (UTC)
@RichardW57 What about using brackets? Something like this:
@RichardW57: All right, I'll see if I can add support for this. Basically, certain places that are now hard-coded need to be replaceable with an arbitrary function, which can implement the relevant logic. Benwing2 (talk) 18:48, 8 July 2023 (UTC)
Harmonisation might cause confusion, as some works classify Pali verbs by the corresponding Sanskrit classes. The problems are usually alleviated by using Roman numerals for the Sanskrit classes, which I thought was the normal system when numbering them rather than naming them from representative roots. I think I can use Dhtm 103 to complete the justification of shoe-horning atthi into the irregulars. I also want to have at least a maintenance category for unclassified verbs, as a piece of editor-friendliness, rather than force guessing. --RichardW57m (talk) 10:11, 5 July 2023 (UTC)
There is a use at ceil#Verb that isn't displaying correctly at the moment.
Note: the Documentation page includes the chapter titles, and ideally the template would display the appropriate chapter title. The new template is designed to work with the scan-backed copy of the 2nd edition being proofread at Wikisource, and which is currently about 70% done. --EncycloPetey (talk) 19:53, 4 July 2023 (UTC)
That's gotten the basic functions to work (Thanks!) but it's still not linking to the chapter, and as I say, it would be nice to set things up to switch in the chapter name for display, but that's outside what I know how to do. --EncycloPetey (talk) 20:52, 4 July 2023 (UTC)
Perhaps, but the first edition does not exist at Wikisource; only the second edition does. So the manner of, and targets for, linking will be completely different for the first and second editions. One will involve wikilinks for both book and chapter, and using the page number following the hashtag to reach the appropriate page, while the other will involve pointing to an adjusted page from a PDF scan displayed at the Internet Archive, and linking the book title to its Wikipedia article instead of a work at Wikisource. Combining the two disparate templates would be overly, and unnecessarily, complicated. --EncycloPetey (talk) 22:26, 4 July 2023 (UTC)
@EncycloPetey I agree with User:Sgconlaw here; unless the external interface needs to be significantly different, it would be better to combine the two templates. You can just put an {{#if:}} clause in the template code. Benwing2 (talk) 22:38, 4 July 2023 (UTC)
@EncycloPetey You mention wikilinks differing; this is part of the implementation, not the external interface. Here by external interface I mean the parameters to the template calls, which appear to be exactly the same for the two templates. Benwing2 (talk) 23:14, 4 July 2023 (UTC)
Parameters aren't external nor are they interface. The external interface is the connection between what's happening inside the template and its relation to the space external to the template. That interface is completely different between the two editions, as is what the parameters would be required to do in terms of linking as part of that external interface. --EncycloPetey (talk) 23:20, 4 July 2023 (UTC)
@EncycloPetey: If the same template can handle 2 different Chaucer manuscripts in facsimile and a 19th-century print edition, as is true for at least one Canterbury Tales template I was looking at recently, a single template can handle this. Chuck Entz (talk) 00:47, 5 July 2023 (UTC)
@Chuck Entz I would be interested to see that template, so that I can determine whether this is an analogous case. Thus far, I do not believe it is, because the template would need to do completely different things with the parameters in each situation. --EncycloPetey (talk) 01:01, 5 July 2023 (UTC)
The template is {{RQ:Chaucer Canterbury Tales}}. I misremembered: there was a manuscript, an early printed edition rather than a second manuscript, and the 19th-century edition- but that doesn't affect what I wrote.
The current situation is not analogous. The Canterbury Tales template is only linking to a page, and only to scans. The desired template here would link to chapters and pages at Wikisource, and the first edition scans could not link to chapters unless a set of interpretive values were added for that scan that first interpreted the chapter values using a preset table for comparison before outputting a link target. The Wikisource copy simply needs the chapter number dropped into a fixed structure to generate the link. That is, all Wikisource chapters are in the form s:The Souls of Black Folk (2nd ed)/Chapter N where N is a value from 1 to 14.
The Canterbury Tales template links only to Wikipedia articles for each item: title, chapter, etc., with just the page number linking to the source for the quote. I am proposing the chapter text generated by the Souls of Black Folk template link to the chapter at Wikisource, as well as having the page number link to the page, because doing so would be a relatively simple matter because of the standard formatting at Wikisource. That wouldn't be possible for linking to the first edition, which is a scan, and would involve a completely different set of code unique to scan linking. --EncycloPetey (talk) 02:18, 5 July 2023 (UTC)
@EncycloPetey We're talking about a single top-level conditional. If you think that is too complicated, it mostly just indicates you aren't comfortable with template coding. Benwing2 (talk) 03:44, 5 July 2023 (UTC)
If the multiple and disparate editions are codeable, then by all means. If that will be the case, then I recommend a look at s:The Souls of Black Folk, which lists several of the early editions, with pub. dates and links to scans where I've been able to find them. --EncycloPetey (talk) 03:49, 5 July 2023 (UTC)
@Benwing2 @Sgconlaw Looking at the quotations that use the first edition, I would say scrap the first edition function entirely, and use only the second edition at Wikisource. Our copy is cleaner that the scan linked for the first edition. See for example the quote used on oasis#Noun, where the first edition scan is missing a significant portion of the bottom left-hand page 11 that is used in the quotation. Given the poor quality of the scan used for the first edition, with missing and distorted text, it would make little sense to juggle two different sets of parameter implementation. A single template, using only the second edition, which has clean text at Wikisource, and which has a better quality scan backing it, would make the most sense. Both editions were published in the same year, just months apart, and from the same publisher, so it makes little difference for the date of a quote, and several of the essays in the book had been previously published elsewhere. --EncycloPetey (talk) 01:14, 5 July 2023 (UTC)
@EncycloPetey, Benwing2, Chuck Entz: I have updated {{RQ:Du Bois Souls of Black Folk}} so that it can link to both the Internet Archive and English Wikisource versions of the 2nd edition of the work. To quote from the English Wikisource version, specify |edition=2nd and |version=WS. Please try it out. (Regarding the use of the quotation template at oasis, the scan of page 11 of the 1st edition is a little distorted but there is no missing text.) — Sgconlaw (talk) 15:28, 11 July 2023 (UTC)
@Sgconlaw: Uh, yes, parts of words are missing in the first edition scan. Please look again at what is quoted, and then read that portion from the bottom of page 11. Part of the quotation is not there because the left hand gutter has devoured some of the text. --EncycloPetey (talk) 15:34, 11 July 2023 (UTC)
Also, the template is not doing anything that was part of the discussion. It is not displaying the chapter number with title given the chapter number, nor is it linking to the WS chapter and page where the quotation comes from. The Chapter should link to the WS chapter, and shouold give both number and title of the chapter. The Page numbers should link to the transcribed page in the Mainspace like this, not to the copy in the Page namespace. The Page namespace is used for proofreading, and can be accessed from the transcribed copy by clicking on a page number, but it is the working space of Wikisource, not the primary space. Someone accessing a quote that spans pages will only get part of the quote in the Page namespace, but will have access to the full quote in the Mainspace. --EncycloPetey (talk) 15:40, 11 July 2023 (UTC)
Regarding the original version of the 1st edition, I see what you mean. I was looking at page 12, not page 11. The good news is that I have found a much better version of the 1st edition at the HathiTrust Digital Library and will upload it to the Internet Archive for use with the quotation template.
I have updated the page link as requested. (I originally linked to the Djvu version because that is the URL which shows up when you click on any of the page number links at Wikisource.) However, I do not think we need to display both the chapter number and the chapter name. Generally, if a chapter name is available we just display that, and if the chapters are not named we just display the chapter number.
@EncycloPetey: ah, I forgot that |chapter= is now mandatory with the updated page link. OK, the template now generates an error message if the chapter is not stated:
Now the link isn't working because it's generating https:/ twice at the start of the link. Please check that it works before pinging me again. --EncycloPetey (talk) 16:59, 11 July 2023 (UTC)
It's working now, thanks! One final question: would the reader expect the book title to point to the book itself, or an article about the book? For the Wikisource edition, it is possible to link the book title to the main page of the book on Wikisource. --EncycloPetey (talk) 17:20, 11 July 2023 (UTC)
@EncycloPetey: we generally link the book title to a Wikipedia article if one exists. I don't think it's necessary to provide a link to the main page of the book source (whether the Internet Archive or Wikisource)—after all the page number (or, for the Wikisource version, the chapter name if the page number is omitted) already links to the source. If you're fine with the updated version of {{RQ:Du Bois Souls of Black Folk}}, I'll delete {{RQ:Du Bois Souls of Black Folk 2nd ed}}. — Sgconlaw (talk) 17:28, 11 July 2023 (UTC)
I am fine with it, but I do think that, where a Wikisource copy of a text can be linked from the title, it should. The main Wikisource page will link to the corresponding WP article, if the reader needs additional context. As a reader, I expect to arrive at the work whose title I click, rather than a secondary description of that work; just as I would expect to arrive at the entry for a word I clicked on Wiktionary. Linking the author to their WP article makes sense to me, but not linking to a WP article about a text when it is the yexy of the work relevant for the quotation. The page link at the end (after an OCLC link) will not be the reader's first guess at where to arrive at the work being cited. --EncycloPetey (talk) 17:46, 11 July 2023 (UTC)
@EncycloPetey: OK, great. My point about the title is that it seems redundant to have two external links to the same place, one at the title and the other at the page number or chapter name. Since there's already a link at the page number or chapter name, the title is more usefully linked to a Wikipedia article about the work as readers may want to learn more about it. — Sgconlaw (talk) 18:36, 11 July 2023 (UTC)
I understand that a reader might want to know more about the work, and that's why Wikisource links to the WP article from the top page of its works. My point is the Principle of Least Surprise. If I pull something off the shelf labelled Moby Dick, I expect it to be the novel Moby Dick, and not an article about the novel. A reader clicking the link would reasonably expect to be taken to the novel, and finding instead an article about the book, might be disappointed. I believe that it is more reasonable for a reader to expect to be taken to the quoted work, rather than an article about the work. --EncycloPetey (talk) 18:45, 11 July 2023 (UTC)
Trying to add an example
Hello, in で i tried add important example needs to translate to the English and the example usage "白水着で泳いでみた" (i tried to swim in white swimsuit), it's not harmful. Frozen Bok (talk) 08:03, 5 July 2023 (UTC)
@Frozen Bok You should be able to see the abuse filters you triggered here: It seems the edit summary you used had keywords of a sort typically used by vandals. For the post a little ways up that triggered the abuse filter, it was because there were formatting issues in the page you created. Benwing2 (talk) 01:11, 6 July 2023 (UTC)
This showed up on my watchlist. It is not the only entry on my watchlist bearing the message "this page is included within other pages".
How does that happen? Is it harmful or wasteful, in actuality or potentially? If it isn't, why is such a message made to appear? DCDuring (talk) 21:50, 5 July 2023 (UTC)
It's not new. I had noticed it for months, maybe years. It shows up in both "what links here" and the listing of templates used at the bottom of edit preview. It doesn't always show up in my watchlist. I have looked at a few words from non-Roman scripts and the self-transclusion seems to occurs there as well.
I notice that for Chinese entries there can be multiple transclusions of entries, which I have yet to see in other languages. DCDuring (talk) 00:03, 6 July 2023 (UTC)
I also note that self-transclusion occurs at Wikipedia. Also, it seems to show up regularly for entries that are one my watchlist because that are in a category on my watchlist. So the question is:
@DCDuring If you preview word, you'll see listed under "Templates used in this preview" word itself as well as a smattering of Chinese pages and a few others. I suspect the pages listed here are those for which content is fetched by Lua code on the page. Currently, for example, if you link to a Chinese-language term, the transliteration module looks up the page contents of the term to fetch its transliteration. Presumably there is some code that run by most or all pages that is fetching the page contents of the page itself. This sounds like something User:Theknightwho might have added, or they might know what is going on. Benwing2 (talk) 01:03, 6 July 2023 (UTC)
@DCDuring @Sgconlaw Tons of pages will show this, because viewing the raw contents of a page via a module counts as a transclusion of that page - even if none of it ends up being actually displayed. The headword module does various checks on the raw text of any page it's on (which makes it straightforward to add pages with issues to maintenance categories), so any page with a headword "transcludes" itself, even if there aren't any problems with it.
@Benwing2 Your suspicion is correct, as mw.getCurrentTitle():getContent() will trigger this, which is run by Module:headword/data to check for various things like manual uses of {{DEFAULTSORT:}}, as it messes around with automatic sorting and sometimes causes a (non-Lua) error to display if it overrides automatic sorting. It's in the "data" module so that it's only done once for the whole page. See Category:Pages with DEFAULTSORT conflicts, which shows that almost all of them are Japanese entries (as it used to get added by {{ja-new}} until recently). Theknightwho (talk) 01:21, 6 July 2023 (UTC)
Whatever the cause, however unavoidable the underlying phenomenon, the message on the watchlist is just cruft. How can we (by which I probably mean you) get rid of it? DCDuring (talk) 11:38, 6 July 2023 (UTC)
It certainly doesn't consistently appear. Since I've been looking, all entries transclude themselves, but only sometimes does the message appear on my watchlist. Can't something be done using CSS to suppress it? I don't want to have to wait for the root cause to be solved next year. DCDuring (talk) 21:24, 6 July 2023 (UTC)
Note that example 3 does not have the message, but the line is the result of the category being on my watchlist. DCDuring (talk) 01:57, 7 July 2023 (UTC)
@DCDuring I couldn’t get it to show up for me, so I’m not sure how to diagnose the issue. I Googled “this page is included within other pages” and got a few results relating to wikis, but nothing I could make much sense of as it’s all buried in random code dumps. Theknightwho (talk) 02:37, 7 July 2023 (UTC)
Oh I see, you have to turn on "Category changes" from the filter list to see it.
I suspect the thinking behind it is, if page A (think: a template) is transcluded into other pages, then those other pages will all be categorised into the category that page A was categorised into, therefore, the watchlist entry is incomplete, because more than just page A was added to the category. Hence the link to WhatLinksHere.
I don't get the logic. If many entries are transcluded into themselves, then those many pages will be (redundantly?) categorized into the same categories.
I had searched for "this page is included within other pages" in template, module, and mediawiki namespaces without result.
For a naive user like me it is not clear whether "this page" refers to the Category page or the page being categorized or decategorized. DCDuring (talk) 13:02, 7 July 2023 (UTC)
@DCDuring They're not actually transcluded by any normal definition of the word, so there's no redundant categorisation going on - they're scanning their own page contents for certain problems, but none of it actually ends up getting displayed. Maybe the logic behind it that @This, that and the other suggests is simply outdated, as they never anticipated situations like this being counted as transclusion.
I suppose the reason it's useful for the software to keep track of it is because it signfies that a change to page X will affect page Y, or in the case of transcluding itslf, that a change to the page might affect other things on the page in ways that are non-obvious. However, that doesn't change the fact that it's obviously not helpful for someone simply viewing their watchlist like you, as it's misleading. They should probably call it something else. Theknightwho (talk) 17:45, 7 July 2023 (UTC)
So, back to the main issue, how does get this cruft eliminated from the watchlist of people like me, whatever the rationale for the misnomer.
For that matter, why not get rid of an entry's name on the list of
"Templates used on this page" on bottom of the edit preview display
items linked by transclusion to itself?
if an entry is not properly called a template or is not truly "used" by 'transclusion'?
@DCDuring Turning off category changes would obviously do it, but there's a clear downside to that. Unfortunately the things you suggest are determined by MediaWiki's software, and I don't think we have any control over them. You can make a feature request at the Phabricator (as I expect they wouldn't see this as a bug since it's technically doing what it's supposed to, even if it's not actually very helpful). Theknightwho (talk) 19:10, 7 July 2023 (UTC)
@DCDuring I am not a CSS expert but I don't think this is possible purely with CSS; it would have to be done with JavaScript (probably a one or two-line JavaScript action would do it but I am not familiar with how to do this). Benwing2 (talk) 21:04, 8 July 2023 (UTC)
Perhaps I wasn't clear. The logic that displays "this page is included within other pages" was clearly intended for when templates, modules, etc. are added to categories, as such categorisation changes will generally affect other pages besides the page that was actually edited. It's apparent enough that the logic was not intended to be activated for pages that are only transcluded by themselves, which is the situation we are facing here. And yes, the logic is in the MediaWiki software itself, which is why the text is not visible when searching this wiki. This, that and the other (talk) 02:30, 8 July 2023 (UTC)
To be clear, it only shows up when the 'watch category' "Filter" on the watchlist and not for any other watchlist item, for which the same under logic applies. That makes it spurious where it appears. Whether the term transclusion is erroneously applied or simply misleading (possibly only to me) is secondary to my concern, which is merely with decluttering my watchlist and making watching categories better for all. DCDuring (talk) 14:07, 8 July 2023 (UTC)
Within the source for the watchlist items with the offending text is the following:
<a href="https://dictious.com/en/Special:WhatLinksHere/neoracism" title="">this page is included within other pages</a>
On Wenzhou in the Translation box, we see: "Wu: 溫州/温州 (1un-tseu-- !--)". This is because on 溫州's pronunciation box, we see "|w=1un tseu !--|wz = ʔy33-1 tɕiɤu33-11 --". The obvious fix is to delete the hidden content. But what's more interesting or important is: can you prevent the Translation box from grabbing hidden content? Thanks! --Geographyinitiative (talk) 10:13, 6 July 2023 (UTC)
It's because there's a pipe in the hidden comment, and Module:zh-translit uses pipes to know where the end of the parameter is.
There was also a minor bug with Wu romanisation where starting and trailing spaces would get converted to hyphens in addition to medial ones, so I've added a bit to remove any padding.
There may be a few instances where a page has reading XXX specified multiple times but one is entered as (e.g.) "XXX ", which would have been treated as different readings (causing a translit fail). Removing padding means that's no longer the case.
Hello! Since gerrit:930625 was merged a couple of weeks ago and has now been deployed, the following pages are no longer necessary – the MediaWiki software will still have the "c" access key even if these pages don't exist. In other words, they can safely be deleted without any functionality changing. The pages are:
Awesome!There are plenty of Wiktionary-worthy entries mixed up in there too. I just made a couple of dozen. This en.wiktionary seems like a fine place to be Not the famouse (talk) 12:26, 8 July 2023 (UTC)
"list" article broken on mobile site due to CSS shenanigans
As described on the talk page of the list article and the Reddit post where this bug was discovered, the "list" article is broken on the mobile site. That's to say the layout is broken and scrolling is impossible. The reason is that Wiktionary includes the name of the article in the name of an HTML class, and that name, in the case of the "list" article, collides with a pre-existing CSS class. Alpatron (talk) 18:09, 9 July 2023 (UTC)
There's nothing we can do to fix it (it affects all Wiktionaries). It has to be reported to the WMF devs, but don't mention Wiktionary or they'll probably close the ticket as "won't fix" with immediate effect. — SURJECTION/ T / C / L /19:46, 9 July 2023 (UTC)
Most MediaWiki-related tasks (bugs) in Phabricator get ignored; I don't think it is especially connected to Wiktionary. There are simply insufficient WMF and volunteer resources to address them all. Perhaps what we lack relative to other projects, like Wikipedia, are local volunteers with knowledge of the MediaWiki codebase and the time to help fix the bugs ourselves. I have the former but not the latter. This, that and the other (talk) 00:56, 19 July 2023 (UTC)
I have implemented the underlying code and changed the relevant Japanese categories to use {{auto cat}}. The same needs to be done for the other languages. Benwing2 (talk) 00:26, 11 July 2023 (UTC)
@Erutuon, This, that and the other I still see this. It's at the very top, where the title ought to be. Some JavaScript code is changing the title but I don't see any recent changes to our JavaScript code so I wonder if this is a MediaWiki bug. Benwing2 (talk) 22:20, 10 July 2023 (UTC)
Many languages have alternative forms of prefixes and suffixes ("variants" per User:RichardW57). Currently the {{affix}}/{{prefix}}/{{suffix}} handlers aren't very smart about this, and so e.g. if a Finnish term ends in -käs instead of -kas, the etymology specified using {{affix}} or the like either needs to write {{af|fi|foo|-kas}} (despite the term's spelling) or it needs a piped link something like {{af|fi|foo|]}}, or a display form something like {{af|fi|foo|-kas|alt=-käs}}. I'm thinking of making this smarter; this would entail adding language-specific data modules to Module:compound. There is precedent for this, since we have language-specific data modules for {{lb}} and {{infl of}}. This would also allow e.g. the category Category:Finnish terms suffixed with -kas to auto-display the alternative forms in the category description. Thoughts? Any other ideas? E.g. it would be nice to reduce the burden of specifying |id=, but I'm not quite sure how to do it. Benwing2 (talk) 20:02, 11 July 2023 (UTC)
For editors, it would be nice to have a documented option for displaying {{senseid}}, {{etymid}} and |id=. It would be off by default, and information on how to add its capability to templates and functions would not be restricted to the cognoscenti. (I suggest documentation by at or linked to from the documentation of {{senseid}}. As I edit with inflection tables displayed, I can't easily check that it is working in simple cases, as |id= it doesn't work with collapsed tables expanded, at least not on Firefox. --RichardW57 (talk) 05:48, 12 July 2023 (UTC)
Users can customise their displays. If the option is enabled, {{senseid|cy|frog}} could, at its simplest, display as 'senseid=frog. '. For the display of |id=, an elegant method would be to display the ID in the same way as |tr= and |ts=, perhaps resorting to the visible prefix 'id='. An adequate method for |id=toad would be to resort to adding ' (id=toad)' after the link. I believe all this might use a class in HTML whose layout in the stylesheet depended on the user's settings. Now, it may be that an implementation in Module:links will actually enable the display of ID in all templates, but if not, I would want other implementations (perhaps using square braces and a knowledge of the fragment coding convention) to be able to add conditionally hidden text, with display controlled by the same switch. Adding it should not depend on your being around (and willing) to help --RichardW57 (talk) 06:26, 12 July 2023 (UTC).
But that shows the sense ID with the language name, colon, and underscores. There would need to be an additional attribute (say data-senseid="sense ID without language name and with spaces") generated by the template to get a nice display. The text around the sense ID can be changed in the CSS. — Eru·tuon15:55, 12 July 2023 (UTC)
Read the earlier post again and I am not sure how to display the |id= attribute in links. It's more complicated because there could be a parenthesized set of annotations, no annotations, or no HTML at all ({{ll}}, though that's rarely used). — Eru·tuon16:26, 12 July 2023 (UTC)
I've added the data-senseid="" attribute to the HTML generated by {{senseid}}, so you can display the sense ID with the following CSS in Special:MyPage/common.css:
This solution might be too rigid. For example, one might want to link to vowel harmony rules or the application of vowel modification rules such as umlaut or ablaut which affect the stem. Or have you already taken these ideas on board? --RichardW57 (talk) 06:33, 12 July 2023 (UTC)
This seems like the best way to go about this, to have a template/module just know that if a Finnish term is input as having the suffix -käs it should categorize as -kas, since IMO people are unlikely to stop finding it intuitive to input whatever suffix a term visually has (like -käs), unless perhaps we started adding an explicit indication of the vowel change as a separate step (e.g. foobar + -kas + vowel harmony change of a to ä, which is way wordier than foobar + ]). The main problems I can think of are a) already problems at present, and b) at least as much issues of editor decision-making as of template functionality. Namely: 1) need to still handle cases where there are two suffixes -foo, and one is an alt form but the other is the main or only form (or even just in general, cases where there are two suffixes; look at the state of Category:English terms suffixed with -n where forgiven and Arizonan are lumped together: there is nominally a subcategory forgiven is supposed to be in, but it is nigh-unused), which is a "users don't know / can't be arsed to use more specific links/parameters" problem more than any issue with the templates per se, and 2) deciding what to combine (e.g. how many, if any, of -ian, -an, -n, and -ean — as seen in e.g. Bangkok+ian vs Abu Dhabi+an vs Saudi Arabia+n, Java+n, Ecuador+ean vs Achebe+an vs Althea+n vs ?Achillean, ?Antillean — should the module consider alt/variant forms of the same underlying suffix?). - -sche(discuss)19:44, 12 July 2023 (UTC)
@-sche Thanks. Yeah it's not always clear where to draw the line with alt forms. My general thinking is that changes that are primarily phonologically motivated should count but other sorts of variants shouldn't, e.g. in -an vs. -ian both can occur with the same word (Arizonan or Arizonian) so this wouldn't count, although there are gray areas e.g. Latinate -al vs. -ar where the latter usually occurs with stems containing an l, hence regular vs. general, but there are exceptions like filial, as well as familiar and familial (with different meanings). Benwing2 (talk) 19:58, 12 July 2023 (UTC)
@Surjection, -sche This support is available now. Currently only Finnish mappings are available; see Module:affix/lang-data/fi. Feel free to add more and expand the language support. The mappings aren't applied if there's a separate display form set with |altN= or the <alt:...> inline modifier, or if embedded (including piped) links are found in the affix. Benwing2 (talk) 20:20, 16 July 2023 (UTC)
Normally lemmas are listed in separate groups under their first letter when you look at a category list such as Category:Ket_lemmas. However, in that Ket list all words beginning with к are followed by all beginning with ӄ, without a big capital ӄ header. The two groups seem to be correctly sorted - there are all the к words in correct order then all the ӄ words, and a proper name Кънӄоʼ is correctly sorted ignoring case. It is just missing the big capital letter dividing them.
So I wondered if it was a peculiarity of ӄ, and what other language might use it? I guessed Uzbek. Curiouser and curiouser. In Category:Uzbek_lemmas if you page through to roman Z (sorry, I don't know how to link to that) you then get a list for o' and g' and sh and ch - it knows these digraphs are supposed to be separate letters (and capital Sh etc are correctly ordered within them), but it doesn't put letter headers for them. Then it goes through Cyrillic, gets to Я, and various other letters such as ў and қ then get listed without their own header. (And I learnt that ӄ and қ are different letters.) -- Hiztegilari (talk) 21:37, 11 July 2023 (UTC)
@Hiztegilari The sort key for Ket in Module:languages/data/3/k intentionally sorts ӄ together with but after к. The reason for this is that Unicode ӄ (code 1220 = 0x4c4) doesn't come directly after Unicode к (code 1082 = 0x43a), so by default they won't be grouped together, so this is being done to ensure that they are ordered correctly. The lack of a separate header is a side effect of this; User:Theknightwho can comment more on this but I don't think it's possible given the way the MediaWiki software works to ensure that ӄ comes after к but ends up with its own header. Benwing2 (talk) 23:18, 11 July 2023 (UTC)
@Benwing2 It is (probably) possible, but we would need to get MediaWiki to enable sorting by the Unicode Collation Algorithm by default. There will still be edge-cases where a specific language will need further changes (which would still have the issue with headers), but it would solve a large majority of manual sorting we’re forced to do at the moment.
It’s just a matter of getting $wgCategoryCollation changed in LocalSettings.php for the site, so if we go to the Phabricator they should make the change. I suspect they’ll want to see consensus for it, but I don’t think a vote is necessary unless they ask for it. Theknightwho (talk) 00:01, 12 July 2023 (UTC)
@Theknightwho I think we had a discussion about this before, and there were concerns about what would break or become backward-incompatible if we do this. If you're looking for consensus on a change like this, you should lay out what those concerns are and how to alleviate them. Benwing2 (talk) 00:10, 12 July 2023 (UTC)
@Benwing2 The main concern is that we’d end up with a bunch of sortkeys double-compensating, so the easiest thing to do would be to switch makeSortKey in Module:languages off for a few days while we fix the sortkeys in the data modules up to be compatible with the new default.
There would still be double-compensation issues with manual sorting, but it’s rarely used anyway and the issue isn’t that big of a problem in the first place. Theknightwho (talk) 00:36, 12 July 2023 (UTC)
Lua errors: back to the bad old days
CAT:E has filled up with the usual suspects again. Worse, a exceeds the template include size - so even the non-Lua templates are failing towards the end of the page!
I think we seriously should consider splitting our very longest entries into multiple pages. Not only are we fighting a losing battle against two technical foes (Lua memory and template include size), but the entries are now so long that the user experience is very poor, especially on mobile (try navigating to the Scottish Gaelic entry for a on your phone).
@This, that and the other Before we go down this road I'd like to hear from User:Theknightwho and the status of their pre-parser. They seem to have had some great luck radically reducing both template include size and memory on certain heavy pages, so we might not need to go the radical step of splitting pages by language (at least not yet). Benwing2 (talk) 07:37, 12 July 2023 (UTC)
@This, that and the other That’s correct - it’s not quite fast enough yet (as it times out on a about 75% of the way through), but on the part it did load it used about 25MB even after converting all the lite templates to normal ones. On teacher, memory use goes down from 49MB to about 13.5MB.
There are still a bunch of edge-cases that need to be worked out, and I’m currently refactoring it to try to increase the speed, since the current design centres around maximising memory savings and seems to be overkill. Theknightwho (talk) 12:28, 12 July 2023 (UTC)
Just to add, by the way: there's a third limit a is very close to as well (the 10 second load time). The lite templates are a major contribution to that, as they're much more intensive for the parser (even though they don't call into Lua). Plus, certain things cause multipliers to be applied to text when calculating the post-expand include size limit - notably, parser functions - and these can compound. That means the lite templates are responsible for a butting up against that limit as well. If we can solve the Lua memory problems with the parser I'm designing, it should also help with the other two limits as well since it means we could ditch the lite templates. Theknightwho (talk) 12:45, 12 July 2023 (UTC)
@This, that and the other Can you expand on the usability issues for mobile? Is there a way to mitigate them other than splitting the page? I don't normally use the mobile site but can you not just search for Scottish Gaelic, or use the table of contents? Benwing2 (talk) 18:26, 12 July 2023 (UTC)
Mobile doesn't have a TOC. But yes, you could search the page for the heading, this is a good point. It actually works quite well on my phone from a performance standpoint! However, you do have to move through a couple of results (a descendant) before you get to the right place. I would still prefer a world where a "find" wasn't necessary, but perhaps there isn't as much to worry about as I thought. This, that and the other (talk) 02:03, 13 July 2023 (UTC)
I've did a bit of testing, and it seems the post-expand inclusion limit can be avoided by replacing {{q-lite}} (and {{sense-lite}} which is based on {{q-lite}} with an even-more-lite template that only supports one argument; this only allows a few more sections though, up to Etymology 2 of a#Yola, and of course the Lua error messages are still there for the non-lite templates. I reckon that someone with more knowledge in how the lite templates work would probably figure out a way (like it might be beneficial to add |langname= in places where needed in order to avoid loading {{langname-lite}} which is stupidly large, or removing the ones in {{langname-lite}} that are rarely used) to bring down both the memory usage and the post-expand inclusion size.
Obviously, this would be a futile effort that requires a lot of effort for very little gain, and would soon require more work as the page grows longer. Ideally we would want to move away from the current practices that are inefficient and bring an end to the lite-pocalypse - TKW's parser seems like a promising start on such work to me. – Wpi (talk) 19:29, 12 July 2023 (UTC)
Tried to add a citation to the word "singlet", got flagged as harmful
I decided to look up some old usages of the word singlet (sense 1.3 "a person who does not have a form of multiplicity"), found a Usenet citation from 1999, and tried to add it to the citations page with the appropriate template. However, my edit was flagged as harmful under "abuse 23", whatever that was. (I don't see any way to check what exactly it was.)
I've been told that if I wanted to dispute the decision I should go to this (sub)forum, so here I am. I am fairly sure that adding a citation is a constructive action, and I'm surprised that it got flagged as harmful. What exactly did I do wrong, and what could I do to fix it? 2.52.7.19515:14, 12 July 2023 (UTC)
I see it now: it was because the citation template introduced an external link, and since my IP did not have previous edits, that flagged me as a possible spammer. Still not sure on the "what I could do to fix it" side, though. 2.52.7.19515:51, 12 July 2023 (UTC)
You should be able to add it now. As it's a non-secret filter, I can tell you that it merely blocks users and IPs with one edit or no edits at all. Theknightwho (talk) 16:30, 12 July 2023 (UTC)
I tried and it still didn't work. I've saved the edit text on my laptop for now. Maybe I should have changed the text somehow? Not today, though. 2.52.7.19519:15, 12 July 2023 (UTC)
It's triggering a global MediaWiki filter (metawiki:Special:AbuseFilter/214) that's not visible even to admins like me. I don't know what the purpose of the filter is, and I've copied your edit out of the abuse filter log and done it myself. Apparently the filter can be bypassed by admins. — Eru·tuon21:09, 12 July 2023 (UTC)
I think the reason @Theknightwho got confused is that the 23 in the "abuse 23" message isn't the number of the filter (that's 214 in MediaWiki's global abuse filter list), but a reference that only those who work with those filters know the meaning of. Our abuse filter 23 is designed to stop spambots that create bogus new user pages, so it doesn't apply to mainspace edits at all. Chuck Entz (talk) 03:49, 13 July 2023 (UTC)
Interesting, and I just noticed that having no prior edits didn't stop a different IP from adding someone's LinkedIn link to cumslut, which means global filter 214 can't just be stopping new users from adding links (which would've been my first guess and seems like a smart enough thing to do!); it must be something other than the mere addition of a link ... perhaps some long-term vandal spams links to that specific newsgroup, or perhaps the filter doesn't like the mention of dissociation or doctors plus a link (as if maybe you might be linking to quack cures), or perhaps the very gibberishy / spam-link-esque URL 5L4WQOpUWU/m/gyzds5osXwIJ is what's triggering it. Who knows. - -sche(discuss)05:09, 13 July 2023 (UTC)
@Chuck Entz So in the abuse log the IP above triggered filter 26 (which is for stopping new users adding external links), but I see right above it that the global filter you mention was also triggered. Theknightwho (talk) 05:21, 13 July 2023 (UTC)
@Einstein2 This is because of recent changes I made to the form-of templates to support multiple comma-separated lemmas. It expects actual embedded commas to be followed by a space, which isn't the case here. I can hack around this but ideally such templates should be formatted like this: {{initialism of|en|tense,mood,aspect}} instead of the hacky way of using the serial comma template. Benwing2 (talk) 00:11, 13 July 2023 (UTC)
Actually not sure about that; the comma-separated lemmas are intended to express the case where a given term is simultaneously the initialism (or whatever) of multiple lemmas (e.g. multiple lemmas each of which spells out TMA), which isn't the use case here. Benwing2 (talk) 00:14, 13 July 2023 (UTC)
If I understand your second comment correctly, it's neat to know that functionality exists. Reading your first comment, I had been about to say that expecting people to type {{initialism of|en|tense,mood,aspect}} (without spaces) whenever they want "Initialism oftense, mood, aspect" (with spaces) would be very unintuitive, but now I take it you mean the CSV approach is for the relatively few cases like WOC where it's equally the acronym of the singular woman of color and the plural women of color (and not that "Initialism oftense, mood, aspectortrimethoxyamphetamineortransmisogyny-affected" should be lumped together). - -sche(discuss)07:22, 13 July 2023 (UTC)
@-sche Yes, that's basically correct. I added the ability to have multiple lemmas in form-of templates mostly for {{infl of}}, so that e.g. Czech jimi can be written {{infl of|oni,ony,ona||ins|p}} (these are respectively, the masculine animate plural, feminine + masculine inanimate plural, and neuter plural third-person pronouns in Czech) rather than having to list the three lemmas separately. It ends up applying to all form-of lemmas, but it's more useful for some than others. I'm not suggesting it makes sense or is required to group multiple unrelated abbreviations together just because the functionality is there, but it may be useful for related terms, as you suggest. Benwing2 (talk) 18:41, 13 July 2023 (UTC)
A lot of redirected categories are defined with {{category redirect}} or {{movecat}}. Why weren't these categories just deleted? They generally consist of cases where a language has been renamed, or are misspellings (e.g. 'adjetives' in place of 'adjectives', missing 'the' in place name categories, etc.). Sometimes the redirects themselves are broken. I'm thinking we should just delete all such categories. Thoughts? Benwing2 (talk) 02:10, 13 July 2023 (UTC)
@Benwing2 It's because pages added to redirected categories don't get added to the main category, so this makes it easier to check. It comes from Wikipedia, and is mostly pointless for us as we tend not to add pages to raw categories. Theknightwho (talk) 05:06, 13 July 2023 (UTC)
Some of these could be useful, especially in cases where automated tools might continue to generate new inclusions in these categories. It's also worth keeping in mind that third-party sites might link to English Wiktionary category pages, and it would be rude to break their links altogether. I'm inclined to delete the misspelled and other obvious junk categories and see what's left behind. This, that and the other (talk) 09:37, 21 July 2023 (UTC)
That's true for pages in our main namespace, but not for pages in, say, the Wiktionary: namespace. A separate message for the two different scenarios should be provided by Wikidata developers. This, that and the other (talk) 11:17, 13 July 2023 (UTC)
@Wpstatus I've fixed this. The issue was caused by the fact that if Wikipedia is merely set to true, the main module prioritises what's in the display field over the label name. It was easy to fix by simply setting Wikipedia to "Multicultural London English" instead.
I have to admit that I didn't find this very intuitive either, and had to check the main module logic to see what was going on. Theknightwho (talk) 01:25, 15 July 2023 (UTC)
@Theknightwho After digging a bit deeper into this, I think this is actually a broader problem with how aliases are handled.
The label variable declared at Module:labels#L-61 is just the second template argument. This variable is then used at Module:labels#L-137, which sets the Wikipedia_entry variable to the value of label.
This means that the value of Wikipedia_entry is determined directly by the second argument to the template. This creates a problem when an alias is used.
For example:
When you do {{label|en|Multicultural Toronto English}}, you get the expected result: (MTE)
In contrast, when you use the alias like {{label|en|MTE}}, you get this: (MTE). Notice that it links to w:MTE. This is because the logic which I described above sets Wikipedia_entry to "MTE"
I think the correct fix is to update the finalize_data function (Module:labels#L-251) - this is the function that sets up all the aliases. When the aliases are being finalized, the information about the original label name should included in the entry.
Actually, after looking through the code a bit more, I think that's exactly what the alias_of field is for (Module:labels#L-92), except that it's not set properly. The fix might be as simple as just setting alias_of correctly within finalize_data.Wpstatus (talk) 02:44, 15 July 2023 (UTC)
@Wpstatus I've changed this so that alias_of gets automatically added if Wikipedia or glossary are set to true for a given label. alias_of is a bit of a misnomer, because finalize_data actually just makes all the relevant keys point to the exact same table (so they're all aliases of each other), which means it's not possible to know what the main key is supposed to be without doing it like this. Theknightwho (talk) 03:12, 15 July 2023 (UTC)
A reference can only be narrowed down to the part of speech, it seems (e.g. example#Noun). But there can sometimes be dozens of senses listed under that. How to link to one of those by number? Elfth (talk) 10:57, 15 July 2023 (UTC)
@Elfth: Yes, but it needs to be set up on both sides of the link. See {{senseid}}. So at example#Noun you could (e.g.) add {{senseid|en|representative}} in the top definition, typically after the # at the start of the line in the code, and then link to it with {{l|en|example|id=representative}}. —Al-Muqanna المقنع (talk) 11:12, 15 July 2023 (UTC)
You should prefer giving glosses instead if possible. Having links to specific definitions from definitions or even most etymologies is not usually ideal. — SURJECTION/ T / C / L /11:57, 15 July 2023 (UTC)
That's a fair point, lexicographically speaking, but Al-Muqanna's solution is not feasible for the end user who needs to cite Wiktionary via a simple URL. It would also be nice to have a unilateral way of linking to senses and not clutter the source. A graphical solution (like clicking on the sense's number to get its URL) wouldn't be perfect, but would be very convenient for the end user. Elfth (talk) 12:20, 15 July 2023 (UTC)
I think there is a feature that allows full links to highlight a particular fragment from a page, but it is a very obscure feature that few people even know exist (if it does at all, I might just be misremembering things). — SURJECTION/ T / C / L /13:50, 15 July 2023 (UTC)
I'm aware of that feature, and it works fine for me, but there's a trivial limitation. It highlights only the first instance of the text appended to the link (example#:~:text=Something that serves won't highlight the third sense), so the link has to be made longer for instances further down in the webpage. This works but isn't neat, especially if there are non-ASCII characters in the link, which get percent-encoded into an illegible mess. Elfth (talk) 14:49, 15 July 2023 (UTC)
@Elfth: I agree it's opaque, though other than the highlighting URL hack I'm not sure it's practicable to do it automatically with the existing software—as it stands, senses are just entries in a numbered list, and even their numbering will change as they get added, removed, and reshuffled. To generate reliable links automatically we'd really need to be treating every sense as an object with a fixed id, which implies a more advanced data structure than we've got right now. For now {{senseid}} is manual but has the advantage that the id is unlikely to ever change once it's set. —Al-Muqanna المقنع (talk) 20:00, 15 July 2023 (UTC)
Module:Quotations:425: bad argument #1 to 'concat' (table expected, got string)
^ That is what appears for: * {{Q|grc|Arist.|Pol.|1300|b|19|thru=33|refn=<sup>]</sup>|quote=ἔστι δὲ τὸν ἀριθμὸν ὀκτώ, ἓν μὲν εὐθυντικόν, ἄλλο δὲ εἴ τίς τι τῶν κοινῶν ἀδικεῖ, ἕτερον ὅσα εἰς τὴν πολιτείαν φέρει, τέταρτον καὶ ἄρχουσι καὶ ἰδιώταις ὅσα περὶ ζημιώσεων ἀμφισβητοῦσιν, πέμπτον τὸ περὶ τῶν ἰδίων συναλλαγμάτων καὶ ἐχόντων μέγεθος, καὶ παρὰ ταῦτα τό τε φονικὸν καὶ τὸ ξενικόν (φονικοῦ μὲν οὖν εἴδη, ἄν τ’ ἐν τοῖς αὐτοῖς δικασταῖς ἄν τ’ ἐν ἄλλοις, περί τε τῶν ἐκ προνοίας καὶ περὶ τῶν ἀκουσίων, καὶ ὅσα ὁμολογεῖται μέν, ἀμφισβητεῖται δὲ περὶ τοῦ δικαίου, τέταρτον δὲ ὅσα τοῖς φεύγουσι φόνου ἐπὶ καθόδῳ ἐπιφέρεται, οἷον Ἀθήνησι λέγεται καὶ τὸ ἐν '''Φρεαττοῖ''' δικαστήριον· συμβαίνει δὲ τὰ τοιαῦτα ἐν τῷ παντὶ χρόνῳ ὀλίγα καὶ ἐν ταῖς μεγάλαις πόλεσιν· τοῦ δὲ ξενικοῦ ἓν μὲν ξένοις πρὸς ξένους, ἄλλο δὲ ξένοις πρὸς ἀστούς), ἔτι δὲ παρὰ πάντα ταῦτα περὶ τῶν μικρῶν συναλλαγμάτων, ὅσα δραχμιαῖα καὶ πεντάδραχμα καὶ μικρῷ πλείονος.}} on the preview screen when I edit Φρεαττώ, whereas the published page just shows a bare bullet. Note that the other two transclusions of T:Q in the entry function properly. Line 425 of Module:Quotations reads: return table.concat(values, separator) but I'm otherwise at a loss to understand what the problem is. Could someone more knowledgable than me explain/correct this problem, please? 0D foam (talk) 22:59, 15 July 2023 (UTC)
@0D foam: it's a bit more complicated: @JoeyChen also made massive changes to Module:Quotations/grc/data at the same time. Previewing Φρεαττώ(Phreattṓ) from this version works fine, but the next edit broke everything and the following edit fixed most of the errors- but left it with this error. I don't know Lua well enough to easily spot the bug in the code, but I'm pretty confident that this is when and where it was introduced. Chuck Entz (talk) 06:19, 16 July 2023 (UTC)
@Chuck Entz: Thanks for your response. I tried fiddling around with it, but nothing I tried fixed Φρεαττώ on preview. Hopefully, @JoeyChen will be able to find and fix the problem. 0D foam (talk) 12:04, 16 July 2023 (UTC)
The 'separ' function introduced by JoeyChen as a wrapper for table.concat at Module:Quotations is only used (anywhere) by the Aristotle handling and is the source of the error. Each of the rlFormats for Aristotle calls .separ along the lines of {'.separ', {'.ref1', '.ref2', {'.digits', 2, '.ref3'}}, '.'}. Not a Lua expert but from the Module:Quotations docs the issue might be that separ isn't actually being given a table as a parameter since in rlFormats tables not beginning with a function are interpreted as nested variable addresses? Should probably revert the changes to Aristotle until a fix is found anyway. —Al-Muqanna المقنع (talk) 12:47, 16 July 2023 (UTC)
I'm not familiar with Module:Quotations, but it seems massively over-complicated for what it's doing. Table concatenation is a really basic function, so why does it need a whole new method in the main module? Theknightwho (talk) 14:07, 16 July 2023 (UTC)
@J3133 Fixed. I introduced |nosuffix= to disable this check, but also added a check for things beginning with more than one hyphen, which aren't suffixes in any case. Benwing2 (talk) 20:09, 16 July 2023 (UTC)
Best way to express vowel quality that’s not normally shown in regular writing
In Norwegian, we have two sets of vowels, “open” and “narrow”. The latter stem from Old Norse long vowels (í, é, ú etc.), and sound different from the “open” vowels (i, e, u etc.), so a word like ON hof became , while hófr became . However, while the vowels are clearly distinguished in normalised Old Norse writing, Norwegian generally doesn’t do this, and the pronunciation may be quite ambiguous if you don’t know the spoken form. Because of this, Norwegian dictionaries tend to add “ò” or “ó” in brackets after the word to show which vowel quality it has. This is essential to know how to say a word you don’t know.
A solution to this is to add the pronunciation in a new section, which I generally do anyway, but this needs proficiency in IPA, which not all editors have. Then there’s the fact that spellings like hòv and hóv aren’t actually prohibited. It’s an optional feature in spelling as well as just a way to show the pronunciation in dictionaries, so I think the ideal way to express it would be to show this optional accent, as opposed to only the IPA transcription. How could this best be done?
My ideas would be to either add the accent to the head form itself, or in brackets to the right:
@Eiliv Support for adding extra accents to headwords themselves is already available and used for many languages (e.g. macrons in Latin and Old English to indicate length, tone marks in Latvian, stress+pitch accent marks in Lithuanian, stress marks in East Slavic languages, stress+tone marks in Slovenian and Serbo-Croatian, etc.). So we could definitely add this to Norwegian. It would just be a change to the Nynorsk and Bokmal entries in Module:languages/data/2 to tell the link-processing code to strip off acute and grave accents. Benwing2 (talk) 20:14, 16 July 2023 (UTC)
See Serbo-Croatian example at slovo. So it is still searchable as slovo, while you can see the accent marks at the entry. (At det er mogleg å slå det opp når ein skriv ordet i søkefeltet utan diakritiske teikn, men at desse teikna er med i sjølve ordbeskrivinga. Det er til dømes gjort her med serbokroatisk slovo og russisk слово(slovo)) Tollef Salemann (talk) 21:35, 16 July 2023 (UTC)
@Eiliv I went ahead and added strippping of acute and grave accents from Nynorsk links. A side effect of this is that headwords with acutes and graves in them can't be so easily linked to, so hopefully there aren't any or they can be moved. I would add circumflex as well per Norwegian orthography but it seems we have some existing Nynorsk entries with circumflexes, e.g. vêr. Benwing2 (talk) 22:05, 16 July 2023 (UTC)
Let me know if this causes any issues. The idea is that Nynorsk headwords would use the |head= parameter or similar to specify the version with accents in it. Benwing2 (talk) 22:06, 16 July 2023 (UTC)
This seems good, thank you! For now, I think the circumflex should stay untouched as it’s frequently used in regular writing. Eiliv / ᛅᛁᛚᛁᚠᛦ (talk) 22:40, 16 July 2023 (UTC)
As there has been some recent disbelief as to the existence of some characters in communication, I have started using words recorded on Wiktionary as evidence of such use. For the quotations, I was using mentions with {{m+}}. As part of the emboldening, I want to embolden both the minimal rendering portion of the word containing the character and the minimal representation in its transliteration. Now, I had worked out a series of methods of doing that:
Embolden as small a region of the character in the word in script as I can and still get a reasonable display.
If the marked up word won't transliterate properly, supply an appropriately emboldened manual transliteration.
If too much of the automatic transliteration is emboldened, then supply a manual transliteration.
However, I hit a problem with a choice of Burmese word to illustrate the Burmese use of ရ with the intrinsic vowel overridden. I chose ရာဇာ(raja, “king”), for which I want an emboldened transliteration 'raja'. At step 1, I ended up with {{m|my|ရာဇာ}}, which yields ရာဇာ(raja). Step 2 isn't applicable. So for Step 3, I try to override the transliteration,
{{m+|my|ရာဇာ|t=king|tr=raja}}, but the override is ignored. What should I do?
though I know I need to replace the double quotes with the appropriate template invocations to honour user preferences. I can't induce {{quote}} to produce a display on one line with a link to the word being quoted, even on a desktop. --RichardW57 (talk) 15:21, 16 July 2023 (UTC)
@RichardW57 This is a Grease Pit discussion and doesn't belong here (I for one don't regularly follow the Information Desk, but I do check the Grease Pit often). IMO using a manual translit isn't the right approach, and the reason it's ignored is that evidently Burmese is one of the languages where this is being done intentionally (there's a setting that controls this). The reason this happens in the first place, as you probably know, is that single quote chars are evidently passed unchanged through the translit process. The correct approach is to modify the Burmese transliteration module so that you can use a special character of some sort to indicate that you want only the consonant, not the consonant + vowel, to be boldfaced. E.g. maybe you can use a % sign before the consonant to indicate this. Benwing2 (talk) 01:22, 17 July 2023 (UTC)
@Benwing2: So, I feed in a sequence that looks like '''ရာ'''ဇာ but has an invisible character after the final triple apostrophe that forces the final triple apostrophe one character to the left when passed through the transliteration module. Oh, and I also need the sequence of triple apostrophe and invisible character not to cause the transliteration to be done in parts. That feels horribly hacky and vulnerable to later breakage.
I suppose I should try one of INVISIBLE TIMES/SEPARATOR/PLUS. I hope no-one is feeling possessive about the transliterator. --RichardW57 (talk) 02:35, 17 July 2023 (UTC)
@RichardW57 That's not what I said. I suggested something like {{m|my|%ရာဇာ|t=king}} where the % sign causes the transliterator to boldface the following single Latin char that emerges. This is a bit hacky but IMO it's better than totally faking it the way you do above. Also if you could, please move this discussion to the Grease Pit so others can contribute. Benwing2 (talk) 02:58, 17 July 2023 (UTC)
@Benwing2: I'm now confused. {{m|my|%ရာဇာ|t=king}} produces %ရာဇာ(%raja, “king”) with direct display of the '%' in the Burmese script, which we do not want, and no emboldening of any portion of the Burmese script word. (Incidentally, we should be thinking about something like {{m|my|ရာဇာ|%ရာဇာ|t=king}} so that it links to the Burmese word.) Were you thinking of having an additional step ('exercise for the reader' to use Module:string) to strip the '%' from the display? So I'd need a template so as to avoid invoking a module directly from a mainspace page! --RichardW57 (talk) 03:36, 17 July 2023 (UTC)
By trying to work out how @Benwing2's solution could work even with changes to the transliteration module, I found the basis of a moderately readable solution.
and even links to the right place. This solution can even handle what I see as discontiguous transliterations - I see stacking as usually stripping the vowel from the subscripted consonant, not the one that remains unmoved. It also applies to rules such as the initial consonant determining the tone or register, though we then hit the Latin script problem that accents can't be emboldened in isolation.
The only problem is that I need to create a template to wrap the module invocation - the tooling for Module:string doesn't provide wrappers for its functions to be invoked directly from articles - unless the rules have been relaxed since I started toiling here. --RichardW57m (talk) 08:22, 17 July 2023 (UTC)
@RichardW57 As I said in my first post, you need to modify the Burmese transliteration module to support this. It looks like the code is in Module:my-pron. I don't know how difficult this is but it shouldn't be so hard; essentially, pass the % sign unchanged through the transliteration process, then as a postprocessing step, convert sequences of % followed by a Latin character to a boldfaced Latin char. Benwing2 (talk) 03:46, 17 July 2023 (UTC)
@Benwing2 I'd prefer we not use the percent character for this because it's already used by the Japanese templates for rubytext, meaning that we probably want to reserve % in the main link templates for that purpose to enable the smooth integration of {{ja-r}} down the line.
@Theknightwho Sure, any char would work, or we could just "fake it" like RichardW57 suggests. I don't know if as an alternative it makes sense to allow a way of overriding the override_translit flag by including some signal in the manual translit. Benwing2 (talk) 04:24, 17 July 2023 (UTC)
@Benwing2 Hmm - I'd be keen to avoid an override flag if possible, to avoid it being abused. One thing that has come up before is whether we should have a way to send optional flags to the transliteration module, which would avoid having to specify a whole manual transliteration but would allow for regular variations (like with е(je) in Russian). Perhaps we could have a standardised way of inputting these that can be interpreted by the language as necessary? (e.g. {{m|my|ရ<!>ာဇာ|t=king}}, where ! is some flag that the language transliteration module knows how to interpret. In Russian, it could be something like {{m|ru|фэ́нте<э>зи}} to give фэ́нтези(fɛ́ntɛzi). Possibly even something like {{m|ru|фэ́нт<э>зи}} would be workable, where <э> acts as a stand-in. Theknightwho (talk) 04:45, 17 July 2023 (UTC)
@Theknightwho Yup, I've long wanted this. It's especially useful in Arabic with the tā' marbūṭa in multiword expressions, which needs to be transliterated as either nothing or as t depending on the syntax. Since the transliteration module can't reasonably work out the syntax of the expression, it renders it as (t), which is less than ideal and leads to the need for a whole lot of (often long) manual transliterations. Benwing2 (talk) 04:55, 17 July 2023 (UTC)
@Theknightwho: The particular application is likely to come up around thirty times in the next month or so, and then go back to rare. The particular application is highlighting the correct bit of Burmese text when it does not correspond to akshara boundaries.
A more related problem is that of highlighting whole words in quotations when the word boundaries occur inside aksharas. That happens for me with Pali about twice a quotation. In the Indic script, formatting boundaries have to occur at akshara boundaries. (Sometimes, as in the Thai script, breaks can occur between spacing mark and consonant.) As I am using quotations fairly intensely, selecting most words, it is quite regular. The work around is to use a manual transliteration for the quotation - it's generation can be automated. The problem with transferring this trick is that Burmese does not allow manual overrides for transliteration. --RichardW57m (talk) 08:02, 17 July 2023 (UTC)
@RichardW57 There's a display_text field on languages that lets you apply substitutions to the raw text prior to display. User:Theknightwho can comment on whether the raw text or display text gets sent for transliteration but if it's the raw text, all you need to do is add a substitution in the Burmese display_text field to boldface the Burmese character after $ or whatever special char you choose. Benwing2 (talk) 09:22, 17 July 2023 (UTC)
Actually, its boldfacing the akshara after the control character, or else one will get a dotted circle, but its doable once I get the editing privilege. For my purposes, it's a lot easier to apply substitutions at need - I can be as flexible as I need if the transliterations are now stable - cheap and cheerful! --RichardW57m (talk) 16:11, 17 July 2023 (UTC)
@RichardW57 I see you found another solution that doesn't require changing the language definition. The restriction on not directly invoking modules is for the mainspace; if you're only using this in discussion forums, I don't think it matters. Benwing2 (talk) 09:25, 17 July 2023 (UTC)
For any of you who regularly work with substrates (e.g. the pre-Greek substrate, the Balkan substrate, the BMAC substrate), I renamed the substrate codes to begin with sub- instead of qfa-sub- (the exceptional qfa-pyg for the Pygmy substrate is now sub-pyg). This was done to shorten the names, particularly so that I could eliminate the nonstandard pregrc alias for the pre-Greek substrate in favor of sub-grc, which is only one character longer; the old code qfa-sub-grc was 5 chars longer, which made it significantly more annoying to type and could have justified keeping the alias. Benwing2 (talk) 06:35, 17 July 2023 (UTC)
Thanks for this. It would be good to take this a bit further by eliminating any other 10 & 11-character codes wherever possible (xx-xxx-xxx and xxx-xxx-xxx), as they're really unwieldy. Theknightwho (talk) 08:12, 17 July 2023 (UTC)
This probably makes some browsers / screenreaders / etc interpret them as varieties of Suku (ISO code sub), but I suppose it doesn't matter from anything but a technical-correctness standpoint, since fonts should be set by our own CSS that uses our codes, we don't normally tag text with any of these codes AFAIK (the way {{der|en|de|foo}} tags text as de) anyway, and I can't imagine anything that interprets these codes as signalling varieties of Suku will handle them any worse than if it interpreted them as private-use-range (q..) codes. However, I am not sold on the idea of updating all the xxx-xxx-xxxs, because if we start having a lot of cases where foo-bar is sometimes a variety of foo (like la-vul is a variety of la, etc) and sometimes completely unrelated but we just shortened our exception code to the valid ISO code foo + hyphen + three more letters even though it's not the language ISO code foo corresponds to, that strikes me as bad. I think most of our xxx-xxx-xxx are also proto-languages where it helps that the codes are just "family code" + "-pro", and breaking that system also strikes me as bad. I suppose we could try to rename all the xxx-xxx family codes to things in the limited ISO private-use q.. range, and then rename their corresponding protolanguage codes to qXX-pro, but it seems like that would reduce the intelligibility of the codes, since we have a lot of family codes. So I think that idea requires more thought. - -sche(discuss)00:45, 18 July 2023 (UTC)
@-sche Hmmm, I didn't think about the conflict with language code sub at the time. If this is an issue, we could always rename the substrate codes to something that won't clash; there weren't too many uses and I'm tracking the new substrate code uses. One possibility is something like qsub-foo; this is guaranteed not to clash with any ISO codes since they're all 3 letters. I agree that renaming all of the 10/11-char codes needs more thought. Benwing2 (talk) 00:49, 18 July 2023 (UTC)
Ah, yes, renaming them to something that's outside the ISO schema entirely could work. I guess we could rename all the long codes TKW mentions, and their family codes, to non-ISO-style codes, too, as a possibility ... but if we rename things to (codes that start with) four-letter codes, we should probably be careful that they don't look like / conflict with four-letter ISO 15924 script codes (for which the reserved range IIRC is Qaaa–Qabx, again quite small, not allowing for many intelligible custom codes); in theory the ISO could assign Qsub as the script code of some script and then qsub-foo might be interpreted as the foo-language version of that script(?). I suppose we could do something like, assign the substrates and our xxx-xxx-xxx-sized codes five letter codes (and just make sure they don't conflict with our five-letter script codes like Polyt)? Or six letter codes? IDK; it's an idea to ponder. - -sche(discuss)01:10, 18 July 2023 (UTC)
@-sche, Theknightwho Do either of you know if script codes directly correlate with CSS classes? Looking through the module code for 'polytonic', I see various places that reference a CSS class polytonic. Does this need to change to Polyt as well? Benwing2 (talk) 05:02, 18 July 2023 (UTC)
If I understand your question correctly, then yes. If you're talking about something like this with class="polytonic" lang="grc", it definitely seems to be assuming "polytonic" is the script code, so if we've updated the script from "polytonic" to "Polyt", that instance should also be updated. - -sche(discuss)07:32, 23 July 2023 (UTC)
other xxx-xxx-xxx codes
Spitballing an idea for shortening other xxx-xxx-xxx codes as requested above:
To keep a system where proto-languages' abc-def-pro codes and families' abc-def codes are derivable from each other, could we rename the 'second part' of the codes to "two letters + f for family or p for proto-language", i.e. rename aav-khs, aav-khs-pro → aav-khf, aav-khp?
What other 11-letter codes are there? Is it just the four qfa-xgx-... ones? (Anything else should use the nearest ISO i.e. three-letter family code, right? Like we have bnt-bal, not bnt-bbo-bal.) We could make an exception-to-the-qfa-exception and shorten qfa-xgx → qpm (still in the ISO private-use range, like qfa) and shorten qfa-xgx-tuh → qpm-tuh etc. Or we could make a general prefix qla- for ISO-family-code-less exceptional languages, like qfa- for families, and rename qfa-xgx-tuh → qla-tuh.
@-scheqpm would make sense as a family code, and would carry through to the language codes by extension. This will only ever come up in situations where we need to have wholly new family codes because there is no parent superfamily, which doesn't happen very often. In all other situations, new family codes can simply be derived from their superfamilies. Theknightwho (talk) 11:54, 25 July 2023 (UTC)
@-sche, Theknightwho My main concern with aav-khp that things with -pro are instantly identifiable as proto-languages, while aav-khp isn't so obviously identifiable. But maybe we could use four-letter codes for proto-languages? E.g. aav-khsp, or gemp for Proto-Germanic, ine-bslp for Proto-Balto-Slavic in place of gem-pro, ine-bsl-pro. I think you've said there's a theoretical possibility of clash with four-letter script codes, but how realistic is that? For one thing, script codes begin with a capital letter (although I'm not sure whether CSS classes are case-sensitive). If this is an issue, we could use a special character to denote the proto-language, e.g gem+, ine-bsl+ or gem-, ine-bsl- or something. Benwing2 (talk) 06:25, 26 July 2023 (UTC)
I agree it would be good to keep the proto-language codes recognizable. Testing, it seems like script codes are indeed case-sensitive such that "foo-arab" will not clash with "foo-Arab" — colour me surprised, since if you coloured me ff0000, that is not case-sensitive. I suppose that means we could indeed use aav-khsp or aav-khsP or something. But to back up for a moment, I suppose the obvious question we might should check is, how broad is the demand to shorten these? since renaming every proto-language will affect a lot of editing communities. (Creating qpm would take care of the para-Mongolian long codes independent of changes to proto-language codes.) - -sche(discuss)23:17, 27 July 2023 (UTC)
Blocking specific items on watchlist
My watchlist has become cluttered with certain topics. Is there a way that I can block items from a specific section of a specific page, say the German section of a specific entry? There are many pages, mostly user talk and WT discussion pages, I'd like to watch without having other items buried by too-frequent postings to a topic I do not much care about.
Is there some custom CSS or JS that I could insert that would suppress a specific section of a specific page or even just any section of any page that had a specific section name or a specific word in a section name?
Could I filter out contributions from certain users using similar means?
@DCDuring I don't actually use Watchlists because they get too cluttered. With all the filters added, it's better but still not good enough IMO. However, you can subscribe or unsubscribe to specific topics in discussion pages; I don't know if that helps you. As for custom CSS e.g. to suppress entries related to specific users or specific keywords, User:This, that and the other or User:Erutuon any ideas here? It would seem that such functionality should be present in the Watchlist filters (e.g. filters on AWS consoles have these sorts of things and it's apparently part of a general widget library that AWS uses; not sure if it's Amazon-specific or also available in open source libraries like Bootstrap, but I'm sure it's been asked about a bunch of times on the Phabricator). Benwing2 (talk) 21:35, 17 July 2023 (UTC)
I don't see how I could not watch BP, GP, ID, TR, RfVE, RfDE, RfDO. When I am watching the whole page, how would I unsubscribe to a specific section? That certainly isn't a visible option. DCDuring (talk) 22:03, 17 July 2023 (UTC)
In Preferences, under Recent changes, but not under Watchlist, appears a checkbox "Group changes by page in recent changes and watchlist". Checking this box reduces clutter, but a cost in performance (It must be a large JS script.). I don't know whether it will prove adequate over time. DCDuring (talk) 03:07, 18 July 2023 (UTC)
@Chuterix We need to have a proper discussion about how to romanise the Ryukyuan languages, because the transliteration module made by Huhu9001 doesn't work in a conventional way. Unless you're very confident with Lua, I'd advise that you explain what changes you want to make first. Theknightwho (talk) 21:12, 18 July 2023 (UTC)
@Chuterix: Are you sure it should be kuzï instead of kudzï? I remember there is a /z/~/dz/ contrast in Miyako, as they use さ行 + handakuten to represent the former. -- Huhu9001 (talk) 01:46, 19 July 2023 (UTC)
Speaking of a standard, there are at least two major systems to write Ryukyuan pronunciation in kana other than classical Okinawan (Omoro) spelling, as far as I know.
A system that used in 現代日本語方言大辞典 (1992): This system covers all Japonic dialects, but the spelling system depends on the font. The Laryngeal sounds and unvoiced nasal are written same as their normal equivalents but font type; I think that is user-unfriendly for the web environment.
Systems fixed by Okinawa prefecture (2022): Those systems are officially fixed by the local government for the 5 areas within the prefecture. It lacks the orthography of dialects of Amami since they are not included in Okinawa prefecture but Kagoshima.
@Eirikr Although it's undocumented, Huhu9001 has designed the Japanese transliteration module in a way that makes it very easy to modify it as needed for the Ryukyuan languages, which is really helpful. I'm oversimplifying, but the essence of it is that you only need to specify the differences from standard Japanese transliteration. This is done in a systematic way, so it means you can (for example) specify how certain letters should behave when geminated without having to do special logic for it. However, that does mean that we need to come up with standardised ways for transliterating the various Ryukyuan languages in the first place. Theknightwho (talk) 21:58, 18 July 2023 (UTC)
@Chuterix I would stop pinging people so much - you've pinged some people 3 times in this discussion. We need to have a full discussion before we can answer specifics like this. Theknightwho (talk) 22:56, 18 July 2023 (UTC)
If I'm reading the timestamps correctly, Chuterix created the faulty Module:ja-translit/data/ams (which Benwing2 later deleted) almost one hour after my comment above about how “we need to have a lot more discussion about that first, before we encode that in a module.” This again points to headstrong editing that deliberately ignores what other editors are saying. And this is after I applied a one-day block to Chuterix to try to get them to pay attention and slow down -- and stop pinging so much -- as discussed at User_talk:Chuterix#Block. @Chuterix, please take heed of what other editors are telling you. Be advised that any editor who engages in disruptive behavior, and who does not change that behavior after being advised about it, is subject to being blocked. ‑‑ Eiríkr Útlendi │Tala við mig17:58, 19 July 2023 (UTC)
Yes, but it caused a bunch of errors to happen, because the main module checks whether it exists. Creating a blank module like you did meant it registered as existing, but didn't have any of the things the main module actually needed for it to work. Theknightwho (talk) 19:20, 19 July 2023 (UTC)
@Chuterix: Sorry, I can barely understand what you are talking. I suggest you first write an descriptive essay on the translation system of the language whose romaji you want to change, including what kana it use, what sound they represent and how they are noted as Latin letters. It is important for us to grasp the whole picture so that we can have the work done with less problems. -- Huhu9001 (talk) 00:55, 19 July 2023 (UTC)
Creating an entry for Wish dot com
I've attested attributive usage of Wish.com (see Citations:Wish.com) but am unable to create an entry for it because the creation of pages containing ".com" is blocked. I presume this was necessarily implemented as an anti-spam measure. But there's always edge cases such as this. Could someone with the relevant know-how create an entry for Wish.com? I presume this would done through Appendix:Unsupported titles but I don't want to try blindly mucking around there myself. WordyAndNerdy (talk) 14:51, 19 July 2023 (UTC)
I believe it's roughly equivalent to figurative uses of pound-shop or dollar-store. I'm not set on any particular definition. The one on the Citations page can be treated as a rough draft or placeholder. I'd be fine with a non-gloss def. WordyAndNerdy (talk) 15:54, 19 July 2023 (UTC)
There are now three entries for terms derived from this constructed language (dracarys, valar dohaeris, and valar morghulis). Would it be possible to add High Valyrian to the language database for the purposes of categorizing these entries? This was done with Dothraki (codified as "art-dtk") several years ago. WordyAndNerdy (talk) 19:37, 19 July 2023 (UTC)
@WordyAndNerdyWT:CFI is rather vague about what the criteria are for including Appendix-only constructed languages; it just says "at the community's discretion". Given that we seem to include other languages related to well-known fantasy series (e.g. CAT:Na'vi language) it is probably fine to include this as well. Benwing2 (talk) 22:54, 19 July 2023 (UTC)
Speaking of constructed languages, WT:CFI says only Esperanto, Ido, Volapük and Interlingua are allowed in the mainspace, but we also have CAT:Eskayan language. @-sche Do you know anything about this language? Presumably it should be moved to the Appendix? Benwing2 (talk) 23:38, 19 July 2023 (UTC)
All three of the entries referenced above were created (and attested) as English terms. I'm not proposing adding HV entries to mainspace other than those that can be attested as English or another natural language. I'm suggesting HV be added to the languge database so a category such as Category:English terms derived from Dothraki may be created. WordyAndNerdy (talk) 00:05, 20 July 2023 (UTC)
Thanks! I see now that someone's created appendix entries for fiction-only Dothraki terms. Arakh is probably attestible as an English term at this point. There doesn't seem to be a single accepted term for sickle-like swords in fantasy media. Also, would Daenerys belong in Category:English terms derived from High Valyrian? This is established within the series as a given name derived from High Valyrian, but there isn't a canonical meaning/translation assigned to it, AFAIK. WordyAndNerdy (talk) 04:40, 20 July 2023 (UTC)
Can it accurately be described as having "derived from High Valyrian" when it hasn't been given a clear etymology in the real-world constructed language of High Valyrian? Its status as a High Valyrian name is in-universe information from the fictional world of ASOIAF/GOT (members of the Targaryen dynasty have names originating in Valyria with few exceptions). Not really sure how this particular case would fit under the WT:FICTION framework. WordyAndNerdy (talk) 06:18, 20 July 2023 (UTC)
Hmm... IMO, if the books/show/conlang materials affirmatively say Daenerys is a High Valyrian-language name, then it seems correct to say it's "derived from High Valyrian Daenerys", even if there's no further "meaning" of the High Valyrian term we can relay beyond "it's a name"; that seems like a common situation with names. Lee says it's derived from Chinese 李 (Lǐ); it doesn't (probably could, but doesn't need to) say anything about what "李" means beyond being a surname. OTOH, if the books/etc don't actually say Daenerys is a Valyrian-language name, and we'd just be assuming based on Targaryen names usually being Valyrian, then ... we probably shouldn't assume. - -sche(discuss)07:09, 20 July 2023 (UTC)
@Benwing, re Eskayan: I added it as outlined at Wiktionary:Beer parlour/2015/August#Eskayan; I have no strong feelings about it but intuitively it feels different to me (having longstanding use by a particular people-group like a natural language) from everybody's dog's failed proposal for a world language, or recent commercial/media-franchise inventions like Dothraki. But that prior discussion was just three people, so likely we should just have a new discussion. - -sche(discuss)07:09, 20 July 2023 (UTC)
@-sche Probably so. I also don't feel strongly about this but IMO either we should move it to the Appendix or update the section of WT:CFI that says only Esperanto, Ido, Volapük and Interlingua get to go in the mainspace. (Personally I would rather move all conlangs other than Esperanto to the Appendix; I'm making an exception for Esperanto not because I have any partiality towards it but because it seems qualitatively different in its reach vs. all the others. But if we're allowing more than Esperanto in the mainspace I won't object to Eskayan being there if those in the know feel it belongs there.) Benwing2 (talk) 21:06, 20 July 2023 (UTC)
@Equinox I changed this cross-linguistically to accord with how most languages work. Didn't realize English behaves differently; what POS category should it go in? Benwing2 (talk) 23:49, 20 July 2023 (UTC)
The new "pattern" seems okay. Thanks. If I was supposed to see a message about this, I didn't. But I'm happy now. Cf. The Smiths. Equinox◑04:54, 26 July 2023 (UTC)
@Equinox That was intentional; for most languages we prefer to mark participles with a Participle header to match the POS of the {{head}} template, and since backsawn is only a past participle (rather than a combination ed-form and past participle), I think it probably makes sense here. Benwing2 (talk) 21:52, 11 August 2023 (UTC)
With this change, all script codes are either of the form Xxxx (for ISO 15924 script codes) or Xxxxx (for Wiktionary-invented codes), or have a language prefix added to one of these codes (e.g. fa-Arab). Currently there are seven Wiktionary-invented codes of the Xxxxx format: the five above as well as Morse (= Morse code) and Semap (= flag semaphore). For the moment, the old script codes still work in that Module:scripts will accept the old names and automatically convert to the new ones, and MediaWiki:Common.css and MediaWiki:Mobile.css recognize both old and new names as CSS classes. I have tracking set up for any uses of the old names that go through Module:scripts (which means unfortunately that uses in the *-lite templates won't get tracked). After awhile I will change Module:scripts to throw an error if it sees the old names, suggesting that the appropriate new names be used instead, and sometime after that (maybe well after) I will remove that special-handling code so you just get an "unrecognized script" error upon using the old names.
The first two on the list (Polyt and Latnx) are the biggies that are used in many places; the remainder are hardly used. If you see any new font-related weirdness esp. related to polytonic Greek characters or Latin characters with diacritics on them, please let me know. Benwing2 (talk) 07:23, 21 July 2023 (UTC)
Something I'm surprised isn't already possible (given all the options for other stuff!) is the ability to designate a translator, or indeed an editor, for a specific chapter in {{quote-book}}. This is relevant in edited volumes where a specific contribution might have its own translator independently of the book as a whole—and is reasonably important since the translator in that case is the one who's actually responsible for the quoted text. As it stands, specifying a chapter and a translator will get you something like "2000, Joe Bloggs, 'A chapter', in Mary Bloggs, transl., Book". With a chapter-specific translator this should be "2000, Joe Bloggs, Mary Bloggs, transl., ...". Would this be worth implementing? Alternatively maybe the second format should just be presumed anyway, since it applies just as well where the translator's rendered the entire book? —Al-Muqanna المقنع (talk) 16:30, 21 July 2023 (UTC)
@Al-Muqanna I support this. There was also a request (by User:GianWiki?) to support the ability to add per-chapter transliterations and such. IMO, however, if we do this we should rename the relevant parameters to be more standard. For example, there is currently a |trans= param for the book-level translator, but a |trans-chapter= param for a translation (i.e. gloss) of the chapter name, so we can't call the chapter translator |trans-chapter=, as you might expect. IMO the chapter gloss should be something like |chapter-t= or |chapter-gloss=, the chapter translit should be |chapter-tr= and the chapter translator should be maybe |chapter-trans=. Technically |chapter-trans= doesn't conflict with |trans-chapter= but having both with different meanings would be very confusing so I'd recommend renaming the existing uses of |trans-chapter=. Maybe User:Sgconlaw can comment, as they are the expert on these templates. Benwing2 (talk) 02:00, 22 July 2023 (UTC)
@Benwing2, Al-Muqanna: I don't have any objection either for a parameter for the translator for a chapter, but am wondering whether it is a good idea to name it |trans-chapter= and then rename the parameter for translations of chapter names to something else, if that's what the proposal is. If we did do that, then we'd have to think about renaming the parameter for translations of titles as well, as it's currently |trans-title= (or |trans-journal= for journals). — Sgconlaw (talk) 05:38, 22 July 2023 (UTC)
@Sgconlaw So my plan is to name it |chapter-trans= (in that order) for the chapter translator and rename |trans-chapter= to avoid confusion. In addition I'd like to support glosses and transliterations of titles and journal names, with more standard names like |title-t=/|title-gloss= and |title-tr= (or |journal-t=/|journal-gloss= and |journal-tr=). Presumably it doesn't make sense to have a translator for titles but it might for journals if they're part of a series. Alternatively we could rename |trans= to something else like |tlat= or |tlator=. My main concern is the confusion caused by having |trans= mean two different things in different param names as well as the fact that |trans= is not used anywhere else and is ambiguous between "translator", "translation" and "transliteration", so it might make sense to adopt |tlator= anyway. Then we'd have something like the following:
|tlator= for the overall translator (or something shorter e.g. |tlr=?)
|t= for the gloss/translation of the text/passage in question
|tr= for the manual transliteration of the text/passage in question
|1= for the lang code of the text/passage in question, if it's not in English, so it can be auto-transliterated
|chapter-tlator= for the chapter translator (or maybe shortened to |chapter-tlr=)
|chapter-t= for the gloss/translation of the chapter name
|chapter-tr= for the manual transliteration of the chapter name
|chapter-lang= for the lang code of the chapter name, if it's not in English, so it can be auto-transliterated
|section-tlator= for a section translator (or maybe shortened to |section-tlr=)
|section-t= for the gloss/translation of the section name
|section-tr= for the manual transliteration of the section name
|section-lang= for the lang code of the section name, if it's not in English, so it can be auto-transliterated
|title-t= for the gloss/translation of the book title
|title-tr= for the manual transliteration of the book title
|title-lang= for the lang code of the book title, if it's not in English, so it can be auto-transliterated
Generally I want to keep the params as consistent as possible both with each other and with the way params work in other templates. Benwing2 (talk) 05:56, 22 July 2023 (UTC)
@Sgconlaw, Al-Muqanna, GianWiki, RichardW57 I have implemented foreign-script handling for authors, titles and chapters in my sandbox module. See User:Benwing2/test-quote for some examples. You can specify the language by prefixing the author, title or chapter with a language code followed by a colon. If you do this, you get proper script handling and transliteration. If you don't do this, it still tries to figure out the correct script and do the right thing with that script, but it may not get certain language-specific handling correct (e.g. it won't be able to detect Persian vs. Arabic and use the special Persian version of the Arabic script), and it won't do auto-transliteration. The relevant params are named e.g. |title= (the text itself); |title-tr= (the translit); |title-ts= (the transcription); |title-sc= (the script, if not detected properly); |title-t= or |title-gloss= or (for compatibility) |trans-title= (the gloss/translation); and |title-lang= (an alternative to prefixing the title with the lang code). I am not completely sure of the correct formatting when both translit and gloss are present; currently they aren't clearly distinguished and both show up within the brackets following the text. Note that |first= and |last= are still supported as an alternative to |author= but don't come with any foreign-script support; it is better to use |author=. Only |title=, |author= and |chapter= currently have foreign-script support, but it's not hard to add it to other params (let me know what other params are so deserving). Chapter translator support (|chapter-tlr=) is not there yet but is coming. Benwing2 (talk) 07:50, 28 July 2023 (UTC)
They may be handled automatically, but |title2=, |2ndauthor= and |chapter2= would also need handling. |section= will need the same support as |chapter=. |author2= etc. need the same treatment as |author=, and I don't think editors should be overlooked either. --RichardW57 (talk) 22:29, 28 July 2023 (UTC)
@RichardW57 Thanks. I am adding this support now. I wasn't sure about |section= but it does look like it needs handling this way. I am also doing this for translators, editors, quotees, coauthors, etc. What about |other=, |others=, |quoted_in=, |publisher=, |city=, |location=, |original=, |by=? I assume yes but I'm not sure what all of these are for. What about |laysummary= and |laysource= (I have no idea what the purpose of these is)? How about |genre=, |format=, |edition=, |volume=, |volume_plain=, |series=, |seriesvolume=? Maybe no on these latter ones? I also need to add some hacks to handle more special-purpose params like |journal= (which don't occur in Module:quote but do occur in the template wikicode of specific {{quote-*}} templates). Benwing2 (talk) 22:44, 28 July 2023 (UTC)
BTW now that I think about it, it may make more sense to implement this exclusively using inline modifiers; it's going to be painful to handle all the stuff like |blog=, |site= and |work= (synonyms for |title= used specifically in {{quote-web}}) any other way. Benwing2 (talk) 22:51, 28 July 2023 (UTC)
For the latter ones, I would say that {{|page}} had a higher claim. Not all of us read Burmese numbers fluently. There are some nasty subtleties - not all of us twigged that Thai ฉบับปรับปรุงครั้งที่๑ did not mean 'first edition', but first revision, i.e. second edition. --RichardW57 (talk) RichardW57 (talk) 23:01, 28 July 2023 (UTC)
@Al-Muqanna, RichardW57, GianWiki I switched title, chapter, author to use inline modifiers (although as a special case, |trans-title=, |trans-chapter= and |trans-author= are still supported for backward compatibility) and added the same treatment for all the other params mentioned above except the "latter" ones. I also added a chapter translator under |chapter_tlr=; if there is a "republished in" second set of params, the chapter translator for the republished/etc. book is |chapter_tlr2=. There's an example of both |chapter_tlr= and |chapter_tlr2= in User:Benwing2/test-quote along with examples using inline modifiers. I'd appreciate it if people could create some complex examples using non-English params, so we can test that everything is working before I push this live. Also, User:Sgconlaw I guessed about the significance of the |other= and |laysource=/|laysummary=/|laydate= params, which aren't documented anywhere that I can see; maybe you can document what they're supposed to do so I have a better idea of whether they can contain foreign text. Benwing2 (talk) 07:53, 29 July 2023 (UTC)
@RichardW57: I'm working on this; just last night e.g. I was doing a bunch of work rewriting the {{quote-*}} documentation. It's a bit more work than it would normally be because there are 12 such templates and I don't want to have to manually copy the text to all 12 doc pages. In the meantime, the modifiers currently supported are <t:...>, <gloss:...> (alias of <t:...>), <tr:...>, <ts:...> and <sc:...>. Benwing2 (talk) 21:55, 11 August 2023 (UTC)
@Benwing2: Thank you; I think the set was begging for an easier to maintain unified set of documentation. I feared you thought you had provisionally finished the task. I had suspected that the language of the section name or whatever might be one of the parameters.
@RichardW57 Hmm, I will remove <gloss:...>. The language of the text in question *IS* specifiable, you just prefix the text with the language code, like this: ru:Баллада о королевском бутерброде<t:Ballad of the King's Bread>. Sorry, forgot to mention this. Benwing2 (talk) 20:20, 12 August 2023 (UTC)
@Benwing2 The (partially?) consolidated documentation seems a lot easier to use. Thank you.
You have picked up some outdated documentation for |1=. On its own, it does not prompt the generation of "(in LANG)", which rather pertains to |worklang=. I think there was a brief time, prior to the introduction of |worklang=, when that documentation was true. Of course, it might be that you are holding fire on fixing this. I remember the documented logic led to a brief time when we were claiming that Michael Everson had written some Unicode proposals in Pali! --RichardW57m (talk) 10:05, 18 August 2023 (UTC)
@RichardW57 Yes, the documentation isn't done. I still need to document inline modifiers, for example, and change all the templates to use the new Module:quote doc; so far only {{quote-book}} and {{quote-journal}} are using it. As for |1=, I will fix things so that if all three of |worklang=, |termlang= and |1= are different, it displays both |worklang= and |1=, and document accordingly. Benwing2 (talk) 20:18, 18 August 2023 (UTC)
@Benwing2: Here are my interpretations of some of the parameters. I'd held off hoping someone more knowledgeable would comment.
|genre= is surely meant to be English, as documented forms include 'fiction' and 'non-fiction', though we may see some barely adopted terms.
|format= is likely surely 'English'. We have examples of 'paperback' and 'hardback', bit I think I've seen 'PDF' used.
|series= is surely going to be like |title=; it is after all, the title of a series of books.
As |edition= takes text and documentation gives "3rd corrected and revised", we should expect foreign language text to show up. In fact, I should probably include the Thai language edition identification in {{tl:R:nod:MFL}} to dispel doubt as to what issue the page numbers refer to. (And some entries appear on significantly separated pages!) I do give a date, but 'when all else fails, read the manual' applies.
|volume= expects a number, but the earlier remarks on page apply. The same applies to {{|plain_volume}} and {{|series volume}}, where I think additional inline qualifiers start to look attractive and the law of diminishing returns is cutting in.
I think |city= is an obsolete synonym of |location=; it's not documented for {{quote-book}}. I think we could do with a check on the validity of parameters; one can't tell why a parameter doesn't affect the display - dismissed as excess detail, or a mistyped parameter name?
|original= seems well-documented to me, and mutatis mutandis should take the same immediate variants or supplementary parameters as |title=. However, |original2= seems an oxymoron.
|others= is documented for {{quote-book}} under |quoted in=, and looks like designer fatigue. It may need be translated, but I've been conveying the information using |newversion=, which parameter I believe should itself take English text. I'd be inclined to discontinue support for it.
@Chuterix: Why ping all the Japanese editors about a technical issue that has nothing to do with Japanese? There was nothing wrong with the categories, but something about your template syntax was messing up the transclusion. I gave the template some dummy parameters wrapped in noincludes and wrapped the real parameters in includeonlys as a preliminary, but that unexpectedly seems to have fixed the problem all by itself. I don't know know enough about template syntax and transclusion to say for sure what happened- it may even have had nothing to do with my edit at all, except as the equivalent of a null edit. Chuck Entz (talk) 05:24, 23 July 2023 (UTC)
On further thought, perhaps your moving the template and its documentation page without editing the template meant that the template was still transcluding the old location of the documentation subpage, which had been replaced by a redirect when you moved it. My edit would have forced the system to update all the internal references. Chuck Entz (talk) 05:36, 23 July 2023 (UTC)
@Chuterix You have been asked multiple times not to do these mass pings, as they're likely to just annoy people, and also mean it's harder to tell what's important and what isn't. Please stop. Theknightwho (talk) 16:12, 23 July 2023 (UTC)
Do these work together? I can't figure out how to do it. The word in question is Northern Thai ย้อนว่า(“because”), for which my quotation source has line-breaking between the constituent morphemes. For the headword, {{head|nod|conjunction|]]}} gives separate links to the two constituents, but I have distinguished the two etymologies for the first word by etymids. I want clicking the first part to definitely initially take one to the correct etymology.
Also, am I missing a trick for showing the line-breaking opportunity? The quotation source carefully marks up acceptable linebreaks, something that is fairly rare in the Thai script. A naive Northern Thai syllable boundary detector knowing only phonology could wrongly find the split ย้อ|นว่า. I can't use {{compound}} because this is not a single word, but an idiomatic phrase. --RichardW57 (talk) 13:31, 23 July 2023 (UTC)
If you're working with entire etym sections, use {{etymid}} instead of {{senseid}} -- the latter is best suited to single senses, rather than etym sections.
For instance, Japanese terms spelled in kanji often have multiple different readings (pronunciations), themselves with separate etymologies and other lexical details, so I make use of {{etymid}} with some frequency. I added a couple just a bit earlier to the 愚者 entry, which covers three terms -- gusha (derived from Chinese), oremono (from Japonic roots, borderline obsolete), and orokamono (from Japonic roots, still current).
In your example, I just added two etymids to the Thaiย้อน(yɔ́ɔn) entry, so to leverage these in your link to make sure that the first term in the compound links through to the first etym section, I'd use this syntax:
{{head|nod|conjunction|]]}}
Note that it is strongly advised to not use numbers for etymids -- etym sections move around as the entries are edited, so some relatively obvious non-numeric identifier is preferrable. For example, for Japanese entry etymids, I'll use the reading in romanization for kanji headwords, and the main gloss for kana headwords -- I used the gloss just now for that Thai entry for purposes of this illustration (please adjust as appropriate).
@Eirikr: Thanks. I had contemplating using that structure, but was worried that @Theknightwho might reasonably think that that undocumented naming convention could freely be changed. As you probably saw, I have been using {{etymid}}; I said {{senseid}} because that's what most of the documentation talks about. I also think etymids will be more stable than senseids, especially for new or sketchy entries, where there can be a lot of later refinement. (Well, I hope so.) --RichardW57 (talk) 23:05, 25 July 2023 (UTC)
@Eirikr, Benwing2, Theknightwho Actually, it looks as though the convention for fragment ID's might be fixed by the documentation for argument id of function language_link exported from Module:linkswere it correct, so my worry may be misplaced. (The bold text should also be taken as a bug report on Module:links, for misreporting the behaviour of function anchor exported from undocumented Module:senseid.) Alternatively, we need a template to follow any changes in convention - does one already exist? --RichardW57m (talk) 09:12, 26 July 2023 (UTC)
Recent changes by content language
Special:RecentChanges is quite advanced, with lots of filtering options for namespaces, tags etc. Is it also possible to get a version of RecentChanges for only Swedish words, i.e. for pages containing "==Swedish=="? (Or could tags be used for this?) -- LA2 (talk) 13:56, 23 July 2023 (UTC)
Of course, this doesn't show many Swedish entries that have problems with headword templates, so it's not complete. It's better then nothing, though. Chuck Entz (talk) 15:51, 23 July 2023 (UTC)
NFC v. SoP
(Wrong forum - moved to Beer Parlour.)
pedocon
I've tried to create an entry for "pedocon" (Internet slang for conservatives whom the speaker suspects of being pedophiles, or at least of tacitly supporting pedophiles), but I keep running into edit filters (either anti-vandalism or anti-spam). What do I do? 93.72.49.12313:36, 24 July 2023 (UTC)
If you have three valid (i.e. other than Reddit and Twitter) citations you can provide I'll create it for you. - TheDaveRoss13:46, 24 July 2023 (UTC)
WT:CFI, specifically the section on attestation. Twitter and Reddit have not been agreed upon as acceptable, durably archived media. And based on recent events with each of those their durability is more in question than ever. - TheDaveRoss13:54, 24 July 2023 (UTC)
Request to adjust protection on discussion pages
Can an admin please adjust the move permissions on the newly-created August discussion pages, specifically Wiktionary:Grease pit/2023/August, Wiktionary:Tea room/2023/August, and Wiktionary:Information desk/2023/August. I wrote a script to automatically generate future discussion pages near the end of each month and but I'd prefer to run it from my bot account instead of my user account. If it's possible to add "bots" to the allowed users that would be ideal. If not, can we either drop the move protection or if the protection is important, can we grant Template editor permission to the bot account User:AutoDooz? Thanks! JeffDoozan (talk) 12:53, 25 July 2023 (UTC)
Thanks, Ben. The bot's scheduled to create the discussion pages on the 25th of every month, which should give human editors time to notice if something goes wrong and the pages need to be manually crated, while also being close enough to the end of the month to give editors time to add or remove themselves from the current watchlist. Previously, all the pages were created en-mass once a year so everyone watching the page in December was added to the watchlist of all of next year's pages, but new users had to manually add themselves to the watchlist every month for the rest of the year. JeffDoozan (talk) 14:56, 26 July 2023 (UTC)
SIM карта
Hi, I've just created the Bulgarian entry for SIM card - SIM карта(SIM karta). I got a warning telling me that I'm mixing Latin and Cyrillic characters, which - while true - is not wrong in this case. There is a small number of Bulgarian compound words where one of the constituents is written in the Latin alphabet. For example:
The warning suggested that I discuss my edit on here. Personally, given the rarity of this use case, and the fact that the warning is non-blocking, I'm fine with the status quo. What I don't like are the bogus categories that got created because of the "SIM" part, e.g. Bulgarian terms spelled with I. Is there a way to suppress those?
@Chernorizets I don't think you need to worry about this - the warning tells people to come here because people might not realise that Latin and Cyrillic letters that look identical are encoded differently.
I'm not sure I understand your last point - Category:Bulgarian terms spelled with I is a completely valid category. When you say "there is a small number of Bulgarian compound words where one of the constituents is written in the Latin alphabet", that's the kind of term that categories like that are designed to contain. Theknightwho (talk) 04:14, 26 July 2023 (UTC)
@Theknightwho prior to me adding SIM карта, the only other "spelled with" subcategories under the parent Category:Bulgarian terms by their individual characters were for a couple of obsolete letters and a couple of non-alphabetic characters (as well as "E", which looks like a mistake I'll go fix). Is the intent of such categories to cover terms with "non-standard" letters w.r.t. the official alphabet of the language? If so - great, although IMO that's not obvious from the names of those categories, or from their descriptions. Chernorizets (talk) 04:31, 26 July 2023 (UTC)
Thanks @Theknightwho! I'd recommend that those category descriptions, in general, make it more explicit that they're intended solely for characters not in the language's current standard alphabet. Otherwise, I can see this question popping up again down the line.
These worked (or seemed to) yesterday, but now the quoted text doesn't appear. When I added text= on pin curl, the quotations showed up as before. Is this a bug or an intended change that hasn't been documented yet? Cnilep (talk) 05:42, 26 July 2023 (UTC)
Can we avoid/deprecate these long chains of unnamed parameters? They're bound to break eventually, and I've seen many half-displayed instances which needed fixing. Jberkel17:34, 26 July 2023 (UTC)
@Jberkel Yes, I agree. At some point I'm going to rename the translator-related parameters in the quote templates and I may see about renaming the numeric params at the same time. Benwing2 (talk) 19:23, 26 July 2023 (UTC)
While that is not obviously the wrong thing to do, it is not obviously the right thing to do, either. Pro: Avoiding unnamed parameters makes this sort of problem less likely. Con: Human-readable parameters make the code a bit heavier, and more of a pain to add by hand; arbitrary (less human-readable) parameters could be shorter, but more difficult to use, especially for newer editors. Both removing and keeping unnamed parameters have merit, and neither is obviously superior. Since its a matter of preference, discussion toward achieving consensus seems in order. Cnilep (talk) 01:11, 27 July 2023 (UTC)
One suggestion (or set of them):
Keep the positional params to avoid breaking older instances of template use in the wikicode.
Add names for those positional params.
Provide longer-form names that are human-readable and obvious -- this supports newbies and anyone worried about wikicode legibility.
Also provide aliased short-form names that are guessable enough for experienced editors -- this supports power users and makes for more-compact code that is less onerous to type in by hand.
@Eirikr, Cnilep This is already the case. I wrote a script (but haven't run it yet) to rename numbered params, and e.g. for {{quote-book}} the mapping is as follows:
The longest named param is author at only 6 chars. For the other quote templates, it's similar. The longest name occurs in {{quote-video game}} where |5= is named |platform=. The only case that has several relatively long such params is {{quote-hansard}}:
However, having 10 numbered params is incredibly error-prone, esp. since the significance of the numbers varies from template to template, and there are 12 templates. Few people are going to be able to keep this straight, and there are in fact tons of existing errors due to numbered params. We could provide shorter aliases of params over say 5 chars if this would help, but I strongly believe we should not encourage people to use these numbered params and that in fact we should deprecate them. Note that most quote templates have several params and take significant work to enter, and in comparison the work required to type the names is very small (and while they may slightly increase the size of the wikicode, this is negligible). Benwing2 (talk) 21:45, 27 July 2023 (UTC)
Excellent, thank you for the explanation. Over the years, I've seen some unfortunate changes to templates in the JA space that involved aggressively short parameters with no documentation and sometimes confusing usage (such as one template using named parameter ko to indicate the "kan'on" reading type and another template using that same parameter name to indicate a combined "kun + on" reading). Very happy to read that usability and longer-term maintainability are considerations here. 😄 ‑‑ Eiríkr Útlendi │Tala við mig23:15, 27 July 2023 (UTC)
Thanks. But I am talking about people who want a permanent link to something, without having to edit anything to get it. Jidanni (talk) 00:08, 27 July 2023 (UTC)
@Jidanni This is hard because there may be more than one header of the same name for a given language, and most other aspects of the wikicode are liable to change over time. Benwing2 (talk) 00:17, 27 July 2023 (UTC)
@Jidanni, ya, as @Benwing2 describes, there is no clean means of automatically and algorithmically generating unique link anchors that are also human-readable.
We could use some kind of infrastructure (modules and templates, perhaps) to automatically generate globally unique identifiers (GUIDs) for anything and everything that someone might want to link to. However, these will consist of unwieldy, long strings of randomized alphanumerics and hyphens, things like 50025ae1-ef80-4542-a061-418e53aea6e5. Humans (at least, neuro-typical ones) will never be able to remember these, so they will have to edit the target page (or inspect in a browser's dev tools) to find the anchor string, before they can use it in a link from some other page.
Numeric positioning is unstable, as you commented -- as soon as anyone adds or removes any of various parts of the wikicode, ] suddenly becomes ] or ], and all the existing links to ] are now pointing at the wrong thing.
Even your proposed #Spanish_Suffix is prone to breakage. What if there is more than one such suffix? What if the part of speech is later changed?
Any approach that is human-readable and understandable, and reasonably stable, will require that humans edit the targeted page to add such anchors. ‑‑ Eiríkr Útlendi │Tala við mig17:16, 27 July 2023 (UTC)
Something is wrong with the top-level Beer Parlour display; it's not correctly transcluding the last two months. I don't see any recent changes to the top-level BP page or the transcluded pages. Benwing2 (talk) 04:45, 27 July 2023 (UTC)
By my count, The July BP page currently has 1,272 edits for a current size of 590,154 bytes, which averages out to something like 464 bytes per edit. This has been an epically verbose, high-stress month- and we've still got a day or two to go. Don't mind me- I think I'll go cower and whimper in the corner... Chuck Entz (talk) 03:26, 30 July 2023 (UTC)
It's still not appearing properly for me when viewing WT:BP, even though I assume it's transcluding only July & August now and that August is nearly empty. It might be a cache issue, but there's no way to tell and the usual hard refresh keystrokes didn't work, and neither does viewing it on another device which hasnt viewed the Beer Parlour lately (if ever). —Soap—14:37, 1 August 2023 (UTC)
It seems @Theknightwho's change to {{discussion recent months}} has made it worse. I tried previewing with the version with parser functions, and that seems to work. (at least for now - the post-expand include size is 1.6/2MB, so it will likely break after a week or two) – Wpi (talk) 15:10, 1 August 2023 (UTC)
@Wpi Yes - it seems the change I had to make to ensure section editing worked has (more than) cancelled out the benefit we were originally seeing. Unless we can find some way to get section edit links to work with the first method, we'll probably have to call this a failed experiment. Theknightwho (talk) 15:25, 1 August 2023 (UTC)
@Benwing2 @Wpi @Chuck Entz @Soap I've come up with a solution to this that seems to work, which is a slightly extended version of the solution someone else came up with on StackExchange (): Module:User:Theknightwho/discussion. The short version is that the module grabs the page contents, iterates over the section headers and replaces them with (e.g.) <h2 data-source="palabra" data-section="38">Spanish</h2>. A simple JS script (as seen in User:Theknightwho/common.js) is able to convert this into a section edit link for every heading when rendering the page. This would be an extremely useful thing to have, because it also means we can parse pages in a single invoke without breaking section editing.
Currently it can only handle whole page transclusions (which is mostly what we need it for), but it wouldn't be difficult to extend it to parts of pages. However, it does support transcluding multiple pages, because the source pagename has to be included in each header tag. I also suspect the reply gadget won't work, but (a) I don't think that's a major problem, and (b) it may be fixable anyway.
The one assumption it makes (which I suspect is unlikely to change) is that the section edit links are numbered sequentially from 1 to n (i.e. the 38th heading is linked to with &action=edit§ion=38). I don't think this is a documented feature. Theknightwho (talk) 17:36, 1 August 2023 (UTC)
@Theknightwho, RichardW57 Maybe one of you two knows this. There's a special phoneticExtraction table in Module:links, where if a language is in this table (currently only Thai and Khmer), getTranslit is called Module:th or Module:km in place of the regular transliteration mechanism. What the hell is going on here? Why does this exist, and why, if it's important, is it not integrated into the regular transliteration mechanism? I should add, transliteration ought to be as simple as calling lang:transliterate() but in fact Module:links and Module:headword both do significant (and different) logic related to transliteration. I'm asking about this because I'm implementing language tagging, transliteration and transcription for titles, chapters, etc. in Module:quote and it feels like I'm doing a lot more logic than I should need to. Benwing2 (talk) 04:06, 28 July 2023 (UTC)
@Benwing2 This pre-dates any of the work I did, and I completely agree it should be integrated: it seems like a horrible hack. From what I can tell, Wyang was terrible at integrating things into the core modules, so anything they worked on (e.g. Module:th and Module:km) is either not integrated at all, or uses special-case crap like this. Another case-in-point are all the Chinese modules. Theknightwho (talk) 04:17, 28 July 2023 (UTC)
@Theknightwho Oh yeah, now I remember, there was a wheel war between Wyang and Rua over this; take a look at the history of Module:links between June and August 2016. I think both got desysopped for awhile as a result. Do you know what the getTranslit code is actually doing for these two languages? I don't know anything about Thai or Khmer transliteration but I know it's complicated, and Richard has complained about multiword Thai expressions not getting transliterated correctly. If it "works" better for links than headwords, this is the cause. Benwing2 (talk) 04:28, 28 July 2023 (UTC)
One semi-related question: What are the categories that get returned as the third return value of lang:transliterate()? Do I need to worry about them when generating transliteration of titles, chapters, etc.? (I would use full_link() but the existing formatting of titles/chapters/etc. and their glosses is somewhat different from what full_link() generates, and I'm preserving that difference while incorporating translits and transcriptions, so I have to roll my own annotations.) Also what are the values of tr_fail, and what does it mean if both the translit and tr_fail are nil? And for the 25th time, can you actually document this shit so I don't have to keep asking you? Benwing2 (talk) 04:36, 28 July 2023 (UTC)
@Benwing2 I'm not sure beyond a basic understanding, but I think what it does is scrape the pages to see if there's a phonetic transcription available to use as the transliteration, and then it calls the translit module as a back-up if there isn't. It's a neat idea, but should be packaged up into the relevant translit modules. I have no idea why Wyang insisted on doing it like this, and to be honest I'm getting really sick of clearing up their spaghetti code. Theknightwho (talk) 04:51, 28 July 2023 (UTC)
It was decided to do transliteration as a form of transcription, and so it got bundled up with working out pronunciations. The translit module works off a scraper and the core of the pronunciation module. Unfortunately, it only addressed the immediate needs, so doesn't attempt phrases. There's also a strong antipathy towards Internet standards for marking word boundaries, which have to be marked for a scraper to handle phrases. It thus doesn't integrate with {{quote}} or {{quote-*}}, and also not with Module:languages:1681: The function getByCode expects a string as its first argument, but received nil.. The code doesn't like to concede that European letters and numbers end up in Thai texts (e.g. email addresses and acronyms such as VDO, as well as telephone numbers). We've therefore had to wrestle with obvious things like Western Arabic numerals (the use of Thai digits is generally a sign of hostility to foreigners) being rejected, resulting inter alia in the falsification of quotations.
I don't think I have anything else to add to the description of @Theknightwho. I think the code has a decent chance of surviving surgery. I think @Octahedron80 has actually worked on the code. As I say, its antipathy to non-Thai characters needs to be worked on, but remember that transliterating the Baht symbol (which is in the Thai block) would involve needless work. --RichardW57m (talk) 14:42, 28 July 2023 (UTC)
You don't need to worry about the categories. I don't think that value is actually returned by anything at the moment, since we decided to turn it off for Chinese. When I've got some time, I will see if there's a better way of implementing it. Also the second value (an explicit fail) could probably be replaced by simply returning false (as opposed to nil). If the translit and tr_fail are both nil, it means that the module has intentionally decided not to return anything (e.g. Arabic) that the transliteration was an accidental fail (e.g. it was only partially completed), whereas an explicit fail means the module intentionally returned nothing (e.g. the Chinese module had nothing to scrape) the . Given I remembered it the wrong way around, I'll make sure to document it. I've now done the documentation, and both previous edits were partially wrong: if tr_fail is true then it means that maintenance action could be required (as it was an accidental fail), while if it's false then it means the expected output was nil (usually because the input was "-"). Theknightwho (talk) 04:57, 28 July 2023 (UTC)
Ironically, Thai transliteration is one of the few that actually tries to return some categorisation when it fails, but I think that currently doesn't make it to the display. It would make sense as an error reporting mechanism - one needs it when only a small subset of the permutations of characters have any reading. I think a combination of a failed transliteration category - to be monitored by a keen human, so probably per language - and an error message masquerading as an uncreated category would be helpful. --RichardW57m (talk) 14:13, 28 July 2023 (UTC)
@Benwing2: Remember that Thai transliteration is technically impossible for an algorithm to get right, and that Thailand is important enough that therefore software rather than users take the strain of line-breaking between words. You may have to ask people to mark up Thai titles with word divisions, for which I recommend '<wbr>'. Some of us are horribly familiar with using ZWSP and WJ. You may also face the grief that Thai personal names are excluded from Wiktionary by policy, but then you don't really want the Wiktionary transcription in a source's references. It will look horribly amateur - and some Romanisations were bestowed by Rama VI along with the surname. Think of all the Thai names ending in 'porn' (e.g. Pittayaporn); that would generally come out as 'phɔɔn'. Part of the effort in using a Thai work is deciding on the Romanisation of the author's name! --RichardW57m (talk) 15:33, 28 July 2023 (UTC)
Correction: It comes out as 'pɔɔn'.
@Benwing2:: Don't use isn't very helpful advice, even if it be the soundest. If you must do Thai transliterations as part of references, I think selecting RTGS is the most defensible option. Doing Thai automatic transliteration should generate some sort of warning; manual is almost always better. --RichardW57 (talk) 16:36, 28 July 2023 (UTC)
@RichardW57 OK I think as a first step we should just move the scraping stuff into the actual Thai transliteration algorithm, same for Khmer. Other fixes can come later. Benwing2 (talk) 19:04, 28 July 2023 (UTC)
@Benwing2: Please note that Northern Thai, Northeastern Thai and Southern Thai are all possible future users of this logic, at least for the Thai script. Their pronunciation modules are at least potentially all different, for at the very least they have different vowel and tone systems. If it's written and is truly Southern Thai, the Tak Bai dialect of Southern Thai is likely to merit a different pronunciation module, as its tone system is different yet again. RichardW57 (talk) 21:50, 28 July 2023 (UTC)
(A little late to this party, but hey...)
Ya, what I recall from the 2016 kerfuffle was that Thai-script spellings apparently don't align well with Latin-script romanizations, and some of the argument between Wyang and Rua / CodeCat was about how to deal with this discrepancy and the difficulties in any algorithmic approach to extracting pronunciation information out of Thai-script strings.
Looking back, I get the impression that CodeCat's point wasn't communicated very well, as the proponents of the current approach (Wyang and Metaknowledge) didn't seem to realise that no-one was opposing the mechanism, but merely the fact that it was bolted onto Module:links in a nonstandard way that isn't necessary and creates a maintenance headache. Module:zh-translit also scrapes translits, but it's all inside the module. Theknightwho (talk) 21:01, 30 July 2023 (UTC)
brackets=on not working in quote-web
In this diff, I add a quote that I feel should be in brackets since it is an alternative form of the entry term. Normally I put this kind of thing in brackets using brackets=on, but today, brackets=on does not work for quote-web. It seems to be working for quote-book. Thanks for looking at this. --Geographyinitiative (talk) 09:26, 28 July 2023 (UTC)
About citations tab at the top and Visibility title on the left
Hi!
I am an interface editor from Turkish Wiktionary. I made some changes that works with new vector skin. For the people who use new vector skin, Citations tab is not aligned with the Entry or Discussions tabs. Also, on mobile version of Wiktionary does not have this tab. Well, I made some changes for this gadget. And now, for Turkish Wiktionary, we have perfect Citations tab on new vector skin, also on mobile. You can check it now: tr:göz. Also this does not mean the Citations tab won't work for old vector skin. It works. You can view the same page on mobile, the Citations tab is there.
For the Visibility options, as we know, new vector skin has a second sidebar on right which has "Tools" in it. Previously these tools were on left. On this wiki, the Visibility title is not seen with new vector skin's style. You can see again, in tr:göz, we have those options on the right.
I can make these changes here too. Just someone contact me, and ask me for the codes. Because I cannot edit the gadget or mediawiki pages here.
@This, that and the other, ok. I am listing the codes. You only need to copy/paste these. I already tested them from my common.js, so everything is ok.
1- The gadget for adding Citations and Documantation tabs. Copy these codes, and replace MediaWiki:Gadget-DocTabs.js page entirely with them:
3- Moving the visibility options to the right (for new vector only) with adaptible styles (for old and the new vector). Copy these codes, and replace MediaWiki:Gadget-VisibilityToggles.js page entirely with them:
Gadget-VisibilityToggles
/* eslint-env es5, browser, jquery */
/* eslint semi: "error" */
/* jshint esversion: 5, eqeqeq: true */
/* globals $, mw */
/* requires mw.cookie, mw.storage */
(function VisibilityTogglesIIFE () {
"use strict";
// Toggle object that is constructed so that `toggle.status = !toggle.status`
// automatically calls either `toggle.show()` or `toggle.hide()` as appropriate.
// Creating toggle also automatically calls either the show or the hide function.
function Toggle (showFunction, hideFunction) {
this.show = showFunction, this.hide = hideFunction;
}
Toggle.prototype = {
get status () {
return this._status;
},
set status (newStatus) {
if (typeof newStatus !== "boolean")
throw new TypeError("Value of 'status' must be a boolean.");
if (newStatus === this._status)
return;
this._status = newStatus;
if (this._status !== this.toggleCategory.status)
this.toggleCategory.updateToggle(this._status);
if (this._status)
this.show();
else
this.hide();
},
};
/*
* Handles storing a boolean value associated with a `name` stored in
* localStorage under `key`.
*
* The `get` method returns `true`, `false`, or `undefined` (if the storage
* hasn't been tampered with).
* The `set` method only allows setting `true` or `false`.
*/
function BooleanStorage(key, name) {
if (typeof key !== "string")
throw new TypeError("Expected string");
if (!(typeof name === "string" && name !== "")) {
throw new TypeError("Expected non-empty string");
}
this.key = key; // key for localStorage
this.name = name; // name of toggle category
function convertOldCookie(cookie) {
return cookie.split(';')
.filter(function(e) { return e !== ''; })
.reduce(function(memo, currentValue) {
var match = /(.+?)=(\d)/.exec(currentValue); // only to test for temporary = format
if (match) {
memo] = Boolean(Number(match));
} else {
memo = true;
}
return memo;
}, {});
}
// Look for cookie in old format.
var cookie = mw.cookie.get(key);
if (cookie !== null) {
this.obj = $.extend(this.obj, convertOldCookie(cookie));
mw.cookie.set(key, null); // Remove cookie.
}
}
BooleanStorage.prototype = {
get: function () {
return this.obj;
},
set: function (value) {
if (typeof value !== "boolean")
throw new TypeError("Expected boolean");
var obj = this.obj;
if (obj !== value) {
obj = value;
this.obj = obj;
}
},
// obj allows getting and setting the object version of the stored value.
get obj() {
if (typeof this.rawValue !== "string")
return {};
try {
return JSON.parse(this.rawValue);
} catch (e) {
if (e instanceof SyntaxError) {
return {};
} else {
throw e;
}
}
},
set obj(value) {
// throws TypeError ("cyclic object value")
this.rawValue = JSON.stringify(value);
},
// rawValue allows simple getting and setting of the stringified object.
get rawValue () {
return mw.storage.get(this.key);
},
set rawValue (value) {
return mw.storage.set(this.key, value);
},
};
// This is a version of the actual CSS identifier syntax (described here:
// https://stackoverflow.com/a/2812097), with only ASCII and that must begin
// with an alphabetic character.
var asciiCssIdentifierRegex = /^+$/;
function ToggleCategory (name, defaultStatus) {
this.name = name;
this.sidebarToggle = this.newSidebarToggle();
this.storage = new BooleanStorage("Visibility", name);
this.status = this.getInitialStatus(defaultStatus);
}
// Have toggle category inherit array methods.
ToggleCategory.prototype = ;
ToggleCategory.prototype.addToggle = function (showFunction, hideFunction) {
var toggle = new Toggle(showFunction, hideFunction);
toggle.toggleCategory = this;
this.push(toggle);
toggle.status = this.status;
return toggle;
};
// Generate an identifier consisting of a lowercase ASCII letter and a random integer.
function randomAsciiCssIdentifier() {
var digits = 9;
var lowCodepoint = "a".codePointAt(0), highCodepoint = "z".codePointAt(0);
return String.fromCodePoint(
lowCodepoint + Math.floor(Math.random() * (highCodepoint - lowCodepoint)) - 1)
+ String(Math.floor(Math.random() * Math.pow(10, digits)));
}
function getCssIdentifier(name) {
name = name.replace(/\s+/g, "-");
// Generate a valid ASCII CSS identifier.
if (!asciiCssIdentifierRegex.test(name)) {
// Remove characters that are invalid in an ASCII CSS identifier.
name = name.replace(/^+/, "").replace(/+/g, "");
if (!asciiCssIdentifierRegex.test(name))
name = randomAsciiCssIdentifier();
}
return name;
}
// Add a new global toggle to the sidebar.
ToggleCategory.prototype.newSidebarToggle = function () {
var name = getCssIdentifier(this.name);
var id = "p-visibility-" + name;
var sidebarToggle = $("#" + id);
if (sidebarToggle.length > 0)
return sidebarToggle;
var listEntry = $("<li class='mw-list-item'>");
sidebarToggle = $("<a>", {
id: id,
href: "#visibility-" + this.name,
})
.click((function () {
this.status = !this.status;
this.storage.set(this.status);
return false;
}).bind(this));
listEntry.append(sidebarToggle).appendTo(this.buttons);
return sidebarToggle;
};
// Update the status of the sidebar toggle for the category when all of its
// toggles on the page are toggled one way.
ToggleCategory.prototype.updateToggle = function (status) {
if (this.length > 0 && this.every(function (toggle) { return toggle.status === status; }))
this.status = status;
};
// getInitialStatus is only called when a category is first created.
ToggleCategory.prototype.getInitialStatus = function (defaultStatus) {
function isFragmentSet(name) {
return location.hash.toLowerCase().split("_") === "#" + name.toLowerCase();
}
function isHideCatsSet(name) {
var match = /^.+?\?(?:.*?&)*?hidecats=(.+?)(?:&.*)?$/.exec(location.href);
if (match !== null) {
var hidecats = match.split(",");
for (var i = 0; i < hidecats.length; ++i) {
switch (hidecats) {
case name: case "all":
return false;
case "!" + name: case "none":
return true;
}
}
}
return false;
}
function isWiktionaryPreferencesCookieSet() {
return mw.cookie.get("WiktionaryPreferencesShowNav") === "true";
}
// TODO check category-specific cookies
return isFragmentSet(this.name)
|| isHideCatsSet(this.name)
|| isWiktionaryPreferencesCookieSet()
|| (function(storedValue) {
return storedValue !== undefined ? storedValue : Boolean(defaultStatus);
}(this.storage.get()));
};
Object.defineProperties(ToggleCategory.prototype, {
status: {
get: function () {
return this._status;
},
set: function (status) {
if (typeof status !== "boolean")
throw new TypeError("Value of 'status' must be a boolean.");
if (status === this._status)
return;
this._status = status;
// Change the state of all Toggles in the ToggleCategory.
for (var i = 0; i < this.length; i++)
this.status = status;
this.sidebarToggle.html((status ? "Hide " : "Show ") + this.name);
},
},
buttons: {
get: function () {
var buttons = $("#p-visibility ul");
if (buttons.length > 0)
return buttons;
buttons = $("<ul class='vector-menu-content-list'>");
var collapsed = mw.cookie.get("vector-nav-p-visibility") === "false";
var toolbox = $("<div>", {
"class": "vector-main-menu-action-item mw-sidebar-action vector-menu vector-menu-portal portal " + (collapsed ? "collapsed" : "expanded"),
"id": "p-visibility"
})
.append("<div class='mw-sidebar-action-item'>")
.append($("<h3 class='mw-sidebar-action-heading vector-main-menu-action-heading vector-menu-heading'>Visibility</h3>"))
.append("</div>")
.append($("<div>", { class: "vector-menu-content" }).append(buttons));
var insert = document.getElementById("p-coll-print_export") || document.getElementById("p-lang") || document.getElementById("p-feedback");
if (insert) {
$(insert).before(toolbox);
} else {
var sidebar = document.getElementById("vector-page-tools") || document.getElementById("vector-main-menu") || document.getElementById("mw-panel") || document.getElementById("column-one");
$(sidebar).append(toolbox);
}
return buttons;
}
}
});
function VisibilityToggles () {
// table containing ToggleCategories
this.togglesByCategory = {};
}
// Add a new toggle, adds a Show/Hide category button in the toolbar.
// Returns a function that when called, calls showFunction and hideFunction
// alternately and updates the sidebar toggle for the category if necessary.
VisibilityToggles.prototype.register = function (category, showFunction, hideFunction, defaultStatus) {
if (!(typeof category === "string" && category !== ""))
return;
var toggle = this.addToggleCategory(category, defaultStatus)
.addToggle(showFunction, hideFunction);
return function () {
toggle.status = !toggle.status;
};
};
VisibilityToggles.prototype.addToggleCategory = function (name, defaultStatus) {
return (this.togglesByCategory = this.togglesByCategory || new ToggleCategory(name, defaultStatus));
};
window.alternativeVisibilityToggles = new VisibilityToggles();
window.VisibilityToggles = window.alternativeVisibilityToggles;
})();
4- Inserting the feedback inside the sidebar, rather than outside (currently it is on the outside). Copy these codes, and find "WT:FEED" in MediaWiki:Common.js. And replace the codes under "WT:FEED" with these:
Common
/* == ] == */
// used to be ]
if (true) {
$(function(){
var fb_comment_url = mw.util.getUrl('Wiktionary:Feedback', {
'action': 'edit',
'section': 'new',
'preload': 'Wiktionary:Feedback/preload',
'editintro': 'Wiktionary:Feedback/intro',
'preloadtitle': ']',
});
var fb_comment = "If you have time, leave us a note.";
var sidebar = document.getElementById("vector-main-menu") || document.getElementById("mw-panel") || document.getElementById("column-one");
$(sidebar).append("<div class='vector-main-menu-action-item mw-sidebar-action vector-menu vector-menu-portal portal expanded' id='p-feedback'>"+
"<h3 class='mw-sidebar-action-heading vector-main-menu-action-heading vector-menu-heading'>Feedback</h3>"+
"<div class='vector-menu-content'>"+
"<ul class='vector-menu-content-list'>"+
"<li class='mw-list-item' id='commentHere'></li></ul></div>"+
"</div>");
$('<a>').attr('href', fb_comment_url).text(fb_comment).appendTo("#commentHere");
});
}
I hope I explained everything and all is ok. If anything goes wrong, you can ping me anytime. Thanks! ~ Z (m)21:30, 30 July 2023 (UTC)
Just noting that I didn't forget about this. It seems that our versions of these scripts (in 1 and 3) are quite different from those you are suggesting here, so I'd need to do some testing before I publish these changes, to make sure we don't lose any local functionality. I will do it at some stage. This, that and the other (talk) 10:09, 17 August 2023 (UTC)
Of course! And if there is anything wrong with the new codes, I will try to adjust them to the old ones. I want to see the new changes on this wiki. The current display of the Citations tab and visibility toggles do really annoy me. ~ Z (m)22:30, 18 August 2023 (UTC)
Sorry, Ive been a bit careless lately, and keep missing things. So it should be back to normal in a few days? Good to know, thanks. —Soap—03:33, 29 July 2023 (UTC)
Is there a way we can generate a list of all uses of incorrect params inside quote templates? To find crap like this. Deliberately mistyping a few common parameters brought up a handful of errors, so there's bound to be a bunch. 280, I predict... Seoovslfmo (talk) 23:43, 29 July 2023 (UTC)
Thanks, WF, I have already renamed place -> location, co-authors= to coauthors= and archive-url= -> archiveurl=. Benwing2 (talk) 16:50, 30 July 2023 (UTC)
@Geographyinitiative Blah, looks like I'm gonna have to review those author1 changes manually. The code checks that there's no author= before renaming to author= but I didn't think about authors specified using first/last=. Benwing2 (talk) 16:53, 30 July 2023 (UTC)
Yeah, I have seen people input "last=Doe|first=John|author2=Richard Roe" or "last1=Doe|first1=John|author2=Richard Roe" and found it very weird. It seems like they should be entered all consistently using one format. - -sche(discuss)17:20, 30 July 2023 (UTC)
When the translation displayed in {{quote-book}} is licensed under CC-BY-SA, and the attribution is short, how do I acknowledge the translation? If I use |translator=, it looks as though the book itself is based on the translations being acknowledged. You can see an example at Pali ယက္ခ(yakkha). The translations are stored centrally in Module:RQ:pi:Shan Paritta, but I'm not confident that the acknowledgement on that page suffices. --RichardW57 (talk) 04:19, 30 July 2023 (UTC)
Are you planning to split PEA content off to its own Reconstruction pages separate from PA? My main hesitation—why I haven't bothered adding PEA before—is that I wonder whether it's (un)helpful to readers to split things up too finely vs having all the roots of Algonquian terms in one place; in the extreme, that leads to things like Malayo-Polynesian and other Austronesian content being divided between the myriad nigh-identical Proto-Foo-Bar, Proto-Foo, Proto-Western-Foo, Proto-Southwestern-Foo entries for every stage of the Proto-Autronesian tree which Tropylium and some other editors tried to prune a while ago. The changes from PA to PEA were relatively modest, the loss of some vowel length distinctions and a few other changes, like *e to *ə and changes in the circumstances under which certain linking vowels were used in negatives, etc. OTOH, I see there has been more research into PEA even in just the last few years, and I suppose it is more like distinguishing just one level (PWGmc but not separate pages for Weser-Rhine, etc) from PGmc rather than umpteen levels of Austronesian. I'm ambivalent; @MiltonLibraryAssistant, Hk5183, do you have any opinions on whether PEA entries should be split off from PA? - -sche(discuss)18:16, 30 July 2023 (UTC)
Unlike Proto-Algonquian, there isn't a publicly-accessible dictionary for PEA online. Also, I can't seem to find much literature on the subject, let alone any attempts at a PEA reconstruction. Take a look at this Proto-Algonquian entry: *nepyi. We don't need create separate entries for Proto-Plains Algonquian, Central Algonquian, Eastern Algonquian, etc. It would be too convoluted, and that's assuming that we already have a reliable source and standard for those proto-language reconstructions. Against. MiltonLibraryAssistant (talk) 04:44, 31 July 2023 (UTC)
So there are some PEA reconstructions in "Algonquian linguistic change and reconstruction" (I. Goddard, 1991), but only for very specific words. There's not a lot to go by, I think we should just leave it as is, until we can gather a list of standardised reconstructions for more general words (e.g. "water", "land", etc.) . MiltonLibraryAssistant (talk) 05:18, 31 July 2023 (UTC)
That may have been once the case, but things have much progressed since then, beyond Goddard, from Costa (2007) to Cunningham (2022a), and the schema for reconstructing PEA is very well established (more than PWG TBH).
The issues that I'm running into is reconstructing PAlg with only PEA descendants, which is problematic because 1) I have to make dodgy guesses to what the proto form was, ex. *xk or *θk, and 2) many terms likely didn't even exist in PAlg at all, but are instead PEA constructions.
The whole situation is very reminiscent of PGmc and PWG, as you pointed out, and the morphological changes are just as significant. In summation, not reconstructing PEA is academically outdated, and both linguistically and chronologically problematic. --{{victar|talk}}08:27, 31 July 2023 (UTC)
Sorry for the late response. I would be interested to know if there are any particularly reliable sources for PEA orthography. I've come across quite variable spellings for the same phonemes (not that this is specific to PEA and not PA).
While I know that there are many quite firmly established PEA reconstructions, I am a bit concerned that there is often a rather fuzzy line between PA and PEA reconstructions.
Of course, my foremost concern is that without any formal style-guide or norms for adding PEA terms, they are likely to be added in a piecemeal manner which may not greatly improve upon existing PA pages.
From a personal point of view, while I believe that PEA should eventually be added, effort would likely be better spent adding new (and improving the quality of existing) PA entries. Hk5183 (talk) 22:44, 2 August 2023 (UTC)
grunduz
The decl table on Reconstruction:Proto-Germanic/grunduz is not for grunduz but rather for grumþuz. Since we list this as an alternate form, I suspect this mismatch is on purpose, but I think we should explain it. If it is in fact a template bug, I have no idea how to fix it or even identify the problem, since the template call on that page is a simple {{gem-decl-noun}} with no parameters whatsoever. By contrast, Reconstruction:Proto-Germanic/handuz appears as normal. —Soap—08:10, 31 July 2023 (UTC)
As a general point, it seems unwise and unnecessary to house inflection table data for sui generis words in a Lua module. The inflections should just be ordinary parameters in the entry itself. This, that and the other (talk) 12:09, 31 July 2023 (UTC)
The nominative singular was intentional indeed. From Kroonen's EDPG intro page xxxii:
When Proto-Germanic still had a mobile accent, these ti- and tu-stems probably had root-stress in the nominative, and suffix-stress in the genitive, e.g. nom. *gʰrḿ-tu-s, gen. *gʰrḿ-té/ó-us. After the Germanic sound shifts, the nominative developed into *grumfþuz, whence G Cimb. grumf, while the genitive *grundauz ultimately served as the basis for Go. grundus and the aforementioned West Germanic forms. ON grunnr, on the other hand, goes back to *grunþuz, and appears to be a secondary variant with analogical n or þ. The fact that this analogy was possible proves that the paradigmatic Verner alternation must have remained intact until after the breaking up of Proto-Germanic and survived into Proto-Norse.
At the time I hardcoded the paradigm since I was too inexperienced in Lua to do otherwise. But the hardcoding can be minimized later, yes. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 16:38, 31 July 2023 (UTC)
Just FYI I agree with User:This, that and the other that we should avoid hard-coding individual paradigms in Lua if possible. I do do this for Romance verbs e.g. Italian fare but that is because they often show up in collocations and with prefixes, so it makes sense to put the conjugation info in one place rather than on each page using it. Benwing2 (talk) 18:46, 31 July 2023 (UTC)
Update: I have replaced many hardcoded Proto-Germanic declensions (namely irregular n-stems and a-stems) with a couple of multi-parameter declension routines. More will come. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 08:20, 1 August 2023 (UTC)
@Mellohi! It's never calling your tis-f routine. If you put an error() statement right before where it calls the declension routine, you'll see that the decl_type is i-mf. Presumably there's a bug in the code in lines 74-84 that is supposed to be taking the declension from the |stem= argument. Benwing2 (talk) 21:26, 1 August 2023 (UTC)
Hyphenation/syllabification for East Asian languages (Japanese and Korean) romanization
Hello,
Can East Asian languages (Japanese and Korean) using hyphenation/syllabification for romanized words (e.g. 鉄道, tetsudō and 만화, manhwa)? Yuliadhi (talk) 23:54, 31 July 2023 (UTC)
So long as it's implemented correctly. There are Korean terms with romanized medial -nh- as in 만화(manhwa), and it isn't always algorithmically obvious where the "H" belongs. And for Japanese, we'd need to pay attention to Japanese morphophonemic rules; in the 鉄道(tetsudō) example, hyphenating as tet-sudō would be incorrect, as the medial tsu is a single integral phonemic unit in Japanese. ‑‑ Eiríkr Útlendi │Tala við mig18:18, 14 August 2023 (UTC)