Hello. I see you are an administrator who deals with flags so maybe you can help me. We currently use only the flag of Portugal to represent Portuguese and here it was requested twice that it be replaced by this flag, which represents Brazil as well (consider that Brazil has twenty times more Portuguese speakers than Portugal). Both requests, the first in May 2016 by myself and the other one in June 2018, have been largely ignored. I'm here to request that change for the third time. - Alumnum (talk) 22:50, 3 January 2019 (UTC)
This is about this change. IE9 and older browsers get grade C support which means our js does not even get to run on them. more info. Giorgi Eufshi (talk) 06:37, 4 January 2019 (UTC)
I noticed you added a block of code to Module:nyms that begins with do
and ends with end
, but it doesn't seem to loop at all. Is this some Lua construct I'm not aware of? There's nothing on https://www.mediawiki.orghttps://dictious.com/en/Extension:Scribunto/Lua_reference_manual about it. —Rua (mew) 21:33, 6 January 2019 (UTC)
thesaurus_links
inaccessible below where it's actually used. I'm not sure I've ever seen it on Wiktionary before. — Eru·tuon 21:50, 6 January 2019 (UTC){{der3}}
At ] Derived terms the control that expands the list does not appear. Clicking the place it should appear does expand the list. DCDuring (talk) 19:36, 7 January 2019 (UTC)
{{hyp4}}
was used to {{hyp3}}
because of odd behaviour; it doesn't need four columns anyway. DonnanZ (talk) 21:17, 7 January 2019 (UTC)
{{der3}}
, right? That's one of the templates that I redid recently and that the discussion and vote in November's Beer Parlour was about. I don't have the RHS ToC so I will have to enable that to see if I can reproduce what you're talking about. — Eru·tuon 02:07, 8 January 2019 (UTC)float: right;
), which was putting the control into the stack of elements floating on the right side of the page, below the TOC and most of the images. Removing that property fixed the problem. — Eru·tuon 03:17, 8 January 2019 (UTC)float: right;
) could be an issue with {{WOTD}}
causing problems at the top of the clear: right;
or clear: both;
to the "Etymology" header. Maybe there is a gadget that adds that CSS property to the header. — Eru·tuon 01:01, 11 January 2019 (UTC)
{{wp}}
, {{swp}}
, {{wikipedia}}
, so maybe {{WOTD}}
should too, and not use (float: right;
). I don't know, I'm not a programmer. DonnanZ (talk) 10:32, 11 January 2019 (UTC)clear: right;
CSS property on the HTML element that encloses the image (<div class="thumb tright">...</div>
). For some reason, Microsoft Edge thinks that the property means that the etymology heading has to be below the image, but Firefox and Chrome don't. Just removing the property isn't desirable; then the image appears to the left of the "WOTD" text. — Eru·tuon 18:39, 17 January 2019 (UTC)
Hi! Could you please have a look here? Thank you very much, --Epìdosis (talk) 10:20, 9 January 2019 (UTC)
Wonder if you can see if I did something wrong. I updated {{WOTD}}
so that it would recognize audio files in the format "File:En-au-.ogg" which Commander Keane has been diligently uploading and inserting into entries. However, it works for some entries and not others. For example, if you look at the January 2019 WOTDs at "Wiktionary:Word of the day/Archive/2019/January", the audio file of emu appears but that of Tiggerish doesn't. I tried resetting the transcode of File:En-au-Tiggerish.ogg but that didn't make a difference. Any idea what might be going wrong? Thanks. — SGconlaw (talk) 06:47, 15 January 2019 (UTC)
action=purge
in the URL), different from reloading the browser. — Eru·tuon 07:32, 15 January 2019 (UTC)
I know you're trying to clean up my (admittedly) poorly constructed category, but you do realize that 糹 is not a triplication? Johnny Shiz (talk) 16:19, 9 February 2019 (UTC)
Eru, you don't have to answer this... But if you ever have time: I'm trying to understand lua (at my age, impossible), which is needed at el.wiktionary, because the last person who could handle it, disappeared last year. I know that neither is correct, but which one is the worst? the 1st or the 2nd? sarri.greek (talk) 23:56, 13 February 2019 (UTC)
Regarding the IP’s removal of the quote on غَزَا (ḡazā), this follows a long line of removing anything implying usage of Arabic words for computing, regard the history of قُرْصَان (qurṣān), English hacker, خ ر ق (ḵ-r-q), and others I cannot name off the cuff. The removal of such references may also be the only motivation for layout changes, this IP appears to frequently camouflage removals by changes in other respects of dubious worth. Informing also @Chuck Entz, Surjection who have previously tackled this IP. Fay Freak (talk) 23:30, 23 February 2019 (UTC)
Why doesn't mw.clone() work on loadData'd tables? What is the error? What is the proper way to clone a table? Maybe table.shallowClone() and/or table.deepcopy()? Module:parameters should *DEFINITELY* not be side-effecting the params table passed into it; that's bad juju and can lead to all sorts of subtle and hard-to-debug errors. Benwing2 (talk) 01:36, 1 March 2019 (UTC)
deepcopy
from Module:table is intended to copy tables loaded with mw.loadData
, but when I tried plugging it into your edit, there was a stack overflow. Not sure how that happened. — Eru·tuon 01:40, 1 March 2019 (UTC)mw.clone
is that it copies the metatable, and the metatable makes the copied table read-only, and prevents mw.clone
from writing any keys to it. deepcopy
allows the metatable not to be copied. — Eru·tuon 01:43, 1 March 2019 (UTC)
I am not familiar with the particular standard used here, but I do not believe "alternative" is an accurate term for these terms. I have previously used "alternative form" for slightly different forms of the same word that are more or less equal in the standard language, like "kaitsema" and "kaitsma". In this case, these forms are not entirely equal in meaning. "Jauhatama" for instance is the Võro word, and would not be considered an "Estonian" word by most. While "jahvama" is listed in the ÕS as a dialectal termin, I would still not consider it an "alternative form", but rather a dialectal synonym. If used, it carries a dialectal connotation which makes it different from "jahvatama". Strombones (talk) 09:55, 6 March 2019 (UTC)
Frankly, I do not think {{grc-noun form}}
is necessarily preferable to {{head|grc|noun form}}
. --Dan Polansky (talk) 06:25, 23 March 2019 (UTC)
{{head}}
does not. — Eru·tuon 06:45, 23 March 2019 (UTC)
Can you rerun your script checking for any of the following templates? Some of them don't end in 'of' (particularly the shortcut aliases). Thanks:
language_specific_alt_form_of_templates = [
u"be-Taraškievica",
"bg-pre-reform",
"ceb-superseded spelling of",
"egy-alt",
"egy-alternative transliteration of",
"en-ing form of",
"fr-post-1990",
"fr-pre-1990",
#"ga-lenition of",
"hy-reformed",
"jbo-rafsi of",
"morse code abbreviation",
"morse code for",
"morse code prosign",
"my-ICT of",
u"pt-superseded-paroxytone-éi",
u"pt-superseded-paroxytone-ói",
"ru-abbrev of",
"ru-acronym of",
"ru-clipping of",
"ru-initialism of",
"ru-pre-reform",
"uk-pre-reform",
"yi-alternatively pointed form of",
"yi-phonetic spelling of",
"yi-unpointed form of",
]
alt_form_of_templates = [
"abbreviation of", "abb", "abbreviation", "ao",
"acronym of",
"archaic form of",
"archaic spelling of",
"aspirate mutation of",
"clipping of", "clipped form of", "clip",
"contraction of",
"dated form of",
"dated spelling of",
"deliberate misspelling of",
"eclipsis of", "eclipsed",
"eggcorn of", "eggcorn",
"elongated form of",
"euphemistic form of",
"euphemistic spelling of",
"former name of",
"hard mutation of",
"informal form of",
"informal spelling of",
"initialism of", "io",
"lenition of", "lenited",
"misromanization of",
"misspelling of", "common misspelling of", "misspell",
"mixed mutation of",
"mutation of",
"nasal mutation of",
"nomen sacrum form of",
"nonstandard form of",
"nonstandard spelling of",
"obsolete form of",
"official form of",
"rare form of", "rareform",
"rare spelling of", "rarespell", "rarspell",
"short for", "short form of", "short of", "shortfor",
"soft mutation of",
"standard form of",
"standard spelling of", "standspell",
"superseded spelling of", "deprecated spelling of", "superseded form of",
"uncommon form of",
"uncommon spelling of",
]
language_specific_form_of_templates = [
"ar-act-participle",
"ar-adj-inf-def",
"ar-noun-inf-cons",
"ar-noun-pl-coll-cons",
"ar-pass-participle",
"ar-verb-form",
"ar-verbal noun of",
"bg-adjective extended of",
"bg-adjective feminine definite of",
"bg-adjective feminine indefinite of",
"bg-adjective masculine definite object of",
"bg-adjective masculine definite subject of",
"bg-adjective neuter definite of",
"bg-adjective neuter indefinite of",
"bg-adjective plural definite of",
"bg-adjective plural indefinite of",
"bg-plural count of",
"bg-singular definite object form of",
"bg-singular definite subject form of",
"blk-past of",
"cs-imperfective form of",
"cu-Glag spelling of",
"da-pl-genitive",
"de-du contraction",
"de-inflected form of",
"egy-verb form of",
"el-comp-form-of",
"el-form-of-verb",
"el-super-form-of",
"en-archaic second-person singular of",
"en-comparative of",
"en-irregular plural of",
"en-past of",
"en-simple past of",
"en-superlative of",
"fy-NPL",
"fy-noun-entry-pl",
"ga-emphatic of",
"ga-lenition of",
"hu-exaggerated of",
"hy-traditional",
"ia-form of",
"ie-past and pp of",
"ja-new/r",
"ja-past of verb",
"ja-romaji",
"ja-romanization of",
"ja-te form of verb",
"ja-verb form of",
u"kyūjitai spelling of",
"la-comp-form",
"la-part-form",
"lb-inflected form of",
"mn-verb form of",
"pt-cardinal form of",
"pt-pronoun-with-l",
"pt-pronoun-with-n",
"ro-adj-form of",
u"ru-alt-ё",
"ru-participle of",
"sa-desiderative of",
"sa-frequentative of",
"sa-root form of",
"sce-verb form of",
"sco-past of",
"sco-simple past of",
"sga-verbnec of",
"sino-vietnamese reading of",
"ug-latin",
"ug-uly of",
"ug-uyy of",
"yi-inflected form of",
"za-sawndip form of",
]
form_of_templates = [
"abessive plural of",
"abessive singular of",
"abstract noun of",
"accusative of",
"accusative plural of",
"accusative singular of",
"active participle of",
"agent noun of",
"alternative case form of", "alternative capitalisation of", "alternative capitalization of", "altcaps", "altcase",
"alternative form of", "alternate form of", "alt form", "altform", "alt form of", "alt-form",
"alternative plural of",
"alternative reconstruction of",
"alternative spelling of", "alternate spelling of", "altspelling", "altspell", "alt-sp", "alt spell of",
"alternative typography of",
"ancient form of",
"aphetic form of",
"apocopic form of",
"associative plural of",
"associative singular of",
"attributive form of", "attributive of",
"augmentative of",
"broad form of",
"causative of",
"combining form of",
"comitative plural of",
"comitative singular of",
"comparative of", "comparative form of",
"comparative plural of",
"comparative singular of",
"dative dual of",
"dative of",
"dative plural definite of",
"dative plural indefinite of",
"dative plural of",
"dative singular of",
"definite of",
"distributive plural of",
"distributive singular of",
"dual of",
"e-form of", "definite and plural of",
"early form of",
"elative of",
"ellipsis of", "anapodoton of", "ellipse of",
"equative of",
"exclusive plural of",
"exclusive singular of",
"female form of", "fem form",
"feminine noun of",
"feminine of",
"feminine plural of",
"feminine plural past participle of",
"feminine singular of",
"feminine singular past participle of", "feminine past participle of",
"form of",
"frequentative of",
"future participle of",
"genitive of",
"genitive plural definite of",
"genitive plural indefinite of",
"genitive plural of",
"genitive singular definite of",
"genitive singular indefinite of",
"genitive singular of",
"gerund of",
"harmonic variant of",
"honorific alternative case form of", "honoraltcaps",
"imperative of",
"imperfective form of",
"inflected form of",
"inflection of", "conjugation of",
"iterative of",
"late form of",
"masculine animate plural past participle of",
"masculine inanimate plural past participle of",
"masculine noun of",
"masculine of",
"masculine plural of",
"masculine plural past participle of",
"masculine singular past participle of",
"medieval spelling of",
"men's speech form of", "men's form of",
"misconstruction of",
"monotonic form of",
"negative of",
"neuter plural of",
"neuter plural past participle of",
"neuter singular of", "neuter of",
"neuter singular past participle of", "neuter past participle of",
"nominalization of",
"nominative plural of",
"nominative singular of",
"nuqtaless form of",
"oblique plural of",
"oblique singular of",
"obsolete spelling of", "obssp", "obs-sp",
"obsolete typography of",
"participle of",
"passive of", "passive form of",
"passive participle of",
"passive past tense of", "past passive of", "passive past of",
"past active participle of",
"past participle of", "past participle",
"past passive participle of",
"past tense of", "past of",
"paucal of",
"pejorative of",
"perfect participle of",
"perfective form of",
"plural definite of", "definite plural of",
"plural indefinite of", "indefinite plural of",
"plural of", "plural form of",
"present active participle of",
"present participle of",
"present tense of", "present of",
"reflexive of",
"rfform",
"second-person singular of",
"second-person singular past of",
"singular definite of", "definite singular of",
"singular of",
"singulative of",
"slender form of",
"spelling of",
"substantivisation of", "substantivization of",
"superlative attributive of",
"superlative of", "superlative form of",
"superlative predicative of",
"supine of",
"syncopic form of",
"synonym of", "alternative term for", "altname", "synonym", "alternative name of", "synof", "syn-of", "syn of",
"terminative plural of",
"terminative singular of",
"verbal noun of",
"vocative plural of",
"vocative singular of",
]
Benwing2 (talk) 05:00, 25 March 2019 (UTC)
|dot=<nowiki/>
. Benwing2 (talk) 09:53, 25 March 2019 (UTC)
{{form of}}
template with the |dot=
param? Thanks! Benwing2 (talk) 01:34, 28 March 2019 (UTC)
{{form of}}
was on the preceding list. Thanks! Benwing2 (talk) 02:21, 28 March 2019 (UTC)Sir, have it your way, if you must. But, with respect, you need to understand that language and poetic scansion are not the same thing. You seem to be confusing the two. Traditional metric scansion can sometimes do violence to language, forcing it to do abnormal things, but that does not mean that those abnormalities then become part and parcel of normal everyday speech. Thus, we can be quite certain that, while forced to scan κᾱλός while reciting certain kinds of poetry, speakers of Attic-Ionic never said κᾱλός in actual normal speech.
Now, I am not saying that information about the linguistic abnormalities of metric scansion is not germane to the Wiktionary. On the contrary, I think it is very useful to a student of Classical Greek poetry, provided that it is placed in the appropriate context, such as in a Usage Note (as it is now), and stating very clearly that what applies to the traditional metric scansion of poetry does not apply to normal everyday language.
However, that kind of metric information certainly does not belong under Pronunciation. “Epic Greek” scansion is not a Greek dialect, as Doric, Attic, Ionic, Aeolian or Boeotian are. Nor are “certain other cases” forms of language in the way that dialects are.
Perhaps, as a student of Classical Greek poetry (a highly commendable endeavor to be sure), you have little concern for language outside of metric scansion. But I very much doubt that ancient Greeks went about their day reciting Homer all the time (thus saying κᾱλός much more often than κᾰλός). I am certain that they actually spoke their Greek as a real everyday language.
The Wiktionary is primarily a dictionary, not a guide to metric scansion. And the purpose of a dictionary is to record actual language, not the abnormalities of metric scansion. Pasquale (talk) 16:02, 26 March 2019 (UTC)
Hmm, I agree that the post-Classical Attic pronunciations of καλός (kalós) are probably inaccurate: it's not clear when Greek speakers would have stopped reciting Homer metrically. So it would make more sense to only show the Classical Attic pronunciation for any special Epic forms.
As mentioned above, we tend to include {{grc-IPA}}
in every entry, and some entries are for special metrical forms that are spelled differently from the normal forms. So to apply your preferred policy, pronunciations would have to be removed altogether from certain entries. Another option is to label these forms and to only show the Classical Attic pronunciation, because it is not clear how the pronunciation used in reciting poetry would have evolved, but it is plausible that poetry known to the Athenians would have been recited in something like an Athenian accent. I'm more inclined to the latter because I think it is helpful to provide some kind of transcription of poetic words or pronunciations. — Eru·tuon 18:06, 27 March 2019 (UTC)
I am interested in the inline links to Wikipedia that we have enclosed in ], {{w}}
(96K pages of transclusions, mostly multiple), and {{w2}}
(3K). There is an argument to be made that the frequency with which we feel compelled to have such a link is an indication that it might make a useful entry, provided, of course, that it meets CFI.
I could extract these using modifications of the Perl scripts that I use for {{taxon}}
(15K) and {{taxlink}}
(25K). Are you aware of any such compilation or of any well-designed code to do the same thing?
Does it make sense to do it from the tool server? How? DCDuring (talk) 23:33, 29 March 2019 (UTC)
{{w}}
and {{w2}}
instances and could make a program to find wikilinks if you'd like. I figure it would be faster to generate the data from a list of templates and wikilinks than the whole dump. — Eru·tuon 00:12, 30 March 2019 (UTC)
{{w2}}
until I has looking for the number of pages that transcluded {{w}}
. My {{vern}}
does a link to WP, but obviously I know all about them. Are there other templates that link to WP?{{taxlink}}
.) Ideally, I would like to merge all the lists of instances and get counts. Specifically, for a given link, I would like the wikipedia link and the display. The WP links might include links to headers and there could be different displays for a given WP link. The sort should group instances by WP link page, and subsort by any header links and then by display. The groups should sort by decreasing frequency of the entire group. DCDuring (talk) 01:55, 30 March 2019 (UTC)
{{w}}
and {{w2}}
files on my computer. Unfortunately the {{w}}
file is too big to save on a Wiktionary page (about 3.5 MB), the other one a bit smaller and available here. If you'd like the {{w}}
file, you can send me an email via Special:Emailuser. (I'm not super familiar with file-sharing sites.) — Eru·tuon 02:22, 30 March 2019 (UTC)
Hi, if I edit {{etyl}}
to show that a certain language is done, but there are actually still a few forms left, please don't undo my edit to the template. I often list language codes as done that don't actually have an "etyl cleanup/xx" subcategory precisely so that I can find the last few stragglers via CAT:E. Obviously if I've really jumped the gun and there are suddenly dozens of pages in CAT:E, you can and should revert me, but if there's only a handful of pages, then please let it be. I'll find and fix the module errors soon. Thanks! —Mahāgaja · talk 14:43, 9 April 2019 (UTC)
Thought you might find this useful ... I created a programmatic list of all the non-language-specific form-of templates and their properties. Not sure if you use Python but if you do it should be very easy to fetch whatever you want out of this list. Each template has an associated dict of properties:
"aliases"
: List of aliases"deprecated-aliases"
: List of deprecated aliases (should no longer be used)"withcap"
: If true, template displays a default initial capital and supports |nocap=
"withdot"
: If true, template displays a default final period and supports |nodot=
and |dot=
"withfrom"
: If true, template supports |from=
, |from2=
, etc. to specify regional dialects or whatever"withPOS"
: If true, template supports |POS=
to control the part of speech of the category"cat"
: If present, non-language-specific portion of the category to which the page belongs (prepended with the canonical name of the language to form the actual category name); value could potentially be a list of multiple categories, but no such entries exist among the non-language-specific templatesI'm still working on the corresponding language-specific list. These templates are much messier, often work in idiosyncratic ways, and are often defined manually instead of using a function in Module:form of/templates. I'm gradually converting them and cleaning them up.
form_of_templates = [
("abbreviation of", {"aliases": , "withcap": True, "withdot": True, "cat": "abbreviations"}),
("abstract noun of", {"withcap": True, "withfrom": True, "cat": "abstract nouns"}),
("accusative of", {}),
("accusative plural of", {}),
("accusative singular of", {}),
("acronym of", {"withcap": True, "withdot": True, "cat": "acronyms"}),
("active participle of", {}),
("agent noun of", {"cat": "agent nouns"}),
("alternative case form of", {"aliases": , "withcap": True}),
("alternative form of", {"aliases": , "deprecated-aliases": , "withcap": True, "withfrom": True}),
("alternative plural of", {}),
("alternative reconstruction of", {}),
("alternative spelling of", {"aliases": , "withcap": True, "withfrom": True}),
("alternative typography of", {}),
("aphetic form of", {"withcap": True, "cat": "aphetic forms"}),
("apocopic form of", {"cat": "apocopic forms"}),
("archaic form of", {"withcap": True, "withdot": True, "cat": "archaic forms"}),
("archaic spelling of", {"withcap": True, "withdot": True, "cat": "archaic forms"}),
("aspirate mutation of", {"withcap": True, "withdot": True, "cat": "aspirate-mutation forms"}),
("attributive form of", {}),
("augmentative of", {"withcap": True, "withPOS": True, "cat": "augmentative {{{POS|noun}}}s"}),
("broad form of", {"withfrom": True}),
("causative of", {"cat": "causative verbs"}),
("clipping of", {"aliases": , "withcap": True, "withdot": True, "cat": "clippings"}),
("combining form of", {"cat": "combining forms"}),
("comparative of", {"withPOS": True, "cat": "comparative {{{POS|adjective}}}s"}),
("construed with", {}),
("contraction of", {"withcap": True, "withdot": True, "cat": "contractions"}),
("dated form of", {"withcap": True, "withdot": True, "cat": "dated forms"}),
("dated spelling of", {"withcap": True, "withdot": True, "cat": "dated forms"}),
("dative of", {}),
("dative plural of", {}),
("dative singular of", {}),
("definite of", {}),
("deliberate misspelling of", {"withcap": True, "withdot": True, "cat": "misspellings"}),
("diminutive of", {"aliases": , "withPOS": True, "cat": "diminutive {{{POS|noun}}}s"}),
("diminutive plural of", {"withPOS": True, "cat": "diminutive {{{POS|noun}}}s"}),
("dual of", {}),
("eclipsis of", {"withcap": True, "withdot": True, "cat": "eclipsed forms"}),
("eggcorn of", {"withcap": True, "withdot": True, "cat": "eggcorns"}),
("elative of", {}),
("ellipsis of", {"withcap": True, "cat": "ellipses"}),
("elongated form of", {"withcap": True, "withdot": True, "cat": "elongated forms"}),
("endearing form of", {"withPOS": True, "cat": "endearing {{{POS|noun}}}s"}),
("equative of", {"withPOS": True, "cat": "{{{POS|adjective}}} equative forms"}),
("euphemistic form of", {"withcap": True, "withdot": True, "cat": "euphemisms"}),
("euphemistic spelling of", {"withcap": True, "withdot": True, "cat": "euphemisms"}),
("eye dialect of", {"withcap": True, "withdot": True, "withfrom": True, "cat": "eye dialect"}),
("feminine noun of", {}),
("feminine of", {}),
("feminine plural of", {}),
("feminine plural past participle of", {"cat": "past participle forms"}),
("feminine singular of", {}),
("feminine singular past participle of", {"cat": "past participle forms"}),
("form of", {}),
("former name of", {"withcap": True, "withdot": True}),
("frequentative of", {"cat": "frequentative verbs"}),
("future participle of", {}),
("genitive of", {}),
("genitive plural definite of", {}),
("genitive plural indefinite of", {}),
("genitive plural of", {}),
("genitive singular definite of", {}),
("genitive singular indefinite of", {}),
("genitive singular of", {}),
("gerund of", {"cat": "gerunds"}),
("h-prothesis of", {"cat": "h-prothesized forms"}),
("hard mutation of", {"withcap": True, "withdot": True, "cat": "hard-mutation forms"}),
("harmonic variant of", {}),
("honorific alternative case form of", {"aliases": , "withcap": True}), # FIXME, rewrite with withdot=
("imperative of", {}),
("imperfective form of", {"cat": "imperfective verbs"}),
("inflected form of", {}),
("inflection of", {"deprecated-aliases": }),
("informal form of", {"withcap": True, "withdot": True, "cat": "informal forms"}),
("informal spelling of", {"withcap": True, "withdot": True, "cat": "informal forms"}),
("initialism of", {"aliases": , "withcap": True, "withdot": True, "cat": "initialisms"}),
("iterative of", {"cat": "iterative verbs"}),
("lenition of", {"withcap": True, "withdot": True, "cat": "lenited forms"}),
("masculine noun of", {}),
("masculine of", {}),
("masculine plural of", {}),
("masculine plural past participle of", {"cat": "past participle forms"}),
("medieval spelling of", {"cat": "medieval spellings"}),
("men's speech form of", {"cat": "men's speech terms"}),
("misconstruction of", {"withcap": True, "cat": "misconstructions"}),
("misromanization of", {"withcap": True, "withdot": True, "cat": "misromanizations"}),
("misspelling of", {"aliases": , "withcap": True, "withdot": True, "cat": "misspellings"}),
("mixed mutation of", {"withcap": True, "withdot": True, "cat": "mixed-mutation forms"}),
("nasal mutation of", {"withcap": True, "withdot": True, "cat": "nasal-mutation forms"}),
("negative of", {}),
("neuter plural of", {}),
("neuter singular of", {}),
("neuter singular past participle of", {"cat": "past participle forms"}),
("nomen sacrum form of", {"withcap": True, "withdot": True, "cat": "nomina sacra"}),
("nominalization of", {"cat": "nominalized adjectives"}),
("nominative plural of", {}),
("nominative singular of", {}),
("nonstandard form of", {"withcap": True, "withdot": True, "cat": "nonstandard forms"}),
("nonstandard spelling of", {"withcap": True, "withdot": True, "cat": "nonstandard forms"}),
("nuqtaless form of", {}),
("obsolete form of", {"withcap": True, "withdot": True, "cat": "obsolete forms"}),
("obsolete spelling of", {"aliases": , "withcap": True, "cat": "obsolete forms"}),
("obsolete typography of", {"cat": "obsolete forms"}),
("official form of", {"withcap": True, "withdot": True, "cat": "official forms"}),
("participle of", {"cat": "participles"}),
("passive of", {"cat": "verb passive forms"}),
("passive participle of", {}),
("passive past tense of", {}),
("past active participle of", {"cat": "past active participles"}),
("past participle form of", {"cat": "past participle forms"}),
("past participle of", {"cat": "past participles"}),
("past passive participle of", {"cat": "past passive participles"}),
("past tense of", {}),
("pejorative of", {"withcap": True, "cat": "derogatory terms"}),
("perfect participle of", {"cat": "perfect participles"}),
("perfective form of", {"cat": "perfective verbs"}),
("plural definite of", {}),
("plural indefinite of", {}),
("plural of", {"deprecated-aliases": }),
("present active participle of", {"cat": "present active participles"}),
("present participle of", {"cat": "present participles"}),
("present tense of", {}),
("pronunciation spelling of", {"withcap": True, "withdot": True, "withfrom": True, "cat": "pronunciation spellings"}),
("pronunciation variant of", {"withcap": True, "withdot": True, "withfrom": True, "cat": "pronunciation variants"}),
("rare form of", {"withcap": True, "withdot": True, "cat": "rare forms"}),
("rare spelling of", {"aliases": , "withcap": True, "withdot": True, "cat": "rare forms"}),
("reflexive of", {"cat": "reflexive verbs"}),
("rfform", {}),
("romanization of", {"withcap": True}),
("short for", {"withcap": True, "withdot": True, "cat": "short forms"}),
("singular definite of", {}),
("singular of", {}),
("singulative of", {}),
("slender form of", {"withfrom": True}),
("soft mutation of", {"withcap": True, "withdot": True, "cat": "soft-mutation forms"}),
("spelling of", {"cat": "{{#if:{{{lang|}}}|{{{1}}} forms|{{{2}}} forms}}"}),
("standard form of", {"withcap": True, "withdot": True}),
("standard spelling of", {"aliases": , "withcap": True, "withdot": True}),
("superlative attributive of", {}),
("superlative of", {"withPOS": True, "cat": "superlative {{{POS|adjective}}}s"}),
("superlative predicative of", {}),
("superseded spelling of", {"withcap": True, "withdot": True, "cat": "superseded forms"}),
("supine of", {}),
("syncopic form of", {"cat": "syncopic forms"}),
("synonym of", {"aliases": , "withcap": True}),
("t-prothesis of", {"cat": "t-prothesized forms"}),
("uncommon form of", {"withcap": True, "withdot": True, "cat": "uncommon forms"}),
("uncommon spelling of", {"withcap": True, "withdot": True, "cat": "uncommon forms"}),
("verbal noun of", {"cat": "verbal nouns"}),
("vocative plural of", {}),
("vocative singular of", {}),
]
Benwing2 (talk) 02:48, 10 April 2019 (UTC)
Here is my current list of lang-specific form-of templates and their aliases (if there are multiple comma-separated template names listed on a single line, the first one is the canonical name and the remainder are aliases). I haven't gotten around yet to classifying them by behavior, which is difficult in any case because each one is so idiosyncratic and my plan is to obsolete as many as possible.
ar-instance noun of ar-verbal noun of be-Taraškievica spelling of bg-adj form of bg-noun form of bg-pre-reform bg-verb form of blk-past of br-noun-mutation of,br-noun-mutated br-noun-plural ca-adj form of ca-form of ca-verb form of caret notation of ceb-superseded spelling of chm-inflection of cmn-erhua form of,zh-erhua form of cu-Glag spelling of cu-form of da-e-form of da-pl-genitive de-du contraction de-form-adj de-form-noun de-inflected form of de-superseded spelling of,de-deprecated spelling of de-umlautless spelling of de-verb form of de-zu-infinitive of egy-alternative transliteration of,egy-alt egy-verb form of el-Cretan dialect form of el-Cypriot dialect form of el-Italiot dialect form of el-Katharevousa form of el-Maniot dialect form of el-Pontian dialect form of el-form-of-adv el-form-of-nounadj,el-form-of-pronoun el-form-of-verb,el-verb form of el-monotonic form of el-participle of el-polytonic form of en-archaic second-person singular of en-archaic second-person singular past of en-archaic third-person singular of en-comparative of en-ing form of en-irregular plural of en-past of en-simple past of en-superlative of en-third-person singular of,en-third person singular of enm-first-person singular of enm-first/third-person singular past of enm-inflected form of enm-plural of enm-plural past of enm-plural subjunctive of enm-plural subjunctive past of enm-second-person singular of enm-second-person singular past of enm-singular subjunctive of enm-singular subjunctive past of enm-third-person singular of eo-form of eo-root of es-adj form of es-compound of es-note-noun-mf es-verb form of es-verb form of/adverbial es-verb form of/conditional es-verb form of/imperative es-verb form of/indicative es-verb form of/participle es-verb form of/subjunctive es-verb form of/subtense-name es-verb form of/subtense-pronoun et-nom form of et-participle of et-verb form of fa-adj form of,fa-adj-form fa-form-verb ff-fuc-form of fi-form of fi-infinitive of fi-participle of fi-verb form of fr-post-1990 fr-pre-1990 fy-pronadv of ga-emphatic of ga-lenition of gl-verb form of gl-verb form of/conditional gl-verb form of/doWork gl-verb form of/error gl-verb form of/imperative gl-verb form of/indicative gl-verb form of/participle gl-verb form of/pronoun gl-verb form of/subjunctive gl-verb form of/subtense-name gl-verb form of/subtense-pronoun gmq-bot-verb-form-sup got-compound of got-nom form of got-verb form of han tu form of,vi-hantu form of he-adj form of he-defective spelling of he-excessive spelling of he-infinitive of he-noun form of he-prep form of he-verb form of hi-form-adj hi-form-adj-verb hi-form-noun hi-form-verb hit-broad transcription of hit-transliteration of hu-exaggerated of hu-inflection of hu-participle hy-form-noun hy-reformed hy-traditional ia-form of ie-past and pp of io-form of is-conjugation of is-inflection of it-adj form of iu-spel ja-form of ja-kyujitai spelling of,kyu,ja-kyu sp ja-past of verb ja-romanization of,ja-romanization-of ja-te form of verb ja-verb form of jbo-rafsi of jyutping reading of ka-form of ka-verb-form-of ka-verbal for,ka-verbal of ko-hanja form of,hanja form of ko-mixed form of ko-root of ku-verb form of la-praenominal abbreviation of lb-inflected form of liv-conjugation of liv-inflection of liv-participle of lt-būdinys,lt-budinys lt-dalyvis-1,lt-dalyvis lt-dalyvis-2 lt-form-adj lt-form-adj-is lt-form-noun lt-form-part lt-form-pronoun lt-form-verb lt-padalyvis lt-pusdalyvis lv-adv form of lv-comparative of lv-definite of lv-inflection of lv-negative of lv-participle of lv-reflexive of lv-superlative of lv-verbal noun of mfe-medial of,mfe-short of mn-verb form of morse code abbreviation morse code for morse code prosign mr-form-adj mt-prep-form my-ICT of nb-noun-form-def-gen nb-noun-form-def-gen-pl nb-noun-form-indef-gen-pl nb-noun-form-indef-pl nl-adj form of nl-noun form of nl-pronadv of nl-verb form of nn-verb-form of nn-verb-form-imp nn-verb-form-past nn-verb-form-pastpart nn-verb-form-pre no-noun-form-def no-noun-form-def-pl ofs-nom form of osx-nom form of pi-sc pinyin reading of,pinread,pinof pt-adj form of pt-adv form of pt-apocopic-verb pt-article form of pt-cardinal form of pt-noun form of pt-obsolete-differential-accent pt-obsolete-hellenism pt-obsolete-sc pt-obsolete-secondary-stress pt-obsolete-silent-letter-1911 pt-obsolete-éia pt-obsolete-ôo pt-obsolete-ü pt-ordinal form,pt-ordinal def pt-pron def pt-pronoun-with-l pt-pronoun-with-n pt-superseded-hyphen pt-superseded-paroxytone pt-superseded-silent-letter-1990 pt-verb form of pt-verb-form-of ro-Cyrillic of ro-adj-form of,ro-form-adj ro-form-noun ro-form-verb ro-superseded spelling of roa-opt-noun plural of ru-abbrev of ru-acronym of ru-alt-ё ru-clipping of ru-initialism of ru-participle of ru-pre-reform sa-desiderative of,sa-desi sa-frequentative of,sa-freq sa-root form of sce-verb form of sco-past of sco-simple past of sco-third-person singular of sga-verbnec of sh-form-noun sh-form-proper-noun sh-verb form of,sh-form-verb sino-vietnamese reading of sl-form-adj sl-form-noun sl-form-verb,sl-verb form of sl-participle of sv-adj-form-abs-def sv-adj-form-abs-def+pl sv-adj-form-abs-def-m sv-adj-form-abs-indef-n sv-adj-form-abs-pl sv-adj-form-comp sv-adj-form-comp-pl sv-adj-form-sup-attr sv-adj-form-sup-attr-m sv-adj-form-sup-pred sv-adj-form-sup-pred-pl sv-adv-form-comp sv-adv-form-sup sv-noun-form-adj sv-noun-form-def sv-noun-form-def-gen sv-noun-form-def-gen-pl sv-noun-form-def-pl sv-noun-form-indef-gen sv-noun-form-indef-gen-pl sv-noun-form-indef-pl sv-proper-noun-gen sv-verb-form-imp sv-verb-form-inf-pass sv-verb-form-past sv-verb-form-past-pass sv-verb-form-pastpart sv-verb-form-pre sv-verb-form-pre-pass sv-verb-form-prepart sv-verb-form-pres-pass sv-verb-form-subjunctive sv-verb-form-sup sv-verb-form-sup-pass sw-adj form of tg-adj form of,tg-adj-form tg-form-verb tl-superseded spelling of tl-verb form of tr-copulative form of tr-inflection of tr-possessive form of ug-uly of ug-uyy of uk-pre-reform ur-form-adj ur-form-noun ur-form-verb vi-Nom form of,Nom form of,nomof xh-combining stem of yi-alternatively pointed form of yi-inflected form of yi-phonetic spelling of yi-unpointed form of za-sawndip form of zh-alt-form zh-altname,zh-alt-name zh-altterm,zh-alt-term zh-misspelling of,zh-misspelling zh-old-name zh-only used in,zh-only zh-original zh-short,zh-abbrev zh-subst-char zh-sum of parts zh-synonym of,zh-synonym zu-combining stem of zu-verb inf of
Benwing2 (talk) 23:55, 13 April 2019 (UTC)
{{inflection of}}
; i.e. any cases where "mp" (possibly with spaces on either end) occurs in param 3 or greater in a call to {{inflection of}}
. BTW I missed two templates in the list above (now corrected): Template:he-infinitive of (I just forgot it) and Template:fy-pronadv of (recently added). Benwing2 (talk) 21:24, 14 April 2019 (UTC)
{{inflection of}}
containing |mp|
. There shouldn't be many cases in which mp
isn't a grammar label because it isn't a language code and isn't very likely to be a word, and no instances with explicitly numbered parameters include mp
as a grammar tag. — Eru·tuon 21:56, 14 April 2019 (UTC)
BTW as part of my cleanup of the lang-specific form-of templates I wrote some general scripts to rewrite templates in various ways. One of them lets you do fairly simple things like rename templates or remove or rename parameters using command-line arguments; e.g. I used the following:
python rewrite_template.py -t 'e-form of' -n 'da-e-form of' -r lang --filter lang=da --save
to rename {{e-form of}}
to {{da-e-form of}}
and remove the |lang=
parameter, with a filter added saying to operate only when |lang=da
, for safety's sake.
Another one lets you specify complex rewrite specifications in code. An example is for rewriting {{et-verb form of}}
to {{Inflection of|et|...}}
(this latter template doesn't exist yet but it will):
("et-verb form of", ( # The template code supports m=ptc and categorizes specially, but # it never occurs. "Inflection of", ("error-if", ("present-except", )), ("set", "1", [ "et", ("copy", "1"), "", ("lookup", "p", { "1s": , "2s": , "3s": , "1p": , "2p": , "3p": , "pass": "pass", "": , }), ("lookup", "m", { "pres": "pres", "past": "past", "cond": "cond", "impr": "impr", "quot": "quot", "": , }), ("lookup", "t", { "da": "da-infinitive", "conn": "conn", "": , }), ]), )),
This will, for example, rewrite {{et-verb form of|foobar|p=1p|t=conn}}
to {{Inflection of|et|foobar||1|p|conn}}
, but will complain and refuse to do anything if it sees an unfamiliar parameter or an unexpected value for a known parameter. I also have lots of other scripts to do things like regex-based lookups and rewrites, lists of pages in a given category or namespace or referencing a given page, etc. All of these scripts operate online, although most of them can be passed a list of pages to operate on, making it possible to interface them with scripts that search through a dump. If you're interested, I can make these scripts available. Benwing2 (talk) 22:34, 14 April 2019 (UTC)
{{R:itc:EDL}}
or move a numbered parameter to a named one, and realized I might have saved some effort by using your scripts instead, because it turned out to be more complex than I thought. — Eru·tuon 19:59, 23 December 2019 (UTC)Hi Erutuon, I appreciate your Latin>Cyrillic edits for the terms in Turkic languages.
Just wanted to ask: are you sure those terms prior to your edits were actually typed using Latin characters? Each time I took the effort to use the actual Cyrillic characters using the respective character sets. If so, then I will have those character sets corrected.
Regards, Borovi4ok (talk) 09:07, 19 April 2019 (UTC)
Lat2CyrMap
here to automatically replace Latin characters with Cyrillic. (I also sometimes verify using a program that I paste text into to see the names of the characters.) If you have trouble finding the characters, you can use the "Cyrillic" menu under the edit box (also available here) as a reference; all the letters there are Cyrillic except in the "Transliteration" section. — Eru·tuon 09:36, 19 April 2019 (UTC)Thanks. I actually routinely use the "Cyrillic" menu under the edit box. So I am confused now. Can I be sure that it actually has all the correct characters in it? Borovi4ok (talk) 10:12, 19 April 2019 (UTC)
Hey ... one of the side effects of my adding a whole bunch of inflection tags is that some pages are now running out of memory. One way to attack this is to separate the tags into more and less common ones, and only load the less common set if an unknown tag is encountered. To do this I need a list of all tags by usage; is this something you can produce? Benwing2 (talk) 00:45, 21 April 2019 (UTC)
{{inflection of}}
, convert the tags from shortcuts to full forms if possible, and count them. — Eru·tuon 01:29, 21 April 2019 (UTC){{inflection of}}
involving tags not in Module:form of/data? That way I can fix them up appropriately or add the missing tags to Module:form of/data. Benwing2 (talk) 15:38, 21 April 2019 (UTC)
Fraction of templates with bad tags = 3165 / 56980 = 5.55% Bad tags: other = 1138 autonomous = 314 {{lb|ga|archaic}} = 125 Epic = 121 Attic = 107 copulative = 50 negative conjugation = 49 duoplural = 42 definite form = 41 resultative = 40 variant = 39 Doric = 37 unaugmented = 36 Verbal noun = 34 Passive participle = 32 inalienable = 32 possession = 30 (multiple possessions) = 30 indefinite form = 25 {{lb|ga|obsolete}} = 25 ...
{{inflection of}}
. All of these are Polish past-tense forms like abonowałyśmy, where "other" means "not masculine personal", but this is far from clear without context. I'd like to replace the "other" tag with something more specific, do you agree? Benwing2 (talk) 14:16, 10 May 2019 (UTC)
ϝείδω and ϝοράω warrant unique inclusion, as they are the common Ancient Greek ancestors of ὁράω, εἴδομαι, and εἶδον. Their existence explains the weirdness of ὁράω, εἶδον, and οἶδα, from two common verbs of origin, and warrants an exception to the usual tendency to skip reconstructed Ancient Greek forms. Indeed ϝείδω's mention in ὁράω is very useful, and instantly explains why its imperfect is ἐώρων. Wing gundam (talk) 00:29, 25 April 2019 (UTC)
Are you sure you got each one or should I revert everything to Rua's edit of 29 minutes ago? DCDuring (talk) 22:59, 28 April 2019 (UTC)
Hello. I remember awhile ago you wondered if we could convert uses of |and|
in {{inflection of}}
to |//|
. I wrote a script to do that. It's careful only to combine things of the same type, and I have special exceptions for certain cases where combining doesn't make sense. The script also combines things like |nom|p|;|acc|p|;|voc|p|
to |nom//acc/voc|p|
and |2|p|pres|indc|;|2|p|pres|subj|
to |2|p|pres|indc//subj|
. A couple of issues that I'd like your input on:
|and|
can be ambiguous in how loosely or tightly it joins. There are cases like |nom|and|voc|and|dat|and|strong|gen|p|
(in Modern Irish, which should be read as "(nominative + vocative + dative + strong-genitive) plural") and |def|s|and|p|
(in Norwegian, which should be read as "(definite singular) + plural") and |1|s|and|3|p|aor|act|ind|
(in Ancient Greek, which should be read in the obvious way). I propose to introduce the code _
to bind more tightly than //
, so that the above three examples could be written as |nom//voc//dat//strong_gen|p|
, |def_s//p|
, and |1_s//3_p|aor|act|ind|
. I'm not sure how to display this to indicate the binding, maybe nominative, vocative, dative and strong_genitive plural (with an underscore) or definite-singular and plural (with a hyphen). What do you think?Benwing2 (talk) 02:19, 4 May 2019 (UTC)
_
and //
, the tag set should be split into multiple tag sets. For example, litear currently has {{inflection of|ligh||pres|indc|and|pres|subj|and|impr|autonomous|lang=ga}}
. This could be expressed as |pres_indc//pres_subj//impr|autonomous|
, but might better be expresed as |pres|ind//sub|autonomous|;|impr|autonomous|
. I think this especially goes for cases like paca, which has {{inflection of|pacare||3|s|pres|indc|and|2|s|impr|lang=it}}
, where the two things being joined share almost nothing; why not use {{inflection of|pacare||3|s|pres|indc|;|2|s|impr|lang=it}}
? Benwing2 (talk) 03:02, 4 May 2019 (UTC)
def:s//p
. The advantage of having *some* code like this is that the underlying template code has an unambiguous interpretation (even if the output doesn't show it), which can enable various use cases. The interpretation of either underscore or colon as a separator would be inhibited if the tag contains either a link (i.e. any of the
or |
chars) or HTML (i.e. the <
or >
chars). It isn't necessary to inhibit interpretation of //
in this fashion because //
doesn't normally occur in links or HTML (which is why I chose it); this allows things like {{lb|grc|Epic}}//{{lb|grc|Attic}}
, which occurs frequently.inflection-of-sep
and inflection-of-conjoined
. If the separator ends or begins with an ASCII space, I think the space has to be replaced with
 
to prevent the MediaWiki parser from moving the space outside of the HTML tag.
(as well as aliases like  
) is replaced with an ASCII space in the HTML emitted by the parser.nom//acc//voc
, this would look roughly like <span class="inflection-of-conjoined">nominative<span class="inflection-of-sep">, </span>accusative<span class="inflection-of-sep"><span class="serial-comma">,</span><span class="serial-and"> and</span></span>vocative</span>
if the linking is omitted. Then JavaScript can iterate over each .inflection-of-conjoined
element and find the child .inflection-of-sep
elements and change their displayed text. — Eru·tuon 18:58, 10 May 2019 (UTC)
tag|tag|and|tag
, like nom|acc|and|voc
, can safely be changed to tag//tag//tag
. This list should include all the templates that need to be fixed. — Eru·tuon 07:33, 4 May 2019 (UTC)
{{inflection of}}
Hey ... sorry to see all the vandalism on your page. I wrote a script to combine adjacent calls to {{inflection of}}
into a single call with semicolon separators, and then apply combination logic when sets of inflections differ along only one axis (the same thing I already did to existing calls to {{inflection of}}
with semicolons in them). I am thinking of running it, what do you think? Benwing2 (talk) 01:20, 8 May 2019 (UTC)
{{inflection of}}
and combine syncretisms as much as possible. I first went through the latest dump and identified subsections where such combination is potentially possible (producing 442,504 subsections on 420,214 pages), and then ran my script on those subsections. The script first combines adjacent calls to {{inflection of}}
that can be combined (same language, same lemma, etc.), using |;|
, and then seeks to further combine tag sets that differ in a single dimension. Some stats after all combining is done:Num tag sets seen = 691737 Num tag sets with 1 multipart tags = 342350 (49.49%) Num tag sets with 0 multipart tags = 300938 (43.50%) Num tag sets with 2 multipart tags = 48445 (7.00%) Num tag sets with 3 multipart tags = 4 (0.00%) Tag sets by ordered dimensions of multipart tags: = 300938 (43.50%) case = 169362 (24.48%) gender = 65031 (9.40%) person = 52322 (7.56%) mood = 47584 (6.88%) case, gender = 44323 (6.41%) tense-aspect = 5490 (0.79%) person, mood = 2947 (0.43%) number = 2146 (0.31%) person, number = 792 (0.11%) gender, case = 318 (0.05%) animacy = 204 (0.03%) voice-valence = 122 (0.02%) state = 75 (0.01%) case, number = 34 (0.00%) person, tense-aspect = 7 (0.00%) voice-valence, mood = 7 (0.00%) unknown = 7 (0.00%) class, case = 6 (0.00%) state, case = 4 (0.00%) class = 4 (0.00%) case, gender, number = 4 (0.00%) grammar = 3 (0.00%) number, case = 2 (0.00%) number, mood = 1 (0.00%) number, gender = 1 (0.00%) person, grammar = 1 (0.00%) animacy, case = 1 (0.00%) gender, number = 1 (0.00%)
nom//acc//voc
, i.e. it denotes syncretism along an axis), while 300,938 (43.5%) had no multipart tags, 48,445 (7%) had two multipart tags, and only 4 had three multipart tags. The rest of the info specifies the dimensions of the multipart tags: 169,362 (24.48%) of the tag sets had a single multipart tag along the case dimension; 44,323 (6.41%) of the tag sets had two multipart tags, with the earlier one along the case dimension and the later one along the gender dimension (this accounts for almost all the cases of two-axis syncretism); etc.{{inflection of}}
aren't getting formatted right but you get the idea.)You seem to be a JS “poweruser”. what do you recommend for adding small Javascript based refactoring tools? I came across meta:TemplateScript, is this any good? I have some Python scripts I use for formatting but I always need to switch back to the terminal, copy&paste etc, I'd like to streamline this. – Jberkel 15:45, 9 May 2019 (UTC)
Thanks for taking a look at my addition. I admit to being a little out of my depth when it comes to some of the finer details like the declension- I basically combined information from an entry starting with πολύ- and one ending with -γονον after checking the LSJ at Perseus and the Gaffiot entry. I also checked as many of the forms as I could get the word study tool at Perseus to show me, though for some reason I couldn't get the genitive to display.
I was wondering if we have any reference template for pages from the Naples Dioscurides. It seems to be an alphabetized condensation of De Materia Medica from the Middle Ages, but it has very nice illustrations, and it's viewable online here]. If we do, it would be nice to link to folio 121 for this entry. Chuck Entz (talk) 05:31, 11 May 2019 (UTC)
Although, https://www.unicode.org/versions/Unicode12.1.0/ did come out in May, specifically for 令和 :p —Suzukaze-c◇◇ 02:39, 14 May 2019 (UTC)
Dear Erutuon! May i bother you again... it is not urgent. I am writing this little module: if a greek word begins with x, x, x letters, then write article την.
I do not know exactly how to write them. I know, I should not use commata, and that they need U+ codes (I have them) and something like local gsub = mw.ustring.gsub. Is there a module where I can see examples? I've looked at transliteration modules, but they substitute letters which is a bit different. --sarri.greek (talk) 11:52, 15 May 2019 (UTC)
mw.ustring.find
. (string.find
does not always work for Greek letters because it looks at bytes and Greek letters are two or three bytes long in UTF-8.) It returns a number (actually two numbers, but that doesn't matter in the code that you showed me) if the letter was found or nil
if it was not, so it can be used in the protasis of an if-statement (if mw.ustring.find(...) then ... end
or if mw.ustring.find(...) ~= nil then ... end
if you want to explicitly convert to a boolean). To check if a term begins with α, you can use mw.ustring.find(term, '^α')
. To check if a term begins with one of multiple characters, put them in square brackets: mw.ustring.find(term, '^')
checks if term
begins with a lowercase vowel letter. ^
at the beginning of the pattern forces the pattern to match only at the beginning of the term, so mw.ustring.find('τη', '^')
returns nil but mw.ustring.find('τη', '')
returns a number.term = mw.ustring.toNFD(term)
before using mw.ustring.find
. When decomposed, for instance ά
(U+03AC GREEK SMALL LETTER ALPHA WITH TONOS) becomes ά
(U+03B1 GREEK SMALL LETTER ALPHA, U+0301 COMBINING ACUTE ACCENT), and mw.ustring.find(mw.ustring.toNFD('ά'), '^')
will return a number while mw.ustring.find('ά', '^')
returns nil
.Hello, could you help me out with Akkadian traditional transcription and IPA. I could use a template that could convert the transcription to IPA. Luckily, it's pretty straight forward. Each letter has a single correspondence except for e which would have to be imputed manually. – Tom 144 (𒄩𒇻𒅗𒀸) 22:09, 26 May 2019 (UTC)
Cheers, I made a request for Franc-Comtois. --Lo Ximiendo (talk) 03:40, 3 June 2019 (UTC)
I know a specific edit number for a WP edit that allegedly triggered a ban of a veteran user. I'd like to see it and the context and judge for myself. I don't know what page was being edited, nor the date. If you don't know how to do this, do you have any idea where I can look? DCDuring (talk) 12:03, 12 June 2019 (UTC)
https://wiki domainhttps://en.wiktionary.org/w/index.php?diff=edit number
. You don't need the page name because all edit numbers on a wiki are unique. If you then want to look at the history for more context, you can note the date, click the History tab, and enter the date to view edits around that time. — Eru·tuon 18:16, 12 June 2019 (UTC)
Lots of errors in documentation pages of translit modules, which you seem to have introduced. Benwing2 (talk) 05:01, 28 June 2019 (UTC)
@Erutuon! THANK YOU. What you have taught me at this module, I applied here annnddd it works wonderfully! (el:λύση, gen.sg). Ι will expand now! You are my hero. sarri.greek (talk) 05:44, 25 July 2019 (UTC)
I'm creating a new module designed to implement a template that would replace {{fi-IPA}}
, {{fi-hyphenation}}
and {{rhymes|fi}}
. I was planning to name this Module:fi-pronunciation, Template:fi-pronunciation (after I realized {{fi-pron}}
was taken for pronouns). However, Module:fi-pronunciation is used too, by a module of yours that seems to be an unused template meant to replace (?) Module:fi-IPA. Mind if I (eventually) take the name for my module? — surjection ⟨?⟩ 20:00, 3 August 2019 (UTC)
I pressed ENTER too fast and now my message in the edit summary of antun appears rude. But what I mean is that such sentences do not look marked enough, not enough herausgehoben. There is also usage examples in new lines with various templates for example in erinnern but it seems excessive and I imagine the templates are abused this way. All that I know is not satisfactory. Are there better methods? Fay Freak (talk) 22:12, 11 August 2019 (UTC)
:
containing the usage examples. Having notes in paragraph tags and examples in unordered lists (as in the linked diff) makes sense HTML-tag-wise, though I can't speak for whether it looks good or not.You made a Wonderfool Module? That's so lame. --Gibraltar Rocks (talk) 15:38, 15 August 2019 (UTC)
I was under influence of măceș. When I pass a multiple-word term as third parameter (altdisplay parameter) of the normal linking templates and use square brackets to link the separate terms the diacritic strip does not run, so I added {{der|ro|bg||*] (])}}
. Though I could pass the same thing with the desired effect to the second parameter so it does not make sense to use the third, something else does not make sense either. I remember I had this problem unrelated to Bulgarian, I think it was Latin diacritics did not get stripped in such an environment – I only now see the pattern, and yep the test code works with Latin content; before, because of the described error I thought that ѝ does not get stripped because of special handling, so people can link ѝ. But how do people link ѝ anyway if the diacritic is stripped? It’s another thing somewhere here that does not make sense. (Arguably, the page should not exist, but the content abide on и with the diacritic in the headerlines.)
And I do remember that there was that discussion about stuff removed from Arabic-script links, differently in Arabic and Persian, I remember the mechanism left much to desire. Fay Freak (talk) 06:31, 21 August 2019 (UTC)
{{m|bg||]}}
.) Perhaps they should be refined to leave the accent on this word. Without hardcoding anything in Module:languages, that could be done by replacing the word ѝ
with a placeholder, removing grave accents, then putting ѝ
back again. Hacky, but it would work. The other option is moving the Bulgarian entry from ѝ to и – if the word is usually spelled without the grave accent, outside of teaching materials or dictionaries. — Eru·tuon 07:12, 21 August 2019 (UTC)Hello - I added inflection templates to अल्प, but am not sure how to add the irregular masc. nom. pl. in -e. Do you know how to override the adjective templates? Hölderlin2019 (talk) 23:33, 28 August 2019 (UTC)
{{sa-decl-adj-mfn}}
, which is just the templates {{sa-decl-noun-m}}
, {{sa-decl-noun-f}}
, and {{sa-decl-noun-n}}
concatenated together. It would be possible to add parameters like |m_nom_s=
, |m_gen_s=
, etc., which punch through to the |nom_s=
, |gen_s=
parameters of the {{sa-decl-noun-m}}
template inside. Regardless, this is not how I wanted to build this template since Sanskrit adjectives sometimes decline differently from the nouns. So... yeah. —*i̯óh₁n̥C 05:01, 29 August 2019 (UTC)Hi :)
Could you help with this?
Hi, I found this via your common.js: User:Erutuon/scripts/simpleTranslations.js. It contains this: {{]}} for Latin-script terms with just lang, term, and gender, to reduce Lua memory usage, using ]
Is this still relevant? If yes, would it not be a good idea to improve the TranslationAdder.js to insert these for da, no, nb, etc.? WDYT?
I saw that some pages have sub-pages /translations to work around the Lua memory issue. Can massive use of t-simple avoid that?--So9q (talk) 10:40, 9 September 2019 (UTC)
{{t-simple}}
. It's just a workaround on pages that are in CAT:E because they are using too much Lua memory. And {{t-simple}}
doesn't always reduce memory enough to remove the error messages; that's why there are translation subpages. — Eru·tuon 17:05, 9 September 2019 (UTC)Share your experience in this survey
Hi Erutuon/2019,
The Wikimedia Foundation is asking for your feedback in a survey about your experience with Wiktionary and Wikimedia. The purpose of this survey is to learn how well the Foundation is supporting your work on wiki and how we can change or improve things in the future. The opinions you share will directly affect the current and future work of the Wikimedia Foundation.
Please take 15 to 25 minutes to give your feedback through this survey. It is available in various languages.
This survey is hosted by a third-party and governed by this privacy statement (in English).
Find more information about this project. Email us if you have any questions, or if you don't want to receive future messages about taking this survey.
Sincerely,
RMaung (WMF) 14:34, 9 September 2019 (UTC)
In {{context}}
, I restored the version that does not show the long red message. The point of deprecation as opposed to deletion is to make page histories legible. I did that after I noticed in page histories illegibility that I did not expect to be there, and then found the source of the illegibility.
I understand this was an attempt to prevent people from using the template. There is a better way, preserving history legibility: create an edit filter that is going to prevent people from saving an entry that contains a deprecated template. No one created such a filter yet and I don't know why; I fear I do not have enough user rights to edit these filters.
In any case, we have deprecation under control via Category:Pages using deprecated templates, which now contains 4 pages. I am cleaning up the category once in a while, and I remember similar counts. It is very manageable. With the edit filter, it would be even easier. --Dan Polansky (talk)
You do a lot of valuable work with templates and modules. Would you consider becoming an administrator? — SGconlaw (talk) 11:51, 13 September 2019 (UTC)
Hi, I just discovered that these entries have been converted by you to t-simple because of the Lua memory bug but in a way that does not show the information about gender.
* Danish: {{t-simple|da|næse|c|langname=Danish|interwiki=1}}
This is correct:
* Danish: {{t-simple|da|næse|g=c|langname=Danish|interwiki=1}}
--So9q (talk) 11:33, 16 September 2019 (UTC)
|g=
and change my script. — Eru·tuon 18:24, 16 September 2019 (UTC){{t-simple}}
from the latest dump:
|1=
: 16129|2=
: 16129|3=
: 3716|4=
: 1|alt=
: 141|g=
: 323|interwiki=
: 6342|lang=
: 1|langname=
: 15341|lit=
: 1|sc=
: 66|tr=
: 317|3=
is so common (because of me no doubt), {{t-simple}}
now accepts the gender in either |3=
or |g=
. I also checked and there was only one instance with both |3=
and |g=
, which I corrected. — Eru·tuon 20:28, 16 September 2019 (UTC)
Concerning this do you have a link to a policy or vote stating this norm? I found nothing in wt:EL and other style pages I looked at.--So9q (talk) 08:05, 18 September 2019 (UTC)
Congratulations! Chuck Entz (talk) 13:01, 30 September 2019 (UTC)
Re your reversion, removal of images: The word majolica has been dogged with confusion since it is used for two distinctly different products in different countries in different periods of time. All other dictionaries than Wiktionary define it inaccurately or omit one sense of the word. Hard to believe but true. The two products, the two meanings of majolica, the two majolicas are visibly different. I feel the deleted images assist understanding and warrant an exception to the 'minimal images' rule.
Davidmadelena (talk) 23:10, 15 October 2019 (UTC)
{{commons}}
. — Eru·tuon 23:33, 15 October 2019 (UTC)
Some of these should not be removed, but rather replaced with an em dash, e.g. . Equinox ◑ 21:25, 18 October 2019 (UTC)
Regarding diff, I thought the whole point of translation subpages was that they would avoid Lua memory problems without the need for the clumsy {{t-simple}}
template. That's why I've been going through them and removing it from them. If you're readding it though, then we're working at cross purposes. —Mahāgaja · talk 09:40, 24 October 2019 (UTC)
{{t-simple}}
because it was running out of memory. In general I'm in favor of having translation subpages use {{t}}
, {{t+}}
if they can without running out of memory; if fire/translations can be switched back (maybe I should make a script for this), it should be. — Eru·tuon 16:07, 24 October 2019 (UTC)
{{t-simple}}
in that case is unavoidable. —Mahāgaja · talk 19:50, 24 October 2019 (UTC)I'm extremely new to Lua. Having a solid background in JavaScript has helped me transition, but I appreciate the improvements you've offered. I just wanted to tell you that I've been working on a major update to the module script, which I've been editing offline because...Wiktionary's editor isn't as convenient as EditPad for indentation, regular expressions text search and replacement, etc.
Some background information: I know ideally, if I can get more people to help me out with Marshallese maintenance on Wiktionary (and on Wikipedia, where I'm mostly responsible for it there, too), I can't just treat scripts like something I can write and maintain unilaterally. But for now, the script is still very much in flux, not just in the state of code but in the wisdom of coding decisions, etc. For instance, I think I made a huge mistake embedding separate MED vs. Choi vs. Willson IPA symbols, because they don't actually represent different dialects, but merely different published researchers' occasionally conflicting phonological analyses of the language. Honestly, the state of Marshallese linguistics publications can be a bit of a mish-mash of different researchers doing their own things and not always agreeing on conventions, which has led me occasionally having to get a tad...creative. Lately I've been asking for more peer review on w:Talk:Marshallese language to help improve the occasionally confused and OR-prone state of the article and pronunciation templates, and what the scripting I write here is something I hope can eventually be used there as well where appropriate. That effort on Wikipedia, like this script, and the Wiktionary:About Marshallese proposal, are still all very much a work in progress, and for the most part I've had to maintain it all myself, and inadequate peer review means the mistakes I make tend to become the decisive word in how the wikis describe the language, sometimes for years on end until someone (or myself) notices the problem.
So thank you for your help with scripting and setting up some simple test cases, etc. While I'm still improving the script offline, I've made note of your improvements and am trying to add them in the offline editing before I submit and test features of a new update, all while trying not to break currently deployed invocations in the process. - Gilgamesh~enwiki (talk) 08:03, 31 October 2019 (UTC)
{{mh-ipa-rows}}
and provided more informative module errors, and then possibly made the errors useless by removing u from the supported characters. (All the erroring instances had u.) Wiktionary:About Marshallese still needs updating though. — Eru·tuon 17:18, 2 November 2019 (UTC)
a.method()
is a method call and passes an implicit this
, equal to a
, to the method. In Lua, a:method()
is the closest equivalent; it passes a
as the first argument to the method. a.method()
would call the method with no arguments. The functions in the string
library are available when a string value is indexed (via the __index
field in the metatable for strings), so if text
is a string, text.gsub
gives a function equal to string.gsub
, and text:gsub(pattern, replacement)
is equivalent to string.gsub(text, pattern, replacement)
, and is analogous to text.replace(regex, replacement)
in JavaScript. text.gsub(a, b)
would fail to pass text
as the first argument to the function, so is equivalent to string.gsub(a, b)
: a
is the string, and b
is the Lua pattern. (Lua will throw a runtime error because the replacement value is required: "string/function/table expected".) In JavaScript, it would be sort of similar to do { const replace = text.replace; replace.call(a, b); }
.nil
value for map
and such indexings. If local map = {}
and local a = "u"
, then accessing map
will cause the error "attempt to index field '?' (a nil value)" because map
is nil
(there is no value indexed by a
) and nil
values can't be indexed in vanilla Lua. So I added a check that will prevent the "indexing of nil
" error message, since I like error messages to be somewhat understandable (even though average users can't fix them). The error message might be wrong, since I was writing it quickly, and it's possible the check is no longer needed, if the module ensures that the transcription has correct phonotactics or syntax before that point. — Eru·tuon 20:29, 2 November 2019 (UTC)
arg:func()
to work, indexing arg.func
(or arg
) has to yield a function. So, setting func
as a field in a table with local arg = { func = table.insert }
enables it to be used as a method: arg:insert("elem")
. (The same can be done by setting the metatable for the table: local arg = setmetatable({}, { __index = { func = table.insert } })
.)string
library as methods, but it can't be modified. — Eru·tuon 15:35, 3 November 2019 (UTC)
this
thing and the difference between prototypes and metatables. — Eru·tuon 17:24, 4 November 2019 (UTC)Also, if you don't mind my asking, are there any thoughts or critiques you could offer on how I structure the module code, the things I'm doing in the functions, etc.? I'm trying not to make my code too convoluted, but I'm also consciously aware I'm exercising some degree of feature creep. And when I realized you were also exporting the internal conversion functions, I changed the export naming convention so that all such functions are prefixed with an underscore to indicate they are internal functions not intended for normal exported use rather than the actual exports functions. - Gilgamesh~enwiki (talk) 19:00, 2 November 2019 (UTC)
{{IPA}}
to have the separate inputs in numbered parameters, rather than separate them with commas in a single parameter, and to bracket them separately: for instance, {{mh-ipa-rows|j&ngw&wil|jengwewil}}
instead of {{mh-ipa-rows|j&ngw&wil, jengwewil}}
yielding /tʲeŋʷewilʲ/, / tʲɛŋʷɛwilʲ/ as the phonemic transcription instead of /tʲeŋʷewilʲ, tʲɛŋʷɛwilʲ/. But this might complicate {{mh-ipa-rows}}
or the module, so you should be the one to decide. — Eru·tuon 18:51, 5 November 2019 (UTC)Okay, so to be clear...calling gsub with tbl is equivalent to function(match) return tbl or match end? I thought if the item wasn't in the table, it might return nil or something, which is why I wrote it as a function that returns the item or match. Also, I noticed you replaced all those substitutions with "("..V..")(ː*)%1". I was honestly not aware it was possible to reference a capture within the same pattern. - Gilgamesh~enwiki (talk) 20:40, 4 November 2019 (UTC)
gsub
returns nil
for a particular match, no change will be made to that match. For instance, both ("bat"):gsub(".", { = "c" })
and ("bat"):gsub(".", function(char) if char == "b" then return "c" end end)
return "cat"
. (Whereas in JavaScript if you do "bat".replace(/./g, function(char) { if (char === "b") { return "c"; } })
you get "cundefinedundefined"
. Heh.) — Eru·tuon 20:55, 4 November 2019 (UTC)
{{mh-ipa-rows}}
templates, which I made two days ago with Pywikibot. I can regenerate it soon if you like. — Eru·tuon 16:41, 6 November 2019 (UTC)
Since you've been helping me maintain the module code, I thought I should let you know that I made some major changes to the code structure. I wrote a new local function, gsubBatch
, to help reduce boilerplate in the source, since gsub
is called a lot and I wanted to streamline it. - Gilgamesh~enwiki (talk) 23:55, 13 November 2019 (UTC)
My gsubBatch
function may not have been as wise as I once thought. Though it makes code more elegant to read, it can actually make it harder to debug, because errors that occur inside anonymous functions don't seem to report their line numbers if they generate an error, which in a long batch makes it harder to determine where the error came from. I may find myself restructuring code again, but if a lot of sequential gsub
calls are necessary, I think I'd rather reduce the length of some variable names, because the sheer amount of boilerplate can be awful. - Gilgamesh~enwiki (talk) 00:55, 18 November 2019 (UTC)
gsubBatch
mechanism again, I'll look into it. - Gilgamesh~enwiki (talk) 17:05, 19 November 2019 (UTC)I just noticed a strange abundance of words in the table spelt "Wiktionary:About Marshallese", with six different phonological forms. :) Also, been adding more words up to moments ago. - Gilgamesh~enwiki (talk) 17:05, 19 November 2019 (UTC)
{{mh-ipa-rows}}
in Wiktionary:About Marshallese as well as in entries; then I have to remove the unwanted titles. I added a list of titles to exclude so that in the future the unwanted titles can be automatically removed. Perhaps alternative spelling entries could just be soft redirects using {{alternative spelling of}}
, without any definition or pronunciation (because both of those are the same for all spellings). I changed M̧ajōļ to an alternative spelling entry for M̧ajeļ based on something you said in the Wikipedia discussion, but am not sure about the others. — Eru·tuon 17:18, 19 November 2019 (UTC)
I may have significantly increased the module's execution time, which may be extending table load times. I changed it so that forRemainder
is actually (pretty much unconditionally) called twice and the duplicate result discarded. This is for careful
mode (variable name subject to change), to satisfy inconsistencies between the way Bender (1968) and Willson (2003) described the language, and the more careful pronunciations prescribed by Naan (2014). Basically, in careful mode, the nasal consonant cluster assimilations are avoided, there's a handful more cases where clusters have epenthesis instead of assimilation, and the behavior of epenthetic vowels neighboring glides has changed. I don't necessarily see an inconsistency in including both, since most languages (including English) have words or phrases that differ notably in pronunciation when spoken more rapidly or more slowly, and can change how people perceive the word in their own speech. Compare "ornge" vs. orange, where some people primarily speak it as two syllables, and some (like me) say it as one syllable. - Gilgamesh~enwiki (talk) 20:02, 20 November 2019 (UTC)
mw.ustring.gsub
. It's not a very efficient function because it's implemented using PHP regex and calls go over the Lua–PHP boundary. Sometimes the number of calls can be reduced by generalizing the patterns (regexes) and using a function replacement. — Eru·tuon 20:17, 20 November 2019 (UTC)mw.ustring.gsub
call to handle all assimilations, and perhaps the overhead of calling a function for every series of two consonants is less than the overhead of multiple calls to mw.ustring.gsub
. I think that's plausible because of all that PHP has to do for each mw.ustring.gsub
call.C2
already exists as a separate higher scope variable, and using a different variable name may reduce the risk of variable name confusion and make the code more readable.a, b, c, d
as a sequence of captures, and easier on the eyes than letter-numbering them like c1, c2, c3, c4
, etc. Anyway, I think I know what you're trying to accomplish. Your code broke some of the (as of yet unused) nʷtˠ
logic, but what you're doing here looks very, very clever and I think I know how to take it and run with it with other parts of the code. - Gilgamesh~enwiki (talk) 22:58, 20 November 2019 (UTC)
C1, A1, C2, A2
were abbreviations of "consonant 1", "articulation 1", "consonant 2", "articulation 2" (though that's not completely accurate terminology, since it's more like primary and secondary articulation), so more descriptive than either a, b, c, d
or c1, c2, c3, c4
. — Eru·tuon 23:03, 20 November 2019 (UTC)
x, xx, y, yy
. It helps that neither X nor Y are in the standard new orthography. And when I realized what you were doing, I rewrote your function. May I demonstrate...? - Gilgamesh~enwiki (talk) 23:36, 20 November 2019 (UTC)
In response to your question, "Why did the epenthetic vowel disappear between the p and the k in Āneeļļapkaņ?", the pattern is not matching the /pʲkˠ/ when mw.ustring.gsub
is called the second time, because /lˠlˠ/ is not changed when mw.ustring.gsub
is called the first time, and is matched both times. Here is a technique for cases like this that also allows mw.ustring.gsub
to be called only once. (Gah, in the edit summary I meant to say "getting the surrounding consonants with mw.ustring.sub
", not "mw.ustring.gsub
".) — Eru·tuon 02:54, 21 November 2019 (UTC)
i
and j
indices was clever. (I renamed them xvi
and yvi
.) It all...seems to work now. Now let's see if I can rewrite the logic of another expensive regex batch without breaking it too badly.mw.ustring.gsub
in Module:mh-pronunc in the generation of the testcases table (counted thus) has been reduced from 228,294 to 156,516.I've been considering an alternative approach to programming the phonetic algorithm. As it currently stands, the regex approach is effective in thoroughly processing the input text, but it's also proven a lot more inefficient than I predicted. Putting more logic into substitutor functions improves the performance somewhat, but in a process where regex replaces matches one by one, it's not as practical in making necessarily adjustments to vowels that were already replaced. For example, this existing code:
-- {yekʷey, yewan} are , not
text = gsub(text,
"(ɦʲ@*)()(@*.ʷ.?ʷ?@*)", function(a, b, c)
return a..VOWELS_Y..c
end)
Unlike other logic that replaces text based on what already exists to the match's left-hand side, this replacement can only be made if the stable value of the vowel on the right is already known. This is how I earlier solved the Ānewātak problem so that its phonetics were properly displayed as instead of . In a more optimized approach, that could be fixed in a second regex pass, but I think I have a better idea—I just don't know beforehand how practical it will be.
Basically, my idea is, instead of relying so much on regex, just parse the input text and represent its data as a doubly linked list of table objects, where each node represents either a consonant or a vowel. Code could loop through the link nodes, make changes in them informed by nodes that come before or after, and can make secondary changes to previous node data as needed. Then, when the linked list is done being manipulated, convert it back to text.
But can this all be done in Lua using only linked lists and logic, more efficiently than batches of regex replacements can do it? - Gilgamesh~enwiki (talk) 18:46, 22 November 2019 (UTC)
mw.ustring.gsub
calls is considerable. It could also reduce memory because fewer intermediate strings would be created. But I'm speculating.make_tokens
in Module:grc-utilities and tr
in Module:grc-utilities. The former processes Greek characters into "tokens" (sub-sequences, mainly to handle diphthongs and single vowels correctly), and uses objects to represent the characteristics of the Greek characters, and the latter processes the tokens to create a transliteration. Not super elegant, but my version of the tokenization function was much faster than the previous one, probably because it got rid of most of the calls to mw.ustring
functions.mw.ustring
functions were inefficient. Does that include mw.ustring.sub
? - Gilgamesh~enwiki (talk) 14:40, 24 November 2019 (UTC)
mw.ustring.sub
is noticeably inefficient when there are many calls, for instance when you iterate through strings using for i = 1, mw.ustring.len(str) do local character = mw.ustring.sub(str, i, i) end
. In the previous version of the tokenization function, mw.ustring.sub
was called about up to three times for every code point in the string. My impression is that that explained most of the inefficiency in the old version of the function, though it's not a great testcase because the old and new versions are so different. The overhead is probably not as noticeable in the function replacement in Module:mh-pronunc though, where it currently has only 2,028 calls, as opposed to 115,872 for mw.ustring.gsub
to create the testcases table. (And I guess mw.ustring.gsub
probably has greater overhead.) It's not so efficient that the function should be avoided altogether.{{mh-ipa-rows}}
takes about a twentieth of a second in entries), so don't feel obligated to remodel it for that reason at least. (Not to discourage you from rewriting it if you want to – I do quite a bit of random rewriting of modules for various reasons.) — Eru·tuon 23:08, 24 November 2019 (UTC)
toPhonetic
would certainly be called multiple times in an article like that. I'd rather not add that much extra load time there. - Gilgamesh~enwiki (talk) 00:12, 25 November 2019 (UTC)mw.string
, but immensely more bloated code. I get the impression that functions like mw.string.sub
are so expensive because the strings are probably encoded in UTF-8, but logic required to seek codepoint indices—or worse, conceivably to convert between UTF-8 and UTF-16 and back—may involve a lot of overhead if called often enough (I'm not sure which, if any of these things, is actually being done). Obviously we're working with a lot of Unicode text and the data needs to be preserved in that format.parse
and passed to the other internal functions) to use only ASCII surrogates and byte-based string functions for the text-crunching, and then convert them to Unicode forms to represent their final forms? Are there also byte-based functions available for regex that are more efficient? - Gilgamesh~enwiki (talk) 00:12, 25 November 2019 (UTC)mw.ustring.sub
can be expensive, right? But most of the time I only need a single Unicode character. What if I...split a string into an array of characters first, and just reference the array's indices? No dynamic linear behavior involved in retrieving an indexed Unicode code point from a byte string. - Gilgamesh~enwiki (talk) 02:00, 25 November 2019 (UTC)
mw.ustring.sub
doesn't do any conversion between UTF-8 and UTF-16. That would be madness. I found that the implementation of mw.ustring.sub
calls mb_substr
in PHP, which calls mbfl_substr
, but I didn't figure out what it does to UTF-8.string
library functions (the ones that can be called as methods on strings). They are much more efficient because they call directly into C and don't have to deal with UTF-8 or Unicode categories. But using ASCII replacements for the Unicode characters sounds like a bit of a pain; it could make the intermediate forms a bit harder to understand.mw.ustring.sub
to get multiple characters from the same string. To be super cheap, I would use string.gmatch
: function get_character_array(str) local arr, i = {}, 1 for char in string.gmatch(str, "*") do arr = char i = i + 1 end return arr end
. — Eru·tuon 05:38, 25 November 2019 (UTC)
mw.string.split(text, "")
, called only once before a major mw.string.gsub
operation whose substitutor function would have otherwise needed mw.string.sub
multiple times per match. I hadn't considered your string.gmatch
approach before, but it looks interesting—might there be a way to expand it to work with three- and four-byte UTF-8 code points?{EeiAV7MQOou
, but when writing regex sequences, {
would have to become %{
, so I could just replace it with a
instead. The secondary articulations is where it gets trickier, as the equivalents of are ' _G _w
. Since I only use as a final phonetic presentation form, I could conceivably just use j G w
, but it's again complicated where the X-SAMPA equivalent of is h\
. Lots of these little things call for lots of little simplifications, until you get to the point where the internal string /ɦʲænʲeɦʲelˠlˠæpʲkˠænˠ/ (Āneeļļapkaņ) has a pseudo-X-SAMPA appearance of hjanjehjelGlGapjkGanG
, and...I end up kinda not wanting to go that route anymore. Regex and the algorithm can already get complex enough without making the internal IPA so much harder to read. - Gilgamesh~enwiki (talk) 16:27, 25 November 2019 (UTC)"*"
does support three- and four-byte code points. - Gilgamesh~enwiki (talk) 16:36, 25 November 2019 (UTC)array.push(element)
. You sure that doesn't hurt array storage efficiency on the JIT site? (Or does Scribunto/Lua not use a JIT anyway?) I'd probably find myself writing it with push
's Lua equivalent, table.insert
. - Gilgamesh~enwiki (talk) 16:41, 25 November 2019 (UTC) x? x* x+
etc.), as it would test for the byte rather than the codepoint. But stuff like simple substring replacements and multi-character captures (xyz)
could be fine even with UTF-8 code points included. - Gilgamesh~enwiki (talk) 17:02, 25 November 2019 (UTC)
table.insert
isn't any more efficient than t
. As mentioned in the link, it's actually slower because of the two meanings that table.insert
has (table.insert(t, val)
vs. table.insert(t, i, val)
). Scribunto doesn't use LuaJIT. It would probably improve performance to allocate the entire array at once with { nil, nil, nil, ... }
, but that requires knowing the number of code points and having a function that can return that many nils.string
library doesn't work with multi-byte characters; also several of the character classes like %s
are Unicode-dependent in the mw.ustring
library. I wrote a little about this at WT:LUA § Ustring patterns and created Module:User:Erutuon/patterns, which contains a function that tests whether a pattern will match correctly (according to UTF-8 and Unicode semantics) in the string
library functions.mw.ustring.sub
is implemented that way. Certainly indexing UTF-8 by code point is slower than byte indexing, but I imagine with this decoding technique it could be fairly fast. — Eru·tuonI've given the the theoretical Unicode-to-ASCII-pseudo-X-SAMPA cipher more thought, and I believe if I were to use it, it would look something like this:
p | b | t | d | z | k | ɡ | m | n | ŋ | r | l | ĭ | ī | ɣ | ɦ | ɧ | _ | ʲ | ˠ | ʷ | æ | ɛ | e | i | ï | ɑ | ʌ | ɤ | ɯ | ɒ | ɔ | o | u | ◌̯ | ː | ◌͡◌ |
p
|
b
|
t
|
d
|
d
|
k
|
g
|
m
|
n
|
N
|
r
|
l
|
y
|
Y
|
H
|
h
|
H
|
_
|
j
|
G
|
w
|
a
|
E
|
e
|
i
|
I
|
A
|
V
|
7
|
M
|
Q
|
O
|
o
|
u
|
^
|
:
|
=
|
Because, on second thought, hjanjehjelGlGapjkGanG
is rather hard to read, but then, so is /ɦʲænʲeɦʲelˠlˠæpʲkˠænˠ/. These are internal formats, not display formats (even the internal IPA is pseudo-IPA), and at least X-SAMPA is well documented enough for a pseudo-X-SAMPA approach to be viable. I'm still working with code ideas offline. - Gilgamesh~enwiki (talk) 21:23, 26 November 2019 (UTC)
I've tried a variety of coding approaches, and I'm realizing there may be no real substitute for batches of regex. Regexp can be written fairly concisely, and the more bloated code comes, the harder it is to read. And after multiple attempted rewrites, I've found that I've stopped writing comments to reduce mental gear-shifting. Well-written code doesn't need many comments anyway. I just want to write something that balances readability with efficiency. Fortunately, I've had decent success with the pseudo-X-SAMPA approach in concept, and I can minimize the use of UTF-8 regex functions and rely more on faster functions like string.gsub
. (At least I hope it's faster...) - Gilgamesh~enwiki (talk) 08:16, 2 December 2019 (UTC)
string.gsub
, string.find
, etc. I cringe to think that the engine has to compile a new regex edifice every time the regex code is passed to one of these functions. I hope they are at least being cached between calls, either in an internal hashtable or attached to the internalized pattern strings themselves. - Gilgamesh~enwiki (talk) 02:08, 3 December 2019 (UTC)
string
-library pattern-matching functions, except string.find
when the plain
flag is set, here. — Eru·tuon 04:15, 3 December 2019 (UTC)
I finished writing the new draft and ironing out the bugs, and replaced the non-sandbox version with it. How does the performance compare now with the previous version? - Gilgamesh~enwiki (talk) 21:32, 5 December 2019 (UTC)
c J h H y Y a I @
which do not represent their conventional X-SAMPA counterparts, for the sake of being more regex-pattern-friendly and single-character-friendly. The way I use them, c
is actually , J
is , h
and H
are transitional representations of unsurfaced and surfaced glides, y
is {yi'y} (), Y
is {'yiy} (), a
is ({
isn't as readably regex-friendly), I
is a dotless replace ı with ɪ, invalid IPA characters (ı) that is friendlier to IPA tie bars, and @
is the diacritic . Otherwise (unless I've forgotten any), the symbols are the same as their X-SAMPA counterparts (or _
-notated forms thereof), which are mostly the same as their IPA counterparts when they are plain Latin lowercase letters. The system works well. (Right now, in edit preview, it complains that replace ı with ɪ, invalid IPA characters (ı) is invalid IPA, but the choice is really just to keep the tie bar from hovering so much higher than over other pairs of vowels when is present— vs. replace ı with ɪ, invalid IPA characters (ı). If it proves problematic, it can be reverted to —I just wanted to polish the presentation a bit, which makes a different with certain IPA typefaces like Gentium and certain browsers like Firefox.) - Gilgamesh~enwiki (talk) 01:35, 6 December 2019 (UTC)
string.gsub
actually seems faster than trying to do the same thing procedurally, even if you try to do it all with arrays of one-character strings. These calls are actually a lot faster than I gave them credit for—I knew they would be faster than mw.ustring.gsub
, but not that they might actually be faster than my attempts to do the same thing procedurally. I suppose it also helps that, this time, I eliminated most throwaway lookup tables, and instead generate them only once and cache them.then
s and not
s and not enough curly braces, and arrays starting at 1
instead of 0
is consistently maddening. I miss JavaScript. Would love to write modules in modern JS. - Gilgamesh~enwiki (talk) 05:09, 6 December 2019 (UTC)I made a small change that could significantly improve performance, at least for some regex replacements, but I don't know how well. The change is:
local function string_gsub2(text, pattern, subst) local result = text result = string.gsub(result, pattern, subst) -- If it didn't change the first time, it won't change the second time. if result ~= text then result = string.gsub(result, pattern, subst) end return result end
Still looking for small ways I can improve efficiency. - Gilgamesh~enwiki (talk) 19:44, 21 January 2020 (UTC)
I wrote a simple new function, toMOD
, that I need tested, perhaps with a new column in the table. It converts standard orthographic spelling to the format used by the Marshallese-English Online Dictionary, converting ĻļM̧m̧ŅņN̄n̄O̧o̧ to ḶḷṂṃṆṇÑñỌọ. This has potential applications in Marshallese reference templating, where a word in standard orthographic spelling can be automatically converted to MOD's spelling so that references can link directly to dictionary entry anchors on that site without us needing to directly embed a differently-spelt word in the external link. No such template has been written yet. It may be a good idea for each row of the "term" column and a potential MOD column to share a table cell where the forms have identical spelling. And, in any event, the separate MOD spelling should probably not link to a Wiktionary entry with that spelling, as it is and always was a non-standard alteration to Marshallese orthography which is largely limited to the MOD, Naan and associated media intended for offline distribution to available computers in the Marshall Islands. I imagine that, if the standard orthography were considered friendlier to older Windows and Mac computers and their available font rendering, MOD and Naan would be using the standard orthography out of the box, but for the time being they are what they are. - Gilgamesh~enwiki (talk) 07:44, 10 December 2019 (UTC)
{{mh-ipa-rows}}
to a template.{{mh-head}}
for now. At least the MOD spelling is being displayed, though. And I don't think it may be the best idea to put the MOD spelling in an alternative forms section, because it may prompt a naive third-party editor to turn the unlinked term into a linked term and create a word entry. My concern is that it may motivate an unnecessary duplication of many entries with the non-standard orthographic variants. It also doesn't help that some sources for the language write Marshallese words without any diacritics, and it seems dan was created from one of these sources as an unknowing duplicate of dān. - Gilgamesh~enwiki (talk) 08:05, 11 December 2019 (UTC)If I may ask, could you please update the table? I was updating it manually, but then I added so many new entries that I got behind. Most of the new entries are words that start with ri-—demonyms, mainly. - Gilgamesh~enwiki (talk) 05:08, 15 December 2019 (UTC)
Marshallese doesn't have all the complex noun cases of an agglutinative language, but it does have some inflected forms, and {{mh-head}}
would seem to be the appropriate place to list these. I have an idea of what I want to accomplish, but it may require some additional Scribunto/Lua API I'm not that familiar with, since I think template-only logic would become unnecessarily bloated. I was wondering if you could help me write such a template and backing script. I need to figure out how vanilla {{head}}
creates its inflection list and handles the appropriate automatical categories with language-sensitive sorting keys, and how I can extend or replicate that in a script, with possibilities like default inflected forms, more than one of the same kind of inflected form, etc. I can conceptualize what I want to achieve, but API-wise I'm in over my head. - Gilgamesh~enwiki (talk) 02:14, 24 December 2019 (UTC)
I think I found some resources to start with, chiefly Module:headword. - Gilgamesh~enwiki (talk) 18:02, 24 December 2019 (UTC)
full_headword
in Module:headword and if necessary format_categories
in Module:utilities to format extra categories that don't begin with the language name. In the Marshallese module there could be a main function that generates the MOD spelling and it can call one of the pos_functions
to handle part-of-speech-specific stuff. I'm not sure what is a good module to base the Marshallese one on though. Much of Module:eo-headword is probably understandable because the morphology is simple at least. — Eru·tuon 19:52, 24 December 2019 (UTC)
I think sometimes I forgot just how much technical work you do here at Wiktionary, beyond just helping me with a Marshallese module. I created a new category, Category:Marshallese distributive verbs, but {{auto cat}}
shows this category is not supported. What would be involved in creating new grammar categories? - Gilgamesh~enwiki (talk) 13:45, 14 January 2020 (UTC)
Some brief background: Marshallese distributive verbs basically modify a noun or verb with the rough inflected meaning of "there are a lot of s." This particular grammatical form is demonstrated extensively in example sentences throughout the Marshallese-English Online Dictionary. - Gilgamesh~enwiki (talk) 13:53, 14 January 2020 (UTC)
@Erutuon Wow, you are a busy bee. I think I have even greater respect for what you do here than I did even just 24 hours ago. As much as I would appreciate your continued feedback in my ongoing endeavors, I can still wait. - Gilgamesh~enwiki (talk) 23:28, 15 January 2020 (UTC)
@Erutuon There's a bug in the module's debug table, most noticeable with words whose Bender spellings start with "yiy" and a vowel. In line with references explaining how Marshallese words can be enunciated phoneme by phoneme, I'm testing an experimental enunciate-mode, where short prosodic breaks are inserted in the middle of consonant clusters. The problem is...the International Phonetic Alphabet specifies these as pipe characters |. I already tried hard-coding {{!}}
in the module output, but it only looks like {{!}}. So now I'm using a normal pipe character, but there's a bug in the way the module's debug table displays it. What's only displaying æ.e.kʷwɤtʲ] should actually be displaying - Gilgamesh~enwiki (talk) 19:03, 16 January 2020 (UTC)
How do I set this up? So things work in {{lb}}
, and so forth. I know similar categories exist for Category:Indian English, Category:New Zealand English, etc. The Ratak Chain and Rālik Chain dialects of Marshallese are mutually intelligible, and differ mainly by some regular variations in pronunciation reflex, and some vocabulary differences. But many of the different forms are often still written differently depending on dialect. For instance, m̧m̧an "good" is the common stem, em̧m̧an is the Rālik reflex, and m̧ōm̧an is the Ratak reflex, but in both dialects the prothetic vowel vanishes if the stem takes a bare vowel prefix: rūm̧m̧an (ri- + m̧m̧an) means "good person." I want to start making articles for the stem forms, and have their dialect reflex entries (by spelling) automatically categorized through {{lb|mh|Ratak}}, {{lb|mh|Ralik}}/{{lb|mh|ālik}}, etc. I should add that I don't know if the dialects themselves have supplemental language codes, the same way Tosk Albanian is "als" (Albanian, South) and Gheg Albanian is "aln" (Albanian, North).
I'm not sure what to name the categories, though—"Rālik Marshallese"? "Rālik dialect Marshallese"? "Rālik Chain Marshallese"? I'm not sure what the most stable nomenclature would be. In the Marshallese-English Online Dictionary, they're also frequently just called "Dial. W" and "Dial E.", since Rālik ("sunset") is the western chain and Ratak ("sunrise") is the eastern chain, but the two dialects' native isogloss line still runs between the two chains themselves.
I should probably additionally add...I'm not 100% sure that I know what I'm doing. It's one thing to know how templating and scripting languages work (which I increasingly know), and another thing entirely to know how existing templates and scripts are set up so I extend them for specific editing needs. - Gilgamesh~enwiki (talk) 01:14, 20 January 2020 (UTC)
{{lb|mh|Ratak}}
and {{lb|mh|Ralik}}
there, with categories and linked display text if desired. Personally, I like the shorter category name: "Rālik Marshallese". The category page can explain what it means. It looks like there aren't ISO codes for Rālik and Ratak, but if they might be referred to in etymologies (for instance, {{der|en|<code for ralik>|word}}
), then they could be given Wiktionary codes in Module:etymology languages/data too. — Eru·tuon 19:34, 20 January 2020 (UTC)
In addition to the previous section I just wrote, I was wondering...do we risk the module timing out if we add additional enunciated columns to it? Seeing that enunciated mode has since been fully deployed to articles wherever a consonant cluster exists in the phonemic form, acting on previously unread documents that Austronesier and I discussed at wikipedia:Talk:Marshallese language—see kajin M̧ajeļ for a good example of how normal phonetic and enunciated IPA can differ. And it's not just the absence of consonant assimilations or epenthetic vowels, but also some different vowel reflexes simply as a consequence of the last vowel before a consonant cluster being the last vowel of its prosodic fragment and the first vowel after a consonant cluster being the first vowel of its prosodic fragment—see eakeak, tuen̄ and utut to see what I mean. (Incidentally, you may be pleased to see that Arņo now shows two different consonants when enunciated.)
As for how the added columns would work, enunciated forms would only differ between dialects if their normal phonetic forms already differ (because of the limits in the differences between dialect reflexes), so I'm thinking something like: phonetic (Rālik), enunciated (Rālik), phonetic (Ratak), enunciated (Ratak), with each dialect's phonetic and enunciated columns merging if they're the same, and all four columns merging if all four forms are the same.
If we'd be taxing our Scribunto/Lua allowances too much for the one table, I could instead set it to show enunciated mode in the sandboxed version as a temporary visual aid during relevant discussions, but still there are now effectively four different phonetic modes to debug. - Gilgamesh~enwiki (talk) 16:11, 20 January 2020 (UTC)
I saw that you've recently edited a bunch of Old English entries to replace /z/ with /s/, leaving the comment that is an allophone of /s/ in Old English. That is arguably true, but I think the removal of /z/ from Old English transcriptions brings up a few more issues that ought to be addressed. First, the reason I say the allophonic status of is "arguable" is because there are in fact some contexts where the use of a voiced vs. a voiceless fricative may not be completely predictable from the phonological context. See "Phonemically Contrastive Fricatives in Old English?", by Donka Minkova, for a description of some of the relevant evidence and references to prior literature that discusses the topic (Minkova does support the interpretation that the voiced and voiceless fricatives were allophones in Old English). The other issue, more important in my opinion, is a matter of consistency: two other voiced fricatives, and , are commonly analyzed as allophones of /f/ and /θ/. So a transcription like "/ˈt͡ʃiyvese/" for ciefese seems fairly problematic: if we decide to use /s/ here, I think it would be better to also use /f/, giving /ˈt͡ʃiyfese/. And in fact, considering that the allophonic realization of voiceless fricative phonemes as voiced fricatives doesn't come naturally to modern English speakers, and that (as mentioned above), the distribution of the voiced and voiceless allophones in Old English is somewhat complicated, I think it would be worthwhile to include a phonetic transcription using and in addition to a phonemic transcription with /f/ and /s/ for words like this.--Urszag (talk) 21:36, 31 October 2019 (UTC)
WDYT about the result? Should I move the function processor() and function setup_click_keyup() out of the setup_infl()?--So9q (talk) 19:17, 4 November 2019 (UTC)
nec-
prefix to the NEC parameters in the URL, to avoid collisions, and it's traditional to use hyphens in class names rather than underscores. I've made the script use mw.util.getParamValue
instead of a custom function.You've done a lot of work on this. Now that we have aliases for etymology languages, I'd like to display them, either in the family tree or in an info box, similar to what we have with {{langcatboiler}}
. Maybe we should have {{etym lang cat}}
for etymology language categories; currently these categories, when they exist, aren't standardized in name or contents. Benwing2 (talk) 05:40, 15 November 2019 (UTC)
ksh
). Entries are added to the categories using {{lb}}
and {{tlb}}
. Ideally lemmas and non-lemma forms would be in different categories, but I didn't know how to do that. It would be weird to have to specify lemmas or non-lemma forms in {{tlb}}
, like having {{tlb|grc|Epic Greek lemmas}}
or {{tlb|grc|Epic Greek non-lemma forms}}
display as "(Epic)" but add different categories, and I didn't know how to accommodate that in Module:labels and couldn't think of another good way to add the categories. So I never came up with any kind of action plan. Maybe this issue doesn't have to be solved right away though. — Eru·tuon 19:52, 15 November 2019 (UTC)
{{head}}
, which knows about the POS and hence whether it's a lemma or not. The only other way I can think of without having the POS or lemma status marked explicitly in {{tlb}}
is for {{tlb}}
to look through the page text, which is expensive and likely error-prone. Benwing2 (talk) 18:11, 16 November 2019 (UTC)Herbert Weir Smyth, A Greek Grammar for Colleges http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.04.0007%3Apart%3D2%3Achapter%3D13%3Asection%3D13 Smyth grammar 2.13.13 FIRST DECLENSION (STEMS IN α_)
214. The dialects show various forms.
214 D. 1. For η, Doric and Aeolic have original α_; thus, νί_κα_, ϝί_κα_ς, ϝί_κᾳ, νί_κα_ν; πολί_τα_ς, κριτά_ς, Ἀτρείδα_ς.
2. Ionic has η for the α_ of Attic even after ε, ι, and ρ; thus, γενεή, οἰκίη, ἀγορή, μοίρης, μοίρῃ (nom. μοῖρα^), νεηνίης. Thus, ἀγορή, -ῆς, -ῇ, -ήν; νεηνίης, -ου, -ῃ, -ην. But Hom. has θεά_ goddess, Ἑρμεία_ς Hermes.
3. The dialects admit -α^ in the nom. sing. less often than does Attic. Thus, Ionic πρύμνη stern, κνί_ση savour (Att. πρύμνα, κνῖσα), Dor. τόλμα_ daring. Ionic has η for α^ in the abstracts in -είη, -οίη (ἀληθείη truth, εὐνοίη good-will). Hom. has νύμφα^ oh maiden from νύμφη.
8. Gen. plur.—(a) -ά_ων, the original form, occurs in Hom. (μουσά_ων, ἀγορά_ων). In Aeolic and Doric -ά_ων contracts to (b) -ᾶν (ἀγορᾶν). The Doric -ᾶν is found also in the choral songs of the drama (πετρᾶν rocks). (c) -έων, the Ionic form, appears in Homer, who usually makes it a single syllable by synizesis (60) as in βουλέωνν, from βουλή plan. -έων is from -ήων, Ionic for -ά_ων. (d) -ῶν in Hom. generally after vowels (κλισιῶν, from κλισίη hut).
Perseus Greek Word Study Tool:
http://www.perseus.tufts.edu/hopper/morph?l=arpa&la=greek#lexicon ἅρπα noun sg fem nom doric aeolic ἅρπα noun sg fem nom doric aeolic
http://www.perseus.tufts.edu/hopper/morph?l=arpas&la=greek#lexicon ἅρπας noun sg fem gen doric aeolic
Greek morphological index (Ελληνική μορφολογικούς δείκτες):
Nominative: https://morphological_el.academic.ru/687234/%E1%BC%85%CF%81%CF%80%CE%B1%CF%82#sel=10:3,10:3 ἅρπας
ἅρπᾱς , ἅρπη bird of prey fem acc pl ἅρπᾱς , ἅρπη bird of prey fem gen sg (doric aeolic)
Accusative: https://morphological_el.enacademic.com/687226/%E1%BC%85%CF%81%CF%80%CE%B1%CE%BD ἅρπαν
ἅρπᾱν , ἅρπη bird of prey fem acc sg (doric aeolic)
Inqvisitor (talk) 08:24, 16 November 2019 (UTC)
You reverted my edit on the page ışık. Why is that? The declension adds nothing to the article (the nominative declension is the word itself and the accusative declension is already given in the {{tr-noun}}
template: "ışık (definite accusative ışığı, plural ışıklar)"). In my opinion, the templates {{tr-infl-noun-c}}
and {{tr-infl-noun-v}}
shouldn't be used anywhere on Wiktionary as they provide no information that {{tr-noun}}
doesn't already provide already but only bloat the site. --Fytcha (talk) 18:16, 6 December 2019 (UTC)
I noticed that you are working in Rust. It has become my favourite language recently, although for Wiktionary bot work I still use Python. —Rua (mew) 11:01, 9 December 2019 (UTC)
I hate to bother you all the time. If you ever have time, could you check el:Module:sarritest The only person in el.wikt who knew Lua is now a 'vanished' user. sarri.greek (talk) 00:00, 11 December 2019 (UTC)
Thank you so much! sarri.greek (talk) 18:48, 11 December 2019 (UTC)
args
. That's not uncommon with me.Hi Erutuon. Can you run a bot to do this:
also this:
also we shouldn't allow ppl to add translations with ku code; they should use Kurdish dialects codes (kmr, ckb, ...) instead of using ku code directly. Thanks.--Calak (talk) 16:50, 13 December 2019 (UTC)
ckb
, kmr
, or sdh
instead of ku
? I might be able to figure out how to do that but I've mostly stayed away from that gadget because its code confuses me. — Eru·tuon 09:14, 17 December 2019 (UTC)It is OK Erutuon. I will be thankful if you can apply any one of them.--Calak (talk) 07:12, 21 December 2019 (UTC)
Hello, it is not an "odd alternative pronunciation". Several million people pronounce it that way, whereas the mispronunciation of "decade" has about five variants on the site for about 10 speakers. ABAlphaBeta (talk) 08:39, 17 December 2019 (UTC)
{{fr-IPA}}
: {{fr-IPA|écuidistant|équidistant}}
. I know very little about the fine details of French pronunciation and you may be right. Words with équi- (or ultimately derived from aequus) are transcribed with either /e.kɥi/ or /e.ki/ on Wiktionary, and while the soundfiles of équidistant on the French Wiktionary and on Forvo has /e.kɥi/, perhaps some people pronounce it with /e.ki/ like équilibre and other words because it may be as confusing for French speakers as it is for foreigners like me. — Eru·tuon 09:10, 17 December 2019 (UTC)Hi. In October, you added "Incorrect title: a mixture of Latin- and Cyrillic-script characters". Do you think this could be merged into the existing "Bad entry title"? How do they differ? Equinox ◑ 08:05, 20 December 2019 (UTC)
Hi Erutuon, can you help me with the Lua Module:number list on simple.wikt? Minorax (talk) 05:10, 29 December 2019 (UTC)