Hello, you have come here looking for the meaning of the word User talk:AutoFormat. In DICTIOUS you will not only get to know all the dictionary meanings for the word User talk:AutoFormat, but we will also tell you about its etymology, its characteristics and you will know how to say User talk:AutoFormat in singular and plural. Everything you need to know about the word User talk:AutoFormat you have here. The definition of the word User talk:AutoFormat will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofUser talk:AutoFormat, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
If you want to add multiple suggestions at once, please make them separate sections, it is easier to discuss.
Latest comment: 15 years ago2 comments2 people in discussion
Hi there! I would like to ask (maybe again) if your bot could add lang= parameter on all templates of the Pronunciation header, including {{SAMPA}}, {{X-SAMPA}}, {{rhymes}}, {{homophones}} and {{hyphenation}}, like it does for {{IPA}}. Normally, we put this parameter while editing, but I lose a lot of time filling all the templates previously written without lang=. Do you think it can be possible? Many thanks for now, have a nice day. Pharamp20:07, 29 December 2009 (UTC)Reply
Per WT:TODO, ideally {{etyl}} and {{context}} labels should have a lang specified as well. Not for things like {{by extension}}, but if someone then adds another context like {{fishing}} it does need a lang parameter, so if there is already done it doesn't do any harm. I already add lang=fr (in my case) to transitive and intransitive, as they may becoming categorizing in the future. Mglovesfun (talk) 20:40, 18 January 2010 (UTC)Reply
removing empty checktrans tables
Latest comment: 15 years ago6 comments3 people in discussion
Hi, RU, any chance Af could remove
<!--Remove this section once all of the translations below have been moved into the tables above.-->
{{checktrans-top}}
{{trans-mid}}
{{trans-bottom}}
yes, I have to think about the combinations. The comment is irritating ... AF does have a specific code path at that point, so I don't have to use fancy regex. Although the regex would not be hard. Robert Ullmann11:25, 19 April 2010 (UTC)Reply
hmm
Regex = (re.compile(r'(<!-- ?Remove this.*-->\n|)' \
r'\{\{checktrans-top}}\n*\{\{(check|)trans-mid}}\n*\{\{(check|)trans-bottom}}\n', \
re.M), r'') # must not have re.S!
Latest comment: 15 years ago2 comments2 people in discussion
I'm curious as to why the bot is reverting these recent changes I made. {{ja-readings}} already adds the * to each line automatically, and the talk page description doesn't require * before the template in entries. Putting * in the actual entry is redundant code, and removing it doesn't change the way the information is rendered. By that same note, keeping it doesn't do anything either, but it seems like it is unecessary. I'm not planning any mass changes, just those that I've stumbled on and thought the * is misplaced. If AutoFormat could be changed so it doesn't revert these kinds of minor corrections in the future, that would be great.
I've reviewed the similar discussion with respect to other templates needing leading bullets, but I think this template is different. {{ja-readings}} generates a list of at least two to six lines, so removing the bullets from the template is not an option. Otherwise, the leading bullet in the entry would only bullet the first reading in the list, leaving the others unbulleted. I appreciate your taking the time to consider this request. Thanks.Dcmacnut17:10, 4 March 2010 (UTC)Reply
There are a couple of good reasons to leave it (the *) in the wikitext as well. One is to make it easier on both programs reading the wikitext and on users to see that it is part of a list. The ability to do this redundantly is intentional in wikitext, not accidental. Also consider this:
* some reading
* {{ja-readings}}
* some other reading
renders correctly:
some reading
some other reading
But:
* some reading
{{ja-readings}}
* some other reading
does not:
some reading
some other reading
The difference is only just visible, but affects the HTML:
<pre>
* some reading
* {{ja-readings}}
* some other reading
</pre>
<p>renders correctly:</p>
<ul>
<li>some reading</li>
<li>some other reading</li>
</ul>
<p>But:</p>
<pre>
* some reading
{{ja-readings}}
* some other reading
</pre>
<p>does not:</p>
<ul>
<li>some reading</li>
</ul>
<ul>
<li>some other reading</li>
</ul>
in which the latter is two separate lists. This example is somewhat contrived in this case, as {ja-readings} with no parameters is unusual, but does work correctly. (And listing other readings first is a bit odd, but it might be the most important reading, and not a usual one.) However, it does illustrate the principal. Is easier to have the users (bots etc) always use * on list items, and not worry about how multiple bullet lines might be generated.
It will, sooner of later. As in this case. It looks at entries after each (non-bot) edit, so it picks up most quickly. Bots adding section add the {rfc-auto} section. Robert Ullmann11:15, 19 April 2010 (UTC)Reply
Latest comment: 15 years ago5 comments2 people in discussion
Malabarista has been listed in the request for AutoFormat category for a week. I can only assume it's using a deprecated template but AutoFormat doesn't know what to replace with what. Are nor do I, can we track this down? Mglovesfun (talk) 12:39, 25 March 2010 (UTC)Reply
The template {mf} and a few others can get pages stuck there when AF doesn't have a rule to eliminate them, typically when the call is indirect. I clean these up manually once in a while; meanwhile they cause no harm. Robert Ullmann11:15, 19 April 2010 (UTC)Reply
Latest comment: 15 years ago2 comments2 people in discussion
I was wondering what this bot does if it finds multiple language sections for the same name. For example, two ==Dutch== sections. Does it append the second to the first, or just give up and put a maintenance template on it? I would like my bot to be able to append to an existing Dutch section, but it's harder for me to write all the section splitting code than it is to just put the additional contents at the end and let AutoFormat sort it out. Is this a viable option? —CodeCat09:04, 27 March 2010 (UTC)Reply
Latest comment: 15 years ago4 comments2 people in discussion
Hi Robert,
AutoFormat has recently edited some entries by converting ===Prepositional phrase=== to ===Preposition phrase=== and adding {{rfc-xphrase}}. This doesn't seem to be specified by User:AutoFormat/Headers; and as of this vote, it's no longer correct behavior. Can it be changed?
Latest comment: 15 years ago3 comments3 people in discussion
Does the bot check those in this category to see if the issues have been resolved? I fixed one header in the category but didn't remove the template; there could be other problems. Should they be removed it will the bot do it? -- 124.171.169.18906:47, 17 April 2010 (UTC)Reply
AF will remove the tag and re-check if it looks at the page, normally when you edit a page it will pick it up in RC and do that. If you'd like to force it, change -level to -auto. It doesn't scan the category, it does pick up things from the XML after a while. Robert Ullmann11:15, 19 April 2010 (UTC)Reply
In all cases, only when they appear at the start of a line in the synonyms or antonyms sections and are followed by one or more wikilinked words. It might be possible/desirable in other sections as well, but I'm not certain.
Its not something I think that should be gone through specifically, but as and when it encounters the in it's routine checking. Thryduulf16:32, 20 April 2010 (UTC)Reply
Certainly Hypernyms sections should be included, probably Hyponyms and other *nyms too; WT:ELE suggests Cooridinate terms should be part of this set. I can see the benefit to them in usage notes, alternative spellings and see also sections as well but I don't know whether this method of marking which sense the links apply to is widely used enough to make it worth coding.
We generally don't want it in pronunciation sections, as they use a different structure. See the new section below for my incomplete thoughts regarding similar patterns in the translations section. Thryduulf15:41, 22 April 2010 (UTC)Reply
Err, that doesn't make sense? This isn't about changing the order of the entries, just replacing one type of markup - with another. Thryduulf (talk) 17:13, 8 May 2010 (UTC)Reply
But you've suggested inserting inappropriate markup where it can't be used. If Related terms and Derived terms are arranged alphabetically (as we normally do), then a {{sense}} tag is illogical, because it is meant to group items by sense. Inserting that tag into the Derived terms section (as you have done) is not a step forward, but sideways; it replaces one error with another. --EncycloPetey19:18, 8 May 2010 (UTC)Reply
I still don't understand. Where term has multiple senses, and one or more terms derived from a specific sense, then the derived terms are grouped by sense and listed alphabetically with that sense - nothing will change except the markup used to note the sense. Where the sense are not grouped by sense, then it is a straight alphabetical list and I'm not proposing to change that - indeed in this situation, nothing will change. Where not all terms in a list are marked as derived from a specific sense, then nothing will change except the markup used to note the sense. I note that you've still not fixed the entry you reverted my edits to - either to remove the error I pointed out or to fix what you say is an error in the structure. Thryduulf (talk) 20:57, 8 May 2010 (UTC)Reply
EP: he's just talking about the markup syntax, changing the syntax at the top of this section to {sense}. It is NOT about re-arranging things or changing any semantics. The question is: are all uses of (e.g.) {{italbrac}}: at the start of a line in these sections reasonably convertable to {sense}? In whichever cases it (the contents of italbrac) is not a "sense", if any, is it harmful (given that the syntax was already "wrong")?
Thank you. Now if we could fix mis-uses of "it's". ("it's own colon"? Indeed. ;-) Rule: read "it's" as "it is", if that sounds wrong, it's. (<smirk>) Yes, "it is" has a verb, in "it's", the "'s" doesn't count as the required verb ... English is crazy. But so is everything else. Robert Ullmann16:14, 16 July 2010 (UTC)Reply
It seems I'm finding more stuff with this pronunciation cleanup, and {{shavian}} now exists too - see WT:GP#Template:shavian. If you want to include it in your sorting algorithm, it should go at the end of the line (after SAMPA) as that is it's alphabetical position (it's also probably the least useful!). AIUI from the WP article, it should only appear on English entries (if it's ever used elsewhere than quadrillion that is!). Thryduulf15:17, 28 April 2010 (UTC)Reply
My "sorting algorithm" is 3 regex rules, that can be fired repeatedly on a line. The number of rules required is n(n-1)/2; for 3 things (X- and SAMPA as one pattern) that is 3. For 4 things it is 6, for 5 it is 10 ... at which point I have to write some code to take the line apart and sort it ;-) I'll keep them in mind. Robert Ullmann14:21, 29 April 2010 (UTC)Reply
Latest comment: 15 years ago3 comments2 people in discussion
Following on from the section above re template:sense I've been wondering about where similar markup appears in translations sections.
Here the template should be {{qualifier}} not {{sense}} as the separate trans-tables should separate the different senses of the English word, but are used to distinguish different aspects of the foreign word. An example is cousin#Translations where different languages have a greater number of terms (e.g. Arabic has 8 terms to English's one).
The following should be fairly simple to determine what text actually is the qualifier. In all cases where it is just a single letter (e.g. m) then it is most likely this is the grammatical gender and not a qualifier.
Where the existing entry uses a template like {{italbrac}}, e.g. the Arabic translations at cousin
Where it is the only part of a line not in a {{t}} template, e.g. the Persian translations at cousin
More complex possibly are these:
Where it is the only part of a line not an internal link, e.g. the Latin translations at cousin
Where it is the only part of a line not an internal link or a gender template, e.g. the Icelandic translations at cousin
Where it is the only part of a line italicised and/or parenthesised, e.g. the Hebrew, Ewe and Russian translations at cousin (for Hebrew here the hyphens should also be removed). Note though that the parenthesised part of the Hebrew translation is not a qualifier but a transliteration.
There appears to be no standard whereby the qualifier comes before or after the translation (although the latter is more common), nor whether one or multiple lines are used. Qualifiers can therefore come before, between and/or after the translations. Some translations are explicitly qualified while others on the line are only implicitly qualified (e.g. the Basque translations at cousin).
A clue is possibly that the qualifier should be (almost?) entirely written in latin script, so if it isn't then it's not autoformattable as a qualifier. This might be an easyish check? It wouldn't help with other languages that use the Latin script though.
Another is that all/the majority of the words should be in English - I'm not programmer, but I guess checking to see whether the words have English definitions here would be very processor/server intensive? Would there be an easier way? It also isn't guaranteed to work as there might be non-English words with the same spelling, or we might not have an entry yet.
Don't panic! I'm not asking you to add this yet, this is little more than a brain dump of my current vague thinking on the subject in the hope that additional brains will help! Do feel free to move this whole section to somewhere else if you don't want it cluttering the talk page. Thryduulf16:17, 22 April 2010 (UTC)Reply
The first trans-table at cousin is a good example of various different formats. I'll go through them manually tomorrow if I get time if that would help. This edit at business is an example of a change. Thryduulf22:42, 22 April 2010 (UTC)Reply
Make minor copyediting edits
Latest comment: 15 years ago5 comments4 people in discussion
Hi there Ullman. I was wondering if it would be possible for your bot AutoFormat to automatically do the following when looking at the following:
Just English language entries:
Capitalize the first letter of the definition in the definition line and add a period to the end of the definition line if one is missing.
All entries:
Put a line between headers and line breaks.
Put a line between inflection line and definition lines.
Move any {{wikipedia}} templates anywhere else on the page to right under the appropriate headers for each language.
Put spaces in between the asterisks and any words in any Alternative spellings/forms, Derived, Related terms, and See also.
Languages other than English entries:
Convert any links to other entries under Related Terms, Derived terms, and See also sections from plain wikilinks to use the correct {{l}} template.
capitalizing: as mentioned, it is being debated. Yet again. Once again we will have to point out that sentences are only appropriate some of the time; forcing phrases and words to be "sentences" is bad construction, and loses important semantic information. Consider the difference between:
the all entries things: AF does do 1,2,4 as noted. Moving {wikipedia} is troublesome; if there are multiple languages it must stay in language section, if in "prolog" (section 0) it can only be moved inside a language section if there is only one.
converting links: maybe ... requires some work to see if there are or are not links that are other things (comments, linked qualifiers or glosses).
Ah, I looked and looked and still didn't see that there were three braces there. Sorry to have bothered you.—msh210℠16:22, 29 April 2010 (UTC)Reply
They can be difficult to spot, which is why there are so many problems to clean-up. I suspect it's because I've been doing a lot of that that I was able to spot it on this occasion, having got my eye in so to speak. Thryduulf22:18, 29 April 2010 (UTC)Reply
I saw one of those too, and couldn't figure it out. I had introduced a bug fixing the Prep. phrase case above. Had a data structure with ill-defined semantics, now fixed. Robert Ullmann23:44, 2 May 2010 (UTC)Reply
AF not doing everything it says
Latest comment: 15 years ago2 comments2 people in discussion
With this edit AF said it had done a long list of things, but only actually appears to have done one of them. I haven't looked at the entry in detail, but at first glance it seems the other things didn't actually need to be done? Thryduulf18:48, 3 May 2010 (UTC)Reply
Yes, transient bug while I was making an improvement. I had had code to change Pronunciation from level 4 to 3 if it was the first header in a language section; I extended this to any L3 header. It is fairly common for people to use L4 headers in error just after the language header, as it looks okay in the TOC, and seems to render fine unless you notice the small difference in font size.
Anyway, in that case it had a bug, promoted two headers to L3, then pushed them back to where they belonged (;-). So no net effect. Fixed now, is working properly. (Good to see someone noticing! Thanks!) Robert Ullmann04:23, 4 May 2010 (UTC)Reply
Partially linked language names
Latest comment: 15 years ago2 comments2 people in discussion
Latest comment: 15 years ago6 comments3 people in discussion
If there is one line in a pronunciation section with one or more IPA, SAMPA, etc templates and one audio template, could AF sort the lines so they occur:
enPR / IPA / SAMPA / X-SAMPA
audio
Things get more complicated when there are multiple pronunciations involved, and writing a specification to cover every possibility would be more hassle than it is worth, so with one exception (below) I think AF should just ignore these.
The one exception is that where an audio template appears on the line above a line that starts with {{a}}, and the audio description matches a parameter of the {{a}} template (e.g. {{a|UK}} and {{audio|example.ogg|UK}} match, as do {{a|Ca}} and {{audio|example.ogg|Canada}}), then this should be flagged for human attention as there could be multiple ways to sort it out.
That's going to be very tricky to code for, especially since there are {{a}} parameters besides the usual UK and US. How many entries will it help and could you give an example? I'm not sure I know what the problem is that you're hoping to clean up. --EncycloPetey16:24, 8 May 2010 (UTC)Reply
For an example, see . If the {{a}} parameters do not match then AF will just not do anything and the finding and sorting will have to be done manually (we're no worse off). The problem is that although transcription before audio is significantly more common and is also what WT:ELE#Pronunciation specifies, the opposite way round still occurs frequently. Fixing them where there is exactly one of each shouldn't be difficult to code, and for the latter it should just be a case of comparing strings and doing one thing if they match and nothing if they don't. Thryduulf (talk) 16:32, 8 May 2010 (UTC)Reply
OK, that makes sense now. Your original description focuseed so much on the exceptions, I couldn't see what it was trying to fix. --EncycloPetey16:44, 8 May 2010 (UTC)Reply
Probably could be: if there are exactly two lines, and first is "audio" and second is transcription, swap them. In most cases it will be the right thing to do, in the exception cases it isn't going to make anything worse. However, this doesn't fit the model of any of the other transforms that AF does, so it isn't just adding a rule. (There is one simple way to add a rule that might do, since we don't see sections with multiple lists except when they have subheads, and that would work. confused yet? ;-) Robert Ullmann18:51, 8 May 2010 (UTC)Reply
Being the complicator I am, I've just thought that we could include {{rhymes}} lines in this. The order should be:
transcription
audio
rhymes
When there is exactly one each of two or more of these, AF should sort them into this order. If they occur in combination with exactly one {{homphones}} line, then AF should sort into this order, but keep rhymes and homophones in the same relative order as they were (whatever that is) - if there is a homophones line but no rhymes line, then homophones should come last. Thryduulf (talk) 16:33, 9 May 2010 (UTC)Reply
* list
Latest comment: 15 years ago2 comments2 people in discussion
Hello. Can AF please add automatically an asterisk before {{list}} when necessary? For instance, please compare these two entries: and . --Daniel.08:13, 12 May 2010 (UTC)Reply
Latest comment: 15 years ago4 comments2 people in discussion
Quite a lot of the recent bad language translation table problems seem to be where there is an entry called: Chinese traditional/simplified. The line format is always exactly the same, so I was wondering if it could be bot-fixed. One example of what needs to be done is .
If the romanization is Mandarin (essentially always), then it should be fixed to * Mandarin: (etc.). Really needs to be fixed by a native speaker, or by someone (like me) who can get it right by cross-checking the entries (with or w/o an external reference ;-). Look at the first two trans tables at budget where someone has done it correctly.
There are apparently about 120 of these. Might be best to do an ordinary bot op; take out the tra/sim part and tag the end of the line with {attention|zh} Will think about it. (Time now is 20:20, time for cricket! Yay!)Robert Ullmann17:23, 20 May 2010 (UTC)Reply
In that case it might be an idea to mark {zh|attention} on any of the ones I've fixed recently (this week) (look for edit summaries that mention Chinese). I don't have time atm to find them quickly myself (and any automation that does better than browse will be quicker anyway!) Thryduulf (talk) 18:05, 20 May 2010 (UTC)Reply
Basa language
Latest comment: 15 years ago4 comments3 people in discussion
In this edit AF is objecting to the language name "Basa", however from what I can see this is the correct name for the language with ISO-639 code bzw, and is what {{bzw}} outputs. Thryduulf (talk) 10:12, 20 May 2010 (UTC)Reply
As noted below, it does that, but not for language/L2 headers. It doesn't try to correct near-matches on languages and so on as that runs a considerable risk of changing a valid but unknown language into a known but incorrect language.
It does sentence-case headers at L3+ unless they are completely unknown, in which case they are tagged. And it picks them up in XML screen so anything missed in RC (which sometimes happens) is fixed later. Robert Ullmann09:11, 24 May 2010 (UTC)Reply
Latest comment: 14 years ago10 comments3 people in discussion
I've just spent the past half-hour or so correcting misuses of {{q}}. They were being used as a substitute for '''{{subst:PAGENAME}}''' to create inflexion lines for initialisms, acronyms, and symbols. When {{q}} occurs at the beginning of a line, could you convert it to '''{{subst:PAGENAME}}''', please? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 01:01, 12 June 2010 (UTC)Reply
Would you give me a specific example or two? (Not that critical in this case, but always useful.) I'll look at it a bit later today. Robert Ullmann05:48, 12 June 2010 (UTC)Reply
I'll give you thirty-three: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and . Also note and , which are misuses that would not be picked up by AF. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:13, 12 June 2010 (UTC)Reply
I've been replacing {{i}} and {{i-c}} (as well as their non-abbreviated forms) with {{qualifier}}, {{sense}}, {{gloss}}, {{a}}, etc. so that we can make any changes to the display of any of these at some future time if we want - {{i}} really shouldn't be being used in the main namespace. As it's all context dependent, I don't think think this is automatable. Thryduulf (talk) 08:17, 30 June 2010 (UTC)Reply
I consistently use {{i}} only where {{qualifier}} is appropriate (because the former redirects to, and is therefore equivalent to, the latter); otherwise, I use the more specific templates. Again, since {{i}} = {{qualifier}} already, it would be pointless to make {{q}} = {{qualifier}} too. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 09:38, 30 June 2010 (UTC)Reply
I'm not suggesting {{q}} should redirect to {{qualifier}} (generally I think we should be using template names that are more descriptive than a single letter - {{qual}} perhaps). {{i}} is really not a good name as it refers to the style it produces rather than anything semantic - other than {{i-c}} and synonyms, I can't think of another example of where this is the case. {{i}} is fine for use in discussion pages where all you want is italics without any semantics, but we should always want semantics in entries. And as others use/used {{i}} differently to you, fixing this is something that can never be automated. I'd quite happily deprecate {{i}} for all namespaces, replacing it with specific templates for the mainspace and something like {{ital}} for discussion pages. Thryduulf (talk) 10:29, 30 June 2010 (UTC)Reply
I'm indifferent to deprecating {{i}}, just as long as {{qualifier}} be abbreviated (as you suggest, to something like {{qual}}), because fourteen characters is an excessive number to input when, for the vast majority, it is identical in appearance to the six characters (''…''). — Raifʻhār Doremítzwr ~ (U · T · C) ~ 12:18, 30 June 2010 (UTC)Reply
I've created {{qual}} as a redirect to {{qualifier}}, for anyone that wants to use it, regardless of any deprecation or not. I'll probably use it as well as I seem to have an intermittent inability to type "qualifier" without making at least one typo. Thryduulf (talk) 13:12, 30 June 2010 (UTC)Reply
Translation nesting
Latest comment: 15 years ago1 comment1 person in discussion
Has AF been doing any translation nesting formatting? I ask because I'm trying to reformat User:Atelaes/TargetedTranslations.js to work around some of the nesting issues. So, for example, if the user selects "Greek", it might automatically look up "Greek" and "Modern Greek", or if the user selects "Chinese" it might tell them that this won't return any translations, and they might consider "Mandarin". Also, have you seen Wiktionary:BP#Accelerated Nested Translations? Thanks. -Atelaesλάλει ἐμοί13:21, 13 June 2010 (UTC)Reply
Latest comment: 14 years ago4 comments2 people in discussion
Hi AF/RU. Would it be possible for AutoFormat to add an entry's PAGENAME to {{q}} (as, for example, for (deprecated template usage)wateriness: {{q|wateriness}}) when said {{q}} is empty (i.e., when the first parameter isn't specified)? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 22:41, 14 June 2010 (UTC)Reply
Using the same criteria, could AF also find and fix any uses of Greek ε (U+3B5) instead of the Latin (and IPA) ɛ (U+25B). This was noted as being a problem on the French Wiktionary, so it's not unlikely to have happened here also. Thryduulf (talk) 12:04, 27 June 2010 (UTC)Reply
Again with the same criteria, ǝ (U+1DD "Latin small letter turned e") should be replaced by ə (U+259 "Latin small letter schwa").
To clarify the whole request: In {{IPA}}, {{IPAchar}} and {{rhymes}} templates (excluding lang= parameters) the following substitutions should be made by AF on an ongoing basis:
(I am here, and I've been reading and thinking about it; just haven't had a reply. And there are all kinds of network problems at the present time.) One concern is that AF is supposed to be about all sorts of syntax "errors", and not about making changes that are semantic, or bordering on it. I'd be more inclined to do something separate, that can look at the IPA strings from a current XML dump, and do a number of things. That could include matching to the SAMPA, some comparison to enPR, and so on. Or perhaps not, perhaps these should just be AF rules (which would not be hard, it has a ruleset for pronunciation section lines). Robert Ullmann13:53, 28 June 2010 (UTC)Reply
This seems like just the thing AF was designed to do, to be honest. Something small and trivial that's not worth starting a discussion over, but just needs to get fixed. I imagine especially the automatic replacement for g would be welcomed by many people, as it's an easier character to type, so they can be lazy and let AF do the fixing. —CodeCat14:29, 28 June 2010 (UTC)Reply
I'd equate these changes to typo fixing, such as the s/See alo/See also/ correction AF made to a header I erred in typing yesterday. Changes like /r/ to /ɹ/ (for English only) that I've suggested previously are closer to semantic (although in that specific instance we have defacto consensus at least), so which is why I've not included them in this request. In the table above the characters in the left column are not IPA characters and so should not appear in IPA strings. Thryduulf (talk)
(hi CodeCat!) (that comment above took me 26 tries to save ... net better now.)
Related to this, could you check if there are any pages in the Rhymes: namespace using one of these characters in their page name? If so, and there are any, then please could you produce a list of them. I doubt that there's going to be enough to bother with automation to fix, but if I'm wrong we can revisit that. Thryduulf (talk) 16:51, 30 June 2010 (UTC)Reply
Oh good. I'd noted the first case as AF changed dragon and the link wasn't either made or broken. (There are two pages?) A number of them seemed to have been moved to the correct forms and then Conrad snapped the redirects, so AF isn't finding that many corrections to rhyes templates. Robert Ullmann04:40, 1 July 2010 (UTC)Reply
I've gone through and fixed/moved/deleted as appropriate. I've changed those pages that linked to them that I felt needed changing - leaving most main namespace pages to AF. It will be worth looking through the Rhymes namespace pages to check that they are consistently using the right character, but as there is a lot of untemplated usage this will be harder. Thryduulf (talk) 12:21, 1 July 2010 (UTC)Reply
Related to the above, Wiktionary's (de-facto) policy is not to use the following (deprecated) ligatures in IPA transcriptions, so could AF make the following replacements please:
I'd not advise doing this. A t-esh ligature is semantically different from single t + esh, and while the IPA has a tie bar for this purposes, it causes major problems with Arial Unicode MS. -- Prince Kassad16:02, 28 July 2010 (UTC)Reply
I don't see that as relevant for -
We shouldn't continue to use deprecated IPA (and I'm not the only one who converts the ligatures to separate characters when I encounter them)
We already specify non-broken fonts for IPA (I believe), so tie bars can be added where needed, however
The tie-bars are not explained in any of the pronunciation keys we link to (at least last time I looked), which thus make no distinction between and (even the Wikipedia article simply says they "may be separate or joined by a tie bar")
(afaik) none of our pronunciation sections make a distinction between and ] (or any other such pair)
So given that the distinction between and / is not made, making this change has no effect but to standardise on the standard IPA throughout Wiktionary.
Alternatively, if you really feel he need, we could replace the ligatures with the tie-barred characters, however I think we'd need to explain the tie bars on the pronunciation keys, note they aren't used universally and add them to the toolbar (I think they'd need to be precomposed pairs to make them clickable) first. Thryduulf (talk)
Prince Kassad, can you give us an example (entry) where a distinction between and the ligature or tied characters is made (and needed)? From what I have found, the ligature and the separate characters are considered identical in IPA, with the former deprecated, and the tie optional. (and we apparently opt-out in our keys ;-) Mind you, I'm not going to have AF do something that is questionable here. (and no hurry) Robert Ullmann08:30, 29 July 2010 (UTC)Reply
I'd love some kind of list about what characters are actually permitted in enPR, just like what I did for IPA. For SAMPA it should be pretty obvious, as that takes only ASCII. -- Prince Kassad14:45, 4 August 2010 (UTC)Reply
Report is using the list for IPA. There are a few corrections to the list, as /, are not valid characters within the IPA strings, and are treated as invalid. (Which means there is something to be fixed.)
SAMPA treats everything > 127 as invalid; this isn't quite right, there are several characters that aren't part of SAMPA or X-SAMPA. We use parentheses as in enPR and IPA. There are a couple of others, but I don't know which.
I also fixed the parsing a bit, it should have been stripping spaces at ends of parameters, so now there are many fewer cases of / apparently occurring inside the string. Robert Ullmann12:30, 9 August 2010 (UTC)Reply
apostrophe
As noted U+0027 apostrophe should not occur anywhere in IPA. A lot of observed use should be the primary stress marker (U+02C8), and a small amount should be the modifier apostrophe (U+02BC) used to form ejectives.
An interesting example is ბიჭი which uses the stress marker correctly, and then uses the 0027 apostrophe incorrectly.
Suppose AF were to always convert 0027 to the stress marker.
Pro: common use of ' for ˈ would be automatically corrected, this makes this easier for editors much as the g->ɡ and :->ː conversions.
Con: editors adding ejectives would be required to know and use the modifier apostrophe.
(and we have to sort the existing cases of ejectives, but that needs to be done anyway ;-)
I also just noticed there are many uses of the typographical apostrophe (U+2019), which is always wrong in IPA context. You should check that out too. -- Prince Kassad14:09, 2 August 2010 (UTC)Reply
IMO the con outweighs the pro. You've been admirably careful, Robert, to avoid your bots' guessing so as to correct ambiguities, and I see no reason you should change that practice now.—msh210℠ (talk) 16:27, 2 August 2010 (UTC)Reply
Well I think we can automatically correct a subset of the uses - for example where the apostrophe occurs in a position that a modifier apostrophe legitimately can't (for example as the first character of a pronunciation transcription, or for languages that do not use ejectives). Any occurrence that cannot be unambiguously corrected should of course not be so, but should be marked. The reverse occurrences should also be flagged as wrong (e.g. the ejective modifier apostrophe appearing as the first character of a transcription). Thryduulf (talk) 17:37, 2 August 2010 (UTC)Reply
As the first character of the transcription, or after . (a period/full stop), can be corrected IMO. After a consonant in a language that doesn't have ejectives I'm warier about, as there's always a chance someone transcribed something oddly, as an ejective, in a language that phonemically has none. Maybe I'm being too cautious, though.—msh210℠ (talk) 17:59, 2 August 2010 (UTC)Reply
I added a few columns to the table. In particular "pos eject" is the number that are possibly ejectives, depending on the language. There are, as you see, almost none. "pos error" is the number of possible errors that could occur if a rule converting any 0027 or 2019 followed by alpha was used. The other way to read this is it is the number of cases that would be missed if a rule of (not ptkqsɬʃ) (0027 or 2019) (any alpha) was followed. That may be a usable rule? Robert Ullmann10:10, 3 August 2010 (UTC)Reply
It looks okay for me, at least. However, you should also include tɬ' and tʃ', which are common ejectives in Native American languages. Whoops, I got confused by the HTML entities. -- Prince Kassad10:24, 3 August 2010 (UTC)Reply
I used the numeric references to make sure I got them right (arrant pedantry: there are "named entities" and "numeric references" ;-) sorry for confusion. Perhaps I'll try this rule after lunch. Robert Ullmann11:22, 3 August 2010 (UTC)Reply
Numeric character references, you mean. (As opposed to "named entity references", or "character entity references" as HTML 4.01 calls them.) —RuakhTALK12:30, 3 August 2010 (UTC)Reply
If we keep explicit non-IPA pronunciations, we can go through and fix any non-valid characters in them later, both SAMPA and enPR contain fewer legitmate characters so would be a smaller job anyway. Thryduulf (talk) 13:28, 3 August 2010 (UTC)Reply
It just came to my mind you should probably check for palatalized and labialized ejectives (02B2 and 02B7). They're uncommon but can occur. -- Prince Kassad14:59, 3 August 2010 (UTC) (addendum: pharyngealized ejectives too: 02C1)Reply
Now for something that should be fairly uncontroversial, dotless-i ı (U+0131) should be corrected to small capital i ɪ (U+026A). -- Prince Kassad10:24, 3 August 2010 (UTC)Reply
Most of the entries in the parsing errors list are caused by an incorrectly closed template. A one-time run finding the string: {{IPA|lang=fr|/*/} (where * is a wildcard matching any number of characters that are not a slash) and appending } to it would fix these (and remove them from the mismatched syntax workload). The transcriptions should then be checked to see if there are other problems.Thryduulf (talk) 10:39, 3 August 2010 (UTC)Reply
They are almost all Dawnraybot. Yes, pretty simple to fix. Should run that. It is effectively moving one } from the end to where it belongs. Robert Ullmann10:49, 3 August 2010 (UTC)Reply
I think I've now fixed all the remaining parsing errors. The IPAchar inside IPA was seemingly caused by an unauthorised bot run in early 2008 that simply converted every instance of IPA:... to {{IPA|/.../}} in French entries, without taking into account that some entries already used {{IPAchar}}. Thryduulf (talk) 12:59, 4 August 2010 (UTC)Reply
Semicolons
There appear to be two uses of semicolons.
semicolons where there should be multiple parameters (e.g. at absorbing) should be fixable using the exiting code where commas separate the parameters.
html entities e.g. /ˈɛɾ̃ɹ̩pɹɑjz/ at enterprise. Both &nnn; and &xnnn; forms seem to be used. Could AF convert these to the proper unicode character? I've got a vague feeling that it does/did this elsewhere before? Thryduulf (talk) 16:02, 3 August 2010 (UTC)Reply
yes, the code can easily handle ; as well as , will do soonish
there was some conversion of both entities and numerics (see above ;-) done a while ago by DoddeBot. AF never did this. Should be more general than IPA of course; to be looked at? Robert Ullmann16:10, 3 August 2010 (UTC)Reply
There are some cases where we want HTML entities, for example in cases where MediaWiki screws with characters due to Unicode normalization. However, this should never be the case for IPA. -- Prince Kassad16:14, 3 August 2010 (UTC)Reply
Conversion of HTML entities/references/whatevers should IMO, to aid human-readability, not be done for those characters that look like other, more common (especially 7-bit) characters, like dashes (en, em, minus, et al.) and spaces (non-break, et al.) and also (for the same reason) not for invisible characters (zero-width joiner, et al.).—msh210℠ (talk) 14:52, 4 August 2010 (UTC)Reply
With the exception of invisible characters, I disagree - inclusion of the HTML code significantly decreases human-readability. For example, which is easier to read "јануар" or "јануар"? Also, which glyphs are similar depends on font and size, which vary between systems. Thirdly, "ј" is going to sort differently to "ј" and confuse bots that are looking for that character. Thryduulf (talk) 15:49, 4 August 2010 (UTC)Reply
Hm, I suppose you're right. Perhaps just dashes and spaces? It's a shame to have a non-break space added by an editor on purpose and another editor replace it by a space, not realizing. (Or, worse, copy it to another page, not realizing it's not a space.) Letters and numbers and things like, that which look like one another, I suspect people are less likely to err on.—msh210℠ (talk) 16:12, 4 August 2010 (UTC)Reply
Non-breaking spaces, I can see the benefits of, yes. Dashes I'm not so certain about, while – and — look pretty much identical in the editing window, they're easily differentiable on preview and in entry display (- is easily differentiable in both). What do others think? Thryduulf (talk) 18:00, 4 August 2010 (UTC)Reply
Still, it's pretty confusing if all dashes look the same in the edit window. I've experienced it before. Threrefore, it should really be – or — -- Prince Kassad18:05, 4 August 2010 (UTC)Reply
I don't know if it's possible to make special rules for SAMPA, but script G should be converted to plain G in SAMPA transcriptions. -- Prince Kassad18:05, 4 August 2010 (UTC)Reply
we have special rules for IPA, right? But two things: one is that SAMPA uses =, which makes the syntax nastier; and the other is that this error is more likely to be something mis-identified as SAMPA, and should be looked at. Robert Ullmann14:06, 5 August 2010 (UTC)Reply
Essentially, this would do the reverse of what AutoFormat is doing for IPA. The only errors I've seen with SAMPA is enPR being misidentified as SAMPA, and I already cleaned up all or most of them. -- Prince Kassad15:40, 5 August 2010 (UTC)Reply
It mostly seems to be just people who are unfamiliar with SAMPA and thus leave in IPA characters, because they don't know their SAMPA equivalent. -- Prince Kassad18:34, 5 August 2010 (UTC)Reply
commas to multiple parameters in pronunciation transcriptions
Latest comment: 14 years ago3 comments2 people in discussion
Would it be possible for AF to edit pronunciation templates that contain a comma or space to show multiple transcriptions in a single parameter and convert them to multiple parameters please? The following rules should catch all cases:
Replace "/,/" or "/, /" or "/ /" with "/|/"
Do not match "|/ /" but report as an empty parameter.
Replace "],, |["
Replace "/..., .../" with "/.../|/.../"
Replace "" with "|"
Replace "{{enPR|..., ....}}" with "{{enPR|...|...}}"
In all cases ... represents any string of characters that are not one of "|", "/", ",", "", "{{" or "}}", and the replacement should be able to cope with more than two transcriptions (e.g. /..., ..., .../}}.
If this isn't possible, can you add these cases to the pronunciation exceptions report.
It already does this in the possible cases. The cases like "/..., .../" can't be done, as the pronunciation can be for a phrase with a comma in it (I've seen this somewhere!) The first two, which are done, is most of the cases. I'll look at some point at getting others into the exception report. Robert Ullmann13:38, 13 July 2010 (UTC)Reply
Latest comment: 14 years ago9 comments3 people in discussion
{{rfp}} now takes a lang= parameter, so it would be useful if AF could add this where it isn't already there. As AF already adds this to {{IPA}}, I'm guessing it wouldn't be too difficult. Thryduulf (talk) 01:24, 20 July 2010 (UTC)Reply
Given that AF also does this for things like context templates, rhymes, {{plural of}}, etc, would it be perhaps simplest to not mark each template in the code, but to have a page which lists all the templates that AF should add a lang= parameter to if one isn't there already?
Also, does AF do any checks to see if the lang= parameter is correct? e.g. would it do anything about {{plural of|foo|lang=cy}} in a German L2 section? Thryduulf (talk) 11:57, 10 August 2010 (UTC)Reply
the code to add lang= is already there; but only operative inside a Pronunciation section. So missed the reasonable case where someone adds it w/o the header. I've added it to the general list for adding the parameter, and moved IPA to that list as well. And added rfv-pronunciation and homophones. Could export this list to a page at some point as noted.
Thank you. Thinking a bit more on the mismatch of lang= and L2 headers, perhaps it would be better to flag this for human attention rather than assuming the L2 is correct in every case? I guess that in most cases that a mismatch will be a typo or thinko between similar codes (eg de/se, en/enm, etc), but it's possible that the opposite has happened and the L2 is wrong. This is going to be much more obvious though, so perhaps it is safe to do it automatically? Perhaps we should get an idea of the scale of the issue first - if it's only a handful of instances we might as well just do them manually, but if there are lots it might be worth doing at least a subset automatically (e.g. where there are other templates in the L2 section with lang= parameters that match the header). Any thoughts? Thryduulf (talk) 18:49, 11 August 2010 (UTC)Reply
Scope can't hurt, but I think that in any event the inflection line's using {infl|langcode} or {langcode-foo} (matching the header) suffices to allow automation, as long as the entry isn't in a "...that lack inflection template" category for that language.—msh210℠ (talk) 18:55, 11 August 2010 (UTC) 17:40, 12 August 2010 (UTC)Reply
Latest comment: 14 years ago1 comment1 person in discussion
Perhaps less easy, and less important (but a nice to have if it's not too tricky) would be to move {{rfp}} to within a pronunciation section if it isn't in one already. I think the following rules should cover it:
If {{rfp}} is already in a Pronunciation section (at any level) then leave it where it is
If there is one L3 Pronunciation section, move it to the end of that section
If the {{rfp}} is within an Etymology N section, and a pronunciation section exists within the same Etymology N section, move it to the end of that pronunciation section
If no pronunciation section exists, and the {{rfp}} is not inside an L3 Etymology N section, create an L3 Pronunciation section immediately before the first L3 POS section or immediately before the first L3 Etymology N section and move the {{rfp}} there.
If the {{rfp}} is within an L3 Etymology N section, create an L4 Pronunciation section immediately before the first L4 POS section if one doesn't exist, and move the {{rfp}} there
Latest comment: 14 years ago2 comments2 people in discussion
Hello,
See this diff without further explanation:
It wasn't easy to notice and not easier to find. Would be wise to check all the possible, similar problems and kindly revert them. Thanks. --grin11:08, 14 September 2010 (UTC)Reply
There are not supposed to be two consecutive identical headers. It removed one, as it was designed to do. I have move the content to Talk:hallo pending some support for the claimed origin. Looking forward to continuing the discussion there. DCDuringTALK20:34, 24 September 2010 (UTC)Reply
Linked parameter
Latest comment: 14 years ago4 comments2 people in discussion
O_O Autoformat adds links to {{plural of}}?? It's not supposed to have a link, linked parameters break the section anchor and accomplish nothing. AF should remove linked parameters, not add them. --Yair rand (talk) 18:51, 17 September 2010 (UTC)Reply
The edit comment says "...to make page count" and there is a discussion relating to this in December 2008, User talk:AutoFormat/2008#wikilinking lemma terms in form-of templates. Maybe it was considered a useful edit at the time, but why did it only cover some of the template calls? And as we now think the parameter should not be linked, why have these links not been removed? Is this just by mistake or is there still a good reason? --LA219:05, 17 September 2010 (UTC)Reply
Latest comment: 14 years ago1 comment1 person in discussion
Hi. May I ask that, when an entry does not contain a "Translations" section, AutoFormat automatically adds it? Apparently your bot still does not have that helpful function, and I would appreciate if it were introduced. --Daniel.22:07, 7 October 2010 (UTC)Reply
Has AF gone on Wikibreak?
Latest comment: 14 years ago1 comment1 person in discussion