. In DICTIOUS you will not only get to know all the dictionary meanings for the word
, but we will also tell you about its etymology, its characteristics and you will know how to say
in singular and plural. Everything you need to know about the word
you have here. The definition of the word
will help you to be more precise and correct when speaking or writing your texts. Knowing the definition of
, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
(January) Decision
Not that this is a real vote, but with 70%, I think the first option passes. Can we get a bot to start moving these? @CodeCat? —JohnC5 22:26, 28 February 2016 (UTC)
- I'll need to make some changes to Module:links first so that links don't break too much. —CodeCat 22:58, 28 February 2016 (UTC)
- Thanks! :) —JohnC5 23:01, 28 February 2016 (UTC)
- I've made a small edit to Module:links. It now links to the Reconstruction: or Appendix: based on the existence of pages with that name. If the Reconstruction page exists, or if neither page exists, it links to that. It only links to the Appendix page if it exists but the Reconstruction page does not. So it should be no problem to move entries without leaving redirects behind, as links will update as soon as the move is made. Red links will automatically take you to the Reconstruction namespace. —CodeCat 00:50, 29 February 2016 (UTC)
- Looks good to me! —JohnC5 01:15, 29 February 2016 (UTC)
- I modified
{{reconstruction}}
with a link that automatically takes you to the "move page" form with everything already filled in. That should make it easier to move the pages. —CodeCat 01:20, 29 February 2016 (UTC)
- @CodeCat: It seems that the box for leaving a redirect behind is unchecked by default. I strongly believe that there should be redirects left behind, because otherwise we are leaving the internet with a bunch of dead links for no good reason. There will be no conflicts if we leave them, so why wouldn't we? —Μετάknowledgediscuss/deeds 02:28, 29 February 2016 (UTC)
- I made it unchecked on purpose. I don't like leaving leftovers. —CodeCat 02:29, 29 February 2016 (UTC)
- Okay, but it seems more important that we not produce dead links to Wiktionary, because that reduces traffic in the short term. —Μετάknowledgediscuss/deeds 02:31, 29 February 2016 (UTC)
- Dead links are inevitable, they're a normal part of website evolution. I don't think we should give ourselves the duty to clean up other people's links for them. —CodeCat 02:34, 29 February 2016 (UTC)
- It's not a matter of duty; it's a matter of whether we want more people using this website or fewer. —Μετάknowledgediscuss/deeds 02:36, 29 February 2016 (UTC)
- We can delete the redirects in a year or so. Let's leave them for now. --WikiTiki89 16:30, 29 February 2016 (UTC)
(January) Officializing automated romanizations
Wiktionary:Votes/pl-2015-12/Translations is probably going to fail. I have the intention of creating a new vote with the same proposal, but improved/fixed based on the multiple points raised by the opposers.
One of the points is: "switching Russian: {{t|ru|апельсин|m|tr=apelʹsín}} to Russian: {{t|ru|апельси́н|m}} is a topic for a separate vote, deserving its own discussion". WT:EL#Translations currently uses some Russian examples with manual romanizations (tr=
parameter). But Russian can do it automatically, so as requested in Wiktionary talk:Votes/pl-2015-12/Translations#Transliteration, in my proposed change, I want to use Russian examples without without the tr=
parameter, which implies that automatic romanizations are official policy. Is there any problem or controversy here?
I have no problem creating a separate vote officially "Allowing automatic romanizations", if that's what people want. --Daniel Carrero (talk) 12:25, 26 January 2016 (UTC)
- By the current convention, not ALL Russian words are automatically transliterated but over 95%. Only languages under "override_translit" in Module:links have automatic transliteration overriding manual. So, Russian is a bad example. --Anatoli T. (обсудить/вклад) 12:31, 26 January 2016 (UTC)
- Current rule in WT:EL#Translations:
- Do add a transliteration or romanization of a translation into a language that does not use the Roman alphabet. Note however that only widespread romanization systems may be used. See Wiktionary:Transliteration.
- My proposed change, as per Wiktionary:Votes/pl-2015-12/Translations:
- If there's some controversy, I could further edit the sentence this way:
- You can add a transliteration or romanization of a translation into a language that does not use the Latin script. In some languages, the romanization can be supplied automatically by the software, but there's no consensus as of yet concerning the acceptability of automatic romanizations and exactly what languages should use them. See Wiktionary:Transliteration and romanization.
- Looks good? Of course, if there's consensus in favour of generally having automatic translations (I'd vote support on that), then that last change would be unnecessary. --Daniel Carrero (talk) 16:19, 26 January 2016 (UTC)
- I'm not clear as to what you're actually proposing. Or are you not proposing anything yet? Renard Migrant (talk) 18:37, 26 January 2016 (UTC)
- Proposal 1: For some languages, allowing automatic romanizations.
- Proposal 2: In some WT:EL examples of wiki markup of Russian translations in the translation tables, using automatic romanizations.
- Reason: I assumed that was a given (i.e., that people generally are supportive of automatic romanizations and that it would be okay mentioning one or two examples in WT:ELE using them), but in Wiktionary:Votes/pl-2015-12/Translations#Oppose, @Dan Polansky complained about it. --Daniel Carrero (talk) 19:22, 26 January 2016 (UTC)
- OK, I'm a bit confused about what is specified as current policy and what is being proposed, but I actually wrote and ran a bot to automatically convert {{t|ru|апельсин|m|tr=apelʹsín}} to {{t|ru|апельси́н|m}}, and no one has complained about it; in fact, the main Russian editors here were happy with the results. Actual current policy doesn't agree much at all with the rule that Daniel quoted above. In particular:
- "do add a transliteration or romanization" isn't really right. It should ideally only be added when automatic transliteration either doesn't exist for a language or would be wrong. In particular, writing апельсин without a stress mark and then including manual translit apelʹsín with a stress mark is wrong; instead, апельси́н should be written and the auto-translit allowed to work. As Anatoli mentioned, most of the time (I would say over 99%) the automatic transliteration for Russian is correct (provided of course that stress marks are added to the Russian). Pretty much the only time when manual translit is needed for Russian is in cases like тест (tɛst), where the auto-translit would be test. For other languages, it may be needed more often; e.g. for Arabic, it's often needed to specify how a tāʾ marbūṭa should be transliterated (as t or as nothing).
- "only widespread romanization systems may be used" gives far too much latitude. This kind of attitude created a huge mess in the Arabic transliterations in translation entries, which took a lot of work on my part to fix (and may have gotten messed up again in more recent entries). Properly, the transliterations must follow the particular translit system used by Wiktionary for that language.
- I would propose something like:
- Add a transliteration or romanization of a translation into a language that does not use the Latin script, except for those languages where the romanization is supplied automatically by the software (but do add a transliteration/romanization if the automatically-provided one is wrong). The transliteration should follow the appropriate Wiktionary-established conventions for the language in question (see Category:Transliteration appendices); do not use any other romanization system.
- Benwing2 (talk) 05:37, 27 January 2016 (UTC)
- @Benwing2: That sounds great to me. I think these are actually our current "unspoken rules", which you could articulate well. Since @Dan Polansky asked for that specific issue to be voted separately, I don't mind creating a separate vote for it. --Daniel Carrero (talk) 13:33, 27 January 2016 (UTC)
- Oh you're trying to get common practice codified somewhere. Excellent. Go for it. Renard Migrant (talk) 15:43, 27 January 2016 (UTC)
I created Wiktionary:Votes/pl-2016-01/Automated transliterations. --Daniel Carrero (talk) 03:58, 28 January 2016 (UTC)
- Proposed wording:
- Translations not written in the Latin script should have romanizations. In some cases, the romanization is supplied automatically by the software. Supply the romanization manually if it is not supplied by the software or if the romanization supplied by the software is wrong. The transliteration should follow the appropriate Wiktionary-established conventions for the language in question (see Category:Transliteration policies); do not use any other romanization system.
- --Daniel Carrero (talk) 00:45, 30 January 2016 (UTC)
- Perhaps the use of manual transliteration for automatically transliterated languages should be mentioned (exceptions as in Korean, Russian, etc.). The necessity to provide word stresses for Cyrillic-based Slavic (Serbo-Croatian accents?), diacritics for Arabic (Hebrew?).
- Some languages are in a transition and a unified transliteration hasn't established yet, due to complexities - such as Khmer and Thai. Some transliteration modules are in the process of development or fixing - Lao, maybe Burmese. Thai module may never work the way other transliteration modules do, it will need phonemic spelling, split by syllables, just like Japanese requires kana readings and PoS info (plus morpheme boundaries in some cases) to determine the correct transliteration. Just commenting. --Anatoli T. (обсудить/вклад) 01:18, 30 January 2016 (UTC)
- @Atitarev: IMO, ideally I would want a comprehensive list of transliteration circumstances as you described, but for the moment I'll probably just try to update WT:EL to officially allow transliteratons in the first place. --Daniel Carrero (talk) 10:06, 30 January 2016 (UTC)
- @Daniel Carrero Sorry, I have been pre-occupied with testing new Thai transliterations and fixes with Russian. Quite busy at work too. There ARE changes currently happening with Thai transliteration methods and headwords and situations with transliterations and requirements with languages are indeed different. Make a list of questions, if you need for transliteration policies/issues and I'll try to answer them. --Anatoli T. (обсудить/вклад) 00:56, 1 February 2016 (UTC)
- @Atitarev Thank you very much. :) There's absolutely no need to apologize, language-specific transliteration issues and changes are valuable information to be documented, it's just that WT:EL technically does not even allow automated transliterations. It says: "Add a transliteration or romanization of a translation into a language that does not use the Roman alphabet." If we were to obey that pre-Lua rule, we would have to throw away all transliteration modules, so I'll try to update that rule first, before working on the language-specific issues. (at least that's my plan at the moment) --Daniel Carrero (talk) 01:07, 1 February 2016 (UTC)
- I'll just describe briefly how I understand the situation with automated transliterations, not manual transliterations.
- Slavic, Cyrillic-based languages should normally use accent marks, especially, Russian, Ukrainian and Belarusian.
- Russian: User:Benwing2 kindly converted all Russian translations to have accents when they were present in the manual transliteration. For cases when manual transliterations are required, word stresses are still required. Exceptions requiring manual translit are described, many are now partially automated.
- Ukrainian and Belarusian: These don't require manual transliterations, if fully accented Cyrillic forms are provided. There is an unresolved issue with monosyllabic Belarusian words with "ё", fixed in Russian. Manually transliterations shouldn't be simply removed before Cyrillic words get accents.
- The above three - currently deciding if we need to use grave accents for the secondary stress. Its usage is inconsistent.
- Bulgarian: No manual transliteration is required, if accents are provided. The use of the grave accent is not very clear but normally used with accented vowel "ъ". More info is required on the rules.
- Macedonian: No manual transliteration is required. The stress position is predictable but accents should be given for words when they differ from expected.
- Serbo-Croatian: (Cyrillic) No transliteration is required. Another nested Roman form should be given to match the Cyrillic. The headwords use accents. (I personally find them problematic but they can be copied from entries if they exist).
- Arabic: Automatic transliteration works only with fully vocalised Arabic forms. Loanwords, which are pronounced irregularly still need to have accents, manual transliterations is required (or can be provided) for some loanwords, words with "ة" between words and some words, with silent letters.
- Korean: Automated but words of certain etymologies need manual transliterations.
- Manual translit overrides automatic for all the above.
- Greek, Armenian, Georgian, Kazakh, Kyrgyz, Tajik, etc. - fully automated. The list can be found in Module:links
- Lao, Burmese: - fully automated. The transliteration is complex, sometimes doesn't work.
- Khmer: - needs more work, can't officialise yet.
- Hindi, Sanskrit, Nepali - almost there, the modules look good but occasional manual translit is required.
- Japanese and Thai are special cases.
- Japanese: transliteration works with headwords on kana, there are some exceptions and additional parameters are sometimes needed to get a correct transliteration. Can't be used to automatically transliterate Japanese words.
- Thai: (new) only works in pronunciation sections. It needs phonetic respellings by syllables. Can't be used to automatically transliterate Thai words.
- Feel free to add on transliteration policies. --Anatoli T. (обсудить/вклад) 02:02, 1 February 2016 (UTC)
- Special thanks from me to User:Wyang and User:Benwing2 (aka Benwing) for making some complex transliterations happen! --Anatoli T. (обсудить/вклад) 02:14, 1 February 2016 (UTC)
- That is an amazing list! Thank you! :) --Daniel Carrero (talk) 02:20, 1 February 2016 (UTC)
- OK. With Hindi, there seems to be an agreement to provide nuqta and chandra when they effect pronunciations, even if Hindi speakers normally omit them in writing. (Very similar to Russian writing "е" instead of "ё" but dictionaries use "ё", so does Wiktionary). Some Sanskrit lovers prefer to provide word stresses, even if there is no native method to that (also Hebrew) and add hyphens to show morpheme boundaries. (I personally oppose that but I need to mention).
- Mongolian (Cyrillic): Fully automated, overrides manual translit but it's known that Mongolian Cyrillic is not fully phonetic. Some textbooks and phrasebooks provide a more phonetic transliteration but we don't - no data or editors.--Anatoli T. (обсудить/вклад) 02:30, 1 February 2016 (UTC)
- Overall, I would find it hard to set the rules to vote for and I am not sure how I am going to vote. I'd like to officialise the use of "^" to capitalised Korean romanisations (romaja officially capitalises proper nouns). I haven't described all situations, of course. e.g. Tamil, Malayalam, Telugu, Tamil and Sinhalese don't require manual overrides but Amaharic, Tigrinya do (rules for schwa-dropping are not defined and consonant geminations may need to be provided manually). Yiddish words of Hebrew origin are often transliterated and pronounced irregularly. --Anatoli T. (обсудить/вклад) 03:07, 1 February 2016 (UTC)
- We don't need to set the rules for each language, we just need to mention that transliterations should appear for non-Latin script languages (with the exception of languages such as Serbo-Croatian, which are exempt as long as an equivalent Latin-script form is supplied), regardless of whether they are manually entered or automatically generated. Then each language can worry about how this happens on its own without policy in getting in the way. And even this much should not be part of the Translation Table policy, but general transliteration in links policy. --WikiTiki89 19:14, 3 February 2016 (UTC)
(February) Changes to Thai headwords, new semi-automatic transliteration, abstract nouns with ความ, etc.
@Wyang, Iudexvivorum, Octahedron80, หมวดซาโต้, Alifshinobi
There are a lot of rapid changes currently happening to Thai entries. Wyang has developed a semi-automatic module to transliterate phonemic Thai using Paiboon publisher's method. It seems to work in most cases and has been accepted by Thai editors and more changes introduced. Good job!
Now, the original Thai transliterations get out of date and become dispreferred. That's fine but entries still use them. It's OK to remove the old transliteration when the new transliteration is added in the Pronunciation section - only in one place, to keep it in sync with any future changes.
However, the majority of entries have not been converted to use {{th-pron}}
. Can we really change the headword, so that most entries don't have ANY transliteration? I don't think users will be happy with that. Maybe some transliteration is better than nothing.
{{th-noun}}
, {{th-verb}}
, etc. now ignore manual transliterations and don't display automatic - the reason is that manual is non-standard and automatic one may be wrong for non-phonetic short words. (only monosyllabic words are transliterated automatically but irregular words are transliterated incorrectly).
BTW, {{th-l}}
will correctly transliterate words that are defined and use {{th-pron}}
(or if phonetic respelling is added with |p=).
Also, Octahedron80 changed the Thai headword to add "abstract noun ความ) to EACH verb, adjective and adverb. Many such PoS would benefit form it but many has become wrong, such as ไทย, etc. which now shows ไทย (abstract noun ความไทย). That's wrong for this and many other entries. The fix is easy, just add "|-" but someone would have to check hundreds of other verbs, adjectives and adverbs!
We need to decide on the rules and what should be allowed. E.g. adding ความ to all verbs, adjectives and adverbs was probably a bad or rash idea if no-one knowledgeable checks ALL these entries.
Rather than removing transliterations from the headword, maybe auto-translit should be turned off? --Anatoli T. (обсудить/вклад) 13:50, 25 February 2016 (UTC)
{{look}}
@Wyang, Iudexvivorum, Octahedron80, หมวดซาโต้, Alifshinobi I still need to hear native users' and Wyang's (as the main developer of the module) opinion on the two matters. Please don't ignore this important topic. Your attention is required as there are hundreds, if not thousands entries affected by changes. My knowledge of the language is not enough and it's also very time-consuming. I have suggestions but I want to hear your opinions first. If you don't respond, I am afraid I have to revert some changes by Octahedron80 and do something about missing transliterations. --Anatoli T. (обсудить/вклад) 22:08, 25 February 2016 (UTC)
- Abstract noun (อาการนาม) is a type of noun taught in school. (Others are common noun, proper noun, collective noun, and classifier.) It is formed by put การ or ความ before stem. General rule is: การ for verb and ความ for adj & adv. This noun is unavoidably mapped with English vocab, for example, การเดิน=walking, การใช้=using/usage, ความสุข=happiness, ความพอใจ=satisfaction. (ความไทย may mean to Thai-ness but people usually use ความเป็นไทย instead.) Most words apply this rule. But sometimes the rule is broken because some words can use both and some words cannot be prepended; this can be determined by how people or literature use it. (This is like RFV. Googling is an option.) Another condition I have noticed recently is that adj & adv which come from proper noun (like countries) won't become abstract noun. However, every rule has exception, right? If you are afraid that checking its existance will be big task. Don't worry. This is not urgent matter. I can volunteer to check them all. Th-headword module has been used on th-wiktionary for months and it works well, therefore I copy it here and hope it useful. --Octahedron80 (talk) 02:08, 26 February 2016 (UTC)
- PS. Which parts of my changes you want to revert? Please discuss to me each case. --Octahedron80 (talk) 02:08, 26 February 2016 (UTC)
- About transliteration, I can't say much because I didn't co-develop the module. I just did data correction. In my opinion, Thai tranlit will never be perfect since there's a lot of exceptions. Direct logic can't take it all. By the way, there is another system by the Royal Institute that is officially used for naming and documents . Should this be included too? --Octahedron80 (talk) 02:54, 26 February 2016 (UTC)
- @Octahedron80 Thank you for responding! No, I won't revert anything without an agreement but I was worried about the impact and if nobody looked after the changed entries, then there wouldn't be anyone to fix the wrong ones. Have you assessed how many adv, adj and verb entries need "-"? Are you able to check ALL of them over time?
- As for transliteration. You may be right that the automatic translit may not work for all cases, all the more it's important to keep the old one in place, rather than disabling tr= parameter altogether. --Anatoli T. (обсудить/вклад) 03:26, 26 February 2016 (UTC)
- @Octahedron80 this revision of แขก (kɛ̀ɛk) doesn't show any transliteration, although it has "|tr=kàek". I am fixing it now but there are many entries to be fixed. --Anatoli T. (обсудить/вклад) 04:06, 26 February 2016 (UTC)
Th-headword module improved. Headword can also show transliteration by these conditions:
If tr parameter is set, show its text.
- If mono (monosyllable) parameter is set, transliteration will be made with its text via th-pron module. (may suggest better parameter name)
- If nothing above is set, transliteration will be made with its pagename via th-pron module.
--Octahedron80 (talk) 08:29, 26 February 2016 (UTC)
- @Octahedron80 Thanks for the changes. #1 is important but when all entries are fixed to use
{{th-pron}}
, we won't need |tr=. The transliterations (the method itself) can change at any moment in the future, that's why Wyang suggested to remove tr= on new and converted entries. #2 may be flawed, as could be the case with ทราย, compare ทราย (saai) vs ทราย (saai) - the same word transliterated differently. The 2nd works correctly because the word is defined and it has a working {{th-pron}}
. The best way, if we need transliteration in the headword as well, is to use method #3, monosyllabic or long words.
- Okay I will drop tr. And you see ทราย and จักรยานยนต์ again. --Octahedron80 (talk) 11:37, 26 February 2016 (UTC)
- Are you planning to go through all adj, adv and verbs to check if "abstract noun ..." is used correctly? I don't know enough Thai to do that myself but it's now the default for all entries using
{{th-adj}}
, {{th-adv}}
and {{th-verb}}
. Do you think there will many entries, which don't need it? --Anatoli T. (обсудить/вклад) 11:32, 26 February 2016 (UTC)
- Yes, I will. From my experience at th-wiktionary. About 10-20% of total verbs, adjs & advs can't be abstract noun. That is the reason why I can check. I also have a bot to collect new words list periodically. --Octahedron80 (talk) 11:41, 26 February 2016 (UTC)
- @Octahedron80 Thanks. It's better to remove the auto-translit from the headword (if there's no "|tr="), please see ฉลาด (chà-làat), which shows "chlàat", instead of "chà-làat". Unless you can make it use the phonetic Thai, the same as in the pronunciation section. Do you agree? --Anatoli T. (обсудить/вклад) 11:51, 26 February 2016 (UTC)
- Not sure if I explain myself well. tr= is still needed but during the transitional period because there are a lot of entries still, which don't have
{{th-pron}}
. NEW or converted entries may do without "tr=". --Anatoli T. (обсудить/вклад) 11:54, 26 February 2016 (UTC)
- @Octahedron80 I can see you're fixing it with "mono". Thanks. I guess it's OK (only some duplication effort, perhaps). A better name would be "phon" - for "phonetic" or "phonemic". Please note, that Chinese headwords don't have transliterations in the headword (but they cater for multiple topolects), so, if this requires too much effort, the pronunciation section would probably suffice. --Anatoli T. (обсудить/вклад) 12:07, 26 February 2016 (UTC)
I think I made it. The first parameter from th-pron is brought to tranliterate in th-headword. Now we don't have to retype phon (or mono) as well as that page uses th-pron template. (case of needing, see สระ) Also we can track down Category:Thai terms without th-pron template; in that case, phon (or mono) will be used. Let's ignore those tr's and change to th-headword template. --Octahedron80 (talk) 04:57, 27 February 2016 (UTC)
Thanks @Octahedron80 for your various edits on Module:th-pron. I make combinations like h+m/n/l/r/ng/w/y interpreted as main+coda unless there is a vowelless sign in between. There may be existing entries containing such combinations undesirably interpreted as main+coda, and we can modify the module to categorise them into a maintenance category to track them down. I support the removal of headword transcriptions. While we are here, let's also discuss (1) Template talk:th-pron: a potential tabular display of {{th-pron}}
; (2) and modification of Module:links, such that any attempt to link to a Thai word (whether it be via {{l}}
, {{m}}
or {{compound}}
) is handed over to Module:th#link. Alternatively, we need to make {{th-compound}}
for use in Etymology. Wyang (talk) 05:29, 27 February 2016 (UTC)
- @Wyang Do you want to use w:ISO 11940 or not? If not I will recover your complete old charTable. --Octahedron80 (talk) 05:44, 27 February 2016 (UTC)
- I liked the vowel part of ISO 11940 but not so much for the consonant part. The sound values are not very apparent. So I restored the table for consonants but left the one for vowels unchanged. Any thoughts on the tabular format and linking? Wyang (talk) 06:00, 27 February 2016 (UTC)
@Wyang, Iudexvivorum, Octahedron80, Atitarev Hello! I have some questions regarding transliteration of homographs (คำพ้องรูป). What we should do when a single entry contains several etymologies/parts of speech with different pronunciations? Examples: เจ้า can be pronounced "jâao" (chief, lord, master, etc.) and "jâo" (second person pronoun, particle, etc.); เช้า can be pronounced "cháao" (morning) and "cháo" (basket); etc. Can we manually add transliterations in the headwords when each etymology/part of speech pronounces differently? Or can we enable the templates to automatically generate different transliterations in a single entry? The newly fixed templates seem to automate only one transliteration (as seen in เจ้า). --หมวดซาโต้ (talk) 08:35, 27 February 2016 (UTC)
- It can detect only first one; this is also my intention. It doesn't know where th-noun was put before or after or how many they are or which one to show. In this case we have to put phonetic next to it. Take some action for less problem. --Octahedron80 (talk) 09:28, 27 February 2016 (UTC)
- PS. Please split dialects as another language section. Northern Thai which uses 'nod' code is not actually sub-set of Thai. Isan (tts), Southern Thai (sou), Pattani Malay (mfa) either. --Octahedron80 (talk) 09:32, 27 February 2016 (UTC)
(March) Reconstruction move count
In case anyone is interested:
- Pages in the Reconstruction namespage: 2,597
- Appendices to be moved (nonredirects starting with "Proto"): 5,963
Also, can the rest of the appendices be moved by bot? --Daniel Carrero (talk) 18:30, 4 March 2016 (UTC)
- One important thing to note is that not all reconstructions start with Proto: unattested Latin terms, for example. --WikiTiki89 18:42, 4 March 2016 (UTC)
- I can bot move every Appendix:Proto... page and corresponding talk page to the Reconstruction namespace if nobody else feels like doing so, but if there are more nuances to it than that I will leave it to someone who knows what is going on. - TheDaveRoss 18:45, 4 March 2016 (UTC)
- The nuances are that if a page is a redirect, it should be fixed to redirect to the Reconstruction namespace as well. For non-redirects, it would be good to make sure that the page has a
{{reconstructed}}
(or {{reconstruction}}
) template at the top, otherwise it might not actually be a reconstruction page (and it would be nice to have a list of such pages if there are any). --WikiTiki89 18:51, 4 March 2016 (UTC)
- Those are all pretty straightforward, I can probably do it this afternoon if nobody beats me to it. Also see above for the list of pages. - TheDaveRoss 18:56, 4 March 2016 (UTC)
- Swadesh lists aren't entries, they're lists. I don't think they should be moved. Chuck Entz (talk) 20:02, 4 March 2016 (UTC)
- Thanks. I moved the ones that were actually reconstructions and left the ones that weren't. Another thing: If there are redirects from pages without a slash to pages with a slash (e.g. from "Appendix:Proto-Something foo" to "Appendix:Proto-Something/foo"), they can be deleted. --WikiTiki89 20:04, 4 March 2016 (UTC)
- @TheDaveRoss: Please confirm that you read my previous post about redirects. Thanking this edit would be enough. --WikiTiki89 22:09, 4 March 2016 (UTC)
- @Wikitiki89 Currently I am ignoring redirects altogether (only moving pages which are not redirects), once those are done I can circle back and fix redirects so that they all point to the right place. - TheDaveRoss 22:13, 4 March 2016 (UTC)
- Except for a few Old Prussian terms I'm not sure about, all of the non-protolanguage reconstruction pages should be done. KarikaSlayer (talk) 19:29, 4 March 2016 (UTC)
- Another nuance is that a lot of pages in the Index: namespace have simple wiki links to the Appendix: namespace rather than links provided by
{{l}}
or {{m}}
. For example, Index:Proto-Indo-European/d has ]
and ]
instead of {{l|gem-pro|*talō|1}}
and {{m|gem-pro|*talō}}
. That means all those links will be broken whenever a reconstruction is moved to Reconstruction: space without leaving a redirect. I'm gradually going through these appendices and fixing the links to reconstructions, but there's a hell of a lot of them, and fixing them is time-consuming because it can't be done by simple search and replace. —Aɴɢʀ (talk) 19:11, 4 March 2016 (UTC)
- Didn't we vote on getting rid of the Index namespace? --WikiTiki89 19:17, 4 March 2016 (UTC)
- We didn't. I said I would create that vote later, but I didn't create it yet. --Daniel Carrero (talk) 19:20, 4 March 2016 (UTC)
- It is simple search and replace. You search "Appendix:Proto" and replace with "Reconstruction:Proto". No?--Dixtosa (talk) 19:22, 4 March 2016 (UTC)
- Well, that works if you're content with pages that say
]
, but I'm not. If I'm going to go to the trouble of tidying up these pages, I'm going to do it right and use {{l|gem-pro|*talō}}
. —Aɴɢʀ (talk) 19:50, 4 March 2016 (UTC)
- I am not either. I am just saying that that is different unrelated low-priority (not intending to diminish your work) job that does not really interfere with this high-priority job. This message assumes that the bot is going to do the mentioned simple search and replace too. --Dixtosa (talk) 20:18, 4 March 2016 (UTC)
- It's relatively easy to do with any regex-based search and replace. --WikiTiki89 20:20, 4 March 2016 (UTC)
- Not sure how prevalent it is, but if there are lots it sounds like a good candidate for that bot to-do list page. - TheDaveRoss 21:12, 4 March 2016 (UTC)
I've done a sweep to clean up all English Wikipedia links to our protolang entries (wasn't many of them really). Are there other Wikimedia projects where we can suspect links to these to be lurking about? --Tropylium (talk) 01:08, 6 March 2016 (UTC)
Done, apparently. Thank you!
- All appendices of reconstructed terms starting with "Proto" are redirects, so I take it all of them were successfully moved to the Reconstruction: namespace.
{{reconstructed}}
(redirect: {{reconstruction}}
) is not linked by any pages in the Appendix: namespace.
Are there any unattested Latin terms, or some other reconstructed terms in appendices yet? Can we remove {{reconstruction}}
from all appendices now and remove the patch that makes {{l}}
and {{m}}
link to the appendix namespace? --Daniel Carrero (talk) 02:21, 6 March 2016 (UTC)
- There are no more Latin reconstructions in Appendix namespace. —Aɴɢʀ (talk) 07:34, 6 March 2016 (UTC)
- The only single-term entries that I've been able to find in the Appendix namespace are conlangs. For the purposes of moving and of the special code that CodeCat added to
{{reconstructed}}
and to Module:links, we're done. I'm sure there are still lots of hard-coded links to reconstructions in Appendix space, and there are no doubt obscure bits of code in modules and templates that will have to be tracked down and fixed (I believe the templates that link to the next/previous in a series, such as {{cardinalbox}}
, need to be checked, for instance). Chuck Entz (talk) 07:54, 6 March 2016 (UTC)
(June) Automatic transliteration for Thai has been disabled for now
- Previous discussion: User talk:Wyang#Module:links
I disabled the automatic transliteration for Thai, because Module:th-translit isn't generating the right transliterations. Apparently, the code to generate the correct transliteration is located in Module:th in the getTranslit
function, so this needs to be added to the transliteration module so that it generates the correct transliterations. User:Wyang had added workaround code to Module:links instead, but this is inappropriate, especially considering the code to generate a proper transliteration already exists, so I removed it again. Module:th-translit should be modified so that such workarounds are no longer necessary; then the automatic transliteration can be reinstated. —CodeCat 12:59, 4 June 2016 (UTC)
- What you are doing is a perfect manifestation of your arrogance, ignorance and mindlessness. "So this needs to be added to the transliteration module so that it generates the correct transliterations." – while Module:th-translit is working perfectly fine with phonetically respelled words. You are suggesting that I should turn a transliteration module into a module that actually parses the entire entry's Wikitext and extract certain parts of the text, because "this is what a transliteration module is supposed to do". Sigh! So much for Eurocentric hubris on Wiktionary. "I shall break it, and ask you plebs to explain to me why things broke after this." Wyang (talk) 13:23, 4 June 2016 (UTC)
- It's supposed to work with as many words as possible, not just phonetically respelled ones. The
getTranslit
function is capable of generating better transliterations, so this needs to be integrated into Module:th-translit. Right now, Module:th-translit only correctly transliterates a subset of the words that it could, in theory, but adding custom code to Module:links is not the way to fix that. Modifying Module:th-translit is the right way. User:Wikitiki89 even did so yesterday, and you just reverted it. Why? —CodeCat 13:36, 4 June 2016 (UTC)
- Because those codes do not belong to a transliteration module page. How many times do I need to iterate that? Wyang (talk) 13:43, 4 June 2016 (UTC)
- Yes they do. And they certainly do not belong on Module:links instead. —CodeCat 13:47, 4 June 2016 (UTC)
- Which definition of "transliteration" is for this? Wyang (talk) 13:58, 4 June 2016 (UTC)
- The same definition we apply across Wiktionary: generating a Latin-script version of a word, that can be understood by people who don't know the script. The accuracy of the transliteration, or its nature (pronunciation or spelling based) is up to the editors of the language and of the transliteration module. However, under no circumstances should a generic language-agnostic module be used to work around a deficiency of the transliteration module. —CodeCat 14:05, 4 June 2016 (UTC)
- In that sense Module:th-translit is working perfectly well. It's just that your Module:links failed to take into account the fact that some languages require another level of phonetic respelling extraction, and it is that phonetic respelling, rather than the entry title itself, that needs to be fed to the transliteration modules. Wyang (talk) 14:17, 4 June 2016 (UTC)
- Yes, and in those cases, we use the
tr=
parameter that is available on countless templates. But let's stick with the situation here. You have a function getTranslit
that is clearly capable of generating the correct transliteration, albeit that it has to parse the page's content in order to extract it. The method used is completely irrelevant. It is clear that there exists a function that is capable of doing the transliteration better than Module:th-translit is currently doing. Therefore, it seems obvious that this function should be added to Module:th-translit so that its transliterations become more accurate. This is what Wikitiki89 tried to do, so what is your objection against having better transliterations? And why do you insist on putting inappropriate workarounds in Module:links instead? —CodeCat 14:29, 4 June 2016 (UTC)
- Regarding your latest attempt at editing Module:links, the edits are completely unnecessary. This module doesn't have to account for this "phonetic extraction". The transliteration module can perform "phonetic extraction" instead. So please, for the nth time, add it to Module:th-translit and stop edit warring in Module:links. —CodeCat 14:32, 4 June 2016 (UTC)
- I just fixed your Module:links, which you again reverted. Module:th-translit is functioning perfectly, given the right inputs. Stop insisting that this belongs at Module:th-translit; it does not. This is not transliteration.
Transliteration is not concerned with representing the sounds of the original, only the characters, ideally accurately and unambiguously. (Wikipedia)
- It belongs at Module:links, which is lacking this new functionality of extracting the phonetic respelling to feed into the transliteration module. So for the nth time, please mend your Module:links so that it is fully language-agnostic, not just European language-agnostic. Wyang (talk) 14:49, 4 June 2016 (UTC)
- The transliteration module itself should extract this information if it needs it. —CodeCat 14:55, 4 June 2016 (UTC)
- Then it is not a module that does transliteration any more. This is exactly why the transliteration module should not be responsible for extracting this. Transliteration module is for transliteration, which is faithfully and systematically converting one writing system to another. Module:th-translit is fully functional at what it does, which is transliteration. A module that tries to extract phonetic respellings is a pronunciation module, which would have to be defined in Module:languages/data2 and have the infrastructure built around it, i.e. mending Module:links. Either way Module:links has to incorporate additional functionalities for non-phonetic languages. Wyang (talk) 15:03, 4 June 2016 (UTC)
- I don't care if it doesn't do transliteration according to your narrow idea of what a transliteration is. Nobody else on Wiktionary cares either, I'd bet. What we all care about is that it generates transliterations according to what Wiktionary's idea of transliteration is, and has been for years, not what your idea of it is. —CodeCat 15:07, 4 June 2016 (UTC)
- You are arguing whatever you believe in is what Wiktionary believes in, allegedly in opposition to what I believe in. A bit tongue-tied, probably? Wyang (talk) 15:17, 4 June 2016 (UTC)
- I have restored automatic Thai transliteration. Remember that what you are doing is against the goal of this project - rather than improving the pages, removing information from numerous entries. Wyang (talk) 13:36, 4 June 2016 (UTC)
- I have removed it. It's still not fixed. Stop edit warring and reach a consensus first. —CodeCat 13:37, 4 June 2016 (UTC)
- Edit warring? Or undoing highly destructive edits to the project? Wyang (talk) 13:39, 4 June 2016 (UTC)
- You added unnecessary custom code to Module:links, and when reverted, you keep reinstating it over and over despite a clear lack of agreement. That is edit warring against consensus. Reach a consensus for your edit first, then it can be reinstated. —CodeCat 13:40, 4 June 2016 (UTC)
- It has been there for months. You abruptly removed it, causing all the Thai links to malfunction, prompting Thai editors to ask me to look into the problem and restore the original functionality. Can you be even further from the truth? Wyang (talk) 13:43, 4 June 2016 (UTC)
- It never should have been added in the first place. Not in a highly visible and widely used language-generic module like Module:links. Language-specific code belongs in language-specific modules. —CodeCat 13:45, 4 June 2016 (UTC)
- User:Wyang, again, please reach a consensus for your edit to Module:links rather than forcing the issue. Do not edit war to push your opinion through. Wait until there is a general agreement that your code belongs in the module. —CodeCat 13:53, 4 June 2016 (UTC)
- Stop vandalising the page! Your removal simply wiped out thousands of correct Thai transliterations from Wiktionary pages. Where is your protest when I added it back in February? And where is your explanation when you suddenly removed the code 6 days ago? If you would like to maintain the status quo, at least get the version right. Wyang (talk) 13:58, 4 June 2016 (UTC)
- Is there a time limit for contesting something? How long ago should an edit be before it's considered an automatic consensual status quo? Do we have a policy for this? I am contesting your edit now, as have two others so far, but you continue to ignore them and push your edit through. That is edit warring against consensus and I wouldn't be surprised if it got you blocked, though I won't be the one to do it because I'm involved in the dispute and people won't like that. —CodeCat 14:01, 4 June 2016 (UTC)
- Did you forget that your edit had been reverted twice by someone other than me? Taking out the block card now? A step-up from your threat to disable on my talk page? Four months seem like a much longer time than 6 days. Wyang (talk) 14:07, 4 June 2016 (UTC)
- Reverts aren't the only way to contest an edit. But in any case, your edit was reverted first by me, then by Wikitiki, then by me again, then you started edit warring, and Dixtosa has also contested your edit. In comparison, only you and Metaknowledge have supported it. According to our common practice, consensus requires a 67% majority in favour, which is clearly not the case. So your edit has no consensus. —CodeCat 14:09, 4 June 2016 (UTC)
- So stop your vandalism. The reason you dare to tackle Thai specifically is you simply don't care. You just don't care about what Thai editors think at all, hence destroying thousands of Thai entries is perfectly justified in your opinion. Wyang (talk) 14:17, 4 June 2016 (UTC)
- Please stop using personal attacks. Reverting an edit that has no consensus is not vandalism. Reinstating that edit over ten times despite being notified that your edit has no consensus is vandalism. —CodeCat 14:29, 4 June 2016 (UTC)
- Are you denying that your edit effectively eliminates valid Thai transliterations from thousands of entries? Repeatedly removing any one of those thousands of transliterations would lead to someone being blocked. So not vandalism you say? Wyang (talk) 14:34, 4 June 2016 (UTC)
- Only for as long as the transliteration module hasn't been fixed to compensate. The fact that you refuse to do so does not suddenly make my reversions vandalism. In fact, you also reverted Wikitik89's edit to Module:th-translit, which did fix (or attempt to fix) the module. So it appears you are not actually interested in fixing the transliterations. —CodeCat 14:37, 4 June 2016 (UTC)
- I have now reinstated User:Wikitiki89's edit to Module:th-translit. Reverting this again would re-break the transliterations, thus doing the exact same thing that you accuse me of doing. So if you revert this too, then I can only assume you are not interested in finding a solution for this problem. —CodeCat 14:41, 4 June 2016 (UTC)
- It looks like พลเรือน (pon-lá-rʉʉan) once again has the correct transliteration. Why you reverted the edits by Wikitiki89 that restored this is beyond me. But please do not break it again. —CodeCat 14:45, 4 June 2016 (UTC)
- As I said numerous times before, this is not transliteration. It does not belong in a transliteration module. Transliteration is the faithful letter-to-letter correspondence performed between writing systems, which is obviously not the process you and Wikitiki89 would like to see implemented in Module:th-translit. Which is hence something that more properly belongs elsewhere, i.e. at your Module:links. Wyang (talk) 14:49, 4 June 2016 (UTC)
- Transliteration on Wiktionary is not the faithful letter-to-letter correspondence, and it never has been. Many languages have non-orthographic transliterations. Hindi, Chinese, Russian, just to name some. You cannot just unilaterally redefine what "transliteration" means on Wiktionary to suit your purposes, and then demand that everyone else accepts your edits to a generic module to work around it. It seems that this isn't a workaround for code, but a workaround for your own mental idea. —CodeCat 14:58, 4 June 2016 (UTC)
- Well, there has never been a Module:zh-translit! Because a Chinese-English transliteration system is never possible. Hindi and Russian have two sets of transliteration and pronunciation modules: Module:hi-translit vs Module:hi-IPA, and Module:ru-translit vs Module:ru-pron, with the former doing fairly strict transliteration and the latter IPA interpretation based on transcription. Thai also has two: Module:th-translit vs Module:th-pron. And yet you are suggesting that th-translit should take on the role of the latter. It is never my "own mental idea" - it is what the definition of transliteration is, and it forms the basis for its distinction from "transcription", whether you are willing to accept it or not. Wyang (talk) 15:17, 4 June 2016 (UTC)
- I do not think any module that is to be invoked in mainspace should EVER take content from the entry and parse it, because the entry can get arbitrarily large and introduces very difficult dependency. It is abusing Lua. As for code placement, it is about how you look at *-translit modules. CodeCat views (shared by me) them as the general transliteration modules which should work independently (i.e. not necessarily through Module:links). But, again, I disapprove the parsing part. @Wyang, why do not you just pass them as arguments? --Dixtosa (talk) 13:41, 4 June 2016 (UTC)
- I apparently disagree. There are huge benefits from the use of parsing, as Wiktionary's system is inherently cumbersome and unsuitable for building a dictionary without parsing. See
{{zh-forms}}
for an example. Wyang (talk) 13:52, 4 June 2016 (UTC)
- Wait, so you've done this for other languages too?? —CodeCat 13:58, 4 June 2016 (UTC)
- Ignorant - you must be reading European language entries only in this year and a half. Wyang (talk) 14:03, 4 June 2016 (UTC)
Recap for us outsiders: Did I understand correctly that the way Thai editors handled the situation worked but was incompatible with some stuff CodeCat's robots do, so CodeCat changed it to make it comply with his/her robots, which in turn broke it for Thai editors? And now you can't decide which way to go because you do not agree whether or not the module should scan the entire entry or not? Korn [kʰũːɘ̃n] (talk) 15:00, 4 June 2016 (UTC)
- No this has nothing to do with bots. What happened, it seems, was that Wyang insisted that transliteration modules should only give letter-for-letter transliterations. But doing that would generate incorrect transliterations in many entries because Thai script is rather haphazard. So rather than adjusting their dogma - and the transliteration module - they instead made an edit to Module:links, a generic language-agnostic module, to entirely bypass the defective transliteration module. This code was noticed a few months later by me, and removed, then removed by Wikitiki89 again, then removed a whole lot more by me again. Wikitiki made edits to Module:th-translit and Module:th which fixed the transliterations after removing the Thai-specific code from Module:links had broken them. However, this seemed to go against Wyang's dogma that transliteration modules must transliterate letter-for-letter (even though they don't, and never have, on Wiktionary), so he reverted the edits and again reverted me when I tried to reinstate the fixes Wikitiki made. —CodeCat 15:05, 4 June 2016 (UTC)
- The transliteration system for Thai was fully and well functional since its implementation in February, until it was abruptly removed by User:CodeCat six days ago. A bit of investigation led to User:CodeCat's edits which basically led to all Thai transliterations on Wiktionary non-functional. Wyang (talk) 15:07, 4 June 2016 (UTC)
- Did you miss the fact that Wikitiki fixed the problem, and you undid his edits? Your undoing broke the transliterations again, but instead of putting Wikitiki's edits back in, instead you insisted that Module:links be edited to fit your dogma instead. —CodeCat 15:10, 4 June 2016 (UTC)
- You both claim that your way produces correct results and the system of the other breaks it. Can you each provide a specific example which works with your system and say how it gets broken by your opponent's method? Korn [kʰũːɘ̃n] (talk) 15:13, 4 June 2016 (UTC)
- Both methods work. However, I object to having extra code in Module:links that handles deficiencies in Module:th-translit, deficiencies that were readily remedied by Wikitiki. The issue seems to be that Wyang dislikes Wikitiki's remedies, but to undo them he has to reinstate the extra code that I object to. I think that problems with Module:th-translit ought to be fixed in that same module, as Wikitiki did, rather than introducing workarounds in another module that has nothing to do with Thai. —CodeCat 15:16, 4 June 2016 (UTC)
For many languages transliteration and transcription/pronunciation are very different concepts, and Thai is one of these languages. One can generate a transliterated outcome for a Thai word (Module:th-translit), but oftentimes this is different from the pronunciation. The core issue here is that Module:links provides no support for these non-phonetic languages, which is why I added the new functionality in the module. Such information does not belong to individual transliteration modules, as this is a widespread linguistic phenomenon and the addition would greatly benefit many non-European languages (for example, Chinese and Japanese). The lack of transcription support in the central linking templates/modules is exactly the reason these languages have been moving away from the standard linking templates, resulting in much confusion and repetition during editing. Wyang (talk) 15:43, 4 June 2016 (UTC)
- That's irrelevant. Module:links needs no additional support, transliteration modules (for Wiktionary's use of the word) are sufficient. If they are not, then you have to show why. So far you have failed to do so, since Wikitiki's edits (which you reverted) proved you wrong, it's perfectly possible for the existing infrastructure to handle Thai. Perhaps you don't want to be proven wrong? —CodeCat 16:31, 4 June 2016 (UTC)
I don't know what's going on. I am native and I only can say that direct auto transliteration from a "Thai word" could never be done due to complexity uncertainity of spelling. That's why we do it on basic syllables (which are more certain); it has been tought in school either. --Octahedron80 (talk) 15:23, 4 June 2016 (UTC)
- Rather than this constant revert war that's going on: is it not possible to apply the code fixes in one single operation that will make the transliterations continue to work as they did before? Fixing only part of it, while leaving Thai users without useful content, seems like a problem. Equinox ◑ 16:39, 4 June 2016 (UTC)
- As of right now, things work just fine. Wyang keeps reverting it. —CodeCat 16:41, 4 June 2016 (UTC)
- Here's how it started(from User talk:Wyang):
- I don't understand it either. Why do those edits change the transliterations, even though none is given in the entry? —CodeCat 20:49, 2 June 2016 (UTC)
- Even after looking at the section above, I don't see what this edit does. In fact, it seems like it would break cases that have alt forms. --WikiTiki89 14:29, 3 June 2016 (UTC)
- I've undone it until we can establish how the special treatment actually changes anything. —CodeCat 14:35, 3 June 2016 (UTC)
- It looks like it, somehow, for some reason, changes the transliteration of พล (pon) between "pol" and "pon". But I have no idea why. I think the problem is with the Thai transliteration module here, not Module:links. —CodeCat 14:37, 3 June 2016 (UTC)
- I think I fixed the problem with these edits. --WikiTiki89 18:57, 3 June 2016 (UTC)
- This was part of a topic where it had been asked what the code was for, and everyone was waiting for Wyang's response. CodeCat acted without knowing what the consequences would be, without waiting to find out what the code was for. That was clearly wrong, and Wyang was understandably upset. Wikitiki89 helpfully came up with an alternative that seems to work.
- This whole episode is painful to watch: we have two strong-minded people who have both done great things for the project, but are now butting heads instead of discussing rationally.
- Wyang has a history of coming up with ingenious ways to make our system do things that no one would have thought possible. Our Chinese entries are infinitely better than they were, and they're getting better all the time. There are, however, times when the system gets brought to its knees, as at 一.
- CodeCat is as responsible as anyone for the current template, module and category infrastructure that runs this site. This prodigious work ethic and expertise is, however, marred by a willingness to break things in order to force people to fix things she sees as wrong (case in point: Module:parameters). She also has a tendency to ramrod things through, which has created deep resentment in some quarters that has poisoned a number of discussions on unrelated issues.
- On the one hand, we have Wyang, still furious about CodeCat's behavior and unwilling to allow anything that would let her get away with it. On the other hand, we have CodeCat, who has gone into Orwellian DoubleSpeak mode to shift attention from her initial, destructive action, portray Wyang as a dangerous loose cannon and portray herself as an innocent victim
- We need to get past all of that and look at the merits of how we want to structure this. Our architecture isn't set up to handle the use of respellings in transliteration, so Wyang came up with a kludge to work around this. At the moment, the debate seems to be over where to put the kludge, not on whether there's a better way to do this. My question is: can we come up with a way to get the respellings to the modules without having the modules swallow an entry whole and rummage through it to find them (please forgive the mixed metaphor)? Chuck Entz (talk) 20:05, 4 June 2016 (UTC)
Wiktionary does not have a JSON-style dictionary system, which is why there is so much formatting nuisance with the use of different headers, headword templates, reduplicating etymologies, ectopic related terms and unsystematic pronunciation notations. Each word in a language should be defined by a JSON set, containing a series of qualities indicating the nature and relationships of subordination of various parts of the text. All the Wikitexts on a Wiktionary entry should be generated from scratch, from that JSON set using pre-defined formatting codes which tells the entry how the original core information should be displayed. All the JSON information from entries should be made rapidly indexable to other entries, so that there is no need to repeatedly define what the pronunciation of another word in the etymology is, or what the meanings of that word are.
What Wiktionary has is a very different system. A system that tends to make people think about "what are we eating for tonight" rather than "how can we most efficiently make dinner for the next 20 years". You can create a magnificent, all-encompassing entry for a word in a language if you put into the entry everything that is known on Earth about the word, when in actual fact you should not have to do most of what you did because they can already be found elsewhere in the dictionary and should have been "extracted" rather than "generated or provided de novo". Say you want to link to another word in your perfect entry. Then in the perfect entry on Wiktionary you would have to put in: (1) the word you wish to link to, (2) the transliteration/transcription of that word, (3) the definition of that word, and (4) qualifiers for the definitions (e.g. derogatory, obsolete), although points (2-4) have already been stored in your destination entry. Previously all of (2-4) would have to be provided in the internal link. Things have improved in that point (2) is sometimes no longer necessary, as Module:links will attempt to generate the transliteration from a series of transliteration modules. This is a great leap forward, as we start to realise some of what we previously wrote were not necessary at all. However, the source of that omitted information (i.e. the regenerable information) is misunderstood. It is not the transliteration modules that are ultimately the source of the regenerable information; rather it is the destination entry where the regenerable information is stored. For languages where transliteration approximates fairly well the transcription/transliteration system we use for that language, this is an acceptable and quite efficient way of regenerating the information, despite the non-zero failure rate (e.g. link in коэффицие́нту (koefficijéntu) to коэффицие́нт (koefficijént)). But for languages where transliteration approximates the transcription system we use very poorly (Thai etc.), or where transliteration is intrinsically impossible (Chinese, Japanese etc.), our hub of Module:links simply gives up, telling editors of these languages "sorry, there is nothing I can help you with here", when in fact it should have been set up to facilitate the extraction of the phonetic pronunciation in the destination entry. Languages lie on various parts of this transliteration–transcription continuum and it is outright inappropriate to call this process of phonetic extraction "transliteration" for languages that fall towards the transcription half of the continuum (Thai, Chinese, Japanese etc.), as that is an obvious oxymoron, and/or transcription vs transliteration are contrastive concepts for these languages. Mixing these two very different concepts or intentionally confusing them to achieve minimal effort could be very dangerous. Wyang (talk) 01:46, 5 June 2016 (UTC)
Word |
Transliteration outcome |
Transcription outcome |
What should be returned if transliteration is desired |
What should be returned if transcription is desired |
What should be returned if IPA is desired
|
พล (Thai) |
pol |
pon |
pol |
pon |
/pʰon˧/
|
십육 (Korean) |
sib.yug |
simnyuk |
sib.yug |
simnyuk |
/ɕʰimɲjuk̚/
|
བརྒྱད (Tibetan) |
brgyad |
gyaew |
brgyad |
gyaew |
/cɛʔ˩˧˨/
|
鬥 / 斗 (Chinese) |
none |
dòu |
nil |
dòu |
/toʊ̯˥˩/
|
水 (Japanese) |
none |
mizu |
nil |
mizu |
/mizɯᵝ/
|
ရှည် (Burmese) |
hrany |
she |
hrany |
she |
/ʃè/
|
vs
Word |
Transliteration outcome |
Transcription outcome |
What should be returned if transliteration is desired |
What should be returned if transcription is desired |
What should be returned if IPA is desired
|
дли́нный (Russian) |
dlínnyj |
dlínnyj |
dlínnyj |
dlínnyj |
/ˈdlʲinːɨj/
|
ტორტი (Georgian) |
ṭorṭi |
ṭorṭi |
ṭorṭi |
ṭorṭi |
/tʼɔrtʼɪ/
|
κέντρον (Ancient Greek) |
kéntron |
kéntron |
kéntron |
kéntron |
/kéntron/
|
- ^According to the table, transliteration of Thai would be useless and would result in problem on difficult words, such as เศรษฐศาสตร์, รัฐธรรมนูญ. You could try to replace letter-by-letter but no one will understand it. I prefer transcription. --Octahedron80 (talk) 09:46, 5 June 2016 (UTC)
- On Wiktionary, the term "transliteration" encompasses transliteration, transcription and general romanization. It's just a historical accident that we call it "transliteration", but it's not transliteration in the strict sense. See Wiktionary:Transliteration and romanization. So it is not an argument that the module can only supply transliterations in the strict sense just because it's called a transliteration module. It's a romanization module, but it's called a transliteration module for historical reasons. —CodeCat 12:35, 5 June 2016 (UTC)
- It's not just a historical accident. It is Eurocentrism in Wiktionary at its best. As a consequence of this historical confusion, the central system just assumes that all languages use transliteration as their romanisation method, and Module:links sends words of all languages indiscriminately to their transliteration modules to generate their romanisations. This leaves languages with both transliteration and transcription outcomes unsupported. Thai already has a functioning transliteration module (Module:th-translit), and in addition it also has a transcription module (Module:th). Module:links should relay the 'tr' parameter to the correct place so that it is truly language-agnostic, and this includes distinguishing between the transliterative and transcriptive modules used for a particular language and rendering support to languages that use a transcriptive method of romanisation. Wyang (talk) 12:54, 5 June 2016 (UTC)
- The correct place for transcriptions is Module:th-translit, so there is no need for additional code. Are you suggesting that we setup an entirely separate system to deal with transcriptions as opposed to transliterations, and have separate transcription and transliteration modules for languages? What's the benefit? And if you are so passionate about it, why don't you start a vote to change the current practice of including transcription in transliteration, rather than edit warring over it for days? Right now you have yet to display any kind of consensus for your views. —CodeCat 13:00, 5 June 2016 (UTC)
- Never mind, I've done it for you. —CodeCat 13:08, 5 June 2016 (UTC)
- For whom? You ignored the points I raised above and therefore completely misunderstood what I was saying. Again I feel like I am talking to someone who did not care to read my comments at all. The answers to your questions are: No and no, and you would not have asked these questions if you had read my replies above. I'm not suggesting that we set up an entirely separate system to deal with transcriptions as opposed to transliterations, nor am I interested in having separate transcription and transliteration modules for any other languages that do not differentiate between the two concepts on a romanisation level. Likewise I am absolutely uninterested in changing the current practice of including transcription in transliteration. Wyang (talk) 13:19, 5 June 2016 (UTC)
- Then why do you keep edit warring? All Wikitiki's edits did was change Module:th-translit to supply a transcription. If you are fine with this practice, your edits say otherwise. —CodeCat 13:22, 5 June 2016 (UTC)
(June) Oppose
- Oppose There is no need for separate systems, this overly complicates things without any obvious benefit. The point of transliteration modules currently is to supply a version of a word in the Latin alphabet, without regard to how closely it maps to the orthographic form of the original language. In other words, they are romanization modules, that are called transliteration modules by historical accident. I see no value in being pedantic about the meaning. If we want to display transliteration and transcription side by side whenever applicable, we should be able to demonstrate that users benefit from this information overload. —CodeCat 13:05, 5 June 2016 (UTC)
- What's the use of phonetic transliteration when we already have dedicated Pronunciation section?--Dixtosa (talk) 13:13, 5 June 2016 (UTC)
- The point here is that this involves a change in the status quo. If we want transliterations to be strict transliterations, then we have to change the practices of all languages whose transliteration is not a strict transliteration currently, and make changes to Wiktionary:Transliteration to reflect the new practice. Russian editors have strongly opposed this in the past. @Benwing, Atitarev. —CodeCat 13:19, 5 June 2016 (UTC)
This is the most stupid response I have ever seen. You would not have acted so bizarrely if you were more attentive and respectful, and this includes completely misconstruing my reasoning and thus creating this poll, and abusing your admin rights to block me. I would like to request to have your admin rights reviewed. Wyang (talk) 13:19, 5 June 2016 (UTC)
- I blocked you because you keep making edits that have no consensus. This poll is an attempt to establish a consensus, but you continue to revert without awaiting the results of the poll. —CodeCat 13:23, 5 June 2016 (UTC)
- Pot, meet kettle; kettle, pot. DCDuring TALK 13:42, 5 June 2016 (UTC)
- Aha. Whatever. I'd rather not be compared with this crazy user. Next time I'll just redirect all the Thai complaints to her. Wyang (talk) 13:59, 5 June 2016 (UTC)
- It seems the status quo ante of Module:links is what CodeCat reverted to, and that therefore CodeCat's edit should be reinstanted but probably not by CodeCat. Wyang should be prevented from reinstating his edits. Wikitiki's edits to Module:th and Module:th-translit should be reinstated and then we should see which Thai entries, if any, display a problem with transliteration or transcription. --Dan Polansky (talk) 20:47, 5 June 2016 (UTC)
- Not really. Wyang added the code in February, and CodeCat removed it in May as part of an extensive rework of the module. Wyang was just restoring it under the assumption that it had been removed accidentally. Thai editors have been basing their edits for three months on the presence of that feature. Wyang made his June edit only because Thai editors were complaining about it not working any more. Chuck Entz (talk) 21:21, 5 June 2016 (UTC)
- Wyang's February edit cannot be traced to a discussion showing consensus, AFAIK. The edit is now challenged. The status quo ante is the status before the challenged edit. Three months have elapsed between the edit and its challenge, probably because the challenging editor did not notice the edit earlier. Now as before, I propose that CodeCat and Wikitiki edits are reinstated, and that specific problems in Thai entries that are a result of that are clearly stated, including stating at least one Thai entry that has the problem. --Dan Polansky (talk) 21:32, 5 June 2016 (UTC)
- The word "consensus" is thrown around here too much. —Aryamanarora (मुझसे बात करो) 23:55, 9 June 2016 (UTC)
Reason: Abuse of admin rights – misusing her admin power to block the other party of a personal dispute. Block log: . Wyang (talk) 13:28, 5 June 2016 (UTC)
- I blocked you to put an end to the continuous edits which forced Wyang's point of view without a consensus for that view. We block other editors for such behaviour, so why not Wyang? —CodeCat 13:29, 5 June 2016 (UTC)
- Well your edit simply removed thousands of correct Thai transliterations on Wiktionary and caused uproar among our Thai editors, which is why it was reverted. Repeated removal of any one of those thousands of transliterations is sufficient to warrant a block. Wyang (talk) 13:31, 5 June 2016 (UTC)
- No it didn't. The edits you've been edit warring on for the past day did not break any entry. Please demonstrate that Wikitiki's edits, which you continued to revert, broke or removed thousands of transliterations. —CodeCat 13:34, 5 June 2016 (UTC)
- I have once again reapplied Wikitiki's edits. Please show an entry that is currently broken. —CodeCat 13:35, 5 June 2016 (UTC)
- Why have you undone Wikitiki's edits yet again? There is no consensus for having transliteration and transcription separate. You should wait for the poll to finish. —CodeCat 13:37, 5 June 2016 (UTC)
- I ask that Wikitiki's edits be restored until 1. it is established that a consensus exists for separating transliteration from transcription, or 2. it is established that Wikitiki's edits break anything. —CodeCat 13:39, 5 June 2016 (UTC)
- Nor is it appropriate or does it have consensus. You seem to be in denial of your repeated vandalism – let me refresh your memory: diff, diff, diff, diff. These are the first four of your edits - did they remove useful content en masse? Wyang (talk) 13:40, 5 June 2016 (UTC)
- Wikitiki also made that same edit diff, so should he also be blocked? Wiktiki in fact made additional edits to fix the problems caused by this edit, and you then reverted his edits too. —CodeCat 13:43, 5 June 2016 (UTC)
- Circumventing the question huh? Did your edits repeatedly remove useful content en masse? Wyang (talk) 13:46, 5 June 2016 (UTC)
- No, they did not, once Wikitiki had provided an appropriate fix. Which you then reverted. So again, please demonstrate that Wikitiki's trio of edits to Module:links, Module:th and Module:th-translit broke something, and that it is therefore warranted to desysop me for restoring those edits. You have yet to show even a single entry that was broken by it, yet you continue to revert these edits. —CodeCat 13:47, 5 June 2016 (UTC)
- Go to the time points (1) 12:56, 4 June 2016; (2) 13:34, 4 June 2016; (3) 02:22, 4 June 2016 and (4) 01:01, 4 June 2016. Preview the page พลเรือน. Were the Thai romanisations there? Wyang (talk) 13:51, 5 June 2016 (UTC)
- Please stop dodging the question. Did Wikitiki's trio of edits break any entries? Please restore his edits and then show us a broken entry. If you can't demonstrate that his edits broke an entry, how can you ask me to be desysopped for restoring them? —CodeCat 13:53, 5 June 2016 (UTC)
- Looks like you are unable to answer my question. You did not restore his edits. You restored your edit, which wiped out thousands of Thai transliterations. Wyang (talk) 13:57, 5 June 2016 (UTC)
- For the past day, you have been reverting those three edits Wikitiki made, one of which included the edit I also made. I have been trying to restore those edits because there is no consensus for your views and no evidence that those three edits break anything. —CodeCat 13:59, 5 June 2016 (UTC)
Your continued edit warring shows a severe lack of professionalism and responsibility. You both are perfectly aware that edit warring warrants an admin stepping in if the users can't get a hold of themselves. You both seem to be admins and abuse your positions to keep ranting where other users would long have been shut up. (Read: Prevented from editing the entry in question.)
You both continuously accuse the other of having no consensus, but your endless bickering makes it harder and harder for people to get an overview over the situation, and thus makes it more and more difficult for the community to actually reach a consensus. Please keep your hands still for a while so that the rest of the community, or at least those parts who understand the techno babble, can actually debate this matter. Korn [kʰũːɘ̃n] (talk) 15:28, 5 June 2016 (UTC)
- +1. I can't even figure out what the primary point of contention is. (I agree very strongly with Dixtosa's point above that no module invoked in the mainspace should ever take content from the entry and parse it, though. Seriously, the devs are going to regret ever giving us Lua if we go in that direction.) Can someone please explain the difference between transliteration and transcription, and where they're each used in entries? --Yair rand (talk) 20:38, 5 June 2016 (UTC)
- Whether we want to allow modules invoked from the main namespace to parse other entries should be a separate discussion, if anyone wants to start it. I believe the Chinese modules extensively use this paradigm. DTLHS (talk) 21:45, 5 June 2016 (UTC)
- The distinction which seems to be being made by those who are making a distinction is : transliteration takes a set of characters and renders them letter-for-letter into another script (in this case, the Latin script), whereas transcription renders the word itself into another script; the difference being that e.g. 鬥 cannot be 'transliterated' per se, but it can be transcribed (as dòu, IPA: /toʊ̯˥˩/), and that if e.g. พล is transliterated, it is pol, but if it is transcribed, it is pon (in IPA it is /pʰon˧/). In practice, the argument here seems to be (1) not over which of these systems should be used (since I haven't actually seen someone suggest that พล should be rendered pol), but over which word should be used, and (2) not over whether or not a module should parse a page, but over which module should host the code. - -sche (discuss) 21:01, 5 June 2016 (UTC)
Module:links is protected so that only administrators can edit it; this prevents non-admins from editing or edit-warring over it, and it means the edit-war between admins User:CodeCat and User:Wyang is a wheel war. If the two of you continue to wheel-war, I will ask a bureaucrat such as User:Chuck Entz or a global 'crat to make emergency and hopefully temporary desysoppings to stop the war. - -sche (discuss) 21:14, 5 June 2016 (UTC)
- I was already considering doing so, but I've been hoping they would start acting like adults without being forced to. Unfortunately, the action has been taking place while I've been offline (I do sleep, occasionally), so I'm left to wonder whether it's over or it's just waiting to flare up again when both are back online. Chuck Entz (talk) 21:36, 5 June 2016 (UTC)
- A proposal for desysopping amounts to harassment, in my opinion. DonnanZ (talk) 22:04, 5 June 2016 (UTC)
- Preventing such proposals would seem to be creating an untouchable ruling elite... Equinox ◑ 22:21, 5 June 2016 (UTC)
- Yes, but no one seems to be backing the proposal, so it's not the brightest of ideas, just a desperate measure. DonnanZ (talk) 22:37, 5 June 2016 (UTC)
- Each party has suggested the other's desysopping (above at at ) — and given that both parties are wheel-warring using admin tools/privileges, and that one blocked the other while edit-warring with him (as noted above), following both proposals and emergency-desysopping both may be in order if the warring continues. - -sche (discuss) 22:40, 5 June 2016 (UTC)
- So blocking the other side of the argument is completely justified and one should not lodge a complaint after such abuse of rights? Ridiculous. Very disappointed in the Wiktionary community; seems to be a place for admin bullies who wilfully block others and maintain their modules without the slightest consideration of the consequences. Will greatly reduce the amount of time spent here. Considering quitting. Wyang (talk) 23:14, 5 June 2016 (UTC)
- No one is excusing CodeCat's behavior, but de-sysopping is a very serious step, and one best not considered in the midst of a dispute, unless circumstances demand it. Chuck Entz (talk) 03:37, 6 June 2016 (UTC)
(June) What happened
Back in February Wyang put code into Module:links that checked for Thai, then called a function in a different module than that used for the transliteration. This function basically checked if there was an entry for the term, and if there was, looked in the source of the entry for the {{th-pron}}
wikicode. If it found the template, it took the template's (respelled) parameters and substituted them for the the actual spelling of the entry name, then called the same module that the transliteration module did. Whatever the module returned was returned in turn to Module:links (sorry), which used that instead of calling the regular transliteration module.
Nobody but the Thai editors noticed this for 3 months, until, at the end of May, CodeCat reworked that part of the module and, in the process, removed Wyang's code- perhaps without realizing it had been there. Thai editors asked Wyang why the link transliterations weren't working right anymore, so he put his code back in to fix the problem.
This time, CodeCat noticed the code and couldn't immediately figure out what it did, so she left a message on Wyang's talk page. In the meanwhile she reverted Wyang's edit. Soon after that, Wikitiki89 came up with a compromise that incorporated Wyang's code from Module:links into the Thai transliteration module.
When Wyang responded to the comments on his talk page 11 hours later, he explained his code and the rationale for it in detail, and expressed his annoyance at CodeCat's reverting his edit before finding out what it did.
Having explained himself, he went back and reverted CodeCat's revert to reinstate his edit.
CodeCat then responded by explaining on Wyang's talk page why she thought it was a bad idea to put custom code in Module:links, but then went on to say that the problem was all due to deficiencies in the transliteration module and tell him that his code wouldn't be allowed back until she was convinced it was necessary. She then reverted his revert of her revert of his edit.
If you don't already have a headache from this- it gets worse. They then proceeded to revert-war back and forth, stopping every once in a while to argue and denounce each other angrily (see above). Then CodeCat blocked him for edit-warring- which accomplished nothing, since he immediately unblocked himself. Then Wyang called for CodeCat to be de-sysopped, and CodeCat called for Wyang to be de-sysopped.
(June) The issues
Filtering out the misunderstandings and trash talk, here's what I see the basic core arguments are (my formulation, not theirs):
- CodeCat
- A general-purpose, high-traffic module like Module:links shouldn't have special cases hardwired into it- language-specific code should go in the language-specific modules.
- The transliteration modules aren't just for transliteration- they can provide transcriptions, if that's what's right for the language.
- Wyang
- Thai and other languages like it need special treatment, because they need transcriptions rather than transliterations
- The version of the modules that CodeCat keeps reverting to isn't the same as his version.
- Concerns from others
- Modules getting data from entries is a very bad idea.
(June) My 2 cents
I agree more with Wyang's view of the events, but agree more with CodeCat on the substance.
CodeCat was wrong to revert Wyang's edit without knowing what it did. Her response to Wyang was too confrontational and demanding. Her poll wasn't really an accurate reflection of what Wyang was asking for, and the block did nothing but make things worse- much worse. On top of that, her characterization of the dispute is rife with spin and trash talk.
Of course, once the revert-war started, Wyang was a full partner in the mudfight, so I'm not giving him a pass, either.
I think the place to deal with Thai's peculiarities is in the Thai transliteration module, not in Module:links. Is there any module other than Module:links that gets the name of the transliteration module from our language data modules (in this case Module:languages/data2)? If not, we should take the function called by Wyang's code (Module:th.getTranslit) and use it as the basis for the transliteration module that Module:links calls (basically what Wikitiki89 did).
Except... I'm not qualified to say much about the concerns expressed over going to other entries to get data. After thinking about it, I can see why Wyang felt he needed to do it: most people linking to Thai entries know nothing about respelling, so it's unrealistic to require passing it as a parameter, and creating a data module with all the terms needing respelling would be a monumental and possibly fruitless task. Still, I think the module should eliminate as much as possible of the straightforward stuff before resorting to such tactics, in order to keep them to an absolute minimum.
Sorry for the encyclopedic length of this, but I wanted to make sure I didn't miss anything. Chuck Entz (talk) 04:17, 6 June 2016 (UTC)
- This is a fairly good summary of the past events. By looking at the Thai frequency list, I think it is safe to say that more than half of the 4000 most commonly used Thai words require some phonetic respelling. This number will only go up if we consider the entire set of Thai words, meaning that only relying on the Thai title linked to is quite hopeless at generating the correct transcription. So it boils down to the problem of whether to analyse the link destination to extract the correct pronunciation, or make it compulsory to supply the romanisation every time. I'm highly biased towards the former as I think page parsing is the best functionality on Wiktionary, and I would imagine the natIve Thai editors to be not very welcoming to the idea of the latter either.
- Regarding transliteration vs transcription, this is an issue that extends to many languages beyond Thai. Tibetan and Burmese are good examples that come to mind. I wrote Module:bo-translit (Tibetan) and Module:my-translit (Burmese) a while back, which form the backend for the Wiktionary transliterations of these two languages. The schemes used are the Wylie transliteration and MLCTS schemes respectively, both of which are transliteration schemes, and transliterated outputs of Tibetan and Burmese texts from these schemes have been used wherever the native script appears, whether it be in a Tibetan or Burmese language entry, in the etymologies of other languages or in translation sections.
- The universal use of these transliteration schemes is confusing to many unfamiliar with the languages, especially casual visitors to the site. Consequently, there should be additional transcription modules developed for the two languages, used to generate the appropriate romanisation in some circumstances on Wiktionary. The most important circumstance under which transcriptions are desired is probably in translation sections. At the moment someone looking to say "eight" in Tibetan would be absolutely clueless when the person saw the following result on the page eight:
- བརྒྱད (brgyad)
- Same with someone trying to say "long" in Burmese:
- ရှည် (hrany)
- The pronunciations of these two words are /cɛʔ˩˧˨/ (Transcription: gyaew) and /ʃè/ (Transcription: she), which the person reading the pages eight and long would not have guessed if (s)he only stayed on those pages. For other circumstances, such as ordinary inter-entry linking, the use of a transliteration method of romanisation is probably better (especially in etymologies), although the decision is to be made by all active editors. The realisation that romanisations used in translation sections should resemble the pronunciation as much as possible has been present on Wiktionary. Compare the Wikitext in the Russian translation of catheter:
- {{t+|ru|кате́тер|m|tr=katɛ́tɛr}}
- This is despite the fact that there is a Russian transliteration module on Wiktionary, which in this case would generate a correct transliteration but an incorrect transcription outcome. On a whole, the distinction between transliteration/transcription in Western languages is very minor compared to languages of the East, for which no infrastructure for this distinction is provided on Wiktionary at the moment. This is how Module:languages/data2 appears currently:
m = {
canonicalName = "Tatar",
scripts = {"Cyrl", "Latn", "Arab", "tt-Arab"},
family = "trk-kip",
translit_module = "tt-translit",
}
- This works well with alphabetic languages. For many languages of the East, the section should be more detailed:
m = {
canonicalName = "Tibetan",
scripts = {"Tibt"},
family = "tbq",
ancestors = {"xct"},
translit_module = "bo-translit",
transcript_module = "bo-...",
transcript_in_links = false, --optional
transcript_in_translations = true,
}
- This is the reason I regarded this problem as a lack of support from the central modules, and did not consider changing Module:th-translit into a transcription module as an appropriate way to tackle this. Wyang (talk) 08:36, 6 June 2016 (UTC)
- @Wyang: One thing I'm confused about, is if you are planning to use the transcription instead of the transliteration, why do you need a transliteration module? --WikiTiki89 18:21, 6 June 2016 (UTC)
- Different languages have different uses of transliteration modules. For Thai, editors have agreed on the use of transcriptions in translation sections and in normal links, although transliteration may be the better option of romanisation of Thai terms in etymologies of other languages, when the module calling Module:links is Module:etymology. For Tibetan and Burmese, transcription should be used in translations, whereas transliteration is the better mode of romanisation in generic links, as there is good one-to-one script correspondence and makes etymologies much more apparent. The modules should be kept and named accordingly for languages where the distinction is important on a romanisation level. Wyang (talk) 00:47, 7 June 2016 (UTC)
- @Wyang: Ok, now I understand better what your intentions are. However, I don't think it's a good idea to use different transliteration/transcription systems in different places. This is something the Wiktionary community should agree on as a whole, and not just the Thai editors (and the Tibetan and Burmese editors). The other issue is that parsing a linked-to entry to determine the word's phonetic transcription is a really bad idea for a number of reasons that have already been pointed out in the above discussions. What would be wrong with manually supplying these transcriptions? You can even add the manual transcriptions with a bot, which is similar to what User:Benwing2 did for Russian accent marks. Changing the logic of Module:links is not the right solution to either of these problems. --WikiTiki89 14:21, 7 June 2016 (UTC)
- From the experience with parsing in the past one and a half years, I would say that the associated harm is very minimal and benefits are extensive. This is somewhat similar to the case of the deletion of Template talk:str index (used in py-to-ipa then) that I contested about five years ago, well before the advent of this Lua system, and the difference is that the benefit-to-harm ratio in this case is even higher. People were not even that warm to the idea of automatic transliteration back then. The earliest and most important use of parsing is in
{{zh-forms}}
, and it has resulted in dramatic changes in the way that Chinese entries are formatted. Code is much more succinct, and as a consequence efficiency and productivity have exponentially increased (examples of use: 安眠藥 / 安眠药 (ānmiányào), 暗物質 / 暗物质 (ànwùzhì), 報酬遞減定律 / 报酬递减定律 (bàochóu dìjiǎn dìnglǜ)).
- Tools should only be used in situations where they must be. In the case of parsing for transcriptions, it is irrelevant to most of the languages hosted on Wiktionary and therefore most editors on the site. Most people have no experience and will have no experience with this. People tend to show aversion to the unfamiliar, and when the aversive mentality is voiced collectively by similar-minded peers, the disinclination is irrationally amplified and may as well convincingly mask the reality, which may only be visible to those centrally involved. (This may well underlie some political phenomena and explain the difficulty experienced with the Chinese entry format change here.) I would be arguing that new technology should be actively embraced and not feared (Wikipedia:Don't worry about performance). Likewise, transcription should be achieved automatically and people/bots should not have been manually supplying the transcriptions since the infrastructure is fully functional with no demonstrated risks. Even if there are, the focus should be on how to solve it, not on how to disable it.
- With regard to the partial change to transcriptive romanisation, I argued for what I consider as appropriate for Tibetan and Burmese and would be happy to hear about other ideas. On a historical note, before the creation of Module:my-translit, most formatted Burmese entries were using the BGN/PCGN system for romanisation, which is a transcription system, and the change to a transliteration system (MLCTS) occurred due to the higher success rate of automation of the latter, which allowed a much wider coverage of romanisation for the Burmese content. It is a decision to be made by Burmese-language editors collectively, and people should have the freedom to choose a practice of romanisation that is most appropriate for the language, with modules using the two modes (transliteration and transcription) of romanisation for this language already recorded in the backend database, and infrastructure in place for determining which system should be used where. For instance, if Burmese uses transcription in links I would still suggest that any calls to Module:links by Module:etymology use the Burmese transliteration module to generate romanisations, as Burmese transcriptions are much less informative for this purpose. Wyang (talk) 08:53, 8 June 2016 (UTC)
- You make some good points. I'll need to think about this for a bit. But also note that {{Wikipedia:Don't worry about performance}} does not apply here. The page states "You, as a user, should not worry about site performance. In most cases, there is little you can do to appreciably speed up or slow down the site's servers. The software is, on the whole, designed to prohibit users' actions from slowing it down much." But the concern is not slowing down site performance, but that since the site's performance is protected by time and memory limits, we have frequently seen on Wiktionary these limits being reached and producing errors. Thus, performance is still an issue, even though its consequences do not affect the site's performance overall. --WikiTiki89 14:40, 8 June 2016 (UTC)
- So, what happens now? Can we please get rid of the Thai code from Module:links now, or do we need some more edit warring? —CodeCat 12:06, 11 June 2016 (UTC)
- Do you have any constructive suggestions? DCDuring TALK 14:25, 11 June 2016 (UTC)
- Reinstate Wikitiki's original 3 edits and be done with it. —CodeCat 15:30, 11 June 2016 (UTC)
- I not that Wikitiki's comment of three days ago made it seem that he hadn't come to that final conclusion. DCDuring TALK 00:21, 12 June 2016 (UTC)
- User:Chuck Entz has described the situation very well. User:Wyang has created a working code for Thai transliterations/transcriptions and character sequencing. It is another commendable achievement of his. Few people attempted to work with scripts of such complexity as Thai. The majority of developers think that Thai is simply not transliteratable, even the phonetic respelling. User:CodeCat has broken the code for the reasons she mentioned. So, Thai transliteration modules stopped working and no alternative was offered. Thai editors were left wondering what was going on. User:Wikitiki89 has provided a workaround (later). I don't really know if it's a good fix. it should, of course, be considered but Wikitiki89 is not sure himself. There could be other solutions for many solutions but breaking an existing code without really offering a working solution is wrong. It seems CodeCat simply doesn't care about thousands of Thai entries, translations, editors and tremendous work put into this. I fully understand Wyang's frustration. I hope this conflict will be resolved peacefully. I don't want anyone desysopped but I encourage more consideration of other people's work. I'll leave the final technical solution to the people who understand it better. I don't see a huge reason for Module:links not to take some of the work (language-specific customisations) and/or accommodate handling of complex scripts with various levels of possible transliteration/transcription. For example, we capitalise transliterations of Korean proper nouns with a symbol "^" using the module.
- As for the transliteration/transcription for Thai - a graphical (literal) transliteration for the Thai script is not used anywhere, no Thai dictionary uses non-phonetic transliteration, it would produce nonsensical garbage, even for many words with regular or predictable spellings, just like many English words would if they were transliterated graphically into another script, e.g. "light" (l-i-g-h-t) - Cyrillic лигхт (ligxt). A phonetic Thai transliteration is not only popular but it's also standard. There are various Thai transliteration standards but none of them is graphical (showing sequence of symbols). A graphical spelling can also be provided, please see กรรเชียง (gan-chiiang), which shows the actual orthography (including the phonetic respelling of the term - "กัน-เชียง). The one adopted here is based on Paiboon publisher of dictionaries, phrasebooks and textbooks. Royal Thai General System of Transcription is also phonetic but not very useful for learners - no tones, no long vowels, etc. --Anatoli T. (обсудить/вклад) 04:27, 14 June 2016 (UTC)
I have tried to explain the situation to him on his talk page, but he doesn't seem to want to understand that he can't just change common practice regarding transliteration to suit his own personal tastes. Big changes to common established practice like this need discussion and consensus, and I consider this a big enough change to require a vote, but I am having a hard time getting him to actually do so and wait for consensus. Instead he edit wars over it to try and force his change through, since he thinks he is right, anything is warranted and any opposition is apparently shortsighted and Eurocentric and therefore it's ok to ignore consensus. Can someone else please try explaining it to him and try to get him to stop messing with the modules? The only thing I can do is continue to revert him. Thank you. —CodeCat 02:01, 17 August 2016 (UTC)
- It has been very frustrating interacting with User:CodeCat - unreceptiveness to suggestion, poor participation in discussions at the Beer Parlour, blocking wilfully, replying with completely irrelevant comments, and impetuous reverts without any input to the topic at hand. The word being thrown around is consensus, when there is not even one to begin with. I repeatedly asked for consensus for treating romanisation and transliteration as equivalent in Module:links, but User:CodeCat's response is plain simple - evasion, evasion and evasion. Without any clear and thorough discussion showing your edit is consensus, why are you throwing around the consensus as if there is one? If you are not willing to discuss, you should not be making any changes, let alone reverting impetuously. Disappointing that such blatant bullying is condoned. Wyang (talk) 02:11, 17 August 2016 (UTC)
(August) Wheel War- Action Taken
The conflict between User:CodeCat and User:Wyang has gone on long enough. They've been edit warring over an absolutely critical module used by huge numbers of entries. I'm not sure what that's doing to the edit queue- but it can't be good.
Both deserve to be blocked, but that would render them unable to contribute in discussions over the issue. It's also true that their misbehavior has been limited to editing protected modules and blocking each other.
Therefore, I have temporarily desysopped both of them, which will prevent them from editing the modules in question. I intend to restore them in one week, or when this is resolved- whichever comes first.
If edits need to be made to protected modules before then, I would appreciate it if our more-knowledgable admins would make themselves available to help out- perhaps User:Wikitiki89 or User:DTLHS?
I hope we can resolve this conflict quickly and get back to building a dictionary.
I would appreciate your feedback on my actions, since such things should only be done with community consensus.
Thanks! Chuck Entz (talk) 05:49, 18 August 2016 (UTC)
- I can't think of any other action that would have been more appropriate. SemperBlotto (talk) 06:10, 18 August 2016 (UTC)
- I think the desysopping was appropriate. I would even strongly propose that community consensus be obtained before reinstating the tools. CodeCat and Wyang have wheel-warred before, and each has blocked the other at least once, among other questioned actions. CodeCat, Wyang: you two are knowledgeable contributors to our content, and you are valuable contributors to our technical infrastructure, but you've both long (and not necessarily in equal measure) shown a tendency towards using your abilities to implement faits accomplis and get your way on e.g. module and entry layout or on treatment of Chinese, respectively. For instance, although on this page CodeCat calls on Wyang "to find a consensus for his proposed changes to Wiktionary practice", mere days ago Benwing called her out for again using her bot to create many new entries inconsistent without our existing entries and practices. Wyang, in turn, has threatened a few times to take his ball and go home if we don't agree with an action or, long ago, the unification of Chinese. These attitudes have driven away other editors; for instance, User:Mkdw just recently left after calling out Wyang's use of admin tools in the BP, while User:Ruakh has been largely inactive since earlier disputes with CodeCat over modules (as noted e.g. here) and the presentation of module errors (then and still now I agreed with CodeCat that module errors should generate a visible error message, but the dispute cost us a knowledgeable technical editor). This particular wheel-war seems especially excessive because the dispute seems to be not over whether there should be an automatic translit feature for complex non-European scripts like Thai, but over where it is most elegant to put that feature. - -sche (discuss) 06:49, 18 August 2016 (UTC)
- As, what it feels like anyway, the only non-admin reading these discussion boards, I express my consensus and agree with -sche that the stripping should not be time-bound but powers should only be restored when the community is convinced that the issue is done with in such a way that neither will have any incentive to do something which sparks it up again. I also repeat my conviction that no party of an edit war (as defined by me above as a conflict where two reversals of an edit have happened) should have the right to block or unblock any participant. Korn [kʰũːɘ̃n] (talk) 10:14, 18 August 2016 (UTC)
- I pretty much agree with you on everything (not unusual, by the way). We have here two equally stubborn and overbearing people who have met their match- if it weren't for the stakes and the damage done, it might be satisfying to see both get their comeuppance. As for duration, I was careful to say "I intend", because the week was just an arbitrary time picked out of the air, and I was hoping we could come up with something better. Right now both are responding with stereotyped "talking points" about the failings of the other, which shows both are still dealing with this on a strictly emotional level. The truth is, both are basically right about each other in the most part, but it's irrelevant. We need to come up with a solution that makes sense and that both can live with. Chuck Entz (talk) 14:16, 18 August 2016 (UTC)
- I think the desysopping has to continue until the matter is resolved. As long as there is no sulking, the project will continue to benefit from their contributions. I hope that the project does not suffer from lingering bad feelings once these valued contributors regain their sysop status. DCDuring TALK 13:10, 19 August 2016 (UTC)
- I support the emergency temporary desysopping of both editors made by Chuck Entz on account of interminable wheel-warring. I believe a bureaucrat is authorized to take such temporary measures to eliminate this kind of wheel-warring, without a vote. --Dan Polansky (talk) 11:44, 21 August 2016 (UTC)
To be honest, I am not expecting any functional input from User:CodeCat regarding the topic at hand, based on her bullying behaviour and unwillingness to engage in discussions in the past few days. Her only argument has been that her edit was based on "consensus", which is obviously nowhere to be found, even when requested again and again.
Treating romanisation as equivalent to transliteration is clearly erroneous (since romanisation = transcription + transliteration), but she keeps reinstating this misinterpretation, with total disregard for the infrastructure of languages which make the distinction between transcription and transliteration on a romanisation level. For example, Module:th-translit does not even describe what it does after her edits, and she is apparently nonchalant about these languages ("It's a misnomer, but that's the way it is.").
This lack of regard for correctness, coupled with her previous heedless deletion of the indispensable code in Module:links (which precipitated all this), are acts of admin sloppiness. Her one-line response of "So, what happens now? Can we please get rid of the Thai code from Module:links now, or do we need some more edit warring?" to my detailed rationales for putting transcription support in the central modules, is exemplification of her uttermost apathy towards the actual topic ("would rather fight not explain") and disrespect to people.
This second episode was perfectly bound to happen, and bound to end tragically, when all that one side of the dispute cares about is "getting rid of the Thai code from Module:links now", even if she has to use "some more edit warring" for that. Yet, there are people cheering for her. Wyang (talk) 10:38, 18 August 2016 (UTC)
I support this action and wish that someone had done something sooner. I called for help above, but nobody responded, so I was very unsure what to do as I didn't feel like I had any options left, and it was all up to me. I'm sad to see that the community only cares when there is edit warring going on but is unwilling to help in solving the problem outside of that. At least now, people's attention is finally here so I can't complain too much.
As far as the dispute goes, I can summarise what I see:
- Wyang, in principle, believes that transliteration modules should only be used for transliteration in the strict sense: letter-by-letter conversion.
- Consequently, the Thai transliteration module does literal transliteration, which makes it pretty much useless for Thai.
- This goes counter to how the term "transliteration" is generally used on Wiktionary; we use the term to refer to transcription, transliteration and romanization in general. Transliteration modules perform all of these functions, and the
tr=
parameter that is present on many templates is frequently provided with something that is not strictly transliteration, but rather adheres to the Wiktionary usage of the word. Our policies with respect to the use of these parameters and modules are labelled "transliteration" as well, as evidenced by WT:RU TR, WT:EL TR and WT:JA TR for example. None of these transliteration policies describes transliteration in the strict sense (av rather than ay for Greek, ō rather than ou for Japanese, etc.).
- Because the transliteration module for Thai is useless by Wyang's own choice, Wyang decided that the best way around this was to insert special-purpose code into Module:links, a widely-used general-purpose module, to transliterate Thai correctly by using code present in another module, Module:th.
- This was disputed by me, arguing that such special language-specific code does not belong in a general purpose module, especially not when it can easily be put into the existing transliteration module and have everything work just fine.
- User:Wikitiki89, in the last war, did just this: he moved the code over to the transliteration module, where it belongs. This was immediately reverted by Wyang however, and his special code in Module:links reinstated despite it already having been disputed. My efforts to reapply Wikitiki's edits were repeatedly reverted by Wyang.
- Fast forward to now, when I once again noticed Wyang's special purpose code in Module:links, and got frustrated that the issue was never solved. I therefore once again moved the code to the Thai modules. This again resulted in a revert war.
- I attempted to explain on Wyang's talk page that in order for his alternative interpretation of transliteration, which involved creating separate modules and infrastructure for transliteration versus transcription/romanization modules, to be accepted, he would have to find a consensus with the community for it and seal it with a vote.
- Wyang showed no intention of doing this, instead arguing on the merits of his views as if to convince me that separating the two was the right way to go. In my view, this missed the point as it wasn't me he was supposed to convince, but the community at large. Thus, I ignored his arguments and instead tried to focus on stopping him from edit warrning and trying to get community consensus first.
- Wyang refused to create a vote, instead telling me to create a vote for him. Two other editors also called for a vote, and even offered to make one if Wyang didn't. I welcomed this, but nothing has been done in this regard yet, and Wyang continued his edit war, rather than waiting on the outcome of the vote.
- I called for help on the Beer Parlour regarding the matter, hoping that other users would be better capable of solving the issue and, especially, to stop Wyang from reverting me each time and get him to wait for consensus. This call for help was entirely ignored, and thus the warring continued.
—CodeCat 14:21, 18 August 2016 (UTC)
- It seems the issue is a bit more complicated than that. Wyang seems to want to have both transliterations and transcriptions for Thai, used in different places. This is something that goes against the status quo and should need a vote before being implemented. Wyang has refused to draft this vote claiming that the consensus among Thai editors is enough. However, this impacts not only Thai editors, but our readers as well who may be confused by having two different romanization systems in different places. As long as Wyang continues to refuse to draft a vote, I don't think we should allow his system to be put in place. My personal opinion is that there should be one default romanization system, whether it be strictly a transliteration, or a transcription, and if it is necessary to use a different system in etymologies, this should either be done manually with
tr=
parameter, or potentially with a dedicated Thai template that would allow choosing a different automatic romanization. In either case, all the automatic Thai romanization code, both transliteration and transcription, should be located in Module:th-translit. --WikiTiki89 14:35, 18 August 2016 (UTC)
- I'd like Thai to follow the pattern we've already established for Burmese: one automatically generated transliteration system used everywhere outside of Thai entries (translation sections, etymology sections, etc.), and Thai entries with additional transliteration systems (both spelling-based and sound-based). Ideally the automatically generated one should be ISO 11940-2 or at least based on it. —Aɴɢʀ (talk) 15:05, 18 August 2016 (UTC)
- @Angr Burmese entries are nowhere near the level of current Thai entries. The current Burmese transliteration is much closer to the spelling, which doesn't help users much with the pronunciation. Ideally, we should have a system created for Thai - with phonetic respellings but for that we need more native knowledge or reliable data available. With Thai, we're are luckier - we have native speakers, phonetic respellings from some dictionaries and "Paiboon" or other transcriptions from published dictionaries sometimes can help reverse-engineer the phonetic respelling (for non-natives). I'd like to see the same methods used for Burmese and Tibetan. --Anatoli T. (обсудить/вклад) 02:32, 21 August 2016 (UTC)
Transliteration is not concerned with representing the sounds of the original, only the characters, ideally accurately and unambiguously. (Wikipedia)
Romanization, in linguistics, is the conversion of writing from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and transcription, for representing the spoken word, and combinations of both. (Wikipedia)
- Transliteration is not the same as romanisation. Romanisation = transliteration (script conversion by spelling) + transcription (script conversion by pronunciation). It is quite embarrassing that we as a dictionary are getting this basic concept wrong in some places, and seem proud of propagating and not rectifying the error. The contrast between transliteration and transcription is a fundamental concept in basic linguistics (and I do not even have a linguistics background). The distinction is strictly adhered to when one discusses romanisation schemes for languages which have a noticeable script-pronunciation discordance, i.e. languages which distinguish between transliteration and transcription on a romanisation level. If one wishes to talk about conversion to the Latin script using any mapping, it is romanisation. There are some places on Wiktionary which have confused these two concepts - for example WT:RU TR - which is precisely why people have complained that "this looks like a mess of transliteration and transcription". If we search for "transliteration of Russian" (or other languages) on Wikipedia, we will quite sensibly be redirected to "romanisation of Russian".
- So far the confusion has been largely non-disastrous, since (1) most content has been for Latin-script languages, and (2) for languages which do not use the Latin script and have had romanisation systems devised for them on Wiktionary, the difference in the romanisation outcomes generated by transliteration and transcription is comparatively minimal, for example Russian. Then we deal with languages of the East, which are renowned in keeping the spelling forms used hundreds or thousands of years ago, and therefore have a high script-pronunciation discordance. If we go back to the comparison table of various languages in transliteration and transcription outcomes that I created in June, we can see that transliteration and transcription are two visibly dissimilar modes of romanisation, and this confusion of transliteration and transcription is destined to have disastrous consequences.
- People may not be aware of this, but this distinction has been faithfully adhered to when we designed the module infrastructure for Oriental languages, until this incident. From the aforementioned table, we can see that transliteration as a concept is inherently impossible for Chinese and Japanese since there is no script-to-script mapping, and appropriately there is no Module:zh-translit or Module:ja-translit on Wiktionary. What we have in place for Chinese and Japanese is Module:zh/Module:zh-pron and Module:ja/Module:ja-pron which help interpret auxiliary or native phonetic representations of these languages to generate romanisations in a transcriptive manner. On the other hand, transliteration is possible and contrastive with transcription for Tibetan and Burmese, therefore we have Module:bo-translit and Module:my-translit to generate transliterations and Module:bo/Module:bo-pron to generate transcriptions. For Thai, the distinction has also been faithfully observed until the incident: we have Module:th-translit which deals with transliteration, and Module:th/Module:th-pron dealing with transcription of Thai. It has been customary practice to devise module infrastructure for languages observing the transcription-transliteration distinction prudently, and this includes naming modules the way they should be named, to avoid future complications.
- Why do we have to be prudent in devising the module infrastructure for these languages, and what are the complications of imprudent and misnomeric handling of languages observing this transcription-transliteration distinction? I explained this in detail in the previous discussions (discussion 1, discussion 2). In short, the two divisions of romanisation (transliteration and transcription) symbolise the two polarising ends of the romanisation spectrum in a dictionary-building context:
- etymology (transliteration) ———— pronunciation (transcription).
- The reasons we use romanisations on Wiktionary are different in various parts of the project. In translation sections, the purpose of romanisations is to inform readers how the word in another language is pronounced. In etymological comparisons, the purpose of romanisations is to inform readers how the term is spelled, i.e. how it used to be pronounced. Among languages which observe the transliteration-transcription distinction, there is variation in how acceptable it is to approximate one romanisation with the other and use it in all places. This really needs to be decided on a language-by-language basis. For some languages (e.g. Tibetan, Burmese), it is not advisable to use one mode of romanisation in all places. Again using the Tibetan example of བརྒྱད (transliteration: brgyad; transcription: gyaew): It makes no sense to say:
- བརྒྱད (gyaew) is a cognate of Old Chinese 八 (OC *preːd)
- when the word is actually spelt as brgyad; and it similarly makes no sense to put:
- བརྒྱད (brgyad)
- as the Tibetan translation of eight. The next day after the faithful Wiktionary user downloads our app, he/she is found at a Lhasa stall, trying to bargain by whipping out their phone and awkwardly pronouncing /brgjad/ replace g with ɡ, invalid IPA characters (g).
- The transcription-transliteration distinction is a pan-linguistic phenomenon not just limited to Thai lacking support in the core module system, and more consideration and acknowledgement that many script-pronunciation discordant languages use two methods of romanisation needs to go into the infrastructure. A system is never perfect (e.g. Module:links and Module:translations already contain language-specific adaptations), but changes are gradual and have to be initiated at the correct end. A step in the wrong direction may precipitate amplified counterproductivity before eventual rectification takes place, like the misinterpretation of transliteration here. Wyang (talk) 00:41, 19 August 2016 (UTC)
- @Angr: ISO 11940-2 is a transcription system. Wyang (talk) 00:43, 19 August 2016 (UTC)
- @Wyang: You're starting to sound like a broken record. We all understand that in linguistics there is a distinction between transliteration and transcription, both of which can be called romanization (when this is done into the Latin alphabet). But the issue here is not of terminology. Yes, we use transliteration incorrectly according to the linguistic definition (although the common non-linguistic definition would include phonetic transcriptions as transliterations), but if we "corrected" ourselves and replaced the word transliteration with romanization everywhere that we misuse it in our templates, modules, "about" pages, etc., you still would not be happy. Why? Because your problem is that you want to use two different automatic romanization systems (one a "transcription" and the other a "transliteration"), when our templates only support using one automatic romanization system. So let's talk about that issue and not the terminology. --WikiTiki89 01:09, 19 August 2016 (UTC)
- Well, I have to sound like a broken record because much of this has already been said two months ago in discussions poorly tended to by User:CodeCat (aside from the one-liner). The core issue is the confusion of transcription and transliteration by people who designed Module:links and the consequent awkwardness in its support for transcriptions. If transliteration = romanisation = transliteration + transcription in the central infrastructure, then where does transcription fit? It merely becomes a transliteration2, which it is not and should instead be contrasted with. The technical side is easy to fix - the shorthand "tr" is perfect already. We only need to store the transliteration and transcription modules as separate in language_data, and turn on the transcription modules at the appropriate point. For example, my revision at Module:links. Obviously there are more rigorous ways, but the approach has to be central to start with, not by confusing this concept even further in languages which truly distinguish them. Wyang (talk) 01:30, 19 August 2016 (UTC)
- Ok, so let's say that we do this. Now, which of the two modules is called when our modules need a romanization to be auto-generated? And what use is the module that does not get called?
- Also, aside from all this, why have you never thrown up a BP discussion or vote to discuss this proposal? Why did you edit war to put it in place instead? —CodeCat 01:37, 19 August 2016 (UTC)
- Personally I think it would be hella confusing to display brgyad in one place and gyaew in another place when referring to the same word. If we are to make a systematic distinction between transliteration (in the proper sense) and transcription, we should include both forms consistently. Perhaps we write བརྒྱད (brgyad ・gyaew) where the dot in the middle links to a page explaining what the two romanizations mean. Mind you, I'm not convinced it's worth the trouble, but if we are to do it, something like this would be the way. Benwing2 (talk) 01:50, 19 August 2016 (UTC)
- And in fact that suggestion is already possible without Wyang's changes to Module:links. --WikiTiki89 01:53, 19 August 2016 (UTC)
- "བརྒྱད (trlit. brgyad; trscr. gyaew)"? (a bit more explicit; the blue dot in headwords is not terribly intuitive IMO) —suzukaze (t・c) 02:08, 19 August 2016 (UTC)
- This is fine with me, and I agree is more intuitive. Benwing2 (talk) 06:01, 19 August 2016 (UTC)
- CodeCat, all of this was in the original discussions (discussion 1, discussion 2). "Why did you edit war to put it in place instead" - this is irresponsible and unnecessary accusation. Please have a look at the page history of Module:links; the first revert was your heedless revert which paralysed the Thai entries.
- བརྒྱད (brgyad ・gyaew) in translations is too confusing for newcomers. The technical support for transcriptions is not difficult to put in place. A simple parallel function of
Language:transcribe
can be added in Module:languages. This function can be called by Module:links (i.e. to turn on transcription support) unconditionally for language A, or conditionally for language B (e.g. only when Module:links is called by Module:translations, or unless Module:links is called by Module:etymology). Wyang (talk) 04:26, 19 August 2016 (UTC)
- What about suzukaze's suggestion? Do you still think it's too confusing for newcomers? IMO displaying different romanizations in different places is far more confusing than displaying both and I would be strongly against that. Benwing2 (talk) 06:01, 19 August 2016 (UTC)
- What about བརྒྱད (gyaew )? We already do this for Akkadian, for example: 𒆍𒀭𒊏𒆠 (bābili ). (Although I don't understand why gyaew is even needed, none of the IPA transcriptions at the page look anything like it; they all look much more like brgyad.) --WikiTiki89 09:20, 19 August 2016 (UTC)
- gyaew is the Lhasa pronunciation: gy /c/, ae /ɛ/, w /˩˧˨/. Frankly, I would be quite confused by the Akkadian word if I saw it in translations (I still am after reading the entry, especially the etymology). It may be less unsatisfactory for Akkadian, as people may be less interested in the spoken aspects of a dead language. I don't think putting transliteration in translations is a good idea for any of the non-small living languages with a high level of script-language discordance.
{{bo-pron}}
has more examples of transliteration-transcription correspondences in Tibetan. Wyang (talk) 09:40, 19 August 2016 (UTC)
- The fact that you are confused by our Akkadian romanizations is not really a problem. We shouldn't necessarily expect people to automatically understand these things. We need to have appendix pages explaining our romanization scheme for each language, just like any other dictionary would do. Such an appendix would explain to you that bābili is the transcription and KA2.DINGIR.RAKI are the names of each character in the word, named by their usual phonetic value, with capital letters indicating Sumerian logograms (Sumerograms; kind of like Kanji) and superscript indicating determinatives. --WikiTiki89 12:26, 19 August 2016 (UTC)
- I would love to read some statistics regarding the traffic of our help pages - I have always been under the impression that very few people are able to navigate to our Wiktionary:About... pages, since we do not have an obvious or subtle link on the entry itself linking to the language help page. We do not have a "translate!" tool alongside the search box that helps a reader check if translations of word A in language X exist (i.e. a simple interface with two fields "word" and "language" (dropdown by #speakers), which parses through the content of the entry A to see if it has the translation of any sense in language X), and prompt the user to suggest that we add this translation if there is none. We also do not have a fuzzy search function, or a reverse transcription search, and many other things. Personally, the reason I look up translations is because I want to know how to say the equivalent in another language. Like the common phrase "How do you say ... in the ... language?", not "How do you spell?". I imagine most readers are expecting to find out the pronunciation of a foreign non-Latin-script word on the translation page itself, which is why I'm suggesting simple, straight-to-the-point phonetic transcriptions inside translation boxes. Wyang (talk) 13:07, 19 August 2016 (UTC)
- I'm all for making the "about" pages more easily accessible. As for pronunciation, you're supposed to click on the entry and not simply look at the table. The entry should have all the pronunciation information. Someone unfamiliar with Tibetan will not know how to pronounce gyaew anyway. Someone who knows a little bit about Tibetan would realize that the word might not be pronounced brgyad in Lhasa and click on the entry for further pronunciation information. I don't know why you're bringing up search features, they do not seem relevant to this discussion. --WikiTiki89 13:29, 19 August 2016 (UTC)
- Yes, people are supposed to click on them, but people (especially casual visitors) often don't. People may not know how to pronounce gyaew initially, but if the display in translations consistently uses transcription and people are pointed to the correct help page, they are more likely to become regular users and use the translation functionality more frequently. The point about the search features was to lament that our user friendliness is (excuse me) crap... and yet, we are here arguing whether or not we should give support to transcriptions which prominently contrast with transliterations in many languages, and whether or not it is worthy to improve user experience with more consideration. Wyang (talk) 14:24, 19 August 2016 (UTC)
- Giving a user a piece of unexplained information without a link or even a name for that information, thus effectively blocking the user from figuring out what that information is, is a problem. Because that means you have not given the user any information at all, you just blurted some nonsensical text. Korn [kʰũːɘ̃n] (talk) 15:11, 19 August 2016 (UTC)
- Perhaps all of our transliterations should automatically link to a description page, like this: обезья́на (obezʹjána (key), “monkey”)? --WikiTiki89 15:53, 19 August 2016 (UTC)
- That's actually how I handle it for Middle Low German grammar. Though I'm not sure it needs to happen for plain transliteration, which should be more intuitive than Sumerograms. In case of doubt, better safe than sorry, though. Korn [kʰũːɘ̃n] (talk) 16:08, 19 August 2016 (UTC)
- The only downside to that idea is that it puts too much emphasis on the transliteration, rather than on the word itself. Another idea I've always contemplated was to just get rid of all transliterations in links and have them only in entries and etymologies, but that's a very radical change. Another idea I just had is what if we have links to transliteration keys after the language name in translation tables and at the top of each language header. --WikiTiki89 16:43, 19 August 2016 (UTC)
- What if we just made the transliterations themselves the links to the keys? (Languages where the transliterations have entries (e.g. Gothic) could continue to link to those entries, since they contain, or link to the main entries which contain, much the same information as the key would.) - -sche (discuss) 17:15, 19 August 2016 (UTC)
- I did consider that. My first thought is that it would look weird for all transliterations to be colored as links. Also, would the reader know what he would get from clicking the transliteration? But maybe it's not such a bad idea. We should limit this to link templates, though. Usage examples and other such things probably don't need to have their transliterations linked. --WikiTiki89 17:54, 19 August 2016 (UTC)
- Strong oppose for any move to remove transliterations / romanizations from links. That would greatly reduce the usability of all Japanese entries. ‑‑ Eiríkr Útlendi │Tala við mig 00:42, 20 August 2016 (UTC)
- A lot of online dictionaries have significantly better interfaces than us. Some use hover over for all links to show a sneak peek of the linked-to entry; examples are Moedict, CantoDict, Thai-language. These are all impressive tricks which we can potentially implement to greatly improve the user experience. The link in translations can be turned into a hover-over link which previews the pronunciation and first sense of the term, and on mobiles it can be simple link with transcription following it in parentheses. The point is we need to suitably name and record our utility modules, so that we can easily call on them and not come to the realisation we have mixed up all the transliteration and transcription modules when there is a need to use transcriptions. Wyang (talk) 00:46, 20 August 2016 (UTC)
- But at what point has there ever been a need to choose between them or display them both? If we have both a transcription and a transliteration module, would they ever both be used for anything? —CodeCat 01:10, 20 August 2016 (UTC)
- In translations. The purpose of having romanisations in translation sections is to inform readers how to say something in another language. Transcription modules, if they exist, should be preferentially called upon when romanising terms in translation sections. Wyang (talk) 02:08, 20 August 2016 (UTC)
- I think presenting both romanisations simultaneously in translations is confusing - readers are unlikely to understand what the difference between transliteration or transcription is, or the difference between Wylie transliteration and Tibetan Pinyin. I would prefer presenting the information in the entry itself, and presenting only what is necessary in translations, e.g. བརྒྱད (pr. gyaew). Wyang (talk) 09:13, 19 August 2016 (UTC)
- You can always make the words give a one line explanation of the difference on hover over. Korn [kʰũːɘ̃n] (talk) 09:37, 19 August 2016 (UTC)
- We have to be careful with using hover over though - it does not seem to be well-supported on mobile devices. Wyang (talk) 10:01, 19 August 2016 (UTC)
- I'd be fine with making the de-syspopping of CodeCat permanent. This is the latest in a series of abuses of the tools, ranging from bad blocks to making major changes without community consensus. Purplebackpack89 18:59, 18 August 2016 (UTC)
- IMO, any action like this needs to be by formal vote. (Note that there was already a vote to desysop CodeCat, which failed.) Benwing2 (talk) 20:22, 18 August 2016 (UTC)
- There should be no double standard. Either both Wyang and CodeCat have their sysop powers restored upon resolution of this problem, or they both have to reapply and be voted on. I do not understand, however, why CodeCat's edits are no longer autopatrolled. That should be fixed as soon as possible. —Aɴɢʀ (talk) 21:55, 18 August 2016 (UTC)
- I overlooked that detail. Fixed. Chuck Entz (talk) 02:11, 19 August 2016 (UTC)
- I can always trust you to make everything be about you and your grievances, no matter the subject. That type of attitude is a large part of what caused this mess in the first place- we need less of it, not more. Chuck Entz (talk) 02:11, 19 August 2016 (UTC)
- I oppose CodeCat's recent desysop. First off, where is the formal vote? Second of all, I've not really had any problems with her. I think her intentions really are good, but she may have made a mistake, just like all of us have. Jeez if I had a penny for every mistake I've made on the internet, I'd have like 10 bucks (which is a lot of pennies!). I feel like it's only if a person continues to make such mistakes somewhat consistently over a long period of time, or do something really bad (like delete the main page, for instance), that they should be desysopped because of behavior. I'd be willing to put up a vote to get her resysopped (hey look I made up a new word!) if necessary. Philmonte101 (talk) 22:33, 18 August 2016 (UTC)
- A vote would be required for a permanent desysop, but in this case, the desysopping was temporary in order to stop an ongoing edit war. Normal users would have received a temporary block for this, but admins can unblock themselves, making such a block useless if the admin is determined to circumvent it (and both of them did so in this case, before they were desysopped). Thus, I think the temporary desypping was justified. --WikiTiki89 23:36, 18 August 2016 (UTC)
- It seems like you completely misunderstand the entire situation. The desysop was the emergency countermeasure to a serious edit war, not "a mistake". —suzukaze (t・c) 01:09, 19 August 2016 (UTC)
This topic must not die again. How are we going to set up the transcription/transliteration infrastructure? —suzukaze (t・c) 00:39, 21 August 2016 (UTC)
- Agreed. I think most people are in agreement that the status quo of one single transliteration is OK, and it's also OK to display two transcriptions/transliterations for languages like Tibetan and Burmese where the pronunciation and written script are far from each other and where the written form carries important etymological information that's missing from the modern pronunciation. This potentially could be done for Thai and Khmer as well although here I think it's less useful, as the difference between the two isn't so much, and the extra information in the written form is mostly only present in Sanskrit loanwords, which are fairly unproblematic etymologically. The main issue here is that Wyang disagrees and wants to impose a system where we show transcriptions in some places and transliterations in others, but I think pretty much everyone else is opposed to this so it won't fly. We could vote on this but Wyang has to be willing to accept the result, since he seems to be the main one who would implement it. Benwing2 (talk) 01:07, 21 August 2016 (UTC)
- My main points were: 1) transliteration and transcription should not be confused; 2) for languages which can both be romanised with transliterations and transcriptions, the functional modules should be distinguished and named appropriately; 3) using multiple romanisations is very confusing in translations and readers will not understand the difference; and 4) translation sections should use transcriptions to romanise terms, if transcriptions are contrastive with transliterations. I do not believe I am the only one who is in favour of this. Discussions should involve effective argumentation, not by merely accusing others of being outlandish. Wyang (talk) 01:36, 21 August 2016 (UTC)
- Eh, I find it reasonable to display only the relevant romanization to reduce clutter. The entry itself could show which is a transcription and which is a transliteration. —suzukaze (t・c) 01:55, 21 August 2016 (UTC)
- I prefer to see transcriptions as is currently done by the Thai module. Transliterations or symbol sequence can still be found in Thai entries. --Anatoli T. (обсудить/вклад) 02:32, 21 August 2016 (UTC)
- I suggest recording the transcription modules in language_data, creating a parallel
Language:transcribe
function in Module:languages, and making Module:links call on this function unconditionally or conditionally for certain languages. Wyang (talk) 01:26, 21 August 2016 (UTC)
- Thanks for summarizing your position. Benwing2 (talk) 01:41, 21 August 2016 (UTC)
- Note that you haven't answered whether you will accept the community's consensus if it goes against yours. Benwing2 (talk) 01:41, 21 August 2016 (UTC)
- Fine. Bye bye. Wyang (talk) 01:42, 21 August 2016 (UTC)
- Christ. I was trying to play mediator but seem not to have been successful. Wyang, I do hope you will reconsider. No one wants to see you leave. Benwing2 (talk) 03:53, 21 August 2016 (UTC)
- Repeatedly using imagined “consensus” (your opinion) as majority tyranny to intimidate others is hardly mediation. I am perplexed how the above discussion could be interpreted as me spewing out nonsense and needing to be brought under control. I elaborated my various points in the discussion and there isn't really any opposing argument regarding either using transcriptions in translations or separating transliteration and transcription utilities for certain languages. Then there was your “summary” which identified the need to smother me without providing any counterarguments whatsoever. To my technical proposal, instead of commenting on the feasibility or reasonableness of this, you again tried to smother me by labelling whatever you believe in as “consensus” and coercing me to accept it. This is opposing for the sake of opposing, without bringing in any intelligent arguments to the discussion. This is bullying. How is བཀྲ་ཤིས་བདེ་ལེགས (zhacf-xih-dev-leh ) not confusing as the Tibetan translation of hello? It is frustrating to try to have people think sensibly and analytically about topics with the future in mind on Wiktionary. Look at how long it took for the community to come to senses with the Chinese merger and now this; time and time again, it is regression led by the unfamiliar majority, without critically analysing proposals for what they are. Wyang (talk) 02:16, 22 August 2016 (UTC)
- Wyang, I am sorry things have gotten to the point that you think I am smothering you, bullying you, tyrannizing you, etc. It was not my intention to do any of these things, and I apologize for giving the wrong impression. How about we simply hold a vote on what is the best way to handle this? This is the Wiktionary way of doing things, and will more clearly reveal the consensus. Are you willing to lead that vote? Benwing2 (talk) 03:12, 22 August 2016 (UTC)
- Thank you. Wikipedia:Polling is not a substitute for discussion; Wikipedia:What_Wikipedia_is_not#Wikipedia_is_not_a_democracy has more arguments why decision making should be achieved by discussion and consensus, not votes. It is not sensible to expect voting to be the most suitable method of decision making, when the great majority of eligible voters are uninvolved and perhaps have no prior familiarity with how the transliteration-transcription distinction manifests itself in the romanisation of certain languages. It is akin to believing that User:Wyang will be a responsible voter on the topic of Akkadian romanisation; quite the contrary I have no previous experience with this and any stance I take regarding Akkadian romanisation could be very unwise for the project's future. In this discussion we should be appraising whether the preferential use of transcriptions in translations is favourable over transliterations for certain oriental languages (Tibetan, Burmese, etc.; example below), and consequently whether transliteration and transcription modules should be kept separate for these languages. Wyang (talk) 04:27, 22 August 2016 (UTC)
|
“eight” |
“ear” |
“hello”
|
Transcription
|
བརྒྱད (pr. gyaew) |
རྣ་བ (pr. naf-waf) |
བཀྲ་ཤིས་བདེ་ལེགས (pr. zhacf-xih-dev-leh)
|
Transcription (without tone letters)
|
བརྒྱད (pr. gyae) |
རྣ་བ (pr. na-wa) |
བཀྲ་ཤིས་བདེ་ལེགས (pr. zhac-xi-de-le)
|
Transliteration
|
བརྒྱད (brgyad) |
རྣ་བ (rna ba) |
བཀྲ་ཤིས་བདེ་ལེགས (bkra shis bde legs)
|
- A vote is good way way out. The above speaker seems to underestimate the ability of voters to inspect and evaluate evidence and to consider arguments presented. Instead, he seems to commit what looks like an authority fallacy, the erroneous notion that only those already familiar with Thai can make a sound judgment about Thai romanization, be it transcription or "transliteration" narrowly construed.
- How consensus, mentioned in the above post, could ever be anything different from the result of a vote is beyond me. Since, consensus is a general agreement even if not unanimity, and I fail to see how a passing vote could ever show anything other than consensus. --Dan Polansky (talk) 08:02, 22 August 2016 (UTC)
- A vote, where the topic is unfamiliar to most of the eligible voters, can easily produce ill-advised consensus (“collective stupidity”). And yes, only those familiar with how the transliteration-transcription distinction manifests itself in the romanisation of certain languages can make a sound judgment about the issue. Calling for a vote is not the way out if that side shows no willingness to engage in discussions and present counterarguments to reach consensus. Cases of collective intelligence are when those familiar and knowledgable about the topic critically appraise the arguments for and against the proposal to attempt to reach consensus, not when the proposal is relayed to a vote to see which side is more numerous. Wyang (talk) 10:22, 22 August 2016 (UTC)
- You have the option to present "how the transliteration-transcription distinction manifests itself in the romanisation of certain languages". In fact, you have just done that in a table above. I trust most of the voters to consider such presentation in their voting decision. Discussion alone is not a mechanism of decision making; indeed, in this dispute, both parties think they are right and that they have presented the right arguments. Strength of argument is not a mechanism of decision making since there is no simple mechanism to assess strength of argument. --Dan Polansky (talk) 11:13, 22 August 2016 (UTC)
- Having information presented can never, ever supplant being knowledgeable about a topic. For instance, if I were provided with a comprehensive overview of the Akkadian language, I would still feel that I am in no position to provide any judgment on Akkadian romanisation. In fact, Wiktionary has witnessed many lessons learnt from having the unfamiliar collectively voice opinions and make decisions. The merger of Chinese is one - it was only adopted in 2014, more than 10 years after the launching of Wiktionary. So much more work could have been done in the meantime, and so much work still remains to be done to rectify the initial step in the wrong direction. The misinterpretation of transliteration is another one. All of these resulted from the lack of intelligent decision making from people who are familiar and knowledgeable about the topics. Discussion is, and is arguably the most important mechanism of decision making, and I would argue that no decision making should be achieved without any substantial discussion. In this dispute, there has been a paucity of argumentation from one side throughout, and a paucity of active discussion of the topics at hand (whether the preferential use of transcriptions in translations is favourable over transliterations for certain oriental languages, and consequently whether transliteration and transcription modules should be kept separate for these languages). Wyang (talk) 12:05, 22 August 2016 (UTC)
- If you submit that after being presented the table, I still don't appreciate the difference between letter or character based transliteration and pronunciation based transcription, you are drastically underestimating my capacity to understand very simple things. Nor do I think other readers have failed to appreciate the distinction. Your fallacy is grave. --Dan Polansky (talk) 12:58, 22 August 2016 (UTC)
- Basically, you're fighting against any way for other editors to disagree with you. Is there any way Wiktionary editors could decide on a course that you disagree with that you'd accept?--Prosfilaes (talk) 13:43, 22 August 2016 (UTC)
- Transliteration / transcription is about providing a Latin-script handle for people who don't read the script. Anywhere but the entry for the word itself we should be using one consistent Latin-script version, and there were can and should provide every transcription/transliteration version now or once in standard use.--Prosfilaes (talk) 09:57, 22 August 2016 (UTC)
- Romanisation is indeed about providing a Latin-script handle for people who don't read the script. Nonetheless, the reasons we want to use romanisations are different in various parts of the project. It could be to show how the foreign-script word is pronounced (for example in translation sections), or how it is spelt literally (in etymological comparisons). At the moment, Tibetan is romanised with a transliteration method (Wylie transliteration), which is 100% automatable and is fantastic in etymological comparisons, as it faithfully represents how the word is spelt. However, there is no point showing brgyad as the Tibetan translation of eight, as readers will automatically assume the romanisation in translations is the word's pronunciation and attempt to pronounce it as such when communicating with locals. It makes more sense to simply use transcriptions to inform readers of the pronunciation in translation sections. Wyang (talk) 10:22, 22 August 2016 (UTC)
- Romanization is not for showing how words are pronounced. That's what the pronunciation section in the entry is for. If you're using Wiktionary's translation tables as a pronunciation key for communicating in that language, epic fail. In any case, showing བརྒྱད shows six rather different pronunciations; giving me gyaew instead of brgyad doesn't really help me pronounce the word.--Prosfilaes (talk) 13:43, 22 August 2016 (UTC)
- Are you sure?! You should then talk to Japanese editors and tell them they should transliterate こんにちは as "konnnichiha" They've been doing it wrong all these years! Also, get in touch with some other dictionary publishers and tell them their Korean and Thai transliterations are wrong. --Anatoli T. (обсудить/вклад) 13:49, 22 August 2016 (UTC)
- Sarcasm aside, the rest of my statement still stands. If you want to know how a word is pronounced, look at the pronunciation key, not the translation table. Readers who "automatically assume the romanisation in translations is the word's pronunciation" is going to be consistently lost, and I fail to see how gyaew is going to help any reader who doesn't know Tibetan figure out the pronunciation is /cɛʔ¹³²/ or /bɡjat/ or /dʑɛʔ⁵³/ or /dʑed/ or /wɟjal/ or /hdʑal/. I in fact feel that any reader who could use gyaew to derive the correct pronunciation probably knows enough to figure out that brgyad isn't the pronunciation transcription they were looking for.
- Romanize as you will, but the value of having one romanization throughout Wiktionary and giving readers a consistent Latin-script name for a word outweighs the value of having different romanizations in different places.--Prosfilaes (talk) 15:53, 22 August 2016 (UTC)
- Now that CodeCat and her supporters have successfully driven away Wyang from the project, someone has to take over all the work he has been doing. Congratulations! I am disgusted with community's reaction to the problem. --Anatoli T. (обсудить/вклад) 02:21, 21 August 2016 (UTC)
- Anatoli, what do you think should have been done (or should be done) differently? Benwing2 (talk) 04:17, 21 August 2016 (UTC)
- I don't think it's anyone's fault but Wyang's, given that the leaving was in response to "Note that you haven't answered whether you will accept the community's consensus if it goes against yours."--Prosfilaes (talk) 13:07, 21 August 2016 (UTC)
- I know the technical subject is not the subject of this thread but anyway: could not the naming disagreement be solved by placing CodeCat code in Module:th-transcr rather than Module:th-translit? Then, the misnomer argument would no longer apply, and other argument against CodeCat's solution would have to be sought. --Dan Polansky (talk) 08:06, 22 August 2016 (UTC)
- The code was originally placed in Module:th (function
getTranslit
). Either Module:th or Module:th-transcript is fine, though either way the transcription module needs to be recorded in addition in Module:links or Module:languages/data2, as translit_module
is a misnomeric parameter. Wyang (talk) 10:22, 22 August 2016 (UTC)
- A further question: are the modules currently present in Category:Transliteration modules in general transcription modules or are they overwhelmingly transliteration modules in the narrow sense, transcribing on the letter or character level? --Dan Polansky (talk) 08:13, 22 August 2016 (UTC)
- Just call it all 'Romanisation modules' and be done with it. Korn [kʰũːɘ̃n] (talk) 10:57, 22 August 2016 (UTC)
- I oppose calling those modules "Romanisation modules". There are elements of "transcription" (more or less) in many languages, most of them are standard transliteration. Here are examples of transliterations with elements of transcription, "the translit" shows more graphical transliterations of the same word (the actual spelling):
- Arabic: عربى (ʿarabiyy), translit: "ʿrbā", vocalised Arabic: عَرَبِيّ (ʕarabiyy)
- Greek: Μπούρμα (Boúrma), translit: "Mpoúrma"
- Russian: легкого (ljóxkovo) (phonetic respelling: "лёхково"), translit: "legkogo", spelling with "ё": лёгкого (ljóxkovo)
- Japanese: こんにちは (konnichiwa), translit: "konnichiha"
- Korean: 십육 (simnyuk) (phonetic respelling: "심뉵"), translit: "sibyuk"
- Hindi: फिल्म (film), translit: "philma", spelling with "nuqta": फ़िल्म (film)
- One can argue that abjad languages like Arabic, Persian, Urdu, Hebrew, etc. can't be transliterated but romanisations are still called transliterations. Persian and Urdu are seldom fully vocalised, so their graphical transliterations would be completely useless for someone wanting to know how to pronounce Persian or Urdu words. Some irregularities are handled by transliteration modules, for some terms manual (hard-coded) transliteration is required. If someone accuses Wyang for making up transliterations for Thai, check Paiboon dictionaries for terms like ชาติ (châat) (graphical transliteration: "châa-dtì") and see how these terms are transliterated there. --Anatoli T. (обсудить/вклад) 13:18, 22 August 2016 (UTC)
- I see no problem with calling any rendering of a non-Latin word in Latin script a romanisation as a hypernym and only referring to it as a transliteration/transcription specifically when it's important to underline the difference. (Maybe leave a note in the documentation.) If a module is made which does both, as some parties propose, or if only one is reasonable for a language, why not go with an indiscriminating 'romanisation' so you can categorise them all easier and don't have to waste debate time on naming conventions? Korn [kʰũːɘ̃n] (talk) 14:37, 22 August 2016 (UTC)
- A propos of the voting/consensus matter, I am particularly well qualified to contribute as one of those ignorant of most aspects of the matters under discussion.
- The rationales for requiring a consensus of more than those knowledgeable about the languages in question is that it might interfere with the module architecture as currently designed, that the translation tables might become cluttered, and that some users (including those not intending to learn the languages and scripts in question) might be confused/overwhelmed/put-off by the transliteration-transcription distinction.
- What makes sense for entries in the languages in question is a matter best left to the contributors in those languages IMO. If our module architecture did not anticipate the need for a transcription-transcription distinction, then so much the worse for the architecture. We cannot have the module architecture unreasonably preventing contributors from contributing in the manner that is best for the languages in question by their lights. IOW, we should not have the tail wagging the dog. How to apply this principle is left as an exercise to the reader.
- I can only beg that the translation table matters do not make the tables cluttered and confusing for all to deliver a questionable benefit to some. DCDuring TALK 12:40, 22 August 2016 (UTC)
- @DCDuring I agree! But wait, maybe CodeCat is eager to make a new module for Khmer or Burmese language and apply their "best practice" there? Well, Wyang has started, somebody can make those modules perfect!
- Seriously, I perfectly understand Wyang's frustration. He created a WORKING SOLUTION for complex Asian languages nobody even attempted before. Now, someone starts changing modules without any discussion with him. I would be very upset if someone tried to change my work without first checking with me. Why people even think they should be both blocked? How would YOU feel if you were in the same situation? I don't want CodeCat blocked but I think she is absolutely wrong here. Yes, location of the code can be reviewed and discussed, agreed first and only THEN changed, if the agreement is reached.--Anatoli T. (обсудить/вклад) 13:29, 22 August 2016 (UTC)
- They were both blocked/desysopped because they both used their admin powers to continue an edit war. It's not a punishment for being on the wrong side of the argument, it's a method for suppressing disruptive behaviour. Korn [kʰũːɘ̃n] (talk) 14:43, 22 August 2016 (UTC)
- @Dan Polansky Dan, can you create a vote to short-circuit endless arguing? The vote should have two choices: (1) Continue the current situation where Module:links enforces the constraint that a single romanization (which may be a two-part transcription/transliteration romanization, on a language-specific basis) is used for all types of links; (2) Modify Module:links to allow different romanizations for different types of links (e.g. etym links vs. translation links). The former is User:CodeCat's position, the latter is User:Wyang's. Set the discussion period and vote start/end dates however you think most appropriate. Benwing2 (talk) 15:34, 22 August 2016 (UTC)
- I agree with what User:DCDuring said above. My proposal is to keep transliteration and transcription utilities modules separate in Module:languages/data2 and similar modules, for languages possessing two contrastive sets of romanisation schemes. Notable examples include Tibetan (Wylie transliteration vs Tibetan Pinyin), Burmese (MLCTS vs BGN-PCGN), Thai (ISO 11940 vs Paiboon) and Korean (Yale vs RR). The rationale is that the module infrastructure should anticipate the need for a transliteration-transcription distinction in certain languages, and not unreasonably prevent contributors of these languages from contributing in a manner that is best for the languages in question by their lights. I am in no position to singlehandedly advocate that language X should use romanisation Y for a certain purpose without meticulous discussion having taken place surrounding language X, which need to happen in separate language-specific discussions. Still, there is a lack of adequate in-depth discussion concerning the issue, especially from arguments against - why do the harms outweigh the benefits if we keep the transliteration and transcription modules separate for these languages? Wyang (talk) 23:56, 22 August 2016 (UTC)
- If they are kept separate, then there needs to be a functional reason. The distinction needs to have a consequence in how our modules work and treat each one, and where each one appears. I don't think it particularly desirable to have multiple romanization schemes in different parts of Wiktionary, this just confuses users. The system we have now, with a consistent representation across Wiktionary, is just fine. We don't need two systems when one suffices. —CodeCat 00:13, 23 August 2016 (UTC)
- The whole point of Wyang's argument is that two systems are already in broader use for certain languages, with each system used for specific purposes. I.e., one romanization scheme doesn't suffice, for certain specific languages. ‑‑ Eiríkr Útlendi │Tala við mig 00:40, 23 August 2016 (UTC)
- The question is “why do the harms outweigh the benefits if we keep the transliteration and transcription modules separate for these languages”? It does not make sense to use multiple romanisation schemes for Greek, Russian, Georgian, Armenian, etc., but the languages of question are languages which contrast these modes of romanisation prominently. Does it make sense to keep transliteration and transcription modules separate in the module infrastructure for these languages? Yes. Many editors of these languages have been conscious of the need to use the appropriate romanisation in certain contexts. See for example, how User:Angr changed the romanisation of the Burmese word to a transcription at elephant. Our Korean romanisation scheme is the transcriptive Revised Romanisation scheme, which is official in South Korea (also hidden under the misnomer Module:ko-translit). User:Visviva, our first prolific Korean contributor, created the entry 미끄럽다. Note the differential use of a transcriptive romanisation in the main text (mikkeureopda) and a transliterative romanisation in etymology (muys.kulepta). Considerations of the arguments for and against need to be made in the context of these script-pronunciation-discordant languages. Provided the romanisation is well-annotated, such as 믯그럽다 (Yale: muys.kulepta) at 미끄럽다, the appropriate, purpose-oriented use of romanisations is hugely beneficial to dictionary building, for these languages. Wyang (talk) 00:54, 23 August 2016 (UTC)
- I will just add, Wyang, that you keep claiming that the people opposed to you are giving no real reasons for doing so, but you yourself have given no reasons why consistently using a dual romanization scheme, like I've suggested as a compromise between your view and CodeCat's, is unacceptable, other than the unsupported claim that it's confusing for new users. Benwing2 (talk) 01:09, 23 August 2016 (UTC)
- FWIW, my bias is to using the more-phonetic romanization (I recognize this is not IPA-grade, but it *is* generally closer to how an English speaker would say something) in translation lists and similar locations, and using the strict transliteration (i.e. letter-for-letter) romanization in etymology sections and other discussions of the term's etymology and historical development. I.e., I disagree with Benwing's suggestion, and I don't want to see both systems used in all cases. This would be similar to what is already in practice for Korean. ‑‑ Eiríkr Útlendi │Tala við mig 01:18, 23 August 2016 (UTC)
- Exactly. The information presented in a dictionary should be succinct; dual romanisation in translations is infoxication. The reason users look up translations is to answer their questions of “how do you say ... in ...?”, and romanisation in translations should cater to the need of users. Let's look at how other translation dictionaries do this: the only previewable English-Tibetan dictionaries on Google Books are 1, 2 and 3 and all are using transcriptions only. Transcriptions answer the users' questions directly, without additional romanisations to complicate their information processing (below). Wyang (talk) 02:57, 23 August 2016 (UTC)
- “hello”: བཀྲ་ཤིས་བདེ་ལེགས (pr. zhacf-xih-dev-leh) vs བཀྲ་ཤིས་བདེ་ལེགས (zhacf-xih-dev-leh )
- “birthday”: འཁྲུངས་སྐར (pr. chungf-gaaf) vs འཁྲུངས་སྐར (chungf-gaaf )
- “brain”: ཀླད་པ (pr. laef-baf) vs ཀླད་པ (laef-baf )
- Do you have any evidence that users look up translations solely for the pronunciations of things? The main reason that the system with multiple romanisations in confusing is that not a single one of them is explained.
- “brain”: ཀླད་པ (pr. laef-baf, sp. ) Korn [kʰũːɘ̃n] (talk) 10:08, 23 August 2016 (UTC)
- There is plenty of evidence for that. Many transliterations are efforts of a few people over a period of time who took part in their development. Apart from Wyang, myself, you can talk to people like Eirikr, Haplology (Japanese), TAKASUGI Shinji (Japanese and Korean), Aryamanarora (Hindi), Benwing2 (Russian), Saltmarsh (Greek) why transliterations are the way they are. They use some phonetic elements, they are not IPA and not supposed to convey the pronunciation accurately. --Anatoli T. (обсудить/вклад) 10:26, 23 August 2016 (UTC)
- Anatoli, I don't understand what you are trying to tell me with your comment. Wikipedia says that Tibetan Pinyin does not mark tone. Our pronunciation sections use the label 'Tibetan Pinyin' but according to Wang, the superscript letters we see are tone marks. What kind of system is that? It seems in need of relabeling. Korn [kʰũːɘ̃n] (talk) 10:34, 23 August 2016 (UTC)
- Korn: See Tibetan pinyin#References. This is the modified version of the official Tibetan Pinyin, with tone letters. The Wylie transliteration scheme is the gold standard of Tibetan romanisation; it and its variants are used in almost all scholarly publications. But it is interesting that all of the three previewable English-Tibetan dictionaries on Google Books use transcriptions only to romanise their Tibetan translations. Wyang (talk) 10:56, 23 August 2016 (UTC)
- My point was that for each language, the interested and knowledgeable editors decided what and how to go about transliterations for specific languages. I could bring series of discussions about Korean. Wyang implemented most of it. The phonetic transliteration (RR) was adopted - officially recommended in South Korea. There were relevant long discussions, decisions made. Now with the argument between Wyang and CodeCat everybody joined with their opinions but cared little when the actual problems were discussed and solved. --Anatoli T. (обсудить/вклад) 11:08, 23 August 2016 (UTC)
- Wyang, there should be a note about what the symbols mean on About: Tibetan, including what the tone marks mean, this information is not easily retrievable. Anatoli, I assume the reason for that is that now the community is forced to take note of the situation because it's brought to the Beer Parlour whereas before it was discussed amongst editors of the language. Korn [kʰũːɘ̃n] (talk) 14:55, 23 August 2016 (UTC)
- Absolutely; have to work on it later. Wyang (talk) 21:16, 23 August 2016 (UTC)
(September) Sysop
Can I have my sysopship back please? It's getting very frustrating not being able to properly patrol or edit protected pages. I also ask for Module:links, Module:th and Module:th-translit to be restored to the version that puts the transliteration code in Module:th-translit (where it ought to be) rather than Module:links, and ask that this be enforced by all editors. There are currently negotiations for a vote for Wyang's proposal, so it would be inappropriate for him to restore his version and continue the edit war before a vote on the matter has been held. —CodeCat 20:42, 5 September 2016 (UTC)
- For the record, negotiations are happening at Wiktionary:Votes/2016-08/Enabling different kinds of romanization in different locations and the vote talk page.
- I support giving back the tools to CodeCat, and to Wyang too. I support restoring modules and templates to the previous version. Whatever the merits of having two separate romanizations (I might even vote support!), I believe the status quo should prevail and that the new proposal should be properly discussed before implementation, especially in case of a huge disagreement like the one that we have now. --Daniel Carrero (talk) 20:48, 5 September 2016 (UTC)
- Agreed And this also may be a good reason to implement Template Editor privileges here. —Justin (koavf)❤T☮C☺M☯ 21:55, 5 September 2016 (UTC)
- Support Why was CodeCat ever desysopped? --Florian Blaschke (talk) 22:20, 5 September 2016 (UTC)
- There are two things that have to happen before I restore sysop rights:
- There has to be support from the community for it. This has been trickling in, and probably won't be a barrier.
- I have to be convinced that both parties will refrain from any actions that might start the edit war again.
- The negotiations at Wiktionary talk:Votes/2016-08/Enabling different kinds of romanization in different locations are a start, but they mostly consist of some variant of "what about this?", followed by some variant of "you're not getting my point". We need to get beyond talking past each other and start talking about serious proposals. We also need to avoid dwelling on past behavior and start discussing what the future is going to look like. Chuck Entz (talk) 22:30, 5 September 2016 (UTC)
- FWIW I am OK with restoring sysop privileges, provided both Wyang and CodeCat agree not to resume edit warring. I also think that Module:links should be restored to the status quo ante, with an appropriate vote to resolve the matter. In fact I asked Dan to create this vote in order try to resolve what I thought was the root of the conflict between CodeCat and Wyang. As it happens, Wyang has objected to the vote for various reasons, some of which concern whether the issue of the vote is the right one to be voting on and some of which object to having a vote at all. The amount of contention here indicates we clearly need a vote but I'm open to rewording it. However, this issue is orthogonal to the issue of sysop privileges. Benwing2 (talk) 22:32, 5 September 2016 (UTC)
- My only concern is the restoration of existing practice to the Thai transliteration module, and the elimination of custom code from Module:links. If that is accepted then there won't be any edit warring from me, though I do ask what course of action I should take if Wyang restores his version of the modules without a vote to support it. The reason the edit war happened in the first place was because Wyang kept reverting me and no steps were taken to stop him, and he ignored all attempts I made to convince him to stop and wait for consensus/vote. So if Wyang is sysopped again, there needs to be a contingency plan in case he does the same again; some kind of guarantee that others will also step in instead of just me. —CodeCat 22:42, 5 September 2016 (UTC)
- Translation: You want us to take your side on the edit war and enforce it for you. I happen to prefer your version, but this kind of talk isn't very helpful. Chuck Entz (talk) 23:27, 5 September 2016 (UTC)
- Pretty much, yes. The alternative would be endorsing Wyang's edits without a vote to show such endorsement by the wider community. That doesn't seem like a proper option given how contentious the issue is. Major changes that are contentious should be voted on, yes? —CodeCat 23:55, 5 September 2016 (UTC)
- (edit conflict) One part of the problem is figuring out exactly what the status quo ante would be: this started when Wyang added his code to Module:links to implement a very useful change for Thai transliterations/romanizations. CodeCat later extensively reworked the module, in the process removing the code (I'm not sure whether she noticed the code or recognized what it was at the time). This broke a number of Thai entries and several Thai editors asked what was going on, so Wyang added the code back. It's possible that CodeCat, if she was unaware of the earlier code, thought this was something entirely new- she certainly acted as if it were. She reverted his edit, and didn't handle the dispute very well. Wyang got upset and the edit war started. Wikitiki89 came up with a compromise that moved the code out of Module:Links, which CodeCat adopted, but Wyang didn't.
- Do we revert it to:
- The state before Wyang's first edit? That would wipe out CodeCat's reworking of the module.
- The state before Wyang's second edit? (Dan Polanski's choice, if I understand correctly). That would break a number of Thai entries.
- The state after Wyang's second edit? (Wyang's choice)
- The state after Wikitiki89's edit? (CodeCat's choice)
- The last two are the only ones that don't break anything, and either could be considered the status quo ante, depending on how you interpret Wyang's first edit. Chuck Entz (talk) 23:15, 5 September 2016 (UTC)
- I don't see any point in restoring anyone's admin rights until the substance of the disagreement is resolved. As I see it, the destructive turn the conflict took is a serious matter, affecting important core software. If the talent involved in the matter cannot resolve it, perhaps someone else should. DCDuring TALK 23:44, 5 September 2016 (UTC)
- There's already a vote that attempts to propose Wyang's changes so that a formal consensus can be made. But Wyang doesn't seem very cooperative in formulating the proposal, so it's mostly stuck. Since Wyang thus has no consensus for his proposed reinterpretation of transliteration modules, the status quo remains, which is that transliteration modules provide any kind of romanisation deemed desirable. This is what my and Wikitiki's edits attempted to do. If Wyang does not agree to a vote but forces his own interpretation through edit warring, what can be done? —CodeCat 23:59, 5 September 2016 (UTC)
- @Chuck Entz: Hmm, when I wrote my comment I didn't check out the whole history carefully. Since the argument is about the presence or absence of a particular piece of Thai-specific code in Module:links, and if I'm not mistaken this didn't exist before the whole edit war started, then logically the status quo ante shouldn't include it. However, I don't completely understand the ramifications of this. Wyang obviously put the code there for a reason; but CodeCat and Wikitiki seem to believe that the same functionality can be achieved with this code in Module:th-translit. If this is true, then it should be taken out pending a vote to decide the underlying issues. Benwing2 (talk) 00:19, 6 September 2016 (UTC)
- The reason the code was placed there by Wyang is because he believes that transliteration modules should only transliterate strictly: character by character. He therefore objects to the modification Wikitiki made, but at the same time, his reinterpretation of transliteration modules is not the agreed status quo. I argue that under the consensus interpretation, a vote is necessary for Wyang's proposal to restrict transliteration modules to just strict transliteration, and have an alternative module system/infrastructure for non-transliterative romanizations. I also believe that under this interpretation, the Thai transliteration code should be placed in Module:th-translit until a vote shows consensus to the contrary. And additionally, even if a vote passes to have separate infrastructures in our modules for transliteration and other types of romanization, the specific code for Thai does not belong in Module:links, but should be handled by said proposed infrastructure in a more general manner. —CodeCat 00:34, 6 September 2016 (UTC)
- There was no consensus. What is being repetitively cited as "consensus" is how people perceive romanisations from the angle of languages not making such a distinction. Truth is, appropriate and purpose-oriented romanisation has been the norm in languages with a script-pronunciation discordance, and it has been the consensus for these languages. See for example the differential use of transcriptions and transliterations (
{{ko-etym-native}}
) in 미끄럽다 (mikkeureopda), by User:Visviva who created the bulk of our Korean entries. The core issue is “why do the harms outweigh the benefits if we keep the transliteration and transcription modules separate for these languages”, and the conclusion from the previous discussion is: "the envisageable harm is minimal and benefits are extensive". There is a demonstratable need to maintain the systems separate - our language editors routinely apply different romanisations when editing these languages, and printed dictionaries of these languages show that authors regard that the different modes of romanisation are suited to different purposes. The issue is not whether we should implement use romanisation X in translations right now; the issue is whether the system should be maintained to take this need into consideration and not deliberately confuse the concepts "transliteration" and "transcription" (where they truly make a difference), so that future edits in these languages are not discouraged. Wyang (talk) 03:20, 6 September 2016 (UTC)
- What happens now? —CodeCat 19:57, 9 September 2016 (UTC)
- This is up to Chuck. I'm not sure where things stand currently. Benwing2 (talk) 16:17, 11 September 2016 (UTC)
(September) Restoration of Sysop Privileges
Given the amount of time with no action on the disputed issue, I'm prepared to restore sysop privileges to @CodeCat and to @Wyang if they will commit to not editing Module:links except for changes both agree to beforehand, at least until both agree that the conflict is resolved.
Please state here whether you agree to this. Thanks! Chuck Entz (talk) 23:58, 11 September 2016 (UTC)
- Can someone else make the changes, then? If neither of us is allowed to edit it, that implies that there is a consensus for Wyang's preferred version. The reason I continue to press this is because I fear that if I don't, nothing will be done about it yet again. —CodeCat 01:39, 12 September 2016 (UTC)
- @CodeCat, maybe you could provide a link to the exact revision of the module which you would say is the correct status quo? --Daniel Carrero (talk) 01:45, 12 September 2016 (UTC)
- , , . These three revisions ensure that the Thai transliteration code is placed in the Thai transliteration module where it belongs (according to the current consensus on treatment of transliteration modules), rather than in Module:links where it does not belong. —CodeCat 01:49, 12 September 2016 (UTC)
- Do other people agree with reverting the modules to these exact versions?
- I'll repeat what I said in another discussion:
- I support restoring sysop privileges to both CodeCat and Wyang.
- I support reverting the modules to the status quo, and in the face of this huge disagreement, I urge @Wyang to help in the creation of the vote before implementing any new proposal.
- Correct me if I'm wrong: I seem to remember that some entries were already edited based on Wyang's system and reverting the modules to the status quo would break the entries. Still, IMO the status quo should prevail and the entries should be fixed. --Daniel Carrero (talk) 02:03, 12 September 2016 (UTC)
- I also support restoring sysop privileges to both CodeCat and Wyang. In addition, I support restoring the modules to the status quo. Unfortunately, as Chuck pointed out, it's not totally obvious what this is, but in my mind, since the edit war specifically concerned references to Module:th in Module:links (+ supporting code), and since the references to Module:th weren't present in the module beforehand, the status quo should not include them: Specifically, it shouldn't include Module:th, 'phonetic_extraction' or the code that references 'phonetic_extraction'. Benwing2 (talk) 02:50, 12 September 2016 (UTC)
- Back then there wasn't even any automated romanisation for Thai; restoring the previous version would simply wipe out the romanisations in thousands of Thai entries. I'm really confused. There was no consensus for CodeCat's edit, despite her claiming there is. I was only adding in transcription support at Module:links (which was lacking transcription support) per the consensus of the Thai editors, in a manner that is most appropriate for further editing in Thai and other similar languages. If you do not agree, voice your arguments other than voicing “I don't like it”! I spent so much effort arguing for why storing transcription and transliteration modules separate is beneficial in the long run, and what I got was non-participation and the indifferent “so what happens now?” (1, 2). Decision-making should not be like this - having people voice their opinions without having a critical appraisal of the arguments for and against makes the decisions arrived at highly prone to unintelligence. It shouldn't be the case that you can say your preference and expect it to be enacted without giving a reason. Why do the harms outweigh the benefits if we keep the transliteration and transcription modules separate for these languages, when our language editors routinely apply different romanisations when editing these languages, and printed dictionaries of these languages show that authors regard the different modes of romanisation as suited to different purposes? If it cannot be demonstrated that the harms do outweigh the benefits for these languages and there is no willingness to demonstrate, there is no justification for enacting this opinion or restoring the “previous version” which abolishes the functionality altogether. Wyang (talk) 03:54, 12 September 2016 (UTC)
- (edit conflict) We're trying to achieve a compromise here. In my book, adopting a version more heavily weighed against one side than the other side even asked for isn't a compromise. What you're asking for basically breaks a large number of Thai entries that were modified in good faith by the Thai community after Wyang provided the capability for it with his first edit. Regardless of how things are going to end up eventually, that's too much collateral damage to make it a reasonable first step toward a compromise. Remember the story of how Solomon pretended he was going to cut a baby in half in order to see from the reaction of the two claimants which was the real mother? This is like cutting the baby in half first. Chuck Entz (talk) 12:41, 12 September 2016 (UTC)
- So, over at the Grease pit, @Vahagn Petrosyan had mentioned that many languages require both transliteration and transcription. Do we think that the inclusion of both, if the transcription differs, could kill two birds with one stone? —JohnC5 17:05, 12 September 2016 (UTC)
- That's what Wiktionary:Votes/2016-08/Enabling different kinds of romanization in different locations is supposed to address. But it's not going anywhere. —CodeCat 17:13, 12 September 2016 (UTC)
- @Wyang: I have a question for you, and I'm sorry if you already explained it somewhere. I'm going to ask anyway: Given the benefits about your proposal that you explained, don't you think that Wiktionary:Votes/2016-08/Enabling different kinds of romanization in different locations has a good chance to pass? More importantly, is the linked vote satisfactory for you, or would you change something in the proposal? --Daniel Carrero (talk) 04:03, 12 September 2016 (UTC)
- @Daniel Carrero: I believe the answer to your question is on the vote's talk page. —suzukaze (t・c) 04:05, 12 September 2016 (UTC)
- OK, but Wyang may still choose to help building the vote. If the vote explains the proposal correctly and passes, it will mean we are all on the same page and understand the implemented proposal.
- In the previous discussion, Chuck Entz presented a few possible versions of the status quo to choose from. Is anyone interested in discussing what exactly is the right one? If no one objects, I'll just trust CodeCat and revert the three modules to the revisions that she mentioned. --Daniel Carrero (talk) 10:04, 12 September 2016 (UTC)
- Why? I have explained the reasons of my objection well enough above, and in the previous discussions. Why do the harms outweigh the benefits if we keep the transliteration and transcription modules separate for these languages, when there is ample evidence suggesting the contrary? Nobody was interested in engaging in discussion to argue for the version that you are trying to restore. Why is reverting to a version which cannot be justified even being considered? Wyang (talk) 11:35, 12 September 2016 (UTC)
- Please understand: It's not about whether the proposal is good, it's about whether other people agree with it, and are on the same page. That's why some of us are interested in having a vote, which would explain and record the proposal, and let others judge its merits. To put it another way: if the proposal is really good, the vote is probably going to pass and we'll do exactly as you proposed. --Daniel Carrero (talk) 11:57, 12 September 2016 (UTC)
- We haven't had votes on the architecture of the modules, so I don't see what makes the "status quo ante" Wyang so sacred. If Wyang took the initiative to overcome a language(s)-relevant limitation of the module architecture, it seems to me that it merits our respect. If our architecture doesn't provide the required flexibility without some kind of kludges, so much the worse for the existing architecture. In this and on many other matters I favor accommodating decentralized decision-making. DCDuring TALK 12:33, 12 September 2016 (UTC)
- Wyang's changes don't do anything that could not be achieved within our existing module framework. The three edits Wikitiki made to the modules, and which I proposed they be restored to, show that. The only reason he did it is because he doesn't like the framework (specifically, that transliteration modules do other kinds of romanization too). Therefore, I proposed that if he doesn't like our current consensus on what transliteration modules do and how they are used in other modules/templates, he should make a vote to change it. So far he hasn't shown any interest. Most of what has happened since then is several editors trying to get Wyang to cooperate on formulating a vote, while Wyang himself is skirting around the issue and avoiding a vote. Is this appropriate behaviour when someone's changes have been challenged? And would it be appropriate to allow said changes to remain in place when they have been challenged so heavily and the user is not prepared to let the community decide per vote on the issue? —CodeCat 13:56, 12 September 2016 (UTC)
- As I said above, the only point revolved around in the “no”-camp is “I don't like it”, without any explanation given. Why do the harms outweigh the benefits if we keep them separate in these languages, when there is ample evidence suggesting the contrary? You keep citing your version as consensus, but where is the vote showing that? Using purpose-suited romanisation is the consensus for languages with a transcription-transliteration distinction (
{{ko-etym-native}}
, etc.). If you do not like this practice, you should bring this up in a discussion and explain your reasoning, aside from saying “I don't like it”. There is no point blaming the implementer for implementing what was already a custom in languages you are not involved in, and barring the improvement in the module infrastructure for these languages. Wyang (talk) 22:55, 12 September 2016 (UTC)
- As I said before, there is no "no"-camp, just people that you need to convince. The burden of proof is on you. Once that's done, the vote should be able to pass. We are repeating the same arguments over and over. This discussion is going nowhere. I reverted the three modules to the revisions chosen by CodeCat. Feel free to discuss if I should have done something different. --Daniel Carrero (talk) 23:07, 12 September 2016 (UTC)
- I reverted the edits I could revert. Discussion is still ongoing; you cannot voice your opinion and expect it to be enacted without justifying it. Any unilateral measure taken constitutes disrespect to the participants of discussion. Wyang (talk) 23:14, 12 September 2016 (UTC)
- "you cannot voice your opinion and expect it to be enacted without justifying it" ... ha! I see some irony there, and it's amusing. But it may be just me. Seriously, if I did something wrong please someone step up and say what to do. I restored the modules again. --Daniel Carrero (talk) 23:21, 12 September 2016 (UTC)
- You are insane. You did not even know what the contention was, and yet you feel empowered to trample on whatever modules you can get your hands on simply because you can. Wyang (talk) 23:28, 12 September 2016 (UTC)
- Good grief. The diff you linked to does not indicate that I'm completely clueless about the contention. It does indicate that I was politely asking you for your opinion on the best way to word a vote. --Daniel Carrero (talk) 23:34, 12 September 2016 (UTC)
- Asking for my opinion on the best way to word a vote... when it should not be relayed to a vote at all, because there is no argument input from people arguing we should confuse transliteration and transcription. There are numerous arguments for keeping the modules separate being put forth in the discussion, such as (1) our editors in these languages already implement the practice of using purpose-suited romanisation; (2) printed dictionaries in these languages use differential romanisation and deem the different modes of romanisation as suited to different purposes; (3) it conforms to existing language-specific module infrastructure developed for these languages; (4) it is prospectively designed, and does not discourage further improvements in these languages. But the arguments against? One: "I don't like it". It is unfair to use a vote to end a discussion, when one side is only interested in expressing their opinion and not giving any rationales for it. It is facilitating mindless decision-making. Wyang (talk) 23:46, 12 September 2016 (UTC)
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ You have failed to provide an accurate view of the opinions of other people. "But the arguments against? One: "I don't like it". is a straw man.
Could you please change your mind and be willing to cooperate in the vote? We could add your 4 points in the rationale. --Daniel Carrero (talk) 23:53, 12 September 2016 (UTC)
- If I have failed to provide an accurate view of the opinions of other people, then could you please list the arguments against? We are still at a stage in the discussion where we are struggling to list any arguments from one side. This is way too immature to call on votes. Votes are evil. It allows such disproportionate argumentation to be easily distorted to produce an unintelligent consensus for the reason of sheer numbers only. Wyang (talk) 00:57, 13 September 2016 (UTC)
(September) What Needs to Happen
The main obstacle to resolving this dispute is that neither CodeCat nor Wyang trust the process- for good reason. In past disputes, we've had an unfortunate tendency to put out the immediate fires and then sweep the issue under the rug. Faced with this possibility, both have tried to get things the way they want them so that they don't lose out when everyone gets tired of the issue and moves on. The one thing we don't want to do is to jump in and take unilateral action- that will just confirm the worst fears of the one who loses out.
We need to resolve this now, before it becomes out of sight, out of mind. The way to do this is to get down to discussing what the new configuration should look like, in concrete terms.
Notice I said "discussing". We simply haven't gotten to the point of drafting votes, because we're still all talking past each other- any vote will most likely not address the issues needed to resolve the dispute and will just complicate things. The correct sequence is to come to a consensus, and then draft a vote, if necessary.
I can't do anymore at the moment because I'm still at work and it's really late. I'll spend some time on my way home trying to come up with a way to get the discussion started. Please don't blow things up in the meanwhile... Chuck Entz (talk) 02:25, 13 September 2016 (UTC)
- I would support passing additional information (such as the name of the calling template and perhaps more) to the romanization module. This would make the Thai-specific code in Module:links that started this whole dispute unnecessary. I still think that there should only need to be one romanization module even if it provides both transliterations and transcriptions. --WikiTiki89 13:50, 13 September 2016 (UTC)
- Another detail that hasn't been mentioned much is that Wyang wants to pass link target to the Thai module in order to find the transcription on the linked page. There are numerous reasons why this is a bad idea. Wyang has mentioned that the performance impact of reading the text of a page in a module is not as bad as people might assume at first, but that is not even the only issue. The romanization module must be able to romanize full unlinked sentences (such as in usage examples) and even redlinks. This cannot happen if the module depends on the existence of the link target. Not only that, but it would produce incorrect results for links with alt text, since it would transcribe the linked form and not the displayed form. --WikiTiki89 13:55, 13 September 2016 (UTC)
- Is the reason for passing additional information such as the name of the calling template so that the Thai module can show a transliteration in etymologies and a transcription in translation sections? I'm opposed to doing that; I think it would be extremely confusing. Better to show both types of romanization in all places, as I've mentioned before. Allowing this would be a major user-facing change and needs a vote (that's why I had Dan create the vote). If this vote passes, then I think we should still require that transcriptions are always shown, and transliterations are also shown in the places where it's desired (e.g. etymology sections). Benwing2 (talk) 14:23, 13 September 2016 (UTC)
- According to Wyang, some entries already do this. It should probably be reversed if there is no consensus for it. Though with how Wyang is, he'll put up a fuss and start another edit war. —CodeCat 14:27, 13 September 2016 (UTC)
- I think that transcription and transliteration need to be separated on some level. First of all, one is conceptually an attribute of the script, and other of the language. Thus changes to a transcription of a script will have to be applied to all trans* modules separately making human errors likely. Second, transcription should be available to overriding while transliteration should always be automatically generated. Also, in historical languages using Abjads, it should be noted that having both of these would be useful, as one is a factual shape of the word as found in the text and other an educated guess and both are necessary to explain some etymologies.
- Regarding the question of whether both or one romanization should be displayed, I suggest that, no matter what is decided to be the default option, appropriate html tags be placed around the transliteration so that a custom .css file can hide these for users that understand the script in question (seeing anything written in Cyrillic repeated in Latin can be slightly annoying when you already are native in the script).
- Yet I do not understand the details of our current implementation and why Wyang's changes are creating problems. If his way of doing this is indeed too harmful I support reverting it, but then please draft an alternative solution to this. Crom daba (talk) 17:36, 13 September 2016 (UTC)
- The alternative solution was Wikitiki's changes, which Wyang reverted over and over again and I reinstated over and over again. Contrary to what you might think, Wyang's changes actually did not establish separate transliteration and transcription. It merely bypassed the fact that the Thai transliteration module was called "translit" by putting the code that would have gone in there in Module:links instead. I argued that such code did not belong there, but it still remains there after months of bickering over it. —CodeCat 17:43, 13 September 2016 (UTC)
- So what was the issue that Thai editors were complaining about? Crom daba (talk) 17:58, 13 September 2016 (UTC)
- Wyang? He was complaining that transcription code should not go in a "transliteration" module, even though it's the normal practice on Wiktionary to do so. Because he didn't want to put the code where it belonged, he started messing with Module:links instead, and that's where I stepped in, and now we have this situation. —CodeCat 18:39, 13 September 2016 (UTC)
- The whole point is: transcription and transliteration utilities should be separately maintained in the module system, whenever there is a foreseeable possibility that purpose-suited romanisation may be useful for the language. The argument is how to design a module structure, specifically a romanisation infrastructure, that best supports the features of these languages and therefore the wishes of the language-editing community. We are not proposing that language A should use X format of romanisation, or that Akkadian/Tibetan romanisations should be written as such, or that different modes of romanisation should be used in different locations (cf. link); these are all highly language-specific questions that need to be addressed separately and individually in discussions among knowledgeable editors. Our role here is to envisage the language-specific romanisation requirements that may be proposed, and partition our stored romanisation utilities in a way that is most regular and easiest to invoke, and in a way that does not deter editors in these languages from contributing in a way they consider most appropriate for the language.
- The crux is “foreseeable possibility” of purpose-suited romanisation for a language. The reason purpose-suited romanisation is relevant is due to the different natures of the two modes of romanisation: transliteration is spelling-based, thus more etymology-oriented, and transcription pronunciation-based. The case of abjads is slightly different, but the benefit of storing utilities still applies. Why is purpose-suited romanisation and hence transliteration-transcription utility separation relevant on Wiktionary? Because:
- It is already being implemented in these languages (
{{ko-etym-native}}
). It is the consensus of the language community on how romanisations should be differentially applied. It is unreasonable to demand that the practice of using purpose-suited romanisation, which has been adopted universally in a language (you do not edit) for nearly ten years, be “reversed” without supplying any reason.
- Printed dictionaries do the same. The following are all the previewable Tibetan-English or English-Tibetan dictionaries on Google Books:
- Tibetan-English: 1, 2, 3
- English-Tibetan: 1, 2, 3.
- All the Tibetan-English dictionaries use transliterations to romanise, and all the English-Tibetan ones use transcriptions to romanise. Why? Because different modes of romanisation are suited to different purposes – transliteration for etymology and transcription for translation from English.
- It conforms to the existing module infrastructure for these languages. In languages observing a transliterative-transcriptive contrast or languages where transliteration is intrinsically impossible, the transliteration-transcription distinction is strictly adhered to when the language-specific modules were designed. Where transliteration is impossible, the term “transliteration” is not ambiguated to mean “transcription”; we do not have Module:zh-translit and Module:ja-translit, instead we use Module:zh/Module:zh-pron and Module:ja/Module:ja-pron to handle transcriptions. Where the transliteration-transcription distinction makes a difference on a romanisation level, modules are named and maintained unambiguously; there are Module:bo-translit and Module:th-translit for transliteration, and Module:bo/Module:bo-pron and Module:th/Module:th-pron for transcription. It is the consensus of how romanisation utilities are maintained in these highly script-pronunciation discordant languages.
- It makes maintenance easier. Maintaining the transliteration and transcription modules separately makes whatever preference there is for the romanisation output less difficult to achieve. Seeing that abjads were raised before, if we decide to apply juxtaposed transliteration-transcription for all abjads or languages X, Y, Z, we can just add in some brief code in the links module to concatenate the outputs of transcription and transliteration modules of these languages (one can also be manually supplied), as these modules have already been recorded appropriately in language_data. If one day we would like to remove transcriptions in romanisations for languages X, Y, Z, we could simply remove the brief code added in earlier, without having to go through all the *-translit modules and delete the transcription passages, wondering whether they should be kept somewhere before they vanish.
- Using page parsing to achieve romanisation has no demonstrable harm. Transcription is inherently more difficult than transliteration; it is nearly perfectly automatable for certain languages (e.g. Korean) but most of the time it needs to be achieved using additional tricks, and page parsing is one of the tricks. I cited w:Wikipedia:Don't worry about performance before and I still think it is also very relevant for the technical structure on Wiktionary. The possibility of using page parsing has made us realise that it is perfectly possible to obtain both the transliteration and transcription for a word when they differ greatly, and this is very exciting. I think all the Thai editors would agree that the implementation of parsing since early this year has made their work much easier (Wiktionary:Statistics, sorted by change in #gloss definitions), and I doubt anyone would be in favour of removing this functionality and having to supply romanisations manually. Likewise for Chinese templates.
- Having an additional functionality module which does something useful is always beneficial. As long as it is maintained adequately. This could be said of transcription modules using parsing to obtain the romanisations. Even though it will not be able to grab a transcription from uncreated entries, or entries which have no pronunciation information, this is an indication that those entries need to be improved. In the case of Thai, having some automatic romanisation is better than having none and having to supply one manually. In the end, we aim to encompass all words in all languages and utilities have to be adapted to ensure we are at our highest efficiencies while progressing towards that goal. I'm sure the functionalities of this site won't be limited to what is present at the moment. If we want to build a Thai transliterator and a Thai transcriber to romanise a Thai passage (similar to what Google Translate is doing simultaneously to the translation), or if we want to develop a tool to romanise a Tibetan text in different ways, having an infrastructure in place which does not confuse the utilities will be essential.
- Very few things are improved all of a sudden. While there is no transcription consideration in the central modules and the transcription modules are not recorded, it is most appropriate to name and maintain the romanisation utilities accurately. When the transcription modules can be recorded in language_data like the transliteration modules, the code should be migrated and rewritten. Above are my rationales for keeping the transcription and transliteration utilities separate for these languages where the different modes of romanisation are contrastive. Wyang (talk) 07:02, 14 September 2016 (UTC)
(September) Separating transcription from transliteration
We seem to be at an impasse on this issue, with discussion having died out again. Here are a few ideas to start discussion with:
- Why don't we have a separate pronunciation parameter? Not only could this be used for transcriptions, it would also be useful for disambiguating homographs like wind. The main drawback is that it could be overused/stuffed with information best left to pronunciation sections.
- The reason I bring this up is that our current romanization method routes everything through the
|tr=
parameter. For languages that have both transcription and transliteration, that leaves no way to tell which is being displayed. Having a separate parameter also makes it easier to set it up as a parallel to our current treatment of transliteration.
|pr=
seems the most logical name for such a parameter
- How would we distinguish between the two? I think we should leave transliteration as it is, and use a superscript in front for the transcription: (Transcr:fonɛtɪk spɛliŋ) (with the superscript linked to something informative)
- Either way, I don't think we should have language-specific special code in Module:links if we can avoid it: it's currently the seventh-most-transcluded page on Wiktionary, used by 4,889,303 pages. More importantly, it's often used dozens of times on a single page and in a few cases thousands of times. Just on general principles, the part of Module:links that's always executed should be only for things that are general in nature and can't be handled in more specialized routines. Even if the overhead is minimal, the clutter makes it harder to maintain. I can understand temporarily putting in a short-term kludge until a solution can be integrated into the regular module structure, but kludges have a way of growing as more special cases arise. They also are harder to understand/maintain: I don't think it would be obvious to most people that
local phonetic_extraction = { = "Module:th"}
has anything to do with transcription, and I'm not sure someone wanting to make changes related to transcription would look for the code where it is now.
- I think the best approach to integrating transcription would be to have a separate value for transcription modules in the Module:languages data submodules to parallel "translit_module"
- I propose naming it "transcr_module"
- I propose naming the entry-point function in these modules "pr()" to parallel the translit modules' "tr()"
- It would then be a simple matter of adding parallel code to what we have in module:links for transliteration
I obviously like my proposals, but feel free to tweak, rework or replace any or all of it. The only thing I ask is that we arrive at something concrete, and not more theoretical or who-did-what-and-why-I-don't-like-it talk. Thanks! Chuck Entz (talk) 02:18, 19 September 2016 (UTC)
- If we lack cooperation between our Lua module editor, we'll have the situation where transliterations and transcriptions are handled by separate modules for Japanese, Chinese, Thai, Burmese, Tibetan, etc and have no integration with other main modules. Wyang's templates (linked to appropriate modules) like
{{th-l}}
, {{ja-r}}
, {{zh-usex}}
exist almost in a separate world. I'd like to be able to transliterate Thai or Japanese by passing Thai phonetic respelling/hiragana with spacing, capitalisation,e tc but also use the features common to other templates. --Anatoli T. (обсудить/вклад) 02:37, 19 September 2016 (UTC)
- As stated elsewhere, I am very much in favor of this, though for a different reason. Vahagn and I had discussed how many languages with abjads or other writing systems require both a transliteration and transcription (Hittite, Old Persian, Mycenaean Greek, etc.). This would greatly reduce the amount of
|tr=
overloading necessary to represent these languages. —JohnC5 02:46, 19 September 2016 (UTC)
- |tr= may mean either transliteration or transcription or a mixture of both. For most languages, including abjad-based, the transcription-like transliteration has been the preferred one. That is also the case for Thai but displaying the character sequence (i.e. the "real" transliteration) can still be used for various purposes.--Anatoli T. (обсудить/вклад) 02:54, 19 September 2016 (UTC)
- @JohnC5 Would you mind pointing me to the discussion, or perhaps an example of the overloading scenario you have in mind? Sorry to write in such an old conversation. Thanks, Isomorphyc (talk) 02:40, 20 November 2016 (UTC)
- @Isomorphyc: No problem at all! The Mycenaean under *h₁éḱwos and *(s)kleh₂w-, the Mycenaean and Old Persian under *tetḱ-, and the Hittite under *ǵónu, to name a few. If these are not sufficient, tell me. —JohnC5 02:56, 20 November 2016 (UTC)
- @JohnC5: Thank you, this is perfect. Isomorphyc (talk) 11:30, 20 November 2016 (UTC)
- I support this. Wyang (talk) 06:03, 19 September 2016 (UTC)
- Sounds good, only I'd prefer it if we didn't bind transcription to phonetics, because for some ancient languages it would be preferable to write for example: (Sogdian) ૣીીૡોૐ (pš'x'rycyk) (pašaxārēčik) without going into details of what exactly were 'a', 'ā', 'ē' or 'č'. Crom daba (talk) 08:33, 19 September 2016 (UTC)
- What do you mean by "preferable". I want to know how to read/pronounce the word, so I want see "pašaxārēčik", as would be the case for Persian and other abjads. The actual string of characters can also be useful for etymologies or for people interested in learning the script.--Anatoli T. (обсудить/вклад) 08:39, 19 September 2016 (UTC)
- Perhaps I wasn't clear. It is preferable to write "pašaxārēčik" rather than "pəʃɨxaret͡ʃjək" (don't quote me on this "reconstruction"). Obviously we need both transcription and transliteration (for one, because there still aren't any free fonts for Manichaean Unicode as far as I know). Crom daba (talk) 09:05, 19 September 2016 (UTC)
(October) Italics in Project-Link Templates
Some time ago, User:DCDuring suggested that an |i=
parameter be added to these templates so links to taxonomic names could be italicized according to standard practice for such names. That was never done, but he started adding it to the wikitext in entries in anticipation of someone getting around to doing it eventually. This wasn't a problem, though, since templates ignore undefined parameters. Then User:CodeCat decided to convert these templates to use a Lua module. Even that would be fine, since Lua adds useful capabilities that can be exploited later. She also incorporated Module:parameters, which lets you specify which parameters can be used in a template. That's where things went off the rails. Aside from maybe half a dozen unrelated errors, all of the 462 entries currently in Category:Pages with module errors are due to the previously-ignored |i=
parameter.
CodeCat had said that the luafied versions would work exactly like the un-luafied versions to start with. I don't know about you, but 450+ module errors seems like a big difference to me. It's not that Module:parameters is inherently evil, but in this case I would suggest it's been misused. Sure, I found a small number of typos that needed to be fixed, but that could have been done without breaking over 450 entries.
As I see it, we have four main options, in order of ease of implementation:
- Do nothing. Not recommended, because this has replaced links to Wikipedia, Wikispecies and Commons with alarming red error messages in close to 460 entries, and it's hard to spot real errors among the three pages of these errors.
- Change the templates to have Module:parameters ignore the
|i=
parameter
- Implement the
|i=
parameter. That's what I would recommend, because the codes governing taxonomic names explicitly say that at least genera and species should be italicized, and the system overhead is negligible.
- Remove the
|i=
parameter from all 450+ entries. Why? They were added in good faith and could serve a useful purpose.
- Pork. Sorry, I just wanted to see if anyone was paying attention.
What would everyone prefer? Chuck Entz (talk) 01:14, 15 October 2016 (UTC)
- Option 3 sounds like the best to me too. If there's some kind of obstacle to option 3, then option 2 seems like an acceptable temporary fix. —Mr. Granger (talk • contribs) 01:25, 15 October 2016 (UTC)
- I for one am willing to go for the pork option, if pork is a misspelling of fork. That would mean substituting a non-Lua template for
{{pedia}}
, {{specieslite}}
, and {{comcatlite}}
that supported italics in those entries that needed it, genus and subgeneric entries. At some future date we could add the ability to de-italicize portions of the taxa that should not be italicized like "subsp.", "subg.", "var.", etc. I am fairly sure that CodeCat will never work on that feature, having bigger fish to fry. I don't like my fish fried anyway; it's bad for my heart. DCDuring TALK 02:02, 15 October 2016 (UTC)
- I did option 2 (it's a trivial line of code), I tried doing option 3 but I think it better that someone who knows what he/she's doing handle it properly. Crom daba (talk) 02:50, 15 October 2016 (UTC)
- @Crom daba: I looked at the code for Module:wikipedia, and then at Module:links, and finally at Module:script utilities. Those are the modules that it uses to create links to Wikipedia. Apparently italics and bold are not supported (see § tag_text), so there is no way to make the links to Wikipedia articles on species names be italicized without removing boldface. That's quite irritating... — Eru·tuon 02:05, 16 October 2016 (UTC)
- Those modules are of "our" own creation. The lack of functionality is a self-inflicted wound. DCDuring TALK 15:29, 16 October 2016 (UTC)
- Thanks to @CodeCat:, the change has been made. A look at ] will show another class of interproject links that make a fork of the templates worth considering. A WP entry like w:Argentina (plant) or a commons link like [[commons:Category:Argentina (Rosaceae)}} have mixed character formatting. They should be Argentina (plant) and Argentina (Rosaceae) respectively. In general the taxonomic authorities prescribe that only a genus and subgeneric names and epithets should appear in italics in text.
- They also prescribe that such items should appear in ordinary typeface when embedded in italic running text, so templates that force italics or regular text force the appearance of such taxa to deviate from the prescription. We might not care about prescriptions, but these prescriptions usually followed in scholarly works and often in popular science and nature books. DCDuring TALK 18:11, 18 October 2016 (UTC)
- I created a function to apply correct italics to species, subspecies, and variety names on Wikipedia (in w:Module:eFloras, which is used in a reference template). A similar thing could be created here – though the parameter that triggers it would have to have a name besides
|i=
(|genus=
, |subspecies=
, etc. or something else?). It could detect the abbreviations subsp., ssp., var., and f., and words in parentheses, all of which should not be italicized, and then apply italics to everything but them. Either that, or we manually enter link text with correct italics in every case. Thus, it would take Cupressus arizonica var. glabra
and display it as Cupressus arizonica var. glabra, and the previously mentioned Argentina (plant)
would display correctly as Argentina (plant). The parameter |i=
could still be used in cases where the whole title should be italicized. — Eru·tuon 19:55, 18 October 2016 (UTC)
- Cool. That's exactly the kind of thing I was hoping for. Some questions remain in my mind:
- Is it worthwhile to attempt to provide for a piped alternative formatting for whatever cases that the logic you have implemented on WP doesn't cover every situation we find? Have you found any exceptions at WP?
- We might discover other taxon elements that should not be italicized, eg, "morph.", perhaps "×". Would this be updateable by altering data in a Module?
- Should this be implemented here within the project-link system or with separate templates and/or modules for the big-box and inline versions the templates for pedia, species, and commons?
- None of these are likely insurmountable and all but the last may be ignorable. DCDuring TALK 21:04, 18 October 2016 (UTC)
- @DCDuring: I haven't encountered any exceptions in the logic that I used in w:Module:eFloras, but it's unlikely there will be any, because the floras and lists that the eFloras template creates references to are all plants, and there is a fair amount of regularity in botanical names. For instance, all plant families end in -aceae, so the module searchs for that and makes sure it's not italicized. The logic has to be different here on Wiktionary, because more than just plant names are involved, and the automatic italicization would have to be explicitly turned on in links to species, subspecies, variety, etc. pages and not turned on for links to family pages, which aren't ever italicized.
- It would be easy to add or remove elements to the module, if we discover any more that should not be italicized. There should just be a list of testcases somewhere (with one example each of genus, subspecies, form, etc.) that we can look at to check that the code is doing what it's supposed to.
- I'm not familiar with the structure of these interwiki link templates and modules, but I think I'll start work on a module that does the simple task of this automatic italicization, which can then be used in whatever interwiki link modules require it. — Eru·tuon 23:14, 18 October 2016 (UTC)
- I greatly appreciate your undertaking this. DCDuring TALK 23:22, 18 October 2016 (UTC)
- Module:italics is now complete and seems to work. It has an array of things that shouldn't be italicized, and it's pretty easy to add another one if you think of any. The documentation page has a set of testcases that show how the module handles some of the un-italicized elements that we talked about. If there are no problems, someone can add this function to the interwiki link modules; not sure what the parameter that turns it on should be called, though. — Eru·tuon 01:19, 19 October 2016 (UTC)
- My choice: "taxi=". DCDuring TALK 11:09, 19 October 2016 (UTC)
- That might be fine, though I wonder: would there be any examples of page names with parentheses that should be italicized but are not taxonomical? I was thinking maybe the titles of plays, but I couldn't find any that are linked to. — Eru·tuon 15:53, 19 October 2016 (UTC)
- Examples: plays like Antigone (Sophocles play), which are named after characters and therefore have a disambiguator. Or Richard II (play), which is named after a real person. These currently don't have Wiktionary entries, or they are not mentioned in definitions (see Antigone; there's no entry on Richard II) yet, but perhaps they will be in the future, and then they would have to be italicized in the same way as genus or species names. — Eru·tuon 16:57, 19 October 2016 (UTC)
- For a broader reference for the parameter name how about "seli", for "selective italics"? DCDuring TALK 00:22, 20 October 2016 (UTC)