This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.

Beer parlour archives edit

2025

2024

Earlier years

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

December

September 2009

Scripps National Spelling Bee winning words

Hi everyone, I'm bringing up this subject (again), here this time, because I really want to know what people want to do with National Spelling Bee winning words. (This is a subject that's really close to me, as I am a previous and hopefully future participant in the aforementioned Bee) There is previous discussion in the Tea Room. So far the suggestions are:

Put a concise explanation under a ===Trivia=== header in the entry, linking to Wikipedia's article on the Scripps National Spelling Bee. (This is what I had originally been doing.)

For those who don't like the Trivia header, merge ====Anagrams==== and ====Spelling bees==== (note L4 status) under one L3 header, ===Word play=== and still keep the information in the definition.

Put all the SNSB winning words into an appendix and/or category, also having info in the entries (or not).

Personally, I would do all three.
My argument for putting info in entries: A dictionary serves to give information about words and their usage, while an encyclopedia gives information about the concepts the words represent. This is information about the word itself and how it has been used, therefore it belongs in a dictionary. Opinions would be greatly appreciated on this ... Thanks! L☺g☺maniac chat? 00:38, 1 September 2009 (UTC)

The content belongs under ===Trivia===, and while anagrams might also be under Trivia, I don't like the heading "Word play" (and have other plans for Anagrams, see above). It would be best if there was a template for doing this, both so it's easy to add, and then it's easy to modify if we want to. Conrad.Irwin 00:59, 1 September 2009 (UTC)

Please see Appendix:Scripps winning words -- Prince Kassad 04:44, 1 September 2009 (UTC)

Ok, then, I didn'tknow that was already there . . . so I guess the question is now do we want the info in the entries or not, and how? I don't want to decide this all by my lonesome ... L☺g☺maniac chat? 15:57, 3 September 2009 (UTC)

Personally, I am weakly hostile to having them in the entry, simply because these don't seem to add enough value to justify a change to ELE. Plus ===Trivia=== is kind of a lame heading, and invites people to add... even more trivial things. :-) -- Visviva 19:00, 19 October 2009 (UTC)

I also dislike them. Anagrams are at least an objective property of the word, but having appeared in a particular competition is not. For me it's like having "words that were played in Scrabble games in tournament finals": arbitrary and non-useful. Equinox ◑ 19:07, 19 October 2009 (UTC)

What is clear is that they should could be in appendices and that any WP entries about these words should link to Wiktionary's entries as well as any appendices that we have. We need to make the content of all appendices more accessible to not-so-knowledgeable users, but our better appendices merit special attention. We could make a special effort to make Wiktionary a site of interest to spelling-bee contestants, who might make good contributors, based on a sample of one so far. DCDuring TALK 19:24, 19 October 2009 (UTC)

The info is in an appendix (though that could be improved, and is on my to do list for when I get the fast computer) and the WP entry does link here. And I hope that other spelling bee competitors who come here to contribute have more free time on their hands. (To quote the faun Tumnus: Then you must have had a very poor sampling.) So if the consensus is to remove the info from the entries (since I've done like, the first half...) then I could do that I guess. L☺g☺maniac chat? 19:45, 19 October 2009 (UTC)

Index to appendices

We have many appendices which are pretty much lost in cyberspace because nothing links to it. I think it would be a great idea to have an Index on appendices, maybe sorted by subject. That way we would also be able to clean up that namespace much easier. -- Prince Kassad 10:20, 1 September 2009 (UTC)

Excellent. Sorely needed. DCDuring TALK 14:28, 1 September 2009 (UTC)

Glosses on antonym, synonym and related words

The current policy says that, essentially glosses on derived terms, synonymous terms and antonyms are not specifically allowed, ever.

This has the unfortunate side-effect that some of the few articles in the wikipedia that are dictionaric shame the wiktionary with their coverage of the word. For example suffixes have mostly been deleted from thw wikipedia, but for example w:-gram had a long list of derived 'gram's including a brief indication of their meaning, where as -gram just has a pointless list, without anything to help the user find the right gram for a particular purpose.

It's completely unreasonable to expect users to click on them to find out what they mean, and English is not regular enough to be able to guess. So the list of words becomes unusable, but would be able to be usable if glosses are allowed.

This is even a problem for synonymous terms; there's a saying that 'there are no true synonyms'- every word is subtly different in emphasis, usage or shades of meaning, but the current layout pretends they are exactly the same.

Basically I think that glosses should be permitted on synonyms, antonyms and derived words.Wolfkeeper 23:31, 1 September 2009 (UTC)

Which policy are you referring to? In practice, glosses are highly encouraged on all lists that need to be matched to senses, although obviously some pages are more thorough than others- see parrot for a well-glossed entry. Nadando 23:40, 1 September 2009 (UTC)

See: Wiktionary:Entry_layout_explained#Synonyms. It has a very specific layout, and all the other sections like antonyms and derived terms are supposed to follow it.Wolfkeeper 23:43, 1 September 2009 (UTC)

Also Wiktionary:Entry_layout_explained#Further_semantic_relations where it specifies about the others.Wolfkeeper 23:46, 1 September 2009 (UTC)

I believe Wolfkeeper (talk • contribs) is referring to glosses after a word, as in {{term|word||gloss}} or {{onym|xx|word|gloss=gloss}}, not to the glosses before a list, as in {{sense|gloss}}. With synonyms it's usually moot — the gloss that heads the list should really apply to each word, or else they're not really synonyms (at least in English; with foreign-language words, there are some good uses of glosses) — but with other semantic relations, it's less so. And with derived terms and related terms, we don't even have the {{sense}} up front. —Ruakh_TALK 00:53, 2 September 2009 (UTC)

I agree with Wolfkeeper (talk • contribs) and Ruakh (talk • contribs) to the extent that there are many cases where {{sense}} or something similar might be a good tool for grouping related terms and derived terms, which are our principal low-structure lists that are sometimes too long to be useful.

I strongly disagree with the idea of having glosses for each term in any of the semantic and etymological relations lists because:

It would make intimidatingly long entries even longer
It would require massive effort to keep the glosses coordinated with the definitions.

Wouldn't a more widespread utilization of "popups" largely solve both problems, whatever the resource costs and the barriers to implementation? DCDuring TALK 01:15, 2 September 2009 (UTC)

No, I actually don't think that would be as good, you would have to point at each in turn to get its meaning, and it interferes with searches.Wolfkeeper 18:58, 2 September 2009 (UTC)

Even with synonyms the words are never precisely the same; they usually have different emphasis. "There are no true synonyms".Wolfkeeper 19:03, 2 September 2009 (UTC)

(Unindent) To clarify what is at stake: Wolfkeeper proposes that pages look like in this revision of "-gram", in which a gloss or a definition is stated right next to each listed derived term. The first four items of the list of derived terms as posted by Wolfkeeper to -gram:

===Derived words===

correlogram - in the analysis of time series, a plot of the sample autocorrelations $r_{h}\,$ versus $h\,$
cosmogram - a flat geometric figure depicting a cosmology
engram - a hypothetical means by which memory traces are stored
engram - a term used in Scientology and Dianetics for a "recording" of a past painful event not normally

--Dan Polansky 07:13, 2 September 2009 (UTC)

Yes, where that is appropriate. I'm not saying we would make that mandatory though. The point is for the policies to permit better quality entries. I think the current situation where you are essentially forbidden from this is clearly wrong.Wolfkeeper 18:58, 2 September 2009 (UTC)

And those are not particularly good examples correlogram should be something like a plot of sample autocorrelations; brutally short.Wolfkeeper 19:01, 2 September 2009 (UTC)

I'm having trouble getting into the head of the user of the page configured this way. What question is the user asking? Of the current design for the the Derived terms section, I imagine a small number of normal users to be asking "What is that "-gram" thing that I'm trying to remember?" or "I know it ends in "-gram", but how do I spell it?" I can also imaging Scrabble players using it. After that I get to questions that only special types of word nuts (like ourselves) would ask. DCDuring TALK 20:05, 2 September 2009 (UTC)

How about people just reading the list and having a way of remembering which are which? We're supposed to be writing articles to read, not just use for a particular directed purpose I think. The fact is that looking at a list of completely unlabelled -gram's is very hard particularly for younger people with smaller vocabularies.Wolfkeeper 15:52, 3 September 2009 (UTC)

To Wolfkeeper: I find the lists of derived terms useful even without glosses. I do click on the derived terms to find what they mean. From what I remember from past discussions that took place on Wiktionary, other people voiced similar sentiments. --Dan Polansky 11:47, 3 September 2009 (UTC)

If you really do think this, check out -logy. While I know a lot of those, some of them like enology or agrostology seem really obscure to me, and three or four words next to them would tell me whether I might be interested in going there or not. Maybe if you're a genius at greek perhaps these are all obvious, but not most people.Wolfkeeper 15:52, 3 September 2009 (UTC)

The real take-home point here is, I think that from a user interface point of view any link that takes you anywhere that you don't have any clue where it's taking you is a bad link. Every link should give you some clue. That these links are by policy completely unlabelled is just wrong.Wolfkeeper 15:52, 3 September 2009 (UTC)

Then I have no clue what ypu are upset about. The user knows exactly where they are going because the link target is also the name displayed for the link. How can a user have "no clue where it's taking you" when the target entry of the link is displayed? --EncycloPetey 16:07, 6 September 2009 (UTC)

Words are used to give a particular meaning. If you don't have any idea what the meaning is, what is the purpose of knowing that that word is related to another word? Knowing that a random list of letters is connected to -logy is incredibly hard to do anything with at all. That's not the way people work, they don't just learn random words; people learn new words by associating words with other things, knowing that agrostology is about grass connects it to other things like 'agrarian'. A dictionary that permits gloss is better than one that doesn't, you can read the list and immediately have a good idea what the words mean, otherwise people, particularly young people will read the list and have no clue what the words mean. It helps build people's vocabulary. If you don't do this, it presents a scary and forbidding list of long words to them and probably makes them feel stupid for not immediately knowing them. You don't want people to feel like that, the dictionary should bend over backwards to help people know what words mean and how they are used. The way this guideline is written, it's more like a deliberate uncaring punch in the face.Wolfkeeper 12:01, 7 September 2009 (UTC)

But that's why we have entries about each of those words. The entry for (deprecated template usage) ornithology (the study of birds) tells you about the component roots, the meaning, any grammatical or usage oddities, etc. There is no sense in duplicating that information in every location that the word ornithology (the study of birds) is used. If we did that every time ornithology (the study of birds) was used in a list of words, we'd be needless duplicating information that could just as easily be found by placing it at the entry for ornithology (the study of birds). Placing all the definitions into a list of synonyms or related terms swamps out the important information with lots of additional noise. Yes, it's noise, because it's not information about the current entry, but about another entry not being discussed on that page. If a long list of words is "scary", as you say, then how much scarier will it be when it's packed full of definitions as well as the words themselves? For the sake of our users, any list of Related terms should be kept visually simple so that a user can quickly see the portion of the word that is etymologically related, and not have to wade through lots of other information that should be on the entry for which it is relevant. --EncycloPetey 16:38, 7 September 2009 (UTC)

How is adding information that the reader would be likely to need, adding 'noise'? That's complete nonsense.Wolfkeeper 13:14, 11 September 2009 (UTC)

In the case of derived words, for example for a suffix, it is information about the current subject of the article; 'you can add this to this suffix to make a word that can mean X', or you can add this to make it that; this is exactly the kind of thing that people read suffix articles to find out. Simply having a list of words is, in general, really very useless.Wolfkeeper 13:30, 11 September 2009 (UTC)

WT:EDIT + `{{trreq}}`

By request, you can now add Translation requests with the Translations editor, simply specify {{trreq}} in place of a translation. Conrad.Irwin 01:31, 2 September 2009 (UTC)

pharmaceuticals

While RC patrolling I ran across talarozole, which got me questioning a few things. Is it okay to be listing the applications for a drug? Isn't that a form of making medical claims? Should we include experimental pharmaceuticals which have not received any approval for use in humans? Do we need to include the health applications, or could we instead limit the information we provide to what portion of medicine prescribes the product. Is this a form of marketing, similar to the (Pharmacorp-produced) Nurses and Physicians drug reference books?

After some personal discussion about it with Equinox, I'm bringing it here to at least open a discussion about it.

In the interests of full disclosure, I've been involved in clinical trials research and research ethics, which might bias my point of view. - Amgine/^talk 04:00, 2 September 2009 (UTC)

If we say what Scholar articles and the financial press say about the product, especially emphasizing the intent of the developers/marketers and stick our usual attestation standards how could this be a problem realistically? I don't entirely see how we can realistically keep up with the changing status of the drug, but that is a problem we have with all of our encyclopedic content. DCDuring TALK 04:17, 2 September 2009 (UTC)

Right. But to avoid any seeming "form of making medical claims" we can use (say) "...drug used for..." instead of "...drug for...".—msh210℠ 20:02, 2 September 2009 (UTC)

I'd be even happier with "drug being tested for" before it is approved. Even if our entry falls behind the real world, we would be erring on the side of caution. I don't know that we would want to be talking about off-label uses. Let WP do that kind of thing if they want. The shorter the entry the less likely anyone would actually rely on it, IMHO. DCDuring TALK 20:24, 2 September 2009 (UTC)

That was oddly prescient based on today's announcement about Pfizer (2.4 billion USD fine for off-label advertising.) Being paranoid, I think the structure "used in oncology" or whatever general portion of medicine regularly prescribes the product would be both less specifically marketing, and a strictly factual (rather than medical) claim. - Amgine/^talk 22:47, 2 September 2009 (UTC)

The financial press often reports that companies have a new drug for X. No one takes that as marketing. In addition, these drugs have to be prescribed. Are doctors going to be relying on us? People make all sorts of explicit and implicit claims for the efficacy of all sorts of products for achieving all all sorts of desirable outcomes. We are in no position to attest to the truth or falsity of any of those claims. Saying that "X" is the name of a drug used to treat a disease is not a statement that X is effective against a disease.

What do you make of ] or ]? I find ] to be somewhat misleading. It seems to imply that many people can heal the sick, but that only doctors are licensed to do so. "Heal" implies a successful outcome. "Treat" the sick would be less misleading. The "heal" wording passed its fifth anniversary at Wiktionary just this week. Perhaps en.wikt bears responsibility for some of the excessive faith in the medical profession that has made US medical costs so high. DCDuring TALK 00:34, 3 September 2009 (UTC)

Actually, the press reporting about a new drug is marketing, and it's always assumed as such by anyone who knows much about financial news. Even when working on en.wikinews I was being targeted by marketers with press releases about pharmaceuticals. I suppose I should kick start another article, like the Senate staff one to show who is editing the drug articles on wikipedia. {{unsigned}]

My point was intended to be that the financial press (vastly more suable than WMF) doesn't seem too worried. The attack of scrupulousness that we are suffering about this seems over the top. We are not so aggressively skeptical about all other implausible claims that appear in our entries. Has the outrageously silly one in doctor generated any concern, any complaints, any lawsuits? I suppose I ought to edit that one into shape. After all what do I expect: that it heal itself? DCDuring TALK 18:34, 3 September 2009 (UTC)

The press is not suable in this regard; they have specific protections in the US Constitution which do not apply to Wiktionary. But your arguments disregard the question: is it appropriate for a dictionary to make medical claims? I don't know, but I suspect not. - Amgine/^talk 02:30, 4 September 2009 (UTC)

Actually, we don't have much skin in this and are not to be trusted to make this kind of decision. I suppose we really ought to kick this one upstairs to the folks with actual legal responsibility. Someone must have faced a related issue on WP. DCDuring TALK 03:02, 4 September 2009 (UTC)

This is still dodging the question. Surely we can address whether it is ethical - disregarding the legal ramifications? - Amgine/^talk 19:17, 4 September 2009 (UTC)

How about we simply punt the issue of keeping up with encyclopedic claims concerning drugs to Wikipedia, which is an encyclopedia. Something along the lines of:

==English==
{{wikipedia}}
===Noun===
{{en-noun|-}}
# A ] with the chemical formula (FO<sub>2</sub>)BAr.
]

Of course for drugs such as thalidomide that have been used attributively, as in thalidomide baby, fuller entries would be warranted. — Carolina wren discussió 01:56, 3 September 2009 (UTC)

Simply giving a chemical formula does not provide context in which it is likely to appear. Whatever we choose to say, it ought to include enough information for a reader to interpret appearances of the word, such as in a novel. If a character in a novel inquires, "Do you have any aspirin?" a person looking up that word here would be unenlightened by a chemical formula. The novel is perhaps using the quote to indicate indirectly that a character has a headache or body ache. So, a minimal definition of (deprecated template usage) aspirin ought to indicate that aspirin is used for pain. We do not need to advocate a drug for a purpose, but we can certainly describe its use. For new or experimental-stage pharmaceuticals, I don't have as strong an opinion, as they are more likely to be neologisms and less likely to have meaningful connotations. --EncycloPetey 04:00, 3 September 2009 (UTC)

I agree: the chemical formula is (99 times out of 100) insufficient.—msh210℠ 16:28, 3 September 2009 (UTC)

I'd reverse that ratio, as far fewer that 1 out of 100 drugs will ever be widely known. There are thousands of pharmaceuticals these days. — Carolina wren discussió 02:22, 4 September 2009 (UTC)

Or they could be inquiring after an aspirin in that hypothetical novel because they have a fever, are suffering from a heart attack or stroke, participating in a study as to whether it is effective in blocking the formation of cataracts, etc. Most times, the use to which a drug is being put to should be evident from the source text. If quotes such as "My mistress is the aspirin I take for the aggravation my wife causes me." exist, I can see the need for a fuller entry than the bare bones one I described. But most drugs are not so well known for them to ever be used metaphorically or figuratively, so for entries such as (deprecated template usage) feprazone, (deprecated template usage) miraprofen, (deprecated template usage) nepafenac, (deprecated template usage) tarenflurbil, etc., a bare bones entry such as I described suffices, as it would for (deprecated template usage) talarozole. — Carolina wren discussió 02:22, 4 September 2009 (UTC)

Consider

gold—a heavy yellow elemental metal of great value, with atomic number 79 and symbol Au.

and its alternative

gold—a chemical element with atomic number 79.

The dictionary definition of gold is phrased not only in terms of underlying scientific properties such as its atomic number or chemical formula but also in terms of those properties or ways of interaction with the reader that are directly accessible to the perception and experience of the reader, such as color, heaviness, typical great value, or typical uses. As a consequence, many dictionary definitions are not 100% pure definitions in terms of logic but rather contain certain factual elements.

When a drug is seen as an artifact, its use is its key characteristic; the likely main uses of hammer are fit for the inclusion in the definition of hammer.

I do realize that with less known drugs there is an additional concern to avoid their unduly promotion or advertising, but only stating their chemical composition seems insufficient to me regardless of how widely known the drug is.

I see no problem with the particular definition:

talarozole—an investigational drug for the treatment of acne, psoriasis and other keratinization disorders.

It states that (a) the drug is investigational, and (b) that it is investigated for the treatment of specific conditions, without saying anything about the efficacy of the drug. An additional statement of chemical formula would be in order, though. --Dan Polansky 08:22, 4 September 2009 (UTC)

In general I'm in agreement with you, Dan. However, note that we don't state gold is toxic, building up in the bone structure with symptoms similar to lead poisoning. Another way to phrase the definition might be:

talarozole—an investigational drug for dermatology.
talarozole—an investigational drug for skin disorders.

Losing some specificity does not harm the informational value of the definition, imo. - Amgine/^talk 19:17, 4 September 2009 (UTC)

Relatedly (and not incidentally), I've just imported w:Wikipedia:Medical disclaimer to Wiktionary:Medical disclaimer (accessible indirectly via the "Disclaimers" link on every page). Please take a look, point out problems, and so on. We may also want to create some sort of {{medical-disclaimer}} template, for use on affected entries and categories. —Ruakh_TALK 18:15, 4 September 2009 (UTC)

Let's ask a lawyer with experience in food and drug law. That would be me. Drugs sold in the U.S. and Europe, at least, must be approved by the appropriate regulatory agency for the specific purpose for which they are used. When they are used for another purpose, this is called an off-label use, and is a violation of the law (although rarely punished if the drug is not hazardous when used for such a purpose). We could, with no legal or moral jeopardy, state that the drug has been approved for use in a particular area, or is being investigated for use in a particular area. For drugs approved in the U.S., we can link directly to the FDA page which contains all indications and cautions. bd2412 T 16:59, 5 September 2009 (UTC)

As our attorney (based on the $1 I will be forwarding to you as retainer once you give me your address or bank routing number), would you advise that we as individuals face any personal legal risk from carelessness in our wording of what the uses are? Do stewards, bureaucrats, or admins face more risk than registered or unregistered users? Does anyone face any risk apart from WMF? If WMF faces risk and we do not, are we not morally compelled to take advice from them. Are they in any way protected by not knowing of the risk that we think we see? DCDuring TALK 19:03, 5 September 2009 (UTC)

This gets us more into encyclopedia territory, but so long as we are not representing that a drug is in fact safe or effective for a particular purpose, we can identify any uses to which it is allegedly actually put by reference to sources supporting those allegations. We can incur no legal jeopardy whatsoever for accurately reporting that a drug has been approved by a government agency for a particular use, or that it has actually been used for that purpose, even if it is ineffective. For example, we can say that amygdalin or laetrile is a substance that has been used as a treatment for cancer, but has not been proven effective for that purpose. bd2412 T 19:46, 5 September 2009 (UTC)

To avoid legal problems, we should be able to leave the text to something like:

==English==
{{wikipedia}}
===Noun===
{{en-noun|-}}
# A ] (or ]) used for ] (or "used to treat ])

This is a dictionary, and a dictionary's job is to describe a word and its usage, so if we stick to something like that ... This is probably exactly what bd2412 just said but I like to reword things to be simpler.... :) L☺g☺maniac chat? 22:37, 5 September 2009 (UTC)

If we are going to say that, we need to cite an external source for the claim. bd2412 T 22:52, 5 September 2009 (UTC)

What to do with translations to nothing, null, nada, zilch, ...

While working on the entry for the Catalan preposition per, I came across a sense in which the equivalent English translation is a nullity. I used an unlinked empty set sign (deprecated template usage) Ø to indicate that, but I'm not at all happy with this solution, yet I can't think of a better one. The problem is likely of broader application than this one entry, which is why I brought it here instead of the Tea room. Any suggestions? — Carolina wren discussió 02:41, 3 September 2009 (UTC)

Note the solutions used under (deprecated template usage) the for languages which have no definite article. The solution there is to say not used. If we agree this (or something else) is a good general solution for this, I recommend creating a template to include in such cases to make it uniform, with a possible link to an explanation of this phenomenon in general. --EncycloPetey 03:53, 3 September 2009 (UTC)

Not quite an equivalent occurrence, but close. What you suggest works acceptably for translation tables when English -> FL yields nothing, since the lack of linking helps establish it as a usage note and not a translation, but in a sense line for a foreign language entry, I'd be worried that it might be interpreted literally as the word has the meaning of not used. — Carolina wren discussió 22:04, 3 September 2009 (UTC)

Why is any translation necessary? —Ruakh_TALK 18:19, 4 September 2009 (UTC)

A green background like in {{trreq}} might help make it stand out from the other text. -- Prince Kassad 13:00, 5 September 2009 (UTC)

I don't know from Catalan, but in the example given it looks to me like "to" is the translation of per. The "to" in "to vacation" is not the infinitive "to", so vacar would only match "vacation" AFAICT. But yes, for similar cases, if deemed necessary, I think you could just do something like:

# {{non-gloss definition|Blah blah. No corresponding word is used in English.}}

... and just leave it out when translating the example. The absence of the boldfaced term should speak for itself in most cases. The empty-set symbol is a bit distracting IMO. -- Visviva 02:49, 7 September 2009 (UTC)

See if you like my specific solution: translate per to in order to. —AugPi 03:02, 7 September 2009 (UTC)

By the way, in analogy to Spanish, the construction va anar sounds to me like being in the future tense, so El meu germà va anar a Tahití per vacar a la platja would be "My brother will go to Tahiti in order to vacation on the beach." —AugPi 03:08, 7 September 2009 (UTC)

I would like to see a uniform "tag", as it were, to categorise translations in which corresponding words cannot be found. I've encountered this before: in Finnish (TESL) and Chinese (solo), for example. Consistency is always good! What say you admin? Tooironic 01:54, 10 September 2009 (UTC)

word history saved with your log in

Wouldn't it be cool if wiktionary saved all the words you've looked up? You could go back and quiz yourself.

While Wiktionary doesn't exactly do this, you can use your account's watchlist to save a list of articles yourself. Just click the "watch" tab at the top of an article if you want to save it. When you click on "My watchlist" at the top right of the screen, you can see the list of articles you have saved by clicking on "View and edit watchlist." You will need to be logged in. Dominic·t 08:45, 3 September 2009 (UTC)

Request for bot status

I am formally requesting bot status for my new bot, User:Di gama bot. It is not intended for any long-term 24×7 operations, but rather for short-term "projects" involving many edits. At the moment, it is being geared up for a mass page move of sign language entries following a change in the entry name style, in accordance with the proposal at Wiktionary talk:About sign languages. As of this message, the matter is not settled, but a change of some sort is likely and this bot is intended to do the heavy lifting.

As noted above, the job requires the ability to move pages and I don't think bots innately have that privilege, so it would need to be autoconfirmed as well. The program is not hashed out yet, but it will use a stock script of Pywikipedia in conjunction with a hand-gathered list of moves to be done. As I said earlier, it is a short-term project, and will be monitored by me for the duration. Any future projects using this bot will be requested similarly. —Di gama (t • c • w) 00:54, 30 September 2009 (UTC)

I don't imagine there would be any objections, but I believe this is technically supposed to go through a formal WT:VOTE after being mentioned here. -- Visviva 12:14, 1 October 2009 (UTC)

I've made a new vote page at Wiktionary:Votes/bt-2009-10/User:Di gama bot for bot status, but I notice that in the instructions at WT:VOTE, which state:

Replace “Title of vote” with what you’d like to start a vote on, or add the relevant user in one of the boxes below.
Ctrl-A, Ctrl-C (select all, copy) of the text in that input box (it is the new link).
Edit in a new tab that page and add the one line of magic text below the editbox. Again, replace “Title of vote” with your exact vote topic name. Remember to add {{ and }} around your pasted text.
For nominations that should be listed in multiple places (in essence, WT:A, WT:B or WT:C) open that page in another new tab and repeat the pasting of the transclusion line.
Click the button below the relevant box, fill out the form displayed and save it.

#'s 2 and 3 require editing WT:VOTE, a protected page (as of '06, apparently). —Di gama (t • c • w) 06:48, 2 October 2009 (UTC)

You should be able to edit WT:VOTE; it's only semiprotected, and you've had an account since early last month. Are you only getting the "view source" option? -- Visviva 07:20, 2 October 2009 (UTC)

Ha, I'm clever. I had an "edit" button, but I turned away when I saw the header which said the page was protected (it actually says protected, not semiprotected), not bothering to try actually saving the page. Just goes to show what glossing over notices can get you. Thanks! —Di gama (t • c • w) 08:26, 2 October 2009 (UTC)

The vote needs more input.—msh210℠ 18:25, 19 October 2009 (UTC)

Voting Farce

Voting as is currently happening cannot be allowed to continue, it is an utter fiasco. If you are planning on starting a new vote, please:

Don't rush.
Allow other people to review the wording for at least a week before the vote starts (this is not necessary for the formulaic votes such as adminship requests). This ensures that simple errors, such as the one with counting the voting eligibility vote can be spotted.
<edit>Don't</edit>Conrad.Irwin 07:00, 7 September 2009 (UTC) Change the vote after it has started. If a week is given for checking things over, it is unlikely there will be any need for this.
Accept the outcome of votes. If the outcome of a vote can be ignored then there is no point in voting. By allowing time before voting starts to discuss issues, most problems can be avoided. If a vote is still inconclusive, then start afresh considering the feedback, there's no point in whipping a dead horse - and it
Avoid using the vote's talk page for discussion of the issue being voted on, we have WT:BP for policy discussions and changes to vote talk pages are hard to notice.

I don't know whether we should add these or similar points to a Wiktionary:Voting policy page, I'd like to hope it wasn't necessary but it seems that we are beginning to become too big to run by common sense alone. Conrad.Irwin 00:33, 7 September 2009 (UTC)

Perhaps in the spirit of points one and two, you may want to præface № 3 with "Don't" -- unless I've misunderstood what you're trying to say. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 00:52, 7 September 2009 (UTC)

Indeed, with the Doremitzwr amendment and other editorial adjustments (like clarifying the referent to it in "and it" at the end of 4), this seems like a useful set of guidelines, possibly worth a vote once we are further along in this discussion. Whether guidelines alone are adequate is itself a worthwhile part of the discussion.

Regrading 5. I am not sure that the vote page itself is a good place for extended discussion either. In meatspace, there are laws like "No electioneering within 100 feet of the polling place".

Item 3. might be formalized that a vote commenced must proceed to its conclusion as originally worded or be deemed to have failed, requiring a restart of the process from start of the one week review of the wording.

Item 4. The declaration to disregard rules subject to vote or the systematic practice of disregarding those rules would seem to be the kind of thing that should summarily disqualify one from membership in the community. DCDuring TALK 02:21, 7 September 2009 (UTC)

Overall, this looks quite sane. DCDuring's fourth point also strikes true to me, though at this point in time it is slightly too late at night for me to make extended comments on this. However, I also would like to add that I am very, very sick of seeing hostility against people who are opposing a vote. The voting pages are perfectly okay for being a place for discussion, but I would like to see the hostility -- and that's from both sides, not just one -- ending. We don't need provocative comments in the explaining rationale for opposing a vote, but we also don't need provocative comments prompting what does something mean, or calling people imbeciles. The current voting farce, as it may be, prompts me to stick to my own little corner of the Wiktionary, despite the fact that there are numerous things that I do when the environment is pleasant to be around aside from Hiligaynon and Kapampangan. --Neskaya kanetsv? 05:34, 7 September 2009 (UTC)

I too would like to see some sort of policy statement on votes, as some of our past (not just recent) votes have been unmitigated disasters. In addition to the above points, some other topics which might be worth addressing include canvassing, eligibility, vote structure, etc. Some initial thoughts of mine: Prohibitions don't necessarily have to have consequences. If we say "canvassing is disallowed,", but cite no specific result of engaging in such behaviour, we don't have to worry about trying to prove someone engaged in canvassing, and yet we still make note that it's frowned upon. That being said, a prohibition without sanctions is dependent upon civility, which is something we cannot always rely upon. Also, regarding pre-vote discussion (which I absolutely agree is a must for most votes, save admin votes etc.), maybe we could come up with a way to address the responsibilities of everyone else besides the vote creator. There have certainly been times where a discussion is started, and either no one participates, or everyone who participates seems hunky dory with it, and then the bill is voted down. Take Ivan's recent Serbo-Croation disaster. The SC merger was not exactly done in a dark corner. A policy page was set up concerning the proposed merger well ahead of time, its existence was advertised on the BP, and very few concerns were raised on its talk page. Then the vote is set up and everyone and their mom crawls out of the woodwork with cries of "genocide." I think that certainly explains, although doesn't justify, the incivility seen from the proponents. I'm somewhat at a loss as to how this could be done, but I'd like to hear thoughts if anyone has them. Also, I would like to see a voting eligibility. Call me elitist if you will, but I think a minimum of experience and commitment to the project are absolutely reasonable conditions for a productive vote. And vote structure....how do we go about things when there are multiple options, considering the fact that we have higher standards than a democratic vote. Must there always be a "none of the above" option? Finally, could we at least think about something a little looser than "support" (i.e. I support only this option and no other). I know that there have been votes before, where I'd like to see some sort of decision, but wasn't absolutely set on it being my preferred option. Perhaps we could introduce a "this is my first choice, but I'll vote on anything that achieves consensus" option? Sorry this is so long, but there are more issues we need to address here than have been presented thus far. Also, it should be noted that it would be absolutely unjust if anything decided was applied ex post facto. -Atelaes λάλει ἐμοί 06:01, 7 September 2009 (UTC)

I wonder if we could change the way we structure issue votes, just a little. Wikis are supposed to be about consensus, right? And votes are supposed merely to demonstrate consensus, except that for various reasons, that never seems to work around here. So maybe we should switch things up a little. Suppose we first have discussion, then have a vote only on whether to delegate the final resolution of the issue -- under certain ground rules -- to a particular page where everyone is welcome to participate. A consensus reached on the delegated page would then have policy authority; it could be brought back for a re-vote only if it could be shown either that no real consensus was established, or that the consensus reached was manifestly moronic. In any case, the burden of proof would be solidly on the person(s) seeking to overturn the established consensus. This would minimize the self-selecting effects of having a discussion on a specialized page -- since the page will already have been thoroughly brought to community attention -- but also avoid the usual mess that occurs when people want to try to reach a new consensus after a formal vote has started. Maybe we could trial this on some relatively non-controversial issue? -- Visviva 06:43, 7 September 2009 (UTC)

I'm not too much against a voting eligibility, honestly. I'd just like to see the vote for it be a bit longer after a disaster vote, as until then I'm really not sure it's been discussed enough and that people aren't being reactionary. I'm also against the within X number months before the vote part because people like Connel and Dvorty and even occasionally myself take off from Wiktionary for periods of time longer than the X months, and that doesn't mean that we don't know enough about the project to come back if there is an important vote going on. I think that we do need to think a lot about the optiontype votes, because those also seem to be a recipe of some sort for disaster. (Note, more coherent thoughts will be put together in the morning.) --Neskaya kanetsv? 06:55, 7 September 2009 (UTC)

I really like Cirwin's proposal, but also Visviva's suggestions. One really important point about Visviva's suggestions is that, if a consensus is reached, there is no need for a vote of any kind. Just a statement that consensus has been reached unless someone posts a disagreement within X, and then ask everyone who has had a dispute but dropped out of the discussion to review it again. Too many iterations of this, say more than two, would trigger a vote, of course. (Note that both these ideas and the voting model are susceptible to tyranny of the majority.) - Amgine/^talk 15:29, 7 September 2009 (UTC)

I think we need both a well-crafted voting system along the lines that Conrad suggests and mechanisms for reaching conclusions that are taken seriously but can me adjusted more readily than by vote of the type we've been having. Visviva's suggestion seems worth a try. We have had votes along the lines that Nesakaya has suggested.

But we need to not be too creative about voting systems. We could have perhaps a few different voting models but they need to be clearly understood and be well named with the voting model that applies to a specific vote clearly shown. The models might vary as to the percentage of vote required, the number of votes for a quorum, the accommodation of multiple choices, and delegation to a virtual subcommittee. Each voting model itself should the the subject to our most rigorous approval standard. A voting model perhaps could be testable once without requiring a rigorous vote. DCDuring TALK 16:23, 7 September 2009 (UTC)

I agree with much of what others have written (Conrad's original five points, DCDuring (02:21, 7 September)'s comment on item three, and Visviva (06:43, 7 September)'s comments). Regarding DCDuring (16:23, 7 September)'s comments: We already have templated vote forms linked to from WT:V.—msh210℠ 17:41, 9 September 2009 (UTC)

Danish compounds

In Danish words like "baby carriage", are usually compounds of the two corresponding nouns, in this case barnevogn from barn which is barne- in compounds and vogn which is unaltered. The article "barne-" exists as a Norwegian prefix, but I'm not really confident about calling it a prefix. My, uneducated, guess is that it is a compound of two nouns. Is there a parallel to this in other languages to emulate? If considered a prefix + postfix, we should make a affix template, which combines {{prefix}} and {{suffix}}. In the article wiener I added the prefix "wiener-", but again it doesn't feel right. Perhaps we should invent another term, like "compound prefix" and "compound postfix" to address this.--Leo Laursen – (talk · contribs) 09:15, 7 September 2009 (UTC)

By the way: barn can be either "barn-", "barne-", "barns-", "børne-" or -barn (-barnet, -børn, -børnene) in compounds.--Leo Laursen – (talk · contribs) 09:27, 7 September 2009 (UTC)

For what it's worth (I am unsure if it helps but let us see):

In Czech, words usually need to be inflected or modified before they can be compounded, like in "hlavolam" = "hlavo" (from "hlava") + "lam" (from "lámat"). I do not see the Czech "hlavo-" as a prefix, but have no sources to cite on this.

In German, there is often an additional "s" or "es" between two compounded words, like in Todesangst (=Tod+es+Angst), Inhaltsverzeichnis (=Inhalt+s+Verzeichnis).

Ancient Greek is rife with compounds, and I think it silly and couterproductive to make affix entries for every word which appears in a compound. I simply list the regular component etyma if a word appears in a compound, and follows normal compound rules (i.e. ν changes to μ before labial, stem forms are used instead of lemma forms, so we see the stems of ἀνήρ and πούς (ἀνδρ- and ποδ- respectively) in ἀνδράποδον, accent recesses, etc.) However, if an affix cannot be traced to any regular word via regular changes, I make an affix entry and note related words (e.g. ἀρχι-). -Atelaes λάλει ἐμοί 10:18, 7 September 2009 (UTC)

Atelaes, I totally agree. And thanks to Dan Polansky I found Category:Danish compound words and {{compound}}. Thanks a lot.--Leo Laursen – (talk · contribs) 10:32, 7 September 2009 (UTC)

There already is a template which combines {{prefix}} with {{suffix}}: it is {{confix}}. By the way, I am starting to grapple with the very same issue in Dutch, about whether to consider a morpheme as a word or an affix. For example: sfeer versus -sfeer. —AugPi 15:48, 7 September 2009 (UTC)

I've had similar problems in Latin. My approach has been to consider the element a word if it has an independent existence with the same meaning and the same (or nearly the same) spelling. I do this in Latin also for prepended prepositions, such as ad in adsum, since it is also an independent preposition. So, by my Latin criteria, I would call Dutch "-sfeer" a word rather than a suffix. --EncycloPetey 16:29, 7 September 2009 (UTC)

Your criterion seems reasonable, so I deleted -sfeer, since it can stand alone as sfeer. —AugPi 17:16, 7 September 2009 (UTC)

Just to note, I've actually taken this one step further. If a word is a very common affix, in addition to being a regular word, I'll note some of its afix properties in the word entry. See διά (diá) for an example of what I'm talking about. This is in line with what standard Ancient Greek dictionaries are doing, and I find it to be a far easier way to organize information, and more helpful for the end user. -Atelaes λάλει ἐμοί 17:40, 7 September 2009 (UTC)

My only comment at this time is that I'd not want to call it a Prefix. I'm not sure what heading or contextual tag I'd use, but I dislike the idea of having a "Prefix" entry that isn't under the usual Prefix entry name. --EncycloPetey 17:51, 7 September 2009 (UTC)

It is fairly common in Czech and German for prefixes to come in pairs with prepositions; in German: auf, auf-; an, an-; zu, zu-, etc. So I would create the entry διά- (diá-) for a prefix, as I would expect that most prepositions would have a same-spelled prefix worthy of standalone documentation. But I know no Ancient Greek, and this is just the way I would do it following the model of Czech and German. --Dan Polansky 19:51, 7 September 2009 (UTC)

Might our English chemistry terms be relevant? See e.g. (deprecated template usage) methoxy and then (deprecated template usage) methoxypyrazine, where we use the "compound" etymology template instead of "prefix". Equinox ◑ 01:57, 10 September 2009 (UTC)

A new place names proposal

Straw poll: Would you support an addition to WT:CFI along the lines of:

Place names

A place name should be included if it is attested with three durably archived citations spanning at least 150 years. Listings on maps, in gazetteers, or in geographic dictionaries may not count toward the required three citations, although such listings may be used to provide additional information.

I'm interested in providing some sort of place name criterion, and a longer span for supporting quotations seems a suitable way to do this objectively. I hesitate to extend the date span further, since there are major modern cities and nations that have been in existence for only a century or two. Nairobi was founded in 1899, for example. --EncycloPetey 16:24, 7 September 2009 (UTC)

What’s the rationale behind 150 years? Why not include new cities? --Vahagn Petrosyan 16:37, 7 September 2009 (UTC)

Many editors here complain about the inclusion of "small" or "unimportant" places, but we have repeatedly rejected "importance" criteria as encyclopedic. So, I figure that 150 years of citations establishes the durability of the word, which is more a lexical criterion. The choice of number is a bit arbitrary, but covers well more than a single century. It is also designed to push the upper limit of what might be usable, in order to (hopefully) appease the crowd that normally opposes the inclusion of place names. Note that the criterion only states what is to be allowed; an entry failing this criterion is not necessarily therefore excluded. Other considerations may permit its inclusion even when this criterion does not permit it. I would, for example, argue (in some other thread) that capital cities of nations should always be included. However, that is beyond the scope of this proposal. The goal of this proposal is not to exclude new cities, but to find an acceptable criterion for objectively including at least some of the place names we continually wrangle over. We can always debate additional place name criteria at a later date. --EncycloPetey 16:55, 7 September 2009 (UTC)

I’m for including all place names without additional preconditions, but if 150 years is the only way to appease placename-haters, I would vote for such a proposal. --Vahagn Petrosyan 19:27, 7 September 2009 (UTC)

I am of the same opinion. Good place names entries have linguistically interesting and important information. Can we, at least include large place names, no matter how old they are and famous place names (often used in news, literature, etc.)? Those are more likely to be sought by users. The "statistical" information about the size and the presence of universities, job opportunities, etc. matters here as these places are more likely to be used in the written form, therefore users will only welcome them. Anatoli 01:23, 8 September 2009 (UTC)

I agree. I think it would be absurd not to have London and Chicago, if only for the etymological value, and to note translations. bd2412 T 01:58, 8 September 2009 (UTC)

Proving the existence of a word should be the only criterion. I already object to the "3 years" mentioned in CFI: paper dictionaries have good reasons for such criteria, but not a wiki. Words may really exist and really be used a few days after their creation. Why not really adopting, once and for all, the principle "all words, all languages" and focusing only on "what's a word?", "what's a language?", "when does a word begin to exist in a language?" Lmaltier 17:54, 7 September 2009 (UTC)

I fail to see why the "three independent durably archived citations" rule cannot apply to cities as well. That rule just by itself filters out all small villages which will never be mentioned in any books. -- Prince Kassad 19:05, 7 September 2009 (UTC)

Support, but just as long as we keep in mind that this is a dictionary, which serves to explain words and their origins and usage, not the concepts themselvse. As long as we're just describing the name of any place and where it came from and what it describes (and not providing any information that really belongs in Wikipedia), then this is fine. And it really shouldn't matter whether the name is 150 years old or not, 'cuz of course, like Lmaltier said above, a word can gain usage really quickly. So if the name is widespread and has sufficient citations . . . sure, we can have 'em . . . :) L☺g☺maniac chat? 19:32, 7 September 2009 (UTC)

I support this, as I definitely want to see more placenames here, but mirror the concerns of Logomaniac, and still think we need to better define how we define placenames. I think we're still confusing different referents with different definitions. Truth be told, I still don't have a strong mental grasp on it myself. This is still a problem with other types of proper nouns. If two guys are named John, the word "John" does not have different meanings for the two of them. Likewise, there are multiple Bloomingtons, and yet I think the word "Bloomington" does not have a different sense for the different cities, simply different referents. Again, this is all pretty hazy in my own mind, but I think we need to sit down and have a good discussion about what defining a proper noun really means. If we figure this out, I think it will help us avoid the encyclopedic info problem we've got. -Atelaes λάλει ἐμοί 23:39, 7 September 2009 (UTC)

If you don't mind, Atelaes (and everyone else), I could do a little explaining right now, at least of my take on the situation. There's a difference between "word" and "concept" - a concept is any tangible (or intangible) thing, while a word is just a term used to refer to that concept. That's also some of the difference between a dictionary and an encyclopedia - a dictionary explains the origins etc. of words themselves and defines which concepts they refer to, while an encyclopedia defines the concepts and lists the different words used to refer to that concept. So in your example, two guys named John, the two different guys are two different concepts but they are both referred to with the same word. Ditto with "Bloomington" - the different cities are different concepts but they are both referred to with the same word. As long as we as a dictionary stick to describing the words it'll be good. So with St. Cloud, an entry I created semi-recently, even though the word generally refers to the city in central Minnesota (a concept), there are other cities referred to with that word so I had to let the definition somewhere like "any of several cities in the U.S.". I think one of the problems we get into sometimes is that we tend to drift toward explaining the concepts themselves and not just the words. That's where it gets tricky. But for placenames, as long as we just explain the words themselves (and point people to Wikipedia), it should be fine. Wow that got long . . . But anyway, you now know my opinion on the subject :) L☺g☺maniac chat? 00:20, 8 September 2009 (UTC)

I think the important criterion, besides proven to exist, is that a place name in a non-Engish land should first be given in the original spelling in the language of that place. I don’t think it is useful to have an English transcription of an Oriya village when we don’t have the name in Oriya. In patroling the language-cleanup category, probably the most common entry I encounter is that of an obscure village somewhere in India, written only in English and no hint of the native spelling or even the language of the village. I delete them out of hand if I can’t get the native name first. —Stephen 00:37, 8 September 2009 (UTC)

IMO we definitely should not list every use of a city name. (To take London as an example, there are tons of them with nothing in common beyond the name.) That's about as bad as listing under Smith all of the millions of individual people who happen to have that name. OTOH, saying that something is a place name, and giving its ety, might be okay (though to be honest I don't like the sound of it, because there are simply so many, including the tiniest of villages, and it seems like clutter). Equinox ◑ 00:41, 8 September 2009 (UTC)

Exactly. That is definitely not what we as a dictionary are supposed to do. Listing every use of a city name (or listing every person with the surname Smith) is purely and wholly encyclopedic, and will never belong in a dictionary. It just shouldn't happen here. So yes, we definitely should not do that. And we wouldn't have reasons to include the names of the tiniest of villages, because such would probably not have 3 independent durably archived citations and would therefore not meet CFI (o whatever EP is proposing to set up.), as Prince Kassad noted above. L☺g☺maniac chat? 01:00, 8 September 2009 (UTC)

@Atelaes: I think that there is, for each person named John, a separate sense of the proper noun John. (However, all of them are covered by a single sense of the common noun John (“a person named John”), found in sentences like “There seem to be a lot of Johns in this town.”) We certainly don't want to include each such sense — instead, we give a non-gloss definition for a sort of meta-sense that covers them all — but the reasons for that don't necessarily apply to all place-names. If a name is only used for a tiny number of places, I don't see much need for that sort of artificially-vague meta-sense (except perhaps to avoid a "slippery slope"?). —Ruakh_TALK 02:56, 8 September 2009 (UTC)

Yes, there are infinitely different senses of the one proper noun, but a dictionary would never think of adding a list of such senses - i.e. 1) my brother's best friend, 2) my brother's other friend, 3) my uncle, etc... But they are all covered by a broader sense, still of the proper noun, referring to the name itself and then a common noun referring to anyone under this name. But even if a term (like a placename) is only used to refer to a small number of specific concepts (like towns/cities), we still shouldn't be listing those places as that is encyclopedic. We are just supposed to say that, hey, it is a placename (no matter how many cities it refers to) and point the user to Wikipedia if they want to find the specific places. Remember, this is a dictionary! (I like to say that) and a dictionary just gives the meaning of the word, not the different specific concepts that word refers to. It is almost as absurd as listing all the different types of televisions under the entry television. It just doesn't happen in a dictionary. L☺g☺maniac chat? 19:12, 8 September 2009 (UTC)

The meaning of John in John Smith and John Kennedy is the same, the meaning of Kennedy in John Kennedy and Edward Kennedy is the same. But the meaning of Paris (in France) and Paris (in Texas) is not the same, and this fact has linguistic consequences (e.g. possibly, pronunciation, etymology, or gentilic words). All senses should be listed, whether the word is a toponym or not. Lmaltier 20:27, 9 September 2009 (UTC)

You mean we're supposed to add a sense for every single city named "Rochester"?! Or "Paris"?! There are like, 10 different Rochesters in the U.S. and probably a lot elsewhere. Even if there's only 3 or 4 occurrences of a name, I would still prefer if we left it to "A placename common in the <insert region here>". Of course if everyone else decides other wise, well, I guess I could just stay out of it. L☺g☺maniac chat? 21:02, 9 September 2009 (UTC)

Yes, I mean exactly that. If you don't think it's necessary, have a look at fr:Beaulieu. You'll see that this page doesn't mention any encyclopedic info such as population, but it mentions linguistic information specific to each sense. Lmaltier 21:09, 9 September 2009 (UTC)

There's a Beaulieu in Hampshire, in case you want to add that one too! Equinox ◑ 21:50, 9 September 2009 (UTC)

But the Paris in Texas (and presumably almost every other Paris) is named for the one in France; so we have only two senses - the first for the original in France, and a second for "any of a number of other cities and towns named after the city in France". bd2412 T 21:23, 9 September 2009 (UTC)

There may be only two different etymologies, but not only two senses. Are inhabitants of Paris (Texas) called Parisians? Maybe, but I'm not sure at all... This information should be mentioned, it's linguistic information. Lmaltier 21:37, 9 September 2009 (UTC)

Don't oppose. I would really prefer that we stuck to the policy that proper names in mainspace have to be used to indicate something other than the literal referent. On the other hand, we should have a policy that people will actually follow consistently. This seems like such a policy. Also, an advantage to the 150-year (or similar) criterion is that it rules out ephemeral region names (metropolitan statistical areas, forest districts, etc.) These are completely encyclopedic, and would be a serious concern if had only the 1-year criterion. (If a particular sewage district has been around for 150 years, I suppose it's earned some kind of distinction.) There are going to be some new and messy corner cases, but I guess we'll deal with them when they arise.

It would be ideal if these were consistently tagged in some way, so that any future reuser who might want to use Wiktionary as a dictionary could filter this ~~junk~~ ~~bright shiny loveliness~~ -- oh, I'll just stick with "junk" -- out. -- Visviva 04:31, 8 September 2009 (UTC)

weneedAL ev'm-itsaMASIV PROJECT,wt,so nobigdeal.

[eg acurious girl intheTINIEST OFPLACES needs2beABL2GO2WT'n'FINDOUTwot herhamlet's nameACTUALY MEANS[=etyl!!
mymyself imstilnot sur'bout the engl name4ppl fromthe south ofbelgium--itsNEVA i/thoseOFENSIVLY B-A-D PRINT DICT uguyz solikeUSERS'NEEDS!!!!!

A problem with 150 years is demonstrated by that the name of Brasília wouldn't qualify for inclusion under this guideline.

Should prescriptive government documents be added to the list of maps & references? On the other hand, we already accept many technical terms with prescribed definitions, and typically put a restrictive label on them like chemistry, medicine, etc. (Or don't label them: look at the for-physicists-only definition of metre!). —Michael Z. 2009-09-08 16:19 z

I can't think of a good way to objectively define a "prescriptive government document". Do treaties and constitutions count as such? Why or why not? --EncycloPetey 05:09, 9 September 2009 (UTC)

I'm just thinking of official lists of place names and their spellings, electoral districts, etc., whose names may not be in general use, or may change or be renamed periodically as governments re-stack the voters' lists. We should go by attested use instead. —Michael Z. 2009-09-09 05:28 z

The proposal doesn't say what it means by "citations". Any use at all? I suspect half the tiny hamlets in Great Britain will meet that criterion (and many elsewhere, such as Colonie (w:)), not that that's necessarily a bad thing.—msh210℠ 17:29, 9 September 2009 (UTC)

One linguistic criterion in a multilingual dictionary such as ours might be that the place name has attestably a different name in at least two languages, or alternatively in at least one language that is not widely used by nationals of that place. This would automatically exclude "hamlets" as they are most likely known in only one language (and pretty unknown in that one, too!), but include important places like London (see Lontoo, Londres, Lundúnir). A foreign spelling is typically something that one would want to look up in a dictionary. This might be combined with other criteria such as "national capital cities always included". --Hekaheka 21:13, 9 September 2009 (UTC)

We don't have such criteria for any other kinds of words. This would just favour place names in colonies and along popular invasion routes. —Michael Z. 2009-09-10 02:00 z

How do we handle the fact that the identity (which is close to being the definition) of the referent changes. In the first instance, let's ignore formal names of the sovereign states and just focus on an area place name, say, Germany. Writings in different periods would necessarily be referring to the Germany as then or previously constituted. A translation of a Latin work that translated "Germania" as "Germany" is referring to something different from later definitions of Germany. Many of the referents do correspond to specific borders, but many don't. Are all the Germanies in the set of entities called Germany 1866-1870, 1871-1914, 1918-1938, 1945-reunification, and post unification the same? Note that even this omits the fluidity of the concept during wars and in the years before Bismarck. The referent, that is, seems to be very fluid, which fluidity is usually reflected in the history of the place or the ethnic or national identity involved. Our existing entry for Germany has three definitions and omits many periods. Inevitably, someone will attempt to insert a complete set of definitions of Germany and attempt to cite each one. Because we emphasize attestation from the written record and historians write copiously I don't doubt that many senses will turn out to be citable. DCDuring TALK 00:44, 10 September 2009 (UTC)

We're talking about criteria for the inclusion of words here, not of the places they represent, and not even of their definitions or senses, so don't steer this conversation too far into the realm of defining.

But to address your concerns, the word Germany refers to the land of the Germans, and not to a particular set of surveyor's measurements or constitutional documents. —Michael Z. 2009-09-10 02:00 z

That just pushes the problem elsewhere. What is a "German"? Someone who lives in Germany? Well that's no use. Equinox ◑ 02:07, 10 September 2009 (UTC)

Maybe not. Germans in this sense are members of an ethnic group, with a shared language. They originally came from elsewhere, but the region they settled in is named after them. Germans in the civic sense, in turn, are people with certain legal rights in Germany.

But I admit, some dictionaries define these differently. The OED arguably cops out by not defining Germany, even though it says of German (adj) “The precise signification depends on the varying extension given to the name Germany”. —Michael Z. 2009-09-10 02:28 z

Well, it seems circular, but they are basically the same because they have the same name. (Contrast a country that changes its name without changing any geographical borders.) Otherwise couldn't we argue for (deprecated template usage) prime minister having a different, separate referent each time a new one was elected, or (deprecated template usage) cat when a new cross-breed was created? Equinox ◑ 01:10, 10 September 2009 (UTC)

Then, which two of the three current definitions of Germany do we strike or how do we rewrite the definition? Do we forbid other definitions?

I would really like to see examples of some model entries for

a current place that corresponds to a sovereign jurisdiction, say, Germany
a current place that corresponds to a non-sovereign jurisdiction, say, Nice
other inhabited places, say, Cote d'Azur and Hell's Kitchen
other named geographic features (if they are to be included)
1. An ocean
2. A marsh
3. A valley
4. A glacier
5. A plains

Are there any features of these entries as they are that would be excluded? How many senses are to be permitted? How attested? What about maps? Pictures? External links (Official websites, Tourist Bureau, Chamber of Commerce)? Hypernyms; hyponyms; coordinate terms? DCDuring TALK 01:38, 10 September 2009 (UTC)

This proposal is about inclusion of terms, and not about styles of defining (although we could certainly use some of the latter). —Michael Z. 2009-09-10 02:00 z

Oppose. If I cannot know the consequences, then it is a pig in a poke. To the extent that I can foresee the consequences, it appears likely to lead to further dilution of effort to improve quality of existing entries of other types. DCDuring TALK 14:47, 22 September 2009 (UTC)

Must-reads, for those who want to bring in proper names:

Salikoko S. Mufwene (1988) “Dictionaries and Proper Names”
Laurence Urdang (1996) “The Uncommon Use of Proper Names”

Both argue that there is no logical reason to omit proper names, but the former also says “since proper names function prototypically as referential indices, denotative descriptions beyond, e.g., 'personal name' or 'name of a city in GL' (where GL stands for geographic location) should be omitted.” —Michael Z. 2009-09-24 05:06 z

Pronunciations for dead languages

Just looking at σχολή among other, where do pronunciations for ancient dead languages like this come from? I see them for Latin and for Old French as well. Clearly I'm not against them of they're correct, but what sources can there possibly be there are reliable, unless they were recorded at the time, trying to "reconstruct" pronunciations seems very dodgy. Mglovesfun (talk) 19:02, 8 September 2009 (UTC)

I wouldn't think it was noticably more challenging than reconstruction Proto-Indo-European. While they may not be totally accurate, enough information can be obtained from verse (where the meter can imply the pronunciation) etc. to make good enough assumptions. Conrad.Irwin 07:47, 9 September 2009 (UTC)

Pronunciation sections for dead languages are, of course, of a different nature than pronunciations for living languages. However, they can be quite reliable. Ancient Greek, specifically, is well suited to fairly precise and confident reconstructions. There are a few things which are used to make such reconstructions, including poetic writings, borrowings, spelling mistakes, and probably a host of other things (phonology has never been my strong suit). Ancient Greek has plenty of all of them, because of its broad historical influence (both temporally and spatially). Our pronunciations are solely based on spellings, but Ancient Greek (especially Classical Greek) was almost certainly spelled based on phonetic principles (instead of etymological principles, as is often the case in English), as its writings came about so soon after the invention of its writing system (a writing system specifically devised for it, which is not the case for many other languages), and was not quite so rife with borrowings as some modern languages are. Later borrowings (e.g. Hebrew names in the Greek New Testament) are probably somewhat less reliable. -Atelaes λάλει ἐμοί 11:52, 9 September 2009 (UTC)

Likewise for Latin. For Classical Latin, there are additional helps in reconstruction beyond the ones Atelaes has mentioned for Greek. Some contemporary grammarians explicitly discuss phonology in clear terms; they compare sounds of different letters within Latin, among dialects, and with Greek. There are writers who bemoan "mis"-pronunciations that are common. There are writers who vary their spellings of certain words, thus indicating similarity of sound, and there are plays on words in some comedies that allow for similar analysis.

There are whole books written on the subject of phonology in Classical Latin, and the best ones (some of which I own) include a discussion of the problems that remain and some of the uncertainties. However, we have a fairly confident picture for most of the sounds, and even an idea concerning elision and other phonological changes that occur in context. The only serious points of contention concern the "long" and "short" vowels. Continental Classicists tend to favor an interpretation of long vowels as of strictly greater duration but with no change in vowel quality. English Classicists favor the view that at least a couple of the vowels differed in quality (different IPA symbol) in addition to being of longer duration. I've chosen to follow the Continental viewpoint as a result of discussion with Latin specialists on Wikipedia and elsewhere, but I have not seen a coherent data-based argument anywhere that I've looked. The difference in view seems to be largely traditional at this point, so I'm going with the people who speak descendant languages. --EncycloPetey 22:02, 9 September 2009 (UTC)

I don't know about Classical Latin, but certainly for Vulgar Latin there are good data-based arguments that short i and u are qualitatively different from long i and u, namely that in the Western Romance languages at least short i and u merge with long e and o, while long i and u are separate. Anyway, I do think it's feasible to give classical Latin pronunciations for Latin entries, and it wouldn't be amiss to give Italianate/Ecclesiastical Latin pronunciations as well. For Old Irish (the old language I'm working on adding forms of right now), it's harder because it's difficult to know what IPA symbol is most appropriate for certain sounds. But even so, it would be nice to do so, because pronunciation isn't easy to guess from spelling in Old Irish, and sometimes it makes a difference to meaning (for example, ingen (“fingernail”) and ingen (“daughter”) are pronounced differently). Angr 11:44, 10 September 2009 (UTC)

史凡’s complaints

wt=mORALITYbook?!?

re achterklap + http://en.wiktionary.orghttps://en.wiktionary.org/w/index.php?title=User_talk:%E5%8F%B2%E5%87%A1&redirect=no====

vulgar word=dito ex,which i/this case=REALIFE1[butfine,ilspare myhands the typin

thatwasLANGUAG ASIS USD,REGISTER-adjusted-o,but if from ast*buk,then ok>UGUYS=HILARIES DILETANTICAMATEURS,MYGAD!!
'gain:now anADMIN W/O BABELBOX>thisisNOT ON!![HIDN NOLEDG??orjust none?canhe reada SINGL IOTA evDUTCH,let alonBRABANTIAN??ifNOT,he'd'vRFV'it,esp.givn hisINEXPERIENCEasan editor!![ASMANY EDITS asme kinda,uppl IMPRESME EA&EVRI DAY,NOOT!

>IM SIK&TIRD evMORALITY-BOOK DICT.,leavdad'mision' 2theBIBL&consorts4chrstsake;OFENSIV lango=LANGUAGE2 nNEEDBE DESCRIBD!!!!--史凡>voice-MSN/skypeme!RSI>typin=hard! 02:46, 9 September 2009 (UTC)

If you want to complain about the actions of an editor, complain to them directly before complaining here. If you want to discuss a word, or its entry, start a section in {{rfc}} or {{rfv}}. The beer parlour is a discussion of general Wiktionary policy, and as far as I can see this is just whinging because an editor disagrees with you. Why not talk to Amgine about Amgine's edits?. Secondly, could I ask you (again) to write in English, the problem with your style of writing is that any details in what you say are lost so people have to guess what you mean. I don't see how having problems with your wrists should affect only your style of writing on discussion pages, and as everywhere else you type normally, it looks like you do it just to seek attention. By all means type short sentences, but please use real words. Conrad.Irwin 07:44, 9 September 2009 (UTC)

UR ADMINimplyd=POLICY>BP[wich ino utry2HOUNDME OUT!nicePOLICY
pROPOSAL:BABELBOXSrOBLIGATRY4ADMINS,nVERY WARMLYRECOMENDED4ACTIV EDITORS.
ppl here'vRUDIMENTRY LISTNIN'"SKILS",wichGREATLY IMPEDZ COMM.
to ur"2nd part":ENTRYS=END->NOCHOIS>C/P+WORDS,EX-SENTENCES V.HARD,notmore than 5/6perday posibl[butsure,BLUNTLY RV'em,evenwhen just i/need ofbein'movd orso,'gain: goodPOLICY.
DISC.PP,'gain,=MEANS+IFISIKELI C-A-N-O-T RITE EVRITHIN OUT.
i/realife u'dNOTDARE QUESTION SAY PARKIN4DADISABLD,yet HERE uthink=ok2state"WE DONT ACOMODATE DADISABLD",impresivPOLICY'gain[btw,IS UR PC-LEVL SUCH UR ALSO BLATANTLI RACIST ORSO ADITIONALY?!?
da use ofVOIS MSGS hasbeen requested,butc perabuv point[iwonder wotWMF'lthink boutsuch"policy"
"growup,atentionseeker"ifind suchwordin'OFENSIVnhens aBAD POLICY.
"wingin"iholdad4non-neutralINFLaMeTRY wordchois-great policy.
i'vMY COMUNICATION CHANELZ WIDE OPEN,urs rCLOZD,yet thinkin'ucanSHIFTDABLAME2ME??nice policy[ieBL-SHIFTIN.
nowgo'n'count daFREQUENCYoftheWORD POLICY ABUV."but sure,PUSHME SOLONGtil athe v.endI'LGET ABUSIV BAK{but nowuriz, asper"2weight/measurez"POLICY,ep wil'v dagoodnes evBLOKinme,eternal asurans},nthen umay askurself:isthis aresult of ourPOLICYnATITUDE--自己惹得禍/bROUGHTIT ONURSELFS!!![ButkeepBLAMIN'evcors..
ps atCEDICT iget aPOLITE,NICE'n'STREIT4WED reply uponmy INQUIRYs ,NObad faith asumption,NOsubliminalitys,NOdiscrimination,NOdoubtin'my arms'prob.,let aloneDERISIONlik here

Firstly, sorry, I did not intend to offend you. Secondly, it is your responsibility to tell people when they have made a mistake, you don't need to get the whole community involved every time. Admins have no special status, feel free to tell them when they do things wrong. Thirdly, people here should not care that you have a disability, we all know, it's in your signature, they are not going to be nicer to you because of it, though they should not be less nice either. If you want to record things to say, I'm sure you could upload them to Wikimedia Commons and link to them from here, asking people to talk to you over Skype is simply trading your convenience for their convenience - which people will understandably be reluctant to do; maybe audio from you with text replies is a better balance all round? You are calling for a policy, do you want to record a detailed proposal we can consider - bearing in mind that it should be fair to everyone, positive discrimination is just as annoying as negative? Do you honestly think I am racist, or assuming bad faith, if so, could you show me why as I'd be eager not to appear so in future. Conrad.Irwin 21:57, 10 September 2009 (UTC)

Is the above writte by a MD? Well, in Sweden doctors are slightly more ... easy to understand. --Andreas Rejbrand 10:59, 9 September 2009 (UTC)

I must admit I can decipher very little of this editor's writing (and I don't know what to call him or her). Perhaps it would be easier if he/she used more conventional abbreviations and laid off of the caps lock key. --Michael Z. 2009-09-10 13:03 z

You do know, don't you, that the reason this editor writes this way is because of a medical condition that makes it extremely hard to type? I know a person who has this condition and I know what it's like. I agree that his writing is ahrd to understand but we don't need to get onto him for it. L☺g☺maniac chat? 13:55, 10 September 2009 (UTC)

Yes I do know, but that doesn't change the fact that I understand practically none of what is written above, or that all-caps strings just make communication worse. Better to say so than to nod and smile politely, methinks, or pretend that "史凡" conveys anything to me. --Michael Z. 2009-09-10 22:42 z

OK, I had no idea. Does this condition have a name? --Andreas Rejbrand 21:05, 10 September 2009 (UTC)

His condition is called RSI as his signature and his name is Sven (史凡 Shǐfán - loose transliteration of the name "Sven" in Chinese characters). He (user Sven70) has been blocked in Wikipedia. Perhaps we should wait for 史凡/Sven to tell us more. Anatoli 07:27, 11 September 2009 (UTC)

So how does he manage to produce entries such as op stang jagen and tardive dyskinesia that are perfectly readable? SemperBlotto 21:11, 10 September 2009 (UTC)

I'm tempted to call bullshit, too, given that some of his abbreviations &c. take more keystrokes to input than the full forms would. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 21:50, 10 September 2009 (UTC)

How a user chooses to express xyrself and interact with others is a personal choice, and so long as xe is civil it's not a problem. However, that's not to say that everyone will take the time to attempt to explicate/translate xyr communiqué (and I don't think can either be required or expected.) - Amgine/^talk 22:02, 10 September 2009 (UTC)

Yes, but after receiving a reasonable request with a good reason behind it, I have modified the way I communicate to cut out certain spellings that cause Neskaya difficulty. I don't read any of 史凡's longer posts, because every line takes me ages and often the content (not just the typography and style) is so vague that I'm left with understanding little. He says he has RSI; that sounds like it's a bitch, but that's just tough. He could do more to aid communication, including acquiring a faster computer to support a dictation program. I may seem unsympathetic, but I'm simply unwilling to go as far out of my way as I'd need to. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 22:52, 10 September 2009 (UTC)

I think that from reading the responses of people, this seems to be a nice little analogy to reading for me on bad days. Of course, on bad days I don't edit, or come online. I go off and do nice simple things like paint and sometimes play music. And I would like to introduce everyone to a recording of the way that this comes out to a text editor. While I understand that Sven possibly has concerns about accessibility, I wonder if he realises that his writing is literally impossible and therefore not going to be read by anyone with a processing disability, or dyslexia. —Neskaya kanetsv? 06:45, 11 September 2009 (UTC)

(after triple edit conflict) It's RSI, which basically means that one's arm & hands hurts like crazy. I don't know how he manages to contribute - my friend could barely hold things, let alone type on a keyboard. I know this user has voice-recognition software because some of his posts at the Grease pit were in perfectly legible English (but Grease was spelled Greece). And how are you managing to think that he writes this way just to confuse us?! L☺g☺maniac chat? 22:22, 10 September 2009 (UTC)

We have had some contributors who seemed to enjoy acting one or more roles and pulling our individual and collective chains. And, as the cartoon said, "On the Internet, no one knows you're a dog," so claims of any kind need be taken with a grain of salt, all while assuming good faith. DCDuring TALK 23:05, 10 September 2009 (UTC)

FWIW, 史凡's indicated gender is male. {{GENDER:史凡|MALE|FEMALE|UNSPECIFIED}} = MALE. --Bequw → ¢ • τ 14:29, 11 September 2009 (UTC)

section break

c alsoJUL/ACC.REQ[conr:"disbld=nrml,no "spec.trtmnt"
wt=reAL PPL usinMEDIUM~Hobyclub,wrk vid-conf>SOCIETY RULESr validhere!!
soc.:weelchair>ramp,blind>beep-trafic lites;here:did aREC.BUTTNget instald so ican 1-clik it???[c acc.req
conr:DISABLD=ATT.SEEKER,"GROWUP!as=nrml>makes mefeelSO HURTi'v2 get sumwhere quiet&CRY4FRUSTRATion
agf-NONE i'vrecvd;empathy=same[NICEwt>al the-BREAKS
sven70:BLOKDasof myINPUT PROB;ep then a.HOUNDEDme of wp[RSI-emblem rv'd
designated confict mediator"'d'vAVOIDEDsuch[i'vtold conr imopen2IRC/wm comns,butnotknow how2us'em>NOanswer,as4pr kasad2
" "=HARDESTkey;caps near a-s-e+SOFT
itypW/MU NUKLSwenbad likenow.[~impakt/fysics igues
FUTUR:electr.certificats'L BEthere+BUILD-INsp2text[evenwhen conr denys it
wholeKAOHSIUNG:NOergonomics shop[idoWAT ICANbut2put the ENTIRE ONUSonme~conr isNOTwat soc.duz[n imNOTaskin4da sky eitha
i'vALWAYSaskd2be judgd ontheMERITevmyDICT-EDITS.[butpplNEVA CARED.
u'v had "wondaful" i/my wt-break,imdaOPOSITEtho>what you see is what you get
tue/thu itypd alot>we/fr myhandSTRIKESn ibeterLISN2it
conr.:ised:DISABLD=nrml~LATENTracism,D-ABL"stay at home n'be outa daway ofHELTHYppl who'v things2do"~OVERTracism;ivNEVAsed urACTUALYracist'n'nordo iholdu4 1;ineither sedthatUwere part oftheNOTshowin'GFbunchithink its beterNOT2twist ea oth words ala ep when hewants2take aSHORTCUT2push hisviewsthru,asMIScomm. alredy galore;iAPRECIATEur techskilz'n'howuAPLY'em,URedits like german tr-l seem ok2mensopraps itsUNFORTUNATEthatmyFORM REQhitUindaBAK'VurNEK,me bein pushdby continuesIGNORANT/SILYpostsfromanyOTHEReditors;only,wodidunget iswi wotlukslikeaCOMPETENT,SERIOUS,COMMITTEDatleastSEEMSor,letaloneDAD,cosimDISABLD,itkeptreadin',inur case"WEDONT HELP DISABLD PPL2FUNCTION HEREwhichimite'vcaldACOMODAT/ENABLatvariestages,asuminthatwerethe precise engl.words",nsins itseemd ukeptREPEATINTHATNO-HELPstatement,idecided2taketheFORM REQaproach asaMEANS2asEND--ihopethatw/l.m HELPtheMANYNUANCESinmywothasbeenasesdasACTUALY NOTHATBADwilbe redilyavailable2GLEANDupinthe comunityzWONTEDWAYSnirelyrely hope this9mo ofAVOIDABLEDRAMA'lno anendingleadin2IMPROVEMENTS4me,odaDISABLDusers,MOREUNDERSTANDINindacomunitynBETA COMUNICATION--4darecerd,i'vNO PE:SNLGRUDGE gainstANY1 here:ep,sb,conr,dcd-msh-ruak,mgluvs--imFIRMLYCONVICTEDwemost evda/quitafewtimz canjust agre2disagre,atleastUNDERSTANDINea oth w/otryin2put ea oth down--itisnormaly i/mypesnl exp.NOTAGUDTHING2be i/the center ofscrutinitybut,sins itisnowso,ihopeda ocasionGETSEIZD2getquitsum misunderstandings outadaway,improvwotwecan,nmoveon4weds w/aCLEANSLATE2wedz abrite future!!

translation: "Conrad: I said that to take the disabled for entirely normal is comparable to the incorrectness of latent racism; saying that disabled "should stay at home and be out of the way of healthy people who have things to do" as f.e. sb seems to imply is the equivalent of overt racism; I've never said you're actually racist and nor do I hold you for one; similarly, I neither said you were part of the not showing good faith bunch. (though many editors here are guilty of that towards me and other newbies.) I think it's better not to twist each other's words a la EncycloPetey when he wants to take a shortcut to push his views through, as miscommunications are already galore; I appreciate your technical skills and how you apply them. (e.g. only since assisted editing for translations I'm able to contribute there, please continue the good work! :) Your language edits like german translations seem to be OK to me quality-wise (which i hold for no mean feat as you seem mainly monolingual, as many here, per your babel, and thanks for having one! :), and so perhaps it's unfortunate that my "formal request actually hits you in the back of your neck, me being pushed by continuous ignorant and silly posts from many other editors; only, what I don't get is why what looks like a competent, serious, committed person (all properties very high on my value scale ;) at least seems to keep saying the above refered to (I read your posts, and other people's, over and over again, thinking they'd not be really saying all that, but, as a near-native english speaker who seems to at last have won some or a degree of community approval and respect based on the merit of my entries and not 'cause perhaps i'm actually a nice guy, brown-nose with admins (who'd say that?! :P ) (sense of humor is an asset, i think, though even that is sometimes lost here, also from my side :( ) or, let alone that, 'cause I'm disabled (as I said before, I'm not very convinced about positive discrimination, and didn't invoke it for myself to say the least), approval thus of my work here these nine months, even from what I'd say are rather critical people (which I can be myself, so no first stone from my side here), still, it kept reading, in your case "We don't help disabled people to function here", concept which I likely have called accommodate/enable at various stages, assuming that were the proper and precise english words, and which I strongly expected, felt and feel you guys would and should, sure, within reason -- for example I'm not expecting you guys to have speech-to-text at your end as of yet since the software is not so far developed at this stage (who'd have believed in free internet-phoning 20 years ago? I communicate via that technology anno 2009, and thoroughly enjoy it! :p ), so, entry-wise I type everything out as I don't get Dragon Naturally Speaking (my speech recognition program) to run for now on my old beast of a laptop (can't be substandard there, I agree with sgb though about different accuracy-requirement on discussion pages, that exactly was my point all the time, and at least tolerance there seems due in my view (doesn't it feel I'm already and really doing all I can?!) and since it seemed you, Conrad (amidst others admittedly, but since you're a tech-whiz you're key to people like me here, and to WT in general), kept repeating that "no-help" statement, I decided to take the "formal request"-approach as a means to an end, which is: "Please help me function here" -- does it really need to be that i get pushed just that far I'm like now hanging crying for upsetness over my computer to state what I feel is only the very obvious in developed society?! In general in my experienced here I find WT-people very concerned with themselves, but not very thoughtful of their own actions, where I in contrast go by "give as good as you want and are willing and prepared to get", which means that if I've been a ass myself somehow, whether unknowingly or on purpose, I find it very much ok for people to tell me, including in an upset way, as long as they make sure to have me see why they feel like that and where i went wrong -- you guys with upset newbies, my gad, you people just block 'em all :( -- I hope that with Logomaniac's help (herculean effort, please let's try for a record button!!) the many nuances in my what has been assessed, to the community's apparent surprise, as actually not that bad English writing (of course it's not, as i told you guys over and again from the very beginning, the Keyboard is the problem!!) will be readily available to be gleaned up in the community's wonted and wanted style-ways, and I really really hope these nine months of avoidable drama will know an ending leading to improvements for me and all the other disabled users, more understanding in the community (I like, nay, loooove understanding things, but, am i the only one thus here?) and better communication -- for the record, I've no personal grudge against anyone on this forum: EncycloPetey (at times the pedanticness, sophistries and fallacies as well as limited outlook on language learning and how dictionaries could and should help in that process are a bit much), SemperBlotto (ignoring can be a lower form of tolerance to give it that twist, and thus not all that bad -- it (ignoring) is supposed to be also a rather rude thing in Flanders like say with friends, but we're mere collaborators here on a world-wide stage, and it's what I asked for myself anyway - to be at least tolerated - and I came across quite a few definitions from you I really liked and looove it when you help me make my entries work, even despite my discussion pages-shorthand!), Conrad (Note I didn't ask for a block which I'd hold for the very last and unconstructive thing to do -- I just like this to get solved and the unnice remarks from the community re my input problem (not "writing style"!) to stop; it's really very frustrating that one, me, has to actually ask such, comparable to an overweight person having to implore to please not call him/her fat -- see now what I mean with uncivil??), DCDuring, msh210, Ruakh (please be not so flowery guys, as many non-natives have signaled, like you with me I hardly ever get your meaning, especially DCDuring's) and have to guess), Mglovesfun (Was I un-nice about getting the nl wrong in the entry i made (over de kling jagen)? It was an attempt, and that is what we're here for, but persistingly so not wanting to understand the nature of my problem really gets at me, and keeping deleting useful entries for learners because of some misguided CFI IMHO (and mind it's mostly not my entries, which is irrelevant but the usefulness of WT), especially when just discovering about what goes on at RFD I can get mega-upset about that too) -- I'm firmly convicted we most of the/quite a few times can just agree to disagree, at least ensuring we understanding each other without (w/o - another user used that abbreviation too, so why then come to me and give me an additional hard time about it?) trying to put each other down -- It is normally in my personal experience not a good thing to be in the center of scrutiny (I actually try to avoid the attention Conrad reproached me to seek, and as almost 40 years old i can also do without "grow up!" comments really) -- I just wish, since it is now so anyway, my disability, RSI, would be understood, leading to WT be accessible also for disabled people; Furthermore, as said before: I do not think it matters who proposes etc. as long as that happens and wiktionary keeps progressing, so I hope now quite some misunderstandings will get and be out of the way, improving what we can, moving on forwards with a clean slate towards a bright future!! (I'm an actual optimist in disguise i guess :) (My poor hand, tomorrow will be a day off I reckon...) (sven 13.9.9)"

whew, that one took a long time...! Hope I didn't get it wrong... :) I KNOW , I HOPE THE COMMUNITY WILL SEE SOME SENSE IN TIME--IT REALLY CAN NOT GO ON LIKE THIS... (i just expanded a bit, ur LONGHAND CONVERSION is admirably accurate!! *electronic friendly and grateful hug* L☺g☺maniac chat? 19:39, 13 September 2009 (UTC)

mytalk:8.5/10=no-reasn coments[cf.polanskis
NOTevry1canSPLASHOUT onhardware justlikethat.
9mo here>wat baten kaars en bril als de uil niet zienen wil

--史凡>voice-MSN/skypeme!RSI>typin=hard! 08:37, 12 September 2009 (UTC)

Automated translation by Logomaniac: "see also July/accessibility request (conrad: "disabled=like normal person, no "special treatment here")
wiktionary = Real people using a software-enabled medium - similar to a hobbyclub, or at work video-conferences > Society rules are valid here!
society : wheelchair users get access through ramps, the blind are provided with beeping traffic lights, here: did a Recording button get installed so I can just 1-click it and function in the community? (see acc. request)
assume good faith - i have None received whatsoever; empathy has not been shown to me but on the rarest of occasions (the wiktionary is so nice, hence all the WT-breaks regulars need here)
sven70: Blocked as of my Input problem (Not "writing style" -that is a derogatory label affixed on me per extension) ep then hounded me of Wikipedia (even the RSI -disability emblem i created through pain got removed)
a designated conflict mediator" would've avoided such (I've told conrad I'm open to IRC/wikimedia commons, but don't know how to use 'em > No answer, same for Prince Kassad
" " - hardest key: caps lock near a, s & e + soft
I type with my knuckles when it's bad like now actually!?(~impact/physics-related i guess, and indeed, y'day i had problems holding my beloved teacup :( )
Future: electronic certificates will be there as built-in speech to text (even when Conrad denies it)
whole Kaohsiung: No ergonomics shop (I do what I can but to put the entire onus on me like Conrad is not what society does (and I'm not asking for the sky either)
I've always asked to be judged on the merit of my dictionary edits (but people never cared)
my talk: 8.5/10 - unreasonable comments (cf. Polansky's)
Not everyone can splash out on hardware just like that
9 months here > wat baten kaars en bril als de uil niet zienen wil

/Automated translationBETTER:LONGHAND, please tell me if wrong --PRETTY MUCH SPOT-ON,TA! SV L☺g☺maniac chat? 13:03, 12 September 2009 (UTC)

Bullet 3 — an accessible “record” button somewhere — sounds like a good idea. Is this feasible to implement? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 09:47, 12 September 2009 (UTC)

conr.said'tWASthere,butDISCARDED

Automated translation: "Conrad said it was/used to be there, but discarded (and sorry for getting abusive myself at times, i just wish you guys would get just how far i feel pushed and dragged down by the described :("

I don't recall ever mentioning such a record button, though I may have forgotten. There are several projects, http://shtooka.net for example, that provide free audio recording software that works on all computers, however I don't believe it would be easy to provide an online "record" button. (It is possible in Java, but the browsers security manager normally disables access to the microphone, and while it should be possible in Flash, a quick google turned up nothing useful). Given that the best way to record the audio is a program on the user's computer, the only improvement I can think of would be to make uploading at commons easier (I still find it next to impossible to navigate the endless dialogs), as sounds files are likely to have fairly similar conditions and requirements it might be possible for us to implement a special audio uploader on toolserver, but I can't recall any attempts to do that (at least not just for Wiktionary audio, people do use a photo upload tool). If there is interest in this, I can have a go at it, but as I don't deal much with audio myself I would need a fairly detailed overview of what is required. Conrad.Irwin 19:53, 12 September 2009 (UTC)

pltry--史凡>voice-MSN/skypeme!RSI>typin=hard! 05:29, 14 September 2009 (UTC)

Just thought I'd say to you Sven, I fixed up that entry you made since you had to use shorthand to make it easier to do :) 50 Xylophone Players talk 17:15, 13 September 2009 (UTC)

Sven, keep up the good work but we need more Chinese entries :) I have to apologize again for commenting your style initially. I was under an assumption that you were a young guy showing off. I still don't understand it without the translation. Good job, Logomaniac! Anatoli 05:41, 14 September 2009 (UTC)

So glad I can help. :) L☺g☺maniac chat? 14:21, 14 September 2009 (UTC)

Sven, keep up the good work but we need more Chinese entries :) I have to apologize again 1TIME WAS MORE THAN ENUF!;) One time was more than enough! (I just wish people'd understand) for commenting your style initially. I was under an assumption that you were a young guy showing off. I still don't understand it without the translation. ICANTRY CHANG IT'GEN,NOCAPS NSUCH,BUT SOLUTION 'LBE SOUND-BASD I/DA END I can tr to change it again, no caps and such, but solution will be sound-based in the end Good job, Logomaniac! JUSTHOPE LIFE WOn'T GIVU RSI2,THATBE2IRONIC..+ofn sore>pltry dns,tho notcheap.. I just hope life won't give you RSI too, that would be too ironic (if you're getting tense muscles when typing (not all people do > they tend to be fine) and often sore > please try dns, though not cheap Anatoli 05:41, 14 September 2009 (UTC)

- np-q:wi itsO HARD2notis[myRSI? - *chin:i'd2STOPcosSOMUCH TYPINmy hands started2inflame>makesme feel~w/flu/ general malaise [iso wishACAI'd push4edit mask2:( - *my shrthnd did change tho,twas muchmore fonetik,so ialredy changd alot ontheirfb,nowmore drop somleters/spaces--only,icantbreach myremainin functional reserve :( [which inow/2dayspent ondadiscsn>no entrys posibl:( [ijust hopeda investmnt pays of>hapier4evrybody i/future!:) - *stilfeel young at40tho!:) - ta4ur nice post!!:D--史凡>voice-MSN/skypeme!RSI>typin=hard! 11:11, 14 September 2009 (UTC)

Question: why is it so hard to notice? (my RSI I MEAN - IS THIS DISEASE so unknown?)

Chinese: I had to stop because it involves so much typing that my hands started to inflame, which makes me feel like with flu / general malaise, which means i'm in bed then, sick. (I so wish Acai'd push for edit mask or chinese too :(
My shorthand did change though, it was much more phonetic-based when i got here(EncycloPetey and Atelaes know), so I already changed a lot on their feedback; now i more drop some letters (vowels and double consonants) and spaces (including trying dns) -- only, I can't breach my remaining functional reserve :( (which I now have spent today on the discussion pages, so no entries possible :( (I just hope the investment will pay off > happier editing for everybody in the future! :)
I still feel young at 40 though! (and likely still will say so at 60, if that's on my cards ;)
Thanks for your nice post!! :D
TA'GAIN LM! L☺g☺maniac chat? 16:55, 14 September 2009 (UTC)

from palkia's talk,rel.sctns:

isjustTALKppMARATHONS'dbe madeACESIBLE2mesoICANTFISIKLI,evenwhen entrysofme rconsidrd ok[4wich imhapi!!:p
i'vno meanbone i/me,ijust wishtheyGET2UNDERSTANDwotim struglinw/nCANTDO,THEYppl--fr.wp=MOST CMN occ.diseas i/fr,inOz=epidemic,wasi mistakn i/c-inda goals v wmf ppl'dbeENLITENDherbut9m0runin,postin w:rsiNUTINworkd,BUTpraps sbw/HELTHYhands'n'resn canmakeaDIFRENS--inourshrt/onlyRECNTdealinz uimpresdme aSENSTIVnRSNBLwichiV.MUCH APRECIATE

From PalkiaX50's talk, related sections:

is just talk pages marathons (pretty much invariably so :( ) would be made accessible to me (my function reserve = about less than 1/2 an A4 page a day (every day being a bit different) so I can't physically, even when entries of mine are considered OK (For which I'm happy! :P)
I've "no mean bones" in me (dixit a friend), I just whish they'd get to understand what I'm struggling with and can't do, they (Wiktionary people in general) are surely not racist as they know it's wrong, and I bet the same would happen with me and RSI inflicted (It really is, though get the keyboard gone and I'm fine > sport, house chores, hell, work, yes!) people -- from French Wikipedia = most commonly occurring professional disease in France; in Oz (where I got it, though not infectious, just bad ergonomics :( ) they say they have an RSI epidemic; was I mistaken - I took seeing the goals of WMF that people would be enlightened here (really thought that would) but nine months running, posting w:RSI, explaining etc, nothing worked, but perhaps healthy hands and reason can make a difference (Logomaniac) - in our short/only recent dealings you impressed me as sensitive and responsible which I very much appreciate (like Mglovesfun's new comment, do they actually realize?? 2 inches going out of their was to help me function is asking too much, insisting on me typing > y'day my hands inflamed'gain :( ...
TA LM! SV 16.9.9]

L☺g☺maniac chat? 19:17, 14 September 2009 (UTC)

After some thinking about it, I'm also feeling like complaining a bit. Bear with me for a moment, please. I don't get why it is that I, a 13-year-old, am the only one who a) is able to read / takes the time to try to read 史凡's posts and b) have suggested and am doing something helpful?? This week I found a very apt song for this situation. I'm glad that I can help by translating his posts into readable English but please, does this have to be the only solution? :P L☺g☺maniac chat? 15:15, 16 September 2009 (UTC)

I don’t blame you for feeling exploited. Facilitating 史凡’s communication must take a significant amount of time and effort. Rest assured that your work is appreciated, by all parties, I’m sure. Certainly, this is no optimal solution. I think that the best hope lies in audio recordings, and that once 史凡 gets used to creating them and uploading them to ~~MediaWiki~~ Wikimedia Commons, he’ll find it a far preferable modus operandi to expending his “functional reserve” in typing here; whatever can be done to facilitate this recording-and-uploading process should be done (I, for one, can’t help with that, given my minimal technical expertise in that respect). Of course, no one could justifiably ask you to perform this translator/transliterator rôle; however, no one did ask you — you took it upon yourself to do this, and you have every right to withdraw your service, if you find the burden to be too great or the recompense to be too meagre. This is the harsh truth of volunteering, I’m afraid. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 15:40, 16 September 2009 (UTC)

It's not that I don't like translating his posts. I'm glad to be able to help. It's just that I intend it to be a temporary solution while we're trying to figure out audio. :) I just get a little discouraged at times when it seems like none of the adults around here are acting like one. I'm also pretty tired today so I apologize for being more irritable than usual. L☺g☺maniac chat? 16:00, 16 September 2009 (UTC)

agre +ta both ofu-iltry2nite

Your longest post took me about an hour. L☺g☺maniac chat? 13:59, 17 September 2009 (UTC)

wa-me>1/2day

Verb lemma forms for various languages

I would like to officially start to build a collection of which verb form is the lemma or dictionary form for each language.

For English, the romance, and Germanic languages, it's the infinitive but for many languages it's not.

Some have no infinitive, some have multiple infinitives. For many languages there are more or less standard names for the verb forms or at least traits of the form can be given such as person, number, tense, aspect. Many languages especially of Asia are not inflected at all and hence verbs have only one form thus no need to name it.

If you know a lanuage for which the form used as headword in dictionaries is special or noteworthy please list it here. I will add it as a resource to my language metadata server currently running on the toolserver.

Here are a few languages I know to have lemma forms which are not like the English infinitive:

Ancient Greek
Hebrew
Latin

— hippietrail 01:51, 10 September 2009 (UTC)

If you start a page where these can be tabulated, I suspect it will grow. The "About" pages usually list this information, as a starting point for someone who'd like to comb through them. --EncycloPetey 03:00, 10 September 2009 (UTC)

I'll start here and move them when somebody suggests a good page title. Here are the ones clearly and easily findable in the "About Language XYZ" pages:

Ancient Greek: present active indicative first singular
Arabic: third person masculine singular perfective (some dictionaries use the 3rd masculine singular imperfective)
Armenian: infinitive
Old Armenian: first-person singular present indicative
Belarusian: the base form is the imperfective infinitive, but the perfective infinitive, though secondary, gets its own page as well
Bulgarian: present simple-tense first-person singular
Catalan: infinitive
Chinese: not inflected: verbs have only one form
Czech: infinitive
Finnish: infinitive
French: infinitive
Galician: infinitive (impersonal)
German: infinitive
Greek: first person singular of the present tense and indicative mood
Hebrew: third-person masculine singular past
Haitian Creole: verbs are uninflected
Hungarian: third-person singular of the indefinite present
Ido: present infinitive
Italian: infinitive
Japanese: the non-past tense (verbs have no person, gender, or number)
Khmer: not inflected: verbs have only one form
Korean: infinitive (i.e. ending in 다)
Lao: not inflected (only one form)
Latin: first-person singular present active indicative (first principal part)
Limburgish: infinitive
Lingala: verb stem
Macedonian: third-person singular simple present (some dictionaries use the 1st person like Bulgarian)
Navajo: third-person singular present
Ojibwe: third-person singular present
Old English: infinitive
Old French: infinitive
Portuguese: infinitive (impersonal)
Quechua: third-person singular present
Romanian: infinitive
Russian: the base form is the imperfective infinitive, but the perfective infinitive, though secondary, gets its own page as well
Sioux: third-person singular present
Spanish: infinitive
Swahili: this has been severely depressed here and the situation is unlikely to change anytime soon, but if we were adding Swahili, normally the indicative root form of the verb is used...e.g., -peleki (to send).
Swedish: active infinitive
Thai: not inflected: verbs have only one form
Turkish: infinitive (some dictionaries use the stem instead: koş- rather than koşmak)
Ukrainian: the base form is the imperfective infinitive, but the perfective infinitive, though secondary, gets its own page as well
Vietnamese: not inflected: verbs have only one form
West Frisian: infinitive
Yup'ik: third-person singular present

— hippietrail 03:57, 10 September 2009 (UTC)

And as far as I know, the lemma is the infinitive for any "modern" Romance language, although in Galician and Potuguese it's the impersonal infinitive that is used, since there are also personal forms of the infinitive in those languages. There are also oddities in the inflection line for Romanian verbs that you might ask Opiaterein about. There's a particle that appears before the infinitive in most Romanian grammars, but which isn't used for our page names or for the headword in dictionaries. It may or may not deserve a note of some kind. Note also: the Hungarian lemma form is actually the third-person singular present indefinite, which is also the form that most dictionaries use. --EncycloPetey 04:09, 10 September 2009 (UTC)

Okay when one of the links Wiktionary:About lemmata or Wiktionary:About lemmas becomes blue it means I have chosen a page name (-:

Yes English, Romanian, and Icelandic all use a "to" particle with verb infinitives which are used in grammars but not in dictionaries. Also apparently certain prefixing African languages use the stem as the lemma for verbs which as such would begin with a hyphen generally which is omitted though for dictionary headwords. — hippietrail 04:29, 10 September 2009 (UTC)

Er... The "About" titles are usually indicators of style guides for specific languages. Why not just Wiktionary:Lemmata or Wiktionary:Lemmas? --EncycloPetey 04:49, 10 September 2009 (UTC)

Why not? Because just as I saw nothing about the prefix "About" to prevent "About lemmas" being sensible, there may be someone who for some reason equally opaque to me objects to "Wiktionary:Lemmata". Perhaps there is some other unwritten "usually" that might apply to that. Hence I will "just" wait for now (-: — hippietrail 07:55, 10 September 2009 (UTC)

If it’s all the same to you, hippietrail, I’d prefer (like EP does, I think) such a page to be at Wiktionary:Lemmata. Wiktionary:Lemmas as well as the About-prefixed forms could (if need be) redirect thereto. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 09:58, 12 September 2009 (UTC)

Japanese: the conclusive form, known in Japanese 終止形; also colloquially known as dictionary form (辞書形) by some. Bendono 04:48, 10 September 2009 (UTC)

Japanese and Korean have very many ways of inflecting verbs besides the "tense" mentioned above. In particular there are the levels of politeness. Books that teach Japanese begin with the "-imasu" forms but it seems there is a "plain" for which dictionaries might prefer. — hippietrail 07:55, 10 September 2009 (UTC)

Anyway the canonical form will be 終止形 (shūshi-kei) which is unique to each verb, as Bendono mentioned above. Using terminology of w:Japanese grammar, it is the terminal form. --Tohru 08:31, 21 September 2009 (UTC)

whynot justINFINITIVSwhere posibl soitsSTANDEDIZD?!--史凡>voice-MSN/skypeme!RSI>typin=hard! 10:27, 10 September 2009 (UTC)

Automated translation: "Why not just infinitives (e.g. Latin) where possible so it's standardized?" L☺g☺maniac chat? 18:44, 12 September 2009 (UTC)

You could read an explanation on the various "About" pages for those languages. Latin has multiple infinitive forms (present, perfect, future both active and passive), so how do you pick which infinitive to use? The infinitive in Latin is often a grammatical noun, so using it to represent the verb introduces complications. Also, Latin textbooks, dictionaries, schools, etc. do not use "the infinitive" as the headword form; they use the first principal part instead. So why should Wiktionary use a form different from most textbooks and dictionaries? (the same goes for Ancient Greek, Hungarian, etc.) This would make it harder for our users to look up words and harder to coordinate between different language Wiktionary projects. There is a saying in English: "Foolish consistency is the hobgoblin of little minds." Trying to shoehorn all languages to one arbitrary verb form for the sake of "standardization" would be foolish consistency. --EncycloPetey 14:22, 10 September 2009 (UTC)

NOTso i/flanders:inf pr fromemory
engl:2have done sth>lets put def under s/he/it sees.
we'd getaway from pos-obsesion,we no gramabuk primarily--史凡>voice-MSN/skypeme!RSI>typin=hard! 17:46, 12 September 2009 (UTC)

Automated translation: "Not so in Flanders: infinitive present from memory

english: to have done something > let's put definition under she/he/it sees.

We'd get away from part of speech-obsession, we are no grammer book primarily [any add. grammar notes fine w/me tho

L☺g☺maniac chat? 18:44, 12 September 2009 (UTC)

Swedish: active infinitive. \Mike 16:15, 10 September 2009 (UTC)

I see hippietrail's mentioned Hebrew, but not listed it: we use the third-person masculine singular past.--msh210℠ 19:46, 10 September 2009 (UTC)

I have edited Ukrainian and added Belarusian - same approach as to Russian. Polish, Czech and Slovakian also have perfective and imperfective can be treated the same way but I know Czech perfective is preferred over imperfective in Wiktionary. For Arabic, I would leave perfective as the official form. Anatoli 20:22, 10 September 2009 (UTC)

Tsalagi/Cherokee: root minus prefixes, suffixes, or anything else, for example, -e-, go. Which really reminds me that I should be actually working on that language. Stupid syllabary that I don't know well enough. --Neskaya kanetsv? 06:37, 11 September 2009 (UTC)

More on some Slavic languages. Some verbs are used only in one type of infinitives (perfective/imperfective) (e.g. интерпретировать, интегрировать) or the other type is very-very rare or, also, one type makes too different from the other to make a different word altogether: класть/положить ("to put" impf/pf) говорить/сказать ("to say" impf/pf, the former also means "to speak"). Anatoli 19:43, 11 September 2009 (UTC)

Yes, some verbs are imperfective only, a few are perfective only, a few are imperfective and perfective at the same time, and some are imperfective in one meaning, but perfective in another meaning. Most dictionaries do not treat perfective verbs that have imperfective counterparts, but merely redirect them to the imperfective. The reason that we treat both forms individually is that we are furnishing the conjugation, and it is confusing to put both conjugations on one page. I think this will apply to the Japanese as well...Japanese verbs are often quite different from the humble to neutral to exalted, honorific infinitives, the plain and the polite. Humble form of to eat is itadaku, neutral is taberu, exalted is meshiagaru, each with a different conjugation. Then there are the polite forms, such as tabemasu. —Stephen 13:43, 12 September 2009 (UTC)

We're doing something similar for Spanish verbs that have a reflexive counterpart, although in that situation the reflexive verb often has a different meaning. The reflexive and non-reflexive verbs have separate entries that are interlinked. We also do entry-splitting for Latin Participles, where the lemmata of the various Participles have separate pages, each with an inflection table for that participle. You can't really squeeze all the participial inflections into a verb conjugation table meaningfully. You have to worry about gender and case for a participle (but not for a verb) in Latin. Also, Latin participles sometimes have substantive meanings for the neuter participle forms as well, which would confuse the definitions section of a verb entry. --EncycloPetey 18:04, 12 September 2009 (UTC)

Haitian Creole: Verbs are uninflected.
Korean: infinitive (I think that's what it's called; they end with 다 e.g. 오다) —Internoob (Talk•Cont.) 23:10, 14 September 2009 (UTC)

No, the Korean infinitive is generally (by Martin &c.) considered to be the 하여, 와, etc. form -- what is sometimes called the "polite stem". IMX the dictionary form is usually just called the "dictionary form", though 기본형 ("basic form") also has some currency. -- Visviva 08:12, 21 September 2009 (UTC)

I think it would be worth noting in the list above that Turkish infinitives end in -mek and -mak. Also, the dictionary forms for Hindi and Urdu are the infinitive, which ends in -nā (ना and نا) Albanian uses the first-person singular present indicative verb form. Lithuanian is the infinitive, which always ends in -ti. — opiaterein — 23:02, 15 October 2009 (UTC)

Featured entry

Has a featured entry proposal ever come about? Promoting the best content Wiktionary has to offer seems like something that would be beneficial to our readers, and an ideal for our contributors to aspire to. –blurpeace ^(talk) 10:06, 10 September 2009 (UTC)

Isn't that exactly the role Word of the day fulfills? -- Prince Kassad 10:44, 10 September 2009 (UTC)

No, the WOTD does what most similar efforts from other sites do. A "Word of the Day" is selected for the interest/utility/etc. of the word itself, and not for the quality of the entry. We have discussed having a "featured" entry, but concluded it would be pointless. Who really cares about seeing a really great dictionary entry disaplayed as if it were a work of art? (Wow, look at the nesting of those section headers!) That's more an encyclopedia option than a dictionary option, because an encyclopedia is displaying topical content where the format has freedom to vary from entry to entry. A dictionary project like ours is trying to produce complete entries that will all have a largely uniform format. That's dull to look at every day, and featuring an entry on the basis of its really great format has no appeal to the general population. The French Wiktionary tried such a thing, and the project quickly aborted. We've chosen to stick to WOTD, which features an entry based on the meaning of the word and not on the basis of entry format. This is the sort of "featuring" that most people expect in a WOTD. --EncycloPetey 14:02, 10 September 2009 (UTC)

I think I agree with you. Anyway, most often, the best pages are the less impressive ones and the shortest ones. Lmaltier 21:16, 10 September 2009 (UTC)

I don’t think it would be so dire. A featured entry would probably be for a highly polysemic word, with multiple alternative spellings, etymologies, and pronunciations, allowing an enormous number of supporting quotations, synonyms, antonyms, related and derived terms, other terms of some applicability, translations, and so on. In such a case, there’s a lot to go wrong, so getting it all right is a lot of work and no mean feat; at the very least, having a category of such “featured entries” will be exemplars for editors to emulate. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 21:48, 10 September 2009 (UTC)

Or should it be one that is particularly helpful to translators or new ESLers or just be prettier or longer than competitors' entries? I'm fairly sure that we will decide that what a restricted category of eligible voters like will be deemed featured. That's how other dictionaries do it, isn't it? They all would like their entries to be just like the OED's but they just don't know how, so their approaches aren't worth learning from.

It might be good if we decide what we are proud of and then seek and get feedback from a good number of non-voters, but I'm fairly sure no one will feel inclined to do that because of our need to respect the privacy of users. And if we do get some feedback, we will discount it because it is a biased sample, not of our real target user; because users don't really know enough to tell us anything, etc.

I wouldn't mind being wrong about any or all of this. DCDuring TALK 22:57, 10 September 2009 (UTC)

Must you be so sarky and negative? Why don’t you propose some common-sense criteria based on the purpose of a dictionary? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:19, 10 September 2009 (UTC)

Glad you asked. Why don't we put some English entries from each PoS up against the entries of competing monolingual on-line dictionaries and compare the coverage of the definitions. One article or conjunction or pronoun, 1 determiner, 2 prepositions, 3 phrasal verbs, 3 idioms, 5 adverbs, 10 each of nouns, adjectives, verbs, 2 proverbs, 2 idioms. Perhaps the numbers are wrong. Perhaps it could be done outside principal namespace by transclusion with links to the competing dictionaries to facilitate comparison without risking copyvio. We still won't have any direct user feedback but we'll have some facts to work with. DCDuring TALK 00:21, 11 September 2009 (UTC)

Personally, my idea of a "featured entry" would be something like COW, only it would be one month, someone would schedule entries to be featured the next month (possibly in the POS quantities DCD was mentioning above) and post them on some page and all contributors would work together to make those entries as good as they could possibly be (providing pictures, etymology, pronunciation, quotations, etc.) and then the next month they would each be featured for a day along with that day's WOTD. That's what I would do. As to where the words come from, I honestly don't care. Perhaps we should include non-English entries too. Maybe we should glean off the cleanup category or the definitionless words category. Again, I don't care. L☺g☺maniac chat? 00:49, 11 September 2009 (UTC)

But we've tried that sort of collaboration many times before. It doesn't happen. Even Wikipedia has had a huge drop-off in the collaborative selections. I've watched as several weekly/monthly collaboration efforts slowly died over there, even for large project groups like "Novels" and "Microbiology". On Wiktionary, we've had even less success with collaboration on entries. It's a nice idea, but it's never been successful in any form it's been tried. --EncycloPetey 04:08, 11 September 2009 (UTC)

While the concept of Wikipedia-style "featured content" is probably not applicable here, I do think we would benefit from having a category of "good entries", which have been vetted as being reasonably accurate and complete (covering all supportable definitions found in other dictionaries, having appropriate examples and citations, etc.). The problem is, of course, that it's rather hard to point out any such entries at the moment. If we were going to do a collaboration to this end, I would suggest that it focus initially on core vocabulary, such as the General Service List of English Words or a similar list. Our coverage of these words is somewhat embarrassing at present, and because our core English vocabulary is the "hub" for our core coverage of all other languages, this has an adverse effect on the project as a whole.

BTW, if anyone is interested, I recently put together a chart of the number of definitions in Wiktionary vs. the number in the Random House Unabridged for each GSL headword. Our median GSL-headword entry is 4 definitions short, but NB that doesn't include any of RHU's subsenses. (I could make a chart that includes subsenses, but it would just be depressing.) -- Visviva 02:21, 11 September 2009 (UTC)

The French Wiktionary tried doing just that. After four years of running the effort, they have accumulated a total of 17 entries. --EncycloPetey 17:56, 12 September 2009 (UTC)

ithink datmite HELP USERSfindin'wothey want,esp.i/LONGER ENTRYS;thoughts any1?--史凡>voice-MSN/skypeme!RSI>typin=hard! 10:14, 10 September 2009 (UTC)

But isn't that what the ===Noun===, ===Verb=== etc. are for? L☺g☺maniac chat? 15:24, 10 September 2009 (UTC)

Using the definitions as the header has been proposed before, the main problem is implementing it (we have nearly 1.5 million entries to convert) and ensuring that it works for all pages (from fudges to lead). If you can come up with a viable proposal, then there will be only a little politics before it can be implemented. Conrad.Irwin 19:39, 10 September 2009 (UTC)

(Unlike Conrad, I assume you refer to the following idea, not using the definitions as headers. Because of your style of writing, it's hard to tell.) I, too, think that we should have a Definitions header, or at least would like to have the community consider it: this would help, I suspect, those who frequently comment that they can't find the definitions. the order, to my mind, would be ===POS===, inflection line, ====Definitions====, #Definition lines, ====Etc.====.—msh210℠ 19:45, 10 September 2009 (UTC)

I also think 史凡 meant to add a ====Definitions==== header. That would be good for search engine optimization. —Rod (A. Smith) 19:55, 10 September 2009 (UTC)

Among the 11 monolingual dictionaries at OneLook only Encarta has "definitions" as a header. The only other dictionary that marks definitions with "definition" puts it to the left of the definitions it offers.

It is my hypothesis that those who cannot find the definitions are befuddled by the table of contents. The are probably similarly befuddled by MWOnline, but less often because MWOnline is monolingual. The cost of adding a definition header for the one or two parts of speech would be to force some of the definitions off the landing screen at which folks arrive from a search engine even if they had the table of contents on the right hand side and to force users to page down more than they do already.

If the definitions header could be rendered invisible for those who opt out (meaning folks like us) we might improve accessibility through the ToC at little cost to us in terms of scrolling.

Even better would be a means of improving accessibility without the ToC, for example, by putting it on the right-hand side by default and following the approach most dictionaries follow of not having the headers material separate lines (if they have it at all). DCDuring TALK 20:49, 10 September 2009 (UTC)

I pushed for something like this years ago. It would be a huge help for many foreign-language entries, because many terms in some other languages are very difficult to put an English label on. It’s similar to the way the Russian Wiktionary does it...there they have no POS headers, and POS info is given under a Morphological and Syntactical feature header, which allows for complex and varied input. Tamil substantives operate as noun and adjective, and there are no separate nouns or adjectives. Similarly, Thai has almost no adjectives, but some substantives may be considered modifiers. Russian is packed with predicatives that are a subset of adjective, but not adjectives (they may also be considered verbs). Polysynthetic languages have few words that match English POS...Ojibwe has no adjectives, but uses preverbs instead; and Ojibwe verbs could be considered phrases or entire sentences. These concerns are a huge headache for any editor trying to fit them into a neat English template. If we could use a more generic header such as Definition, it would be a Godsend. (This discussion was not even possible when Connel was active here, since he was dead certain that all languages were just like English.) —Stephen 00:46, 11 September 2009 (UTC)

I'm a bit lost as to how making defs into a header would alleviate this particular problem. Why not just use non-English POS headers? I, for one, have been using particle and participle in Ancient Greek entries, regardless of whether they're relevant to English entries. And yes, Connel was absolutely convinced that English contained every feature of language in existence. :-) -Atelaes λάλει ἐμοί 11:44, 11 September 2009 (UTC)

I don't see how this could work without completely breaking not only layout, but the internal logical structure of entries. It's not like we have a definitions section and we're just hiding it; in a multi-etymology and multi-POS entry, one group of definitions may be separated from the next by screenfuls of semantic relations, etymological information, translations, usage notes and references. Which group would "Definitions" be linked to? -- Visviva 06:50, 11 September 2009 (UTC)

imeant~msh{h/she c it actualy clearer than idid--buthen wot sgb said canot bedone-oyee..[conrad's-iheard boutit,butnevaREALYgot thatconcept nor advant.{ididnt folo the completethread}n yday ididntrealiz thusCONFUSIONmite arise.

visv:butsame goes4say SYN.-sections--howzthat denhandldnow?{a-urefrd2sgb's,sai,complicatd,sory:/--史凡>voice-MSN/skypeme!RSI>typin=hard! 08:13, 11 September 2009 (UTC)

Well, synonym (&c.) sections are nested under the POS heading, and if there are multiple senses then the synonym groups are labeled using {{sense}}. (That's the way it's supposed to work, anyway; compliance varies.) I don't quite see how this could be applied to a definitions heading. -- Visviva 09:09, 11 September 2009 (UTC)

Or, wait ... rereading the above, I may have misconstrued things. Would it work to just have the TOC say "Noun definitions," "Verb definitions", etc., instead of "Noun", "Verb"? Since POSs are a fairly small closed set, that ought to be doable in Special:Monobook.js, if there's a will. -- Visviva 09:16, 11 September 2009 (UTC)

I can't support this, as I've never seen a reasonable concept entry using defs as headers that didn't look awful. I agree that we need to put stuff "inside" a definition, but I've always thought some collapsing JS was the way to do it. -Atelaes λάλει ἐμοί 11:44, 11 September 2009 (UTC)

I hate our giant ===Noun===, ===Verb=== etc. headers. Also ===Etymology=== and ===Pronunciation=== . They are so distracting. If you reed WT:FEED you'll see many readers complaining about how they can't find definitions because of these. No other dictionary has such gargantuan POS and other markers. Here is how I think dictionary should look like. --Vahagn Petrosyan 12:11, 11 September 2009 (UTC)

If we were able use the anon-survey to get an idea of the most useful sections for users, then we could auto-hide all other sections that are over, say, 2 lines long (using some heuristic). W/o we're just prematurely optimizing. Also, right-hand siding the ToC by default would hopefully make it less confusing & obtrusive. --Bequw → ¢ • τ 14:54, 11 September 2009 (UTC)

When you write "the anon survey" you make it seem so - definite. Is there one scheduled, planned, or under discussion? DCDuring TALK 16:46, 11 September 2009 (UTC)

No, I just meant the feedback box that anonymous users see on the left-hand side. It could be modified periodically to collect different data, but there's nothing planned that I know of. --Bequw → ¢ • τ 15:06, 12 September 2009 (UTC)

formal request

4users:conrad,visv,polanski,mglovsfun&rejbrand2beCITED4DISPARAGINGaDISABLED PESN

Sven (史凡 Shǐfán), why are you writing like this? I suspect you are educated in proper English but I haven't seen a normal sentence from you. Nobody can read you. Sorry, I don't understand what you are complaining about. Anatoli 02:25, 11 September 2009 (UTC)

史凡 is apparently asking for a reprimand of User:Conrad.Irwin and User:Andreas Rejbrand for remarks made in the "wt=mORALITYbook?!? re achterklap + http...%E5%8F%B2%E5%87%A1&redirect=no" section above.

史凡, would it be OK with you if people here call you Sven? Few editors here are able to enter Chinese characters, so it may be easier for them to call you "Sven" if that's OK with you. I assume you use a regular keyboard like the one most of us use. If so, that the caps lock key is probably no easier to press than the space bar, yet you frequently use caps lock, apparently to indicate word breaks. If that's the case, and the space bar is just as easy for you to press as the caps lock key, please consider use the space bar instead of caps lock to separate your English words. Also, it doesn't really help anyone when you wikilink capitalized versions of misspelled words here. Are you doing that to emphasize the words? I have no idea what you mean by "aBLOTCHon wt'sBLAZON" above, and other editors are probably similarly unable to parse some of the things you write here. So far as I can tell, nobody has been intentionally offensive toward you. There are probably some ways to make your communication more effective and understandable without further aggravating your condition. Please be open to the suggestions of other editors here. —Rod (A. Smith) 02:55, 11 September 2009 (UTC)

I'm sorry, Sven, I wasn't aware of the RSI (failed to read in your signature). Rod, thanks for explaining the topic. Anatoli 04:06, 11 September 2009 (UTC)

I'm sorry for upsetting you, and will not talk with you further. Clearly my attempts to be rational have failed. Conrad.Irwin 07:39, 11 September 2009 (UTC)

So let me see whether I can decipher the original post:

Original: 4users:conrad&rejbrand2beCITED4DISPARAGINGaDISABLED PESN[mylife asuch=alredyHARDENUF,n its aBLOTCHon wt'sBLAZON thatsuchULTRA-INCORECT STATEMENTS rTOLERATEDhere,BAAADPOLICY,nay,the ultimateABYSofINCIVILITYn daHEIGHTof MEDIEVAL IGNORANCE!!!!!-
- The number of characters: 242
Translation: For users: Conrad and Rejbran to be cited for disparaging a disabled person. My life as such is already hard enough and it's a blotch on what blazon that such ultraincorrect statements are tolerated here. Bad policy, nay, the ultimate abyss of incivility and the height of medieval ignorance!
- The number of characters: 292 = 1.2 * 242

Honestly, I took me a lot of attention to translate the thing, and the translation is not all that much longer than the original. --Dan Polansky 08:33, 11 September 2009 (UTC)

<semijoking> Would the community like me to volunteer to monitor his talk page contributions and translate them into readable English? </semijoking> And come on, let's try to be helpful and not just bulldoze our contributors. Please? L☺g☺maniac chat? 14:20, 11 September 2009 (UTC)

(Getting back on track) What exactly did these two editors do wrong? The MD comment was a little bit uncalled for, but Rejbrand later admitted that he had no idea about your condition (and has now apologized). I think you're overreacting a bit. —Internoob (Talk•Cont.) 22:50, 11 September 2009 (UTC)

Agree with former comments in that the use of the space bar and a lack of links would make your statements a lot easier to read, and, I would think, a bit easier to write. To address your initial comment, I don't think that anything Conrad said was out of line. Rejbrand's comment was perhaps a bit snarky, admittedly. However, the fact is that Wiktionary does not discriminate against disabled people, and I don't think that either of their remarks indicates otherwise. -Atelaes λάλει ἐμοί 00:14, 12 September 2009 (UTC)

史凡: Get a better speech-to-text program, please. — opiaterein — 00:36, 12 September 2009 (UTC)

If I understand correctly, he uses a certain speech-to-text program that does not do a very good job with text, writing upper case, misspelling, inserting weird characters, and so on, but it also allows for execution of commands such as are needed for opening sections for editing, navigation, saving, and so on. And better programs are either too slow or don’t execute commands well, or both. Apparently he performs regular manual typing when creating or editing an entry, which he can only manage to do a small amount of each day, and he uses the speech program for commentary, since accuracy is not so important in that context. —Stephen 13:53, 12 September 2009 (UTC)

No, I'm pretty sure that it is just his typing with his own two hands. . . L☺g☺maniac chat? 13:59, 12 September 2009 (UTC)

sinsreturn i/may:abandund DNSp mainly cos comp2slow[1paragraf/h:(

1,afal onmy ristwatch broke myscaph:/--史凡>voice-MSN/skypeme!RSI>typin=hard! 16:49, 12 September 2009 (UTC)

Automated translation: "Since my return in May i have abandoned DNSp 9.0 (did have issues too, like producing lots of crap, but did/does have potential, and now 2 updates down the line-9.5&10.0 i thought) mainly 'cuz computer was too slow (fe one paragraph/hour only at times:("

one arm now actually, since a fall on my wristwatch broke my left scaphoid bone ('gain, if not rfd'ed by now :/" L☺g☺maniac chat? 18:15, 12 September 2009 (UTC)

I don't think there's ever been an issue of "disability discrimination" here - we know that 史凡 writes in fully grammatical English when he edits or creates main space entries, but writes gibberish on discussion pages. It's very easy to cry "discrimination" for convenience rather than a genuine reason. Mglovesfun (talk) 11:09, 13 September 2009 (UTC)

There are several things I have wondered about your writing style. Feel free to leave my questions unanswered.

What benefit does it bring you to capitalize things instead of separating words by a space?
I cannot personally imagine what your RSI condition is like; I know personally no one with that condition. I imagine there is some typing cost per letter, as it were. Then, are spaces more expensive than non-spaces to type?
I would estimate that writing in normal English would double (multiply by two) the writing costs for you. Is that correct?ITS PRETIMUCH AN EXPONENTIAL REL-SHP~ATLET CANT OVERTAKE TORTOIS-FALACY Automated translation: "It's pretty much an exponential relationship,similar to the "can't overtake a tortoise-fallacy'"
If the third is correct, could not the increase in writing costs be compensated by writing shorter sentences?ICAN RUN ABLOK MOSTDAYS NOT A MILE. "I can run a block most days (see functional reserve before it's RFD'd) not a mile"

I have tried several times to decipher what you write, but I have mostly given up. I don't think I can be morally obliged to decipher something.I'V NEVA ASKD BUT TOLERANS['PART FROM CONR.>TECHN.SOLUTIONS "I've never asked but tolerance (apart from Conrad"

If someone did not speak English and would start writing in French, expecting me to adjust by learning French, I think I would refuse and insist that he learns English instead. MYSHRTHND=ENGL And by the way, in my opinion French and English are just dialects of the same language. And I mean that.

So I for one would very much appreciate if you (a) either write in plain English in spite of the additional cost for you, HUMANITIES-REASONIN:INTHE END IWANA GETMYWAY,RESONABL ORNOT[SCIENS PL MAKE URWAY2WT.. or (b) at least stop using capitalized letters and start using spaces. There are still a lot of tricks to make the writing shorter, tricks that you are already using, like writing "yr" instead of "your", "u" instead of "you", "lite" instead of "light", etc. "Humanities - reasoning: In the end I want to get my way, reasonable or not (science please make your way to wiktionary..)"

--Dan Polansky 07:52, 11 September 2009 (UTC)

I agree, I don't see how inserting gibberish IHATE THAT OFENSIV WORD "I've never used that one" :{}/ makes typing easier rather than harder. Plus when you create main space entries you write very well, C BP "see the Beer Parlour" making me think that you simply do not want to be understood. ICANONLY SHAKEMYHEAD,KINDERGARDN HERESIGH. "I can only shake my head, kindergarten here, *sigh*" Mglovesfun (talk) 07:55, 11 September 2009 (UTC)

(Unindent) Just to be clear about my position: I am mainly asking questions, and trying to explain the difficulties that I have with deciphering.INO>NEED=VOIS-RECORDINS "I know - I need voice-recordings." I am not saying anything about your intentions, about why you write in this style WHENDO UGUYS GET STH?????; "When do you guys get something?" I have no mind-reader device. I can imagine that habit is habit, and that you have already automatized the way of writing that you are currently using. I do realize that changing your way is to go out of your habit, which is a mental cost. PROB=ARMS,NOT HED

某人騷擾我為我RSI /ppl giveme aROUGHTIME cosev my-

jen says (9:13 PM):

史凡-Sven says (9:16 PM): 維基百科的-他們讓我覺得好難過 wt/theyREALY MAKE MEFEEL BAD jen says (9:18 PM):

jen says (9:19 PM): just forgive them... 史凡-Sven says (9:20 PM): 雖然我手好痛 EVENTHO MYHANDS REAL/bigtimeHURT--史凡>voice-MSN/skypeme!RSI>typin=hard! 16:13, 11 September 2009 (UTC)

Hm. I don't speak any Chinese. This looks like a record of a chat session. Anyway, I don't understand. V.STREIT4WED Q/very straightforward question..:(--Dan Polansky 18:00, 11 September 2009 (UTC)

Okay. I wonder why you choose MYHANDS DID!!!!! "My hands did!!!! (SV)" not to talk to me. Instead, you are posting a chat record, now also annotated in that shorthand. From what I can decipher: "TRANSLATIN"CAPS--PRAPS IJUST DONTGET HUMANITYSPPL:( "Translating" caps - perhaps I just don't get humanity people :( "

Sven: People give me a rough time because of my RSI.

Jen: ???

Sven: wIKTIONARY/They really make me feel bad.

Jen: forgive them...

Sven: Even though my hands real hurt.

Maybe I am a bit slow of understanding, Y! "yes!" but I see no effort on your side to explain why it hurts less to write in that hard-to-decipher shorthand than in plain English. C BP PROVERB "see Beer Parlour proverb" . The savings in the number of keystrokes are minimal, as far as I can see. U'V NOIDEA,NONE,WOTSOEVA,WHICH ISAD. "You have no idea, none whatsoever, which is sad"

But this must have already been dealt with before. Can you post a hyperlink to a website that explains how your shorthand provides a substantial advantage over normal English writing? --Dan Polansky 21:43, 11 September 2009 (UTC) NICE-RETARDED Q2CLOSE=SALT I/WOUND,BUTWELDUN DAN,/KEEPTRYIN HARD!

Translation provided automatically by Logomaniac (talk • contribs). Very sorry if it isn't accurate, there were a few things I couldn't figure out. L☺g☺maniac chat? 12:43, 12 September 2009 (UTC)

95%acurat,l.m,BIGTA+SORY

Then again, why don't you help me to get rid of my "self-elected ignorance"? Why don't you post a hyperlink to a website that explains how your shorthand provides a substantial advantage over normal English writing? --Dan Polansky 07:07, 14 September 2009 (UTC)

(bafld smily)NON-imjustTRYIN2COPE.[ndad form req was mostlyRITTENOUT asiwantedMOSTPPL2gETITgivnHOW IMPORTANTdat isue is2me!--史凡>voice-MSN/skypeme!RSI>typin=hard! 09:29, 14 September 2009 (UTC) as/myad.,itjustGOSON,hypocritbunchthatmaksmepuk!.--史凡>voice-MSN/skypeme!RSI>typin=hard! 01:59, 7 October 2009 (UTC)

inote d'abuse isufaboutmyRSI=unabated,so beherew/infrmd i'lconsidersendin'that leter2WMFbouthis repugnant situatn--isthis athreat?takit asumay wish;ivtried myuterhere pointinout the v.obvies4a yrnow,my amplpatiens=runin'out w/the abusiv ppl here--sokeep lafin'+givinme sht'boutmy input-prob n,hevenz,surlydontacomodat'it bycreatin'dt darnd rec butn,butrest asurd myeloquentlyhandritn leter 'lfors urhand indoin merely wotis civil,ie 2beNONDISCRIMINATORY[sav urbreth denyin the facts,as fctsr fcts bynatur,~as aspade=aspad,sosincerely iremainw/respectfl regrds2aDISrespctfl'comunity'.
ps nowe dono that noformlreactn ensued suit2myv.graveconcern underthis formalreqst,which'lbe an agravatin'factr, aswe alrealiz2WMFi/her considerations--u'vbeengivn ampl oportunity byme,tho n thru althe ridiculizin'cmnts sufrd,2sortout ur act,shape up n regulate urself which u,wt,refusd2do,sonow thetime hascom,unlesv.swift,in- n decisiv actn frm ur 'comnty'-side finaly'n belatedly ensues,4me 2walk ntake anothe avenu, insted ofkeep' wastinmy breth here as itabundantlyseems,coursof action ofmine which'lbe consequential to wt aswe no it4now,quagmire of utershamles abus+discriminatn,ipromis.--史凡>voice-MSN/skypeme!RSI>typin=hard! 23:44, 8 October 2009 (UTC)
usr polanski thinks he can rv thechanges imake2myOWN LONGHAND+keepspestrinme w/hiswilfl ignorans'boutmy RSI>aneficientcomunity'D DO STH'bouthis+thebeyond

I note that the abuse I suffer (and react to!) about my RSI has not abated so be here informed that I'll consider sending that letter to WMF about this repugnant situation - is this a threat? Take it as you may wish; I've tried my uttermost here pointing out the very obvious for a year now, my ample patience is running out (I'd have hoped after the BP discussion here, but Visviva had too thoroughly blown ridiculizing it, left uncorrected by other regulars) with the abusive people here - so keep laughing and giving me shit about my input problem, and heavens, surely don't accommodate it by creating that darned record button, but rest assured my eloquently "handwritten" letter will force your hand in doing merely what is civil, i.e. to be non discriminatory (save your breath denying the facts, as facts are facts by nature, as a spade is a spade, so sincerely I remain with respectful regards to a disrespectful "community".

p.s. no we don't know that no formal reaction ensued suit to my very grave concern under this formal request, which'll be an aggravating factor, as we all realize (there's only people here who "pretend" to be stupid, we all know) to WMF in her considerations (and don't think any "connections" will help, all the disparaging comments towards me and my RSI are there for all to see, never thought of that?) - you've been given ample opportunity by me, through and through all the comments I've suffered, to sort out your act, shape up and regulate yourself which you, Wiktionary, refuse to do, so now the time has come, unless very swift, decisive action from your "community"-side finally and belatedly ensues, for me to walk and take another avenue, instead of keep wasting my breath here as it abundantly seems, course of action of mine which'll be consequential to Wiktionary as we know it for now, quagmire of utter shameless abuse and discrimination, I promise.

User Polanksi thinks he can remove the changes I make to my own longhand (see above) and keeps pestering me with his willful ignorance about my RSi > an efficient community would do something about this and the beyond (but let me guess, blaming the victim is easier huh!?)

sorry I Didn't get to this sooner :( L☺g☺maniac chat? 14:16, 9 October 2009 (UTC)

I'm not sure what you're saying. I'd ask for diffs if I were sure. If you are saying you changed your previous posting and someone reverted it, it might be appropriate to ask you to not make such changes if they are of a substantive nature (with the added benefit that then no one will be able to revert the change.) RJFJR 13:56, 9 October 2009 (UTC)

I'm sad that this has resulted in 史凡 being blocked. Now can we all move past it please? All I'm seeing (as a concerned outsider) is people bickering and blaming and tearing each other apart. Now it's my turn. I'm not going to blame anybody for anything. I know 史凡 thinks that the members of the community are trying to insult him all the time, and I don't blame him for thinking that. I know some other people (no names) are not being particularly nice to him, either. Now I will plead with you all. Can you please just smile, make up, and get over it? I know I'm a teenager and I shouldn't be exerting influence over anyone here, but it seems like you all are never going to get this over with and it's just going to get worse. It really, really saddens me when all that people do is bulldoze each other down. I'd like to see this community working together to build each other up and encourage each other. Please?!?! L☺g☺maniac chat? 14:16, 9 October 2009 (UTC)

I suppose I am one of those who is not being nice (probably so, as I am really out of patience at this point). But I don't really see this the way you do. This user was posting abusive/hostile screeds here for weeks (months?) without garnering any sort of reaction in kind. In fact, I would have to say he still hasn't received any reaction in kind. No one has told him that he sucks, that he's an idiot, or anything of the sort; on the other hand, he has had no compunctions about dishing out such abuse here and elsewhere. I was happy to simply ignore these postings for the most part -- and I would be happy to continue doing so, except when they progress into personal attacks on other users. But I do think that there comes a point when constructive mainspace contributions no longer outweigh destructive project-space contributions.

In short, I think that Ruakh's action was a reasonable exercise of judgment. And now I hope we can all get back to the rather massive task at hand. -- Visviva 14:52, 9 October 2009 (UTC)

I didn't say I was specifically blaming anyone for being abusive. I do know, however, that I haven't seen anyone being particularly encouraging and helpful. And yes, it would be nice if we just built the dictionary. L☺g☺maniac chat? 15:04, 9 October 2009 (UTC)

Well, for what it's worth, I would have to say you have been the most mature person in all of this. -- Visviva 16:32, 9 October 2009 (UTC)

And honestly, that saddens me. But thank you for the compliment. :) L☺g☺maniac chat? 16:53, 9 October 2009 (UTC)

I believe the reference is to this diff. I will hold my tongue on the questions at issue, if any. But I will say that this and allied discussions do not appear to be serving any purpose other than to waste the participants' time. And time is something we all have precious little of, relative to the task at hand. -- Visviva 14:13, 9 October 2009 (UTC)

`{{citedterm}}`

Hi all. I’ve been thinking about how we draw attention to the cited terms in a given quotation. At present, our method is emboldenment, and AFAIK that convention enjoys the universal support of the editing community. However, there are two problems with this:

Since emboldenment, like italicisation, is often used for emphasis, a sentence may contain other instances of emboldenment, thus obscuring somewhat the cited term.
The cited term may already be emboldened in the source text. There is no way to distinguish a quotation wherein the term is emboldened in the source text from one wherein it is not; this conflicts with one’s ability to reproduce literatim a given text.

As a solution to these two problems, I have created {{citedterm}}; you can see it in use hereat.
At the moment, the template is very simple, the entirety of its code comprising only {{{1}}}, which produces the same visible result as our current convention of emboldenment (using triple ASCII apostrophes). This solves the second problem, since now a term unemboldened in the source text is denoted by {{citedterm|term}}, whilst one that is emboldened is denoted by '''{{citedterm|term}}''' (ATM, they display identically — only the wikicode differs). This template also has the potential to solve the first problem: It’s code may be simple and inflexible right now, but using it would allow us to introduce in future greater flexibility in how we draw attention to cited terms; this could be an option for further customisation via WT:PREFS. As an example of an alternative method, see Google Book Search, which draws attention to the search term by highlighting it in yellow. An additional advantage of using this template is that it could be used to add every entry in which it is transcluded to a hidden category, which would allow us to see exactly how many of our entries contain supporting quotations, and which lack them.
So, what do y’all think of this proposal? Does it sound like A Good Thing™? Is it necessary to take it to a vote? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 17:55, 11 September 2009 (UTC)
I’ve just thought of another advantage. In some languages, it may be inappropriate to mark terms by emboldenment (such as, perhaps, in Chinese, where for some of the more complex characters, emboldenment may turn them into unreadable black blocks); {{citedterm}} could allow different methods for different languages, once it is amended to take the lang= parameter. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 17:58, 11 September 2009 (UTC)

Looks like a nice idea, upon the first impression. Anyway, I dislike the use of boldface for cited terms. The boldface makes the quotation stand up much more than the definition. --Dan Polansky 18:09, 11 September 2009 (UTC)

I like the idea, chiefly because I don't really like emboldening cited terms and it would be nice to turn it off via PREFS. Ƿidsiþ 18:35, 11 September 2009 (UTC)

I also like this idea, but would prefer it to be customizable as to color/intesity of the highlighting. On my screen, that yellow screams out at me like a banshee because most other colors are softer and lighter on my laptop's monitor than they would be on a typical stand-alone monitor. --EncycloPetey 17:52, 12 September 2009 (UTC)

The idea is good, but we don't need a template to do it. On all well-formatted entries, the CSS selector .ns-0 #bodyContent ol ul dl b matches the wikitext #*:'''. so If you want to un-bolden them, just put the following into your Special:MyPage/monobook.css:

.ns-0 #bodyContent ol ul dl b { font-weight: normal; }

In the same way, if we wanted to highlight them yellow, we could add a rule in the site's MediaWiki:Monobook.css. It goes without saying that '''word''' is easier to type than {{citedterm|word}}, and it also avoids the need to convert the existing entries. The only issue then would be that we would need to use existing emphasis instead of '''existing emphasis''' to avoid words that were emboldened in the source being captured by this rule. As emboldenings in the source are very much the exception, I don't see this as much of an issue. Conrad.Irwin 19:18, 11 September 2009 (UTC)

The “need to use existing emphasis instead of '''existing emphasis''' to avoid words that were emboldened in the source being captured by this rule”, however, does not “avoid the need to convert the existing entries”; moreover, your proposal prevents the adoption of any new system of highlighting until all such checking and converting is done. Conversely, using {{citedterm}} allows us to move gradually from one system to another, deprecating plain emboldenment in its favour. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 21:10, 25 September 2009 (UTC)

Interesting. I'm not especially in love with bold, though a lesser emboldenment for everything now bold except the inflection-line headwords would satisfy me.

Query 1: How many instances of bold in the citation are there?

Query 2: Can the CSS-only change be readily made available for anons after testing?

Query 3: Are two degrees of emboldenment possible, what we have plus one less bold than what we have? (Is it a question of what browsers support or of what Mediawiki supports?) DCDuring TALK 21:33, 11 September 2009 (UTC)

I did a quick search for /#\*:.*'''/ in the main namespace and that came up with ~21,000 and there are a further ~7,000 occurances of /\|passage=.*'''/, so somewhere around 25-30 thousand. Conrad.Irwin 22:35, 11 September 2009 (UTC)
They can be added to WT:PREFS.
In theory there are 9, but I don't think that most fonts support all levels.
1. The quick brown fox jumped οωερ τηε λαζυ δογ. (normal for me)
2. The quick brown fox jumped οωερ τηε λαζυ δογ. ( " )
3. The quick brown fox jumped οωερ τηε λαζυ δογ. ( " )
4. The quick brown fox jumped οωερ τηε λαζυ δογ. ( " )
5. The quick brown fox jumped οωερ τηε λαζυ δογ. ( " )
6. The quick brown fox jumped οωερ τηε λαζυ δογ. (bold for me)
7. The quick brown fox jumped οωερ τηε λαζυ δογ. ( " )
8. The quick brown fox jumped οωερ τηε λαζυ δογ. ( " )
9. The quick brown fox jumped οωερ τηε λαζυ δογ. ( " )

WT:PREFS is only available to registered users, right? For me, font weights 100–500 are normal, whereas font weights 600–900 are emboldened; there are only two distinct font weights AFAICS. BTW, the pangram is “The quick brown fox jumps over the lazy dog.” — the form with (deprecated template usage) jumped doesn’t contain an ‘s’. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 21:10, 25 September 2009 (UTC)

FWIW, I agree that the highlighting is too bright & garish.--Tyranny Sue 07:28, 25 September 2009 (UTC)

There’s no need for us to stick to highlighting with a yellow background; I was just rather unimaginatively following {{b.g.c.}}’s lead on that one. I’ve changed it to emboldened crimson text now (per the OED, so equally unimaginative, I suppose), which may be better, at least æsthetically if not functionally. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 21:10, 25 September 2009 (UTC)

I’ve gathered some examples to show when this sort of thing would be useful. I’ve cited a number of words using the book Introducing Foucault, which uses emboldenment extensively for various kinds of emphasis throughout. Here’s the list of terms I’ve cited using the book (the ones marked with asterisks are especially good examples of where plain emboldenment fails us): leather queen, transdiscursive, Foucaldian, Foucaults, régime(s), oeuvre, billet, Fox, clientèle, *archaeology, Enlightenment, secreted, arachnoidian, dura mater, anatomical atlas, episteme, resemblance, *similitude(s), aemulation, *convenientia, *analogy, sympathy, signature, affinities, arcane, *mathesis, *taxinomia, *classification, *tabulation, arrested, tables, and age of judgement. When other terms around it are emboldened, the cited term is obscured if it receives only plain emboldenment; moreover, it is very difficult (if not impossible) to infer whether the cited term was emboldened in the source text (especially for texts like Introducing Foucault) if we use emboldenment to highlight our cited terms.
Introducing Foucault is touted as “The international bestseller” on its front cover. The entire Introducing… series is very popular and comprises a large and growing number of titles, all of which (in my experience, going on the ones on Marx, Postmodernism, Aristotle, Wittgenstein, Semiotics, and Nietzsche) make liberal use of emboldenment. “mboldenings in the source” may very well be an exception, but they certainly aren’t a negligible exception.
My chief motivation in using {{citedterm}} is faithful reproduction of texts. Even if we stick to plain emboldenment as the default (whilst allowing customisation via WT:PREFS), that’d be fine, just as long as source-text emboldenment is also included in the entry’s source code. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 21:10, 25 September 2009 (UTC)

I can now see the need for something like this, if I might make constructive suggestions: The red looks very like a red-link, and as it also emboldens the word, you are still unable to tell whether the term was emboldened in the original. Highlighting is a bit naff, but might work if the colour is chosen sensitively. I quite like using gold underlines as they are both distinctive and unintrusive (though no-doubt everyone else has better taste than me :). I have modified the template to assume {{PAGENAME}} as that is the most common use-case. Conrad.Irwin 22:30, 27 September 2009 (UTC)

Great! :-) Adding {{PAGENAME}} as the default is a labour-saver; good thinking. I’ve made the change you suggested. I don’t mind very much how we mark these terms, TBH; however, that kind of underlining is better than emboldenment (as well as being better than plain … underlining, which I’ve seen in source texts exactly thrice in my experience of citing terms for Wiktionary). Thought if I may venture from my position of apathy, I reckon highlighting is the best option (or an empty box drawn around the term, if that’s possible), since it most efficiently draws one’s attention to the {{citedterm}}. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:47, 27 September 2009 (UTC)

An empty box might work, though interaction with hanging letters isn't easy to rectify without pushing the lines further apart. Conrad.Irwin 23:52, 27 September 2009 (UTC)

How about making the border thicker and a different colour? Also, is it possible to make it translucent, so it doesn’t interfere much with letters’ ascenders and descenders? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 00:48, 28 September 2009 (UTC)

Is that gold background any better than the yellow one? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:51, 27 September 2009 (UTC)

Not hugely, if we are to use highlighting, I'd suggest something very pale, given that the highlighting is not to draw attention, merely to mark the place. Conrad.Irwin 00:00, 28 September 2009 (UTC)

That’s not too great for quotations that feature emboldened text, but I’ve checked it on typically-formatted entries and it does the job perfectly for them. I’m happy with that. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 00:48, 28 September 2009 (UTC)

With all respect to the thought that has gone into this, I prefer the proposed solution of using , or an equivalent template, to mark non-headword bolding. Since we don't use "strong" otherwise, it would be easy to modify the display of this bolding in CSS or turn it off entirely in the default skin (while preserving the information). Since quotations with non-headword bolding are a tiny fraction of the whole, it seems better to have a solution that is restricted to this tiny minority, rather than a solution that would effectively require reformatting all the tens of thousands of existing citations in order to preserve consistency. -- Visviva 08:40, 29 September 2009 (UTC)

That would require going through every citation we presently have, checking for source-text emboldenment, and then substituting … where appropriate; that solution is no less work, and has the added disadvantage of disallowing a gradual conversion. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 16:24, 29 September 2009 (UTC)

I guess I don't see anything as being broken in the current situation. Heretofore the practice has been to omit source-text bolding entirely, or to note it only in an HTML comment. Since the purpose of the citations is simply to illustrate and support the definition, there's nothing particularly wrong with this. So nothing needs to be changed. Going forward, we can encourage people to use in the quite rare cases where source text bolding is present. But beyond this, I don't see how {{citedterm}} allows for a gradual conversion at all. As presently configured, this template is a radical stylistic departure from the standard presentation of the headword. You can't tell me that having {{citedterm}} in one definition and ordinary bolding in the next won't look hideous. So not only would every citation in the wiki need to be rewritten (presumably by bot), but everyone who has gotten used to the existing system would have to change their habits (and, incidentally, all of the tens of thousands of citation lists that I have been working on would need to be revised). This is a lot of trouble to go through for a problem that affects perhaps 1% of citations, and even at its worst is not terribly serious. -- Visviva 17:53, 29 September 2009 (UTC)

My argument is that “the practice heretofore” has been lacking. I’m a great believer in faithful reproduction of source-text formatting. If emboldenment is as rare as you say it is (and it is sufficiently rare for the argument that follows), then learning editors will not pick up that we require … when reproducing source-text emboldenment, since they’ll run across it too rarely to pick up the pattern of usage. Even in the conversion process, there really needn’t exist any inconsistency of term-marking conventions within the same entry, since it’s a simple matter to apply {{citedterm}} to already-extant citations in an entry which use the old standard. Still, if you’re concerned about the stylistic difference, we can always change {{citedterm}} to use a term-marking convention visually closer to the old standard; preferably, it would be similar, but still subtly and perceptibly different, but it would also work to have an option in WT:PREFS available to change the term-marking scheme employed by {{citedterm}}. As for your citations lists, it should be a simple task (if I am right in inferring that you have not marked source-text emboldenment in them, if applicable) of using a bot to convert all instances of ''' immediately followed by a space or a piece of punctuation to }}, and then convert all the remaining instances of ''' to {{citedterm|; right? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:10, 30 September 2009 (UTC)

I’ve moved {{citedterm}} to {{q}} to save editing labour: in entries where the cited term matches the {{PAGENAME}}, this reduces the number of keystrokes needed by at least two (a constant of 5 vs. the former {{PAGENAME}} + 6), whilst in entries where the cited term differs from the {{PAGENAME}}, {{q}} takes the same number of keystrokes as manual emboldenment (6 in both cases). See (deprecated template usage) rôleplay for examples of both these uses. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 17:50, 11 October 2009 (UTC)

Is {q} now BCP, or is it still a trial? (I'm happy with it, fwiw.)—msh210℠ 18:36, 19 October 2009 (UTC)

Sorry for commenting so late, but rather than {{citedterm|…}} and {{q}}, I think it might be better to use MediaWiki's built-in self-link functionality: ] generates linktext. We can then customize the CSS for, say, ul strong.selflink. (I realize this is the exact opposite of Visviva's proposal, which goes against my general policy of never disagreeing with him — and I do think he has a good point here — but it just seems like this is exactly the sort of use-case that the built-in functionality was designed for.) —Ruakh_TALK 20:22, 21 October 2009 (UTC)

Post-archival reply to Ruakh: That, I assume, would only work for citing examples of the term that are exactly the same as the page name, right? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 19:24, 18 January 2010 (UTC)

Ruakh replied on his talk page. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 13:27, 19 January 2010 (UTC)

Deletion and verification

Two points, which I will number to try and make sure both get a reply

Why is there an archive for RFV debates but not RFD ones? As far as I can see, MediaWiki:Previously deleted entries is for all deleted entries, but RFV (verification) ones get archived twice, while RFD (deletion) ones only get archived once. It seems simple enough to me just to not use the WT:RFVA (archives) page, which is easy enough as I don't think anyone ever does, apart from me, and for no good reason that I'm aware of.

RFD debates should not (usually) be closed in less than 7 days, unless it's a speedy deletion candidate. Msh210 correctly pointed this out to me, but it's not on the page itself. I suppose I can be a bit over-eager to delete or keep stuff, mainly because some debates have been known to stay of the page for over a year - I use those words, because the debates don't "go on" for a year, they go on for two weeks and then there are another 50 weeks where nobody does anything. I don't really like deleting entries when the consensus is only 3-0 or sometimes 2-0, but I can't "make" people vote either. In fact, I'll edit the RFD page right now. Mglovesfun (talk) 11:18, 13 September 2009 (UTC)

AFAIAC, them's as does the work can make the rules, when it comes to archiving. If anybody has a problem with how you're doing it, they can go back into history and do it themselves. So, anybody got a problem with how Mglovesfun wants to handle archiving? Yeah, I didn't think so. ;-)

1. Once debates have been archived to a suitable location, there certainly isn't a need for a second archive. This was probably just a mistake to begin with; I can't recall such a practice in years past. 2. Seven days seems reasonable, with allowance for continuation up to ~30 days if there is no clear consensus. -- Visviva 12:18, 13 September 2009 (UTC)

Clickable translation bars

Would there be any objection if I made it so that clicking on the entire translation bar instead of just the button would open it? I did some informal usability testing with a friend (going through Category:Translation requests (German)), and one of the things that I noticed was that initially, the assumption was that clicking on the bar would open it, and latterly attempts to click on the link missed by a few pixels from time to time. The other thing I found particularly tricky was finding the category in the first place, and given that it is now "Hidden", I'm not sure how anyone would manage it by clicking links alone (though maybe I missed a link somewhere). Conrad.Irwin 11:19, 13 September 2009 (UTC)

Seems reasonable to me. I wonder if we shouldn't leave the "show" bit in, as some are accustomed to it, but if we make a standard of this, we can probably remove it in time. -Atelaes λάλει ἐμοί 12:10, 13 September 2009 (UTC)

My natural instinct expects the whole translation bar to be clickable. --Vahagn Petrosyan 12:30, 13 September 2009 (UTC)

Great idea! I assume that in due course you would do the same for "rel" and "der" as well. "Der" might be a lower risk for testing purposes.

Wasn't the rationale to having the category be hidden that it was too much of an open invitation to anons to add translations? If so, I wonder if it would be desirable to run an experiment. At the same time that one or more of the translation requests categories was unhidden, if translations added by anons could automagically get a "ttbc" for a day or more, we could actually determine whether there was a major problem with the their average quality. DCDuring TALK 12:49, 13 September 2009 (UTC)

I don't think so. Quite frankly, the only thing of use that anons often do is add translations. The rationale was that some entries, especially entries which have recently had their senses and/or translations regrouped, often end up with an ugly festering mass of categories at the bottom, and we were trying to eliminate cats from that which we figured readers wouldn't find useful. I'm ambivalent on whether it's a good idea or not. -Atelaes λάλει ἐμοί 13:24, 13 September 2009 (UTC)

Good idea. Mglovesfun (talk) 13:42, 13 September 2009 (UTC)

I can't see any reason why not... can't imagine its not being useful. Go for it, AFAIC. L☺g☺maniac chat? 13:44, 13 September 2009 (UTC)

Part of the rationale behind the assisted translations tool being enabled for all was to encourage adding translations - I don't know whether these are of high quality, as I don't speak any foreign languages. With my friend, it seemed to oscillate between erring on the side of caution and the other way - if you want to review those edits they are the german translations recentish in Special:Contributions/Conrad.Irwin. Conrad.Irwin 14:43, 13 September 2009 (UTC)

This is now done, if it is causing problems, please undo it. Conrad.Irwin 14:43, 13 September 2009 (UTC)

It's not working for me (at least, not yet). I like the idea, but think we should leave in the . We still get new users who don't realize that the bar is a collapsed box full of information. --EncycloPetey 15:00, 13 September 2009 (UTC)

I was not planning to remove the box. clearing your cache (ctrl+shift+F5) should fix it. Conrad.Irwin 15:20, 13 September 2009 (UTC)

Uh... that adjusts the volume. I have a Mac. Clearing the cache makes no difference (but that doesn't mean it won't be working later), and this is something I've come to live with from previous experience. Hence, I say "not yet". --EncycloPetey 15:23, 13 September 2009 (UTC)

You should be able to get it to fix on any specific page by command-shift-r though. I have a mac too, it's working for me as I go through English pages adding translations back to other entries that I have created. --Neskaya kanetsv? 03:50, 14 September 2009 (UTC)

Lookin’ good! One thing: how will this affect wikilinks in translation-table bars? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 11:58, 14 September 2009 (UTC)

I just checked: wikilinks in translation bars are unaffected; this all seems to work fine. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 17:51, 14 September 2009 (UTC)

Awesome. Works here... :D L☺g☺maniac chat? 17:02, 14 September 2009 (UTC)

Hm. But now the button doesn't work, just the bar itself. L☺g☺maniac chat? 22:00, 14 September 2009 (UTC)

Ach, now you've told me (on IRC) that you use Safari, I've found that Safari does not support the method I have used to prevent the bar opening when links are clicked. The result is that when you click on the link, the click on the link opens it, and the click on the bar closes it. I will try and find a good solution to this. Conrad.Irwin 22:34, 14 September 2009 (UTC)

Since when are there links in translation table bars? I've always thought those were to be discouraged. Nadando 22:36, 14 September 2009 (UTC)

like when the relevant translations are under another headword: cf. apple L☺g☺maniac chat? 22:50, 14 September 2009 (UTC)

For uses of {{jump}} et. al. They are few and far between, but certainly there. The Safari problem should now be fixed. Conrad.Irwin 23:35, 14 September 2009 (UTC)

Confirmed. The whole enchilada works for me now. --EncycloPetey 04:14, 15 September 2009 (UTC)

Wiki Campus Radio - Hallowen Podcast .

Hi,

Wiki Campus Radio ( a project on English Wikiversity) is looking into making a Halloween podcast. As I feel Wiki Campus Radio's podcasting should reflect the entire Wikimedia universe, it would be appropriate if some content derived from Wiktionary featured.

Given the focus of Wiktionary, the idea was to produce a short 10 min audio item concerning the origins and superstitions of some words associated with Halloween and related themes.

I wasn't sure what words might be suited though, and so the guidance of the experts here in developing an initial script would be greatly appreciated.

Given that Wiki Campus Radio is nominally hosted on Wikiversity, some degree of original academic synthesis may be acceptable.

Sfan00 IMG 19:09, 14 September 2009 (UTC)

Interesting... Maybe something about hallow (needs work), or jack-o'-lantern? Or perhaps the strange journey of trick or treat from rhyme to interjection to verb? -- Visviva 08:45, 17 September 2009 (UTC)

Indeed, Those are some good starting points. I can start a Wikiversity page, if you can find people interested in making this happen:)

Sfan00 IMG 23:06, 17 September 2009 (UTC)

Watcher: New tool counts how many users are watching a page

This tool, called Watcher, will tell you how many users are watching your user page. It also works with any wiktionary entry page that begins with a capital letter. I've asked the author if he can modify it to work with wiktionary entries that begin with a lower-case letter. -- WikiPedant 20:47, 14 September 2009 (UTC)

Glad you asked for the mod. DCDuring TALK 21:51, 14 September 2009 (UTC)

Well, it seems that in the last couple of hours someone has modified Watcher so that it no longer provides a count if a page has fewer than 30 watchers. That probably pretty much eliminates most of wiktionary. So, never mind. -- WikiPedant 04:34, 15 September 2009 (UTC)

A proposed revision to CFI

In the RfV discussion for abnodate, I expressed a grievance I had with the CFI, whereupon DCDuring suggested that I propose a revision to them here; so, this is what I’m doing. I think that all my points have been made in general terms in this section of my talk page (pertinently, in my most recent post therein, timestamped 21:33, 10 September 2009) and, in the specific case of (deprecated template usage) abnodate, at its RfV (see especially my post timestamped 21:00, 14 September 2009); therefore, I shan’t clutter the Beer Parlour by copying or paraphrasing those two posts here. In brief, my arguments concern the purpose of the CFI and ask whether the letter of the criteria is in accord with their spirit, they note the (necessary) imperfection of our verification tools, and they conclude by advocating that we allow appeals to authority where there is great precedent to do so.
The policy discussion should probably take place here, so please respond here rather than on my talk page or in the RfV discussion. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 11:02, 15 September 2009 (UTC)

It seems like you're proposing that if a word is in a lot of dictionaries, even if there is no evidence that it has ever been used to convey meaning, we should include it. I could not oppose this more strongly. We are here to document words in use; words that are never used are outside our proper scope. When it is a matter of the fine points of defining and context labeling, I have no problem with leaning on authority... but where it is a question of whether a word actually exists, there is no excuse for using anything but actual use as our guide.

How about simply collecting the mentions of these non-words in Citations: space? A Category:Dictionary words might be of great interest to those with an interest in such things... and it would provide an easy springboard for demonstrating real use where this is possible. -- Visviva 12:35, 15 September 2009 (UTC)

To quote the post on my talk page:
“Google Book Search is not an exhaustive archive of everything written through the medium of English since 1470; whilst a term showing little or no evidence of use on the Internet is probably beyond our means to verify, this does not mean that the term does not exist, being used, by someone, somewhere; consequently, someone may still ‘run across it and want to know what it means’. What is often stated, and rightly so, is that inclusion as a headword in a dictionary does not constitute verification; however, when a word is included as a headword in several dictionaries, over centuries, then maybe it’s time we take the hint.”
That said, your idea of a category has given me an idea for a compromise. How about an appendix such as Appendix:Unattested terms listed in other dictionaries? Our entry for (deprecated template usage) abnodate could then be stubbed per *(deprecated template usage) caligynephobia, which would direct readers to the appendix, where the term could be given a mini-entry. The criterion for inclusion in the appendix would be inclusion in one (or two, or whatever minimum) or more authoritative dictionaries (which we can define thereatop). I believe that this would be the best of both worlds, defining terms with long standing in authoritative lexicographical works, whilst ensuring that our foundational principle of attestability is not violated. What do you say? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 13:19, 15 September 2009 (UTC)

I think you are ascribing lexicographic authority to authors and publishers of dictionaries that do not deserve it. For slang, underworld, and dialect terms one might rely on works with low attestation standards for evidence of use and meaning. For terms that are not likely to be colloquial, there is no reason to assume that there is significant use of a term in a modern language that does not appear in writing.

If a non-colloquial term cannot find a way of leaving evidence that it has survived in the wild, then perhaps lexicographers should take the hint, as most professional lexicographic institutions do. A term that is offered to the public in several dictionaries over the centuries and fails to find use is a proven non-word.

Attestation is a core principle of lexicography. Otherwise any inventive author with an undemanding publisher can claim any arbitrary arrangement of letters is a word. That there is normally a purported etymology, often from the classics, in the actual instances of oft-mentioned, unattestable-in-use words accounts for much of the persistence of these terms. They are memes that seem to live and reproduce mostly among classicists and antiquarians. They seem to be creatures that are not hearty enough to live and reproduce in the wild. They mostly seem to have the effect of confusing the readers of the occasional translator who finds them in an ancient dictionary and finds them convenient to use. DCDuring TALK 14:38, 15 September 2009 (UTC)

DCDuring,

“I think you are ascribing lexicographic authority to authors and publishers of dictionaries that do not deserve it.”

You mean Webster’s and the OED?

“Attestation is a core principle of lexicography.”

Hence “I believe that this would be the best of both worlds, defining terms with long standing in authoritative lexicographical works, whilst ensuring that our foundational principle of attestability is not violated.”

“That there is normally a purported etymology, often from the classics, in the actual instances of oft-mentioned, unattestable-in-use words accounts for much of the persistence of these terms.”

A situation highly analogous with all those phobias from Ancient Greek roots which, nevertheless, we retain and define in an appendix — and they don’t even have these palæologisms’ claim to listing in lexicographical works for centuries.

“They mostly seem to have the effect of confusing the readers of the occasional translator who finds them in an ancient dictionary and finds them convenient to use.”

Hence the value of our defining them.

† ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 16:29, 15 September 2009 (UTC)

Lexicographic authority: If the only dictionaries we are talking about accepting entries from are Webster's 1913 and OED, I would happily consent. I don't see that there are likely to be more than a couple of thousand that are not attestable. I could imagine accepting other dictionaries, but I would like the list to be part of any proposal. There are many other dictionaries from which we have unattested entries. In English I see material from Century, slang/cant dictionaries and so forth. Immediately preceding this topic, ], a mention-only word was bruited based on more questionable sources.

Attestation: There is nothing that prevents one from adding citations now in citation space for any entry that would meet CFI were it attested.

Phobias only-in: Only-in has been accepted in practice only for phobias and military slang, where its use has been an experiment. I believe that part of the motive was to have a structured approach to handling contributions from those not regular contributors to reduce the acrimony of some of the patrol confrontations.

Zombie words: These words would never have been born except from the minds of classicists. These hothouse creations have been offered to the public which has rejected them. Only plagiarizing pretend-lexicographers have kept them in a their undead state. All responsible dictionaries have accepted the responsibility of separating the living from the undead. We may not be able to put a stake through their hearts, but we can banish them from the land of the living so that they do not find unsuspecting hosts among our users. DCDuring TALK 19:31, 16 September 2009 (UTC)

Can you explain why you would oppose {{only in}}? After all, it draws a brighter line between zombie words and real words than simple deletion would do. -- Visviva 03:55, 17 September 2009 (UTC)

Yes, using {{only in}}, as we do with invented phobias et al., would be fine AFAIAC. Probably more viable than my Citations: idea. There are quite a few such words we've encountered over the years, but I don't have a list handy... does anyone? -- Visviva 15:07, 15 September 2009 (UTC)

I assume that such words either failed RfV or remain here, unnoticed. The fact that many entries from Webster’s were bulk-added hereto long ago explains why they slipped by. Presumably, they will be picked up as time goes on and entries are directly attested by us. I for one will not mind {{rfv}}ing terms that I suspect will fail the process now that I know that valuable information will not be lost, but rather preserved in an appendix. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 16:29, 15 September 2009 (UTC)

I'd oppose this. Mglovesfun (talk) 18:47, 16 September 2009 (UTC)

Can you explain why? It seems very analogous to the invented-phobias situation. If we simply delete these entries, people who assume that (for example) abnodate is a real word will also assume that it has been omitted by accident. At the very least they will acquire a poor opinion of our coverage. Worse, if they are a public-spirited word geek -- that is, a Wiktionarian in spirit -- they may even try to add or re-add the entry. Far better than this, IMO, to have a signpost that makes clear what the word is, and also where further information about it can be found or added. Hence, {{only in}}. Less work for the community, and less discouragement of prospective contributors (there are few things more discouraging than having one's good-faith contributions deleted out of hand). -- Visviva 03:55, 17 September 2009 (UTC)

If there were a way to limit this to deserving anons and we were only talking about a short list of approved-by-vote dictionaries of authority, the recruitment value might outweigh the zombie-word problem. In the case of English occurrence of a word in 3 dictionaries or editions thereof without any uptake in wider use evidenced by attestable use is a clear sign that a term is not a word fit for a discriminating dictionary. Failure to discriminate is a disservice to users that diminishes our brand. It puts us with the various novelty-word books that clutter the shelves. It doesn't actually help in areas such as technical vocabulary where there is no dictionary coverage, just perhaps a government or trade-association glossary. DCDuring TALK 11:40, 17 September 2009 (UTC)

I feel that we are talking at cross-purposes. {{only in}} looks nothing like an entry. It will not lead anyone to think that abnodate et al. are real words, and will not result in any (mis)-information being picked up by downstream applications like Google. AFAICT, it discriminates pretty much as clearly as could possibly be done. Real words get an entry; non-words get a link to an appendix (which should make the reasons for the entry's absence clear). So what's the problem? -- Visviva 11:54, 17 September 2009 (UTC)

Started Appendix:Words found only in dictionaries, currently working mostly from OED entries that cite Blount (about 50% of which prove to be dictionary words). It's interesting to see the trends in OED inclusion over time; the first volume had a couple, but then there were almost none until L... but after L, all hell breaks loose, and it looks like there are at least a couple hundred altogether. They seem to have made no effort to clean any of these up over the past century... I guess once a word is in the OED, it doesn't come out. -- Visviva 11:54, 17 September 2009 (UTC)

Shall I take it that the new appendix is now accepted as an appropriate way to deal with these so-called “zombie words”? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 04:27, 20 September 2009 (UTC)

Since only four have actively participated in the conversation it would seem we have no information on a consensus. For an Appendix, not much consensus seems to be required. It might be worth a vote on the general subject of the use of {{only in}}, including for this class of words. DCDuring TALK 11:30, 20 September 2009 (UTC)

Feel free to start that vote. I’d just noticed that all objection seemed to have died once Visviva created the appendix; however, it didn’t want to infer consent from silence. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 12:22, 20 September 2009 (UTC)

User:DCDuring added rvf to Chiayi, another place name that may become a victim of the current rules and place name haters. Anatoli 22:07, 30 September 2009 (UTC)

SI units and abbreviations

ZΩ and zΩ are currently up for review at RfV. These are components of the International System of Units, the official nomenclature of units (and their abbreviations) that has been adopted by the entire world, save the U.S. and a handful of fairly small countries. It seems absurd to me that we would not report on words and symbols for which the entire world had agreed on a meaning, especially since the inclusion of those that are used would leave a patchwork grid full of holes if we exclude the ones that we know to exist, but for which citations "in the wild" can not be found. This seems to me to be a case for a CFI exception based on international adoption of the nomenclature. bd2412 T 18:29, 16 September 2009 (UTC)

On the contrary, it seems absurd to me that we should include words nobody has truly used. Equinox ◑ 18:39, 16 September 2009 (UTC)

This is a situation like others in which we had, until recently, simply accepted the authority of international standard-setting bodies or processes in scientific fields: taxonomic names, chemical names, language codes and names. We thereby effectively ignored attestation under CFI. As I see it we have options something like below:

Substantively, we could:
1. bundle these into a group of bodies whose decisions we accept,
2. state a general principle, or
3. opine on just ISU.
Procedurally, we could:
1. go for a vote to amend CFI in one of the three ways above,
2. express a consensus to selectively ignore CFI in this area,
3. or leave things as vague as they are.

-- DCDuring TALK 19:04, 16 September 2009 (UTC)

Alternately, we could redirect these to an appendix - but then some (such as nanometer) would not redirect and other (such as zeptometer) presumably would. This is of no great moment if the attestable entries include a link to the appendix. bd2412 T 19:30, 16 September 2009 (UTC)

Ah, the fourth procedural option, "none of the above", aka, creativity or remembering the forgoten. Don't we already have such an appendix or at least a base for the ideal one, Appendix:SI units? It does not now have the right format for this purpose, does it? Or can "spans" be inserted in tables to accomplish something good?

On a side note, I'd add that the French Wiktionary has all of them - I'm not familiar with their CFI, but it seems odd to me that ours would exclude what theirs includes. bd2412 T 19:45, 16 September 2009 (UTC)

Oh, they're so dirigiste. They just do what the Acadamie tells them. |;)) DCDuring TALK 20:09, 16 September 2009 (UTC)

IMO it would be extremely lame for us not to have these. We are trying to build a useful reference work (or set of reference works), not some sort of monument to consistency. -- Visviva 03:25, 17 September 2009 (UTC)
Agree with Visviva. In any event, these meet the first three of CFI's attestation criteria, anyway:
1. Clearly widespread use
2. Usage in a well-known work
3. Appearance in a refereed academic journal
Right? —Rod (A. Smith) 05:02, 17 September 2009 (UTC)

I'm not sure that's the case. The system certainly meets these requirements, but I can't find any uses of zeptoohm/zettaohm, or ZΩ/zΩ, whatsoever -- in refereed journals or elsewhere. Even if the pertinent ISO standard counts as a well-known work (and actually includes these explicitly), these terms' appearing therein would constitute a mention only. I don't think there is a clear basis in current CFI for having (all of) these; but that still doesn't mean we shouldn't have them. They are a highly finite set of useful terms; the perfect case for a targeted exception to the rule. -- Visviva 10:24, 18 September 2009 (UTC)

As far as I can tell, it is perfectly possible to argue that these are sum-of-part anyway. Circeus 18:20, 17 September 2009 (UTC)

Of what parts? These are all whole, unhyphenated words (or abbreviations thereof). bd2412 T 20:17, 17 September 2009 (UTC)

We rarely test abbreviations in RfV or RfD. It would seem that intrinsically there would need be different ways of applying the rules. DCDuring TALK 20:10, 17 September 2009 (UTC)

These are very useful for a dictionary to have. We get terms such as this all the time in translation jobs that we do for companies like Bell Helicopter, Haliburton, and so on. People would never be able to figure it out just by looking up z and Ω separately. —Stephen 20:54, 17 September 2009 (UTC)

What makes people say that these are part of the official SI terminology? It seems to be an SI prefix and an SI unit that no one, including the SI, has ever actually combined to use—the abbreviated form or the spelled-out version. This seems more like one of those theoretical phobias we always get, that are really just Greek prefixes creatively added onto "-phobia" to coin words that aren't actually used. I don't think it is right to speak about this as if anyone is ever going to look it up, having come across it in some work. Can anyone even provide a mention of this in an SI publication or another reputable source? Is there some source for the notion that all SI prefixes may be combined with all SI units indiscriminately? Dominic·t 17:37, 19 September 2009 (UTC)

As for the last question, all SI prefixes may be combined with all SI units indiscriminately will never be true, with an exception that one of the SI base units kilogram has an SI prefix as a part of its name and can't be directly prefixed with another. Other than that, I've never heard of or read about explicit concerns over combining an SI prefix and an SI unit, while using a compound prefix (juxtaposition of multiple prefixes) is being strongly discouraged in the AIP style guide and other authoritative sources. --Tohru 15:57, 21 September 2009 (UTC)

I still don't see the problem. We are speaking of a few dozen entries (particularly given that those in the middle ranges will be easily attestable), for which it is indisputable that the specific combination of prefix and unit is the only internationally acceptable term for the description of a particular size of unit. Granted, we may never see something with a mass of a zeptogram, but this is still the only correct name for that amount of mass, and the accuracy of the definition can not reasonably be questioned. bd2412 T 03:11, 24 September 2009 (UTC)

Are we are a purely descriptive dictionary, or not? If our guidelines were to acknowledge that technical terminology is often prescriptively defined, then we'd be able to add these. As it stands, they are not used in the language, so they can stay in appendix limbo. “This is lame” and “it's only a dozen” may be true, but they're not principals on which to operate the volunteer-run, biggest dictionary in the world. —Michael Z. 2009-09-24 03:35 z

We're not the ones being prescriptive, though. The SI came up with these units, and almost the entire rest of the world has gone along with them. bd2412 T 03:59, 24 September 2009 (UTC)

If you can't define zeptoohm descriptively, according to attested usage (3 quotations required by our CFI), then any definition that we include for it is prescriptive.

The SI is a body which sets standards. It makes up rules. If we recopy their prescribed definitions, then we are being prescriptive. —Michael Z. 2009-09-24 04:52 z

There's no shame in it. A huge proportion of a dictionary is technical and scientific terms, and many of them are prescriptively defined and used. We just need to acknowledge that and accommodate it in our guidelines and usage labels. —Michael Z. 2009-09-24 05:01 z

We could include them but label them as neologisms. The last four prefixes (yotta-, zetta-, zepto-, yocto-) were not established until 1991. bd2412 T 19:14, 24 September 2009 (UTC)

Supplements to enPR

Our system for specifically-English pronunciatory transcription omits any symbols for the many semi-naturalised foreign sounds used by Anglophones when pronouncing semi-naturalised foreign words. The American Heritage Dictionary has <ɶ> for French (deprecated template usage) feu (IPA^(key): , Template:X-SAMPA) and German (deprecated template usage) schön (, Template:X-SAMPAchar) as well as French (deprecated template usage) œuf (, Template:X-SAMPAchar) and German (deprecated template usage) zwölf (, Template:X-SAMPAchar), <ü> for French (deprecated template usage) tu (, Template:X-SAMPAchar) and German (deprecated template usage) über (, Template:X-SAMPAchar), <KH> for German (deprecated template usage) ich (, Template:X-SAMPAchar) and (deprecated template usage) ach (, Template:X-SAMPAchar) and Scots (deprecated template usage) loch (, Template:X-SAMPAchar), and <ɴ> for French (deprecated template usage) bon (, Template:X-SAMPAchar); Dictionary.com itself has <a> for French (deprecated template usage) ami (, Template:X-SAMPAchar), <kh> for the AHD’s <KH>, <œ> for French (deprecated template usage) feu (, Template:X-SAMPAchar) and German (deprecated template usage) schön (, Template:X-SAMPAchar), <r> for French (deprecated template usage) au revoir (, Template:X-SAMPAchar) and Yiddish (deprecated template usage) rebbe (, Template:X-SAMPAchar), <uh> for French (deprecated template usage) œuvre (, Template:X-SAMPAchar), <y> for the AHD’s <ü>, and a variety of vocalic characters followed by <n> for some French and Portuguese nasalised vowels. (BTW, the OED¹ seems to have had a different pronunciatory transliteration scheme from both the AHD’s and Dictionary.com’s; it had, for example, prəmye daṅsṏr (It didn’t have <ṏ>, it actually had an ‘o’ with a macron and an umlaut, but I could find neither such a precomposed character nor a combination of characters and combining forms which would display the OED’s character correctly.) for (deprecated template usage) premier danseur (/prəmje dɑ̃sœr/, Template:X-SAMPAchar).) For transcriptions approaching anywhere near “correct” (as normally considered) pronunciations, such semi-foreign phones are necessary. So, I propose that we supplement enPR with the symbols for some semi-naturalised sounds used to pronounce these xenogenous words. Here are my specific suggestions:

+<ɴ> to denote the nasalisation of the previous vowel, per both Dictionary.com and the AHD; equivalent to ⟨˜⟩ (tilde diacritic), Template:X-SAMPA.
+<ʀ> to denote any of the “rolled ‘r’s” (viz., the alveolar trill (⟨r⟩, Template:X-SAMPAchar) and the voiced uvular trill (⟨ʀ⟩, Template:X-SAMPAchar) and fricative (⟨ʁ⟩, Template:X-SAMPAchar)), in contradistinction with <r> (denoting the native alveolar (⟨ɹ⟩, Template:X-SAMPAchar) and retroflex (⟨ɻ⟩, Template:X-SAMPAchar) approximants), per Dictionary.com. (I’d think that few Anglophones make distinctions between any of the “rolled ‘r’s”; perhaps some do ⟨r⟩ vis-à-vis ⟨ʀ⟩ and ⟨ʁ⟩ (so two separate symbols for the “rolled ‘r’s” may be defensible), but I expect that very few would distinguish ⟨ʀ⟩ from ⟨ʁ⟩.)
<KH> → <ĸʜ>: We already use a shrunken capital ‘KH’ (KH) to denote ⟨x⟩, Template:X-SAMPAchar (which is a native sound in some (e.g., Scottish) English dialects); however, that kind of font-style formatting is not preserved when one copies (from the external display) and pastes enPR transcriptions, whereas no information is lost by similarly copying and pasting IPA and SAMPA transcriptions (e.g., whereas and Template:X-SAMPAchar remain the same, lŏKH becomes lŏKH). Æsthetically, enPR (and its ancestor schemes’) transcriptions are meant to look like diacriticked words written in lower-case, which quality, compounded with the lack of the need for slashes or brackets to give phonemic or phonetic context, makes them especially suited for use in running text (such as in usage notes discussing pronunciation). This sort of appeal to format preservation is a basis for favouring <ɴ> and <ʀ> over shrunken ‘N’ and ‘R’ (as above), <dh> over italicised ‘th’ (as below), &c. mutatis mutandis. Foreseeable opposing arguments to this are the inconvenience of having to use the edit tools to insert these non-ASCII characters and the encoding problems that may result from their use. In response to the first, I would say that such inconvenience is minimal given the fact that the edit tools need already be used to insert the diacriticked vocalic characters which are used in almost every enPR transcription, and what little inconvenience this presents is nevertheless outweighed by the inconvenience of typing seventeen characters in place of two, as the alternative requires (ĸʜ vs. KH); to the second, greater encoding problems are caused by <o͞o> and <o͝o>, whose diacritics are occasionally substituted with ? when character support is poor, than are caused by characters of the Latin Extended-A and IPA Extensions Unicode subsets (whence <ɴ>, <ʀ>, <ĸ>, <ʜ>).
<th> → <dh> for the reasons of format preservation given above for “<KH> → <ĸʜ>”. Such usage would not be without precedent; Fowler’s Dictionary of Modern English Usage (1926) used <dh> to denote ⟨ð⟩, Template:X-SAMPAchar, for example. This would make logical sense, drawing an analogous parallel between <sh>–<zh> and <th>–<dh>; we write <zh> not <sh> and <KH> not <ch> (as in Ernest A. Baker’s New English Dictionary, 1932).
+<ö> for the AHD’s <ɶ> and Dictionary.com’s <œ>; +<ü> for the AHD’s <ü> and Dictionary.com’s <y>. I think that the least confusing characters we could use to represent those two semi-English phonemes would be the German umlauted vowels for them, <ö> and <ü>. <ʏ> makes sense inasmuch as it matches the IPA phone for one of the two symbols its use in enPR would be intended to denote; however, a ‘y’-variant character would not (IMO) imply an <ü> sound to most Anglophones, and it has the chance of being confused with <y>, which the IPA uses for the other sound denoted by <ü> and which our enPR uses to denote the semivowel ⟨j⟩, Template:X-SAMPAchar. Using <ɶ> is a bad idea, because, in the IPA, ⟨ɶ⟩ denotes the open front rounded vowel, whereas the two sounds covered by the AHD’s use of <ɶ> are the open-mid front rounded vowel (⟨œ⟩, Template:X-SAMPAchar) and the close-mid front rounded vowel (⟨ø⟩, Template:X-SAMPAchar); if the œthel is to be preferred, the minuscule form <œ> is better than the shrunken majuscule, given that it matches the IPA phone for one of the two symbols its use in enPR would be intended to denote and since that’s the sound denoted by the French ligature (deprecated template usage) œ. Still, <ö> and <ü> are best, since they’re parallel (unlike <œ> and <ʏ> or <œ> and <ü>) and because, in German, (deprecated template usage) ö is pronounced , Template:X-SAMPAchar in its short form and , Template:X-SAMPAchar in its long form, whilst (deprecated template usage) ü is pronounced , Template:X-SAMPAchar in its short form and , Template:X-SAMPAchar in its long form, so they cover all the phones they represent. There is, however, the lack of a parallel with the third German umlauted vowel, (deprecated template usage) ä, pronounced , Template:X-SAMPAchar in German, but which denotes , Template:X-SAMPAchar in the context of enPR; that said, I don’t think that’s much of a problem, since that German vowel’s short form is denoted by <ĕ>, whereas the long form is the subject of the next bullet.
+<ë>, <ê>, or <è> for , Template:X-SAMPAchar. <ĕ> only denotes the short form (, Template:X-SAMPAchar), but this is unsuitable for representing the , Template:X-SAMPAchar in the three Anglicised RP pronunciations of (deprecated template usage) première danseuse; I settled for <ĕĕ> in the end, but that is not appropriate, since the underlining isn’t preserved when the transcription is copied-and-pasted (see “<KH> → <ĸʜ>” above), a two-‘e’ digraph misleadingly implies that in normal English spelling ‘ee’ is sometimes or always pronounced , Template:X-SAMPAchar (per the principle behind <o͞o> and <o͝o>), and, frankly, it’s an ugly fudge. I think that the best way to represent the long form of <ĕ> would be to use <ë>, by analogy with <ä> — the tremata denoting “secondary long forms” (so <ë> contrasts with <ĕ> in the same way that <ä> contrasts with <ă>). Alternatively, we could use <ê>, but the circumflex seems to denote lengthening in combination with the effects of a subsequent ‘r’ (except for the case of <ô>), so I think that it would be less appropriate. As a third option, there’s <è>, which has the advantage of being homographic with the French letter (deprecated template usage) è which is pronounced , Template:X-SAMPAchar (vowel length not being phonemic in French), but has the disadvantage of using a diacritic that does not occur atop any other enPR character (though, OTOH, the same goes for <ä> ATM, and for the use of <é>, which I propose below).
+<é> for , Template:X-SAMPAchar, in contrast with <ā> (, Template:X-SAMPAchar); e.g., as in the distinction between (deprecated template usage) nay (nā, , Template:X-SAMPAchar) and (deprecated template usage) né (né, , Template:X-SAMPAchar).
+<ӛ>, <ə̄>, <ə̈>, or <ə̂> for , Template:X-SAMPAchar as needed for (deprecated template usage) première danseuse; the nearest to that — <ûr>; , ; Template:X-SAMPAchar, Template:X-SAMPAchar — is/are inappropriate for that purpose. <ӛ> is definitely the best choice, IMO, since it’s a precomposed character (specifically, Unicode’s “Cyrillic small letter schwa with diaeresis”), which ensures that the diacritic is displayed in the correct location atop the schwa (for me, the diacritics of each of the other three alternatives (with a macron, diæresis, or circumflex) are all displayed much too far to the right); furthermore, I think the trema is the most appropriate diacritic, since the sound is only lengthened, and otherwise remains unaltered (cf. <ă>, <ä>, <ā>; <ĕ>, <ë>, <ē>), and the difference has nothing (necessarily) to do with the effects of a subsequent ‘r’ (cf. <â>, <î>, <û>).
Finally, concerning Dictionary.com’s <a> (for French (deprecated template usage) ami: , Template:X-SAMPAchar) and <uh> (for French (deprecated template usage) œuvre: , Template:X-SAMPAchar): I don’t think that either of these are necessary. If such sounds do need to be distinguished, I believe it’d be best to use <a> and <ə>, respectively.

These supplements will greatly improve enPR’s functionality, matching or improving upon that of the AHD’s, Dictionary.com’s, and other dictionaries’ phonemic transcription schemes. Besides functionality, the fact that these addenda introduce more ways in which our scheme differs from the AHD’s should mean that kwami will stop calling for enPR to be renamed AHD.
Right. Hopefully, this won’t get too many TL;DR responses. Thoughts, anyone? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 00:40, 17 September 2009 (UTC)

TC; DRC. ;-) But I'm sure someone does. -- Visviva 09:22, 17 September 2009 (UTC)

Sorry, I don’t understand; please add that. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 14:03, 17 September 2009 (UTC)

I just made it up (AFAIK), to indicate "too complex; don't really care." Phonology is a black box for me, I'm afraid. (I've long felt that phonologists are the only group that are consistently more insane than lexicographers; but in all likelihood, I simply have a tin ear for pronunciation.) -- Visviva 14:51, 17 September 2009 (UTC)

Nah, given how long it took me to write all that and that it came to almost 16KB, I’ll agree with your original assessment. Right then, the anti-TL;DR version:

Our enPR scheme doesn’t cover semi-naturalised pronunciations at all well, so we need supplementary symbols to represent the phones used in those pronunciations, just like the AHD, Dictionary.com, OED , and other dictionaries have. Specifically, we should make the following changes:

+<ɴ> to denote vowel nasalisation; equivalent to ⟨˜⟩ and Template:X-SAMPA.
+<ʀ> to denote any of the “rolled ‘r’s” (viz. ⟨r⟩, Template:X-SAMPAchar; ⟨ʀ⟩, Template:X-SAMPAchar; and ⟨ʁ⟩, Template:X-SAMPAchar).
<KH> → <ĸʜ> — Let’s use characters that are innately small caps, rather than capitals shrunk using ….
<th> → <dh>, per the reasons in “<KH> → <ĸʜ>”, per Fowler’s 1926, and per logic.
+<ö> for German’s (deprecated template usage) ö and French’s (deprecated template usage) œ (, ; Template:X-SAMPA).
+<ü> for German’s (deprecated template usage) ü and French’s (deprecated template usage) u (, ; Template:X-SAMPA).
+<ë> for , Template:X-SAMPAchar (i.e., the long form of ĕ, , Template:X-SAMPAchar), as in (deprecated template usage) première danseuse.
+<é> for , Template:X-SAMPAchar (as in (deprecated template usage) né), in contrast with <ā> (, Template:X-SAMPAchar; as in (deprecated template usage) nay).
+<ӛ> for , Template:X-SAMPAchar (i.e., the long form of ə, , Template:X-SAMPAchar), as in (deprecated template usage) première danseuse.

Thoughts, anyone?

There, and all the suggestions fit onto one line each. Obviously, the rationales are lost in the (considerably-) shorter version, but by God! it’s clearer. BTW, I’ve revised the enPR transcriptions we have for (deprecated template usage) première danseuse, to show some of these characters in action. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 15:48, 17 September 2009 (UTC)

Do we have any information of the number or proportion of folks who actually use enPR? What about of those who actually use enPR and not IPA? What about those who actually ever use any phonetic alphabet renderings? Have we taken a poll of contributors on this? What about anons? What about any general population? College students? ESL learners? In the absence of such information, I find it difficult to get interested in this matter. DCDuring TALK 16:18, 17 September 2009 (UTC)

None accurate, AFAIK. From my experience of chatting with people about this issue (not my most common topic of conversation, I must admit), ESL learners tend to be proficient in IPA and don’t use the more traditional enPR-style systems, in general. As for Anglophones, I think they find enPR-style systems more intuitive (and hence they learn them more quickly than universal transcription schemes like IPA or (X-)SAMPA), since the transcriptions tend to resemble far more closely the words they represent (e.g., (deprecated template usage) change → enPR: chānj, /ʧeɪnʤ/, Template:X-SAMPA) and the vast majority of the characters used thereby are especially chosen to denote sounds that those letters would normally denote in English (as is especially the case with o͞o and o͝o). Obviously, none of this has any statistical merit, but if you find the rationale and conclusions plausible, then it’s probably still useful to know. All that aside, none of your questions pertain to whether you think these supplements ought to be adopted; do you have an opinion regarding that? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:41, 20 September 2009 (UTC)

My opinion, FWIW, is that it is not worth getting an opinion on the substance, because there seems to be no user-oriented discipline on the process of generated the proposal. I was hoping that someone evidently interested in the area actually knew something about actual facts of usage of phonetic transcriptions. DCDuring TALK 11:19, 20 September 2009 (UTC)

I don’t see how an interest in demographics necessarily follows from an interest in phonology. I can’t think of a good way to gather these statistics; if I did, I’d do so. In the meantime, I’ll rely on what individua tell me in conversations; I tend to find that the results are more coherent and intelligible than those of opinion polls, anyway. If we did have these “actual facts of usage of phonetic transcriptions”, what would you propose we do with them? If few users used enPR, would you propose we stop using those transcriptions? Your questions only concern the very fundamental issue of whether we use enPR at all (irrespective of what form it takes); for this reason, it would be a total waste of time to ask those questions unless we were seriously considering discontinuing the provision of enPR transcriptions (and I doubt very much that you’d get a consensus of people here who thought that to be a good idea).
Sod such dubious statistics, I say. I generally trust our capacity to reason as to the merits and demerits of a given proposal. What a priori hypothesis do you need statistic evidence to support? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 13:38, 20 September 2009 (UTC)

I naively believe that folks on this project are interesting in serving the needs of others while amusing themselves. Accordingly I naively expect folks to be interested in specifics what is useful, especially in areas where they apparently have some technical knowledge. If someone had such information, the information itself might have been useful in shaping decisions. The sole information I have garnered from the lack of response to my occasional inquiries is that no one has any such statistical information that they believe has any value more than the anecdotal. DCDuring TALK 16:16, 20 September 2009 (UTC)

(Not to drag this too far off-track, but... I don't think this is naive, but I don't think it properly leads to a focus on current web users -- that is, the non-random subset of people who happen to have found their way to the web interface for our lexical information repository. We certainly want that interface to be as good as possible without interfering in the real work of the project... but these concerns properly apply only to the interface, not to the content itself. If Wiktionary is ever a success, most use will take place through reusers who will be able to filter out anything that gets in their way. Thus even if there were evidence that enPR, or the more precise implementation proposed here, was deleterious to the current web-user experience -- which isn't very likely, IMO -- even then, I don't think that would be a sufficient basis for excluding content that is otherwise valid. -- Visviva 16:48, 20 September 2009 (UTC)

I'd settle for information about what reusers want or even who they might be. And, for that might why we might care about any particular one or a group of them. Or a definition of "the real work of the project". A little transparency in that regard might help.

On the specific point under discussion: Is there any evidence that non-standard anything is of any help to reusers? DCDuring TALK 18:23, 20 September 2009 (UTC)

If you want statistics on re-users, use Google to search for Wiktionary + . The first page of hits I got for (deprecated template usage) preantepenultimate, yielded these three dictionaries (they were the pertinent ones); infer what you will from what they do with our content. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:42, 21 September 2009 (UTC)

Thanks. Based on that, I find it hard to swallow that we should trust them to serve users on our behalf. I'd like to see what else the use of ours on other entries and how they handle pronunciation and other sections where we offer them. Also how successful they are at capturing users from search engines. I'll be reporting back under a new header. DCDuring TALK 04:21, 21 September 2009 (UTC)

A caveat: There is often a very large time lag in these derivative sites being updated (if they ever are), so what you’ll be seeing in most cases will be content re-used from months or even years ago. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 05:07, 21 September 2009 (UTC)

We are a body of volunteers; we do not attend compulsory customer-service seminars. Of course I’d like to help our users find the information they need, but certainly not at the cost of diminishing the information content we have. That is how I have interpreted some of your comments in the past; for example, you opposed taking up too much of the first-screen vertical space in entries with non-definiential content (viz. alternative spellings, etymologies, and pronunciatory transcriptions), then we got {{rel-top}} et seqq., but you then opposed them since you assumed that users would miss the show/hide toggle, and thereby miss the information enclosed therein; there was nothing wrong with your arguments, just with your seeming floccinaucinihilipilification (;-)) of any information that wasn’t number one or two on the list of Things People Go To A Dictionary To Look For; eventually the whole bar was made clickable, achieving a win–win situation. Serve the higher as much the lower.
Since you’re always talking about statistics, why don’t you compile them? People are more likely to listen to your arguments based on statistics than on your counterarguments decrying the universal lack of statistics.
Again, however, I must repeat myself: What bearing do these hypothetical statistics have on this proposal? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 16:53, 20 September 2009 (UTC)

You seem to have a mistaken notion of what I do and don't favor. I favor vertical space reduction by almost any means possible, especially on the first screen. In that regard I have made probably thousands of edit changes in that direction: horizontalizing lists of alternative spellings and synonyms, trimming verbiage exceeding our best norms in Etymologies, and adding show-hide bars whenever content under eligible headings exceeding two items. The small size of the show/hide bars was a drawback, but never one that in any way stopped me from deploying them wantonly. Other than definitions, there is very little that could not be hidden under such bars or even something less space-consuming. Perhaps you could direct me to someplace where I opined otherwise.

I am a mere amateur among those who have apparently spent a great deal of time developing expertise in the area. I am not an ESL instructor or translator. I also have no access to subscriptions to scholarly journals. I normally expect people with expertise to have some information about such things.

If there is no statistical or other evidence suggesting that users' needs for pronunciation are not being served by what other online dictionaries offer, then we might draw some inferences from what they do: have easy-to-access audio pronunciation and low-space-consuming pronunciation sections. We can use the best industry-standard practices in this regard unless we evidence that something else is worth doing. And "worth" can be defined in many ways, so long as some explicit users populations are considered. Who are the users, academics, language professionals, ESL students, writers, normal folk? Current users? Subsequent generations? Are reusers our best way of understanding what is wanted?

I have been concerned with making sure that we do not let the valuable secondary information get in the way of introducing users to the overall experience. They need to get what they want (usually definitions, sometimes usage, pronunciation, synonyms, or translations) the first few times in order to notice the riches that we sometimes can offer them in our good entries. It is the poor quality of too large a portion of our entries that I fear may kill off this grand experiment.

On this specific point, if users want IPA and can use it, why do we think they will look at enPR? Should enPR be dropped instead of extended? DCDuring TALK 19:03, 20 September 2009 (UTC)

¶ 1: I apologise for my mistake of interpretation, then. Still, at least that may explain some of my past comments. Anyway, I stand corrected.
¶ 2: Heck, I’m no professional either; I’m a B.A. philosophy student with a long-term interest in language, especially etymology. That’s all.
¶ 3: Most other dictionaries lack our pretensions to giving transcriptions for multiple dialects; they also tend to be stricter with the range of pronunciations they prescribe. Audio pronunciations are good, I suppose (I neither use them nor have the means to add them at present), but I prefer transcriptions personally; each to his own, but we should keep as many options open as is practical, IMO.
¶ 5: As I’ve argued above and below, some users prefer IPA and others enPR; if we’re going to drop any of the three transliteration schemes we use, it should be SAMPA. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 22:40, 25 September 2009 (UTC)

Chambers uses dh for the sound in breathe, as well as Fowler. Equinox ◑ 14:20, 18 September 2009 (UTC)

Thanks for that info.; (deprecated template usage) dh is also (infrequently) used IRL (read: in real language), as in the case of (deprecated template usage) Þrymskviða, which (in addition to having the Norse/Icelandic spelling preserved) is variously Anglicised as (deprecated template usage) Thrymskviða, (deprecated template usage) Thrymskvitha, (deprecated template usage) Thrymskvidha, and (deprecated template usage) Thrymskvida. (A fringe case, perhaps, but nevertheless notable and pertinent.) † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:41, 20 September 2009 (UTC)

Interestingly, the last point I made in my initial posting has gained immediate relevance; two days thereafter, kwami started the #plagiarizing AHD thread (four sections below this one) and has created a vote page to rename enPR back to AHD. It assume that it would be a hell of a hassle to change all the references to the pronunciatory-transcription system back to AHD. As I said above, precluding such accusations of plagiarism would be another benefit resulting from adopting these supplements to enPR. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:41, 20 September 2009 (UTC)

I’ve an example of a point I made above (“Æsthetically, enPR (and its ancestor schemes’) transcriptions are meant to look like diacriticked words written in lower-case, which quality, compounded with the lack of the need for slashes or brackets to give phonemic or phonetic context, makes them especially suited for use in running text (such as in usage notes discussing pronunciation).”); see parthen-#Usage notes, which reads:

The pär syllable is unstressed in tetrasyllables where the stress falls on the antepenult.

Substitute that with IPA or SAMPA, and you get:

The /pɑ(ɻ)/ syllable is unstressed in tetrasyllables where the stress falls on the antepenult.
The Template:X-SAMPAchar syllable is unstressed in tetrasyllables where the stress falls on the antepenult.

IMO, enPR is definitely to be preferred over IPA and SAMPA for such uses. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:53, 20 September 2009 (UTC)

The Canadian Oxford Dictionary includes the following in their pronunciations. The first, /ɜː/, is listed with the regular English vowels, but its example doesn't appear as a headword on its own. The last three “are used in the representation of French pronunciations” (CanOD 2004, inside back cover). They appear in the only Canadian-English pronunciation of some words, including the listed examples, and so should be included in any pronunciation scheme which intends to represent Canadian English. I believe that Oxford's IPA /ã/ is equivalent to our /æ̃/. —Michael Z. 2009-09-20 17:34 z

ɜː deux
ɑ̃ franglais
ã Canadien
ɔ̃ Brayon

Yeah, I remember seeing and reading something similar in the COED . The ones you give are examples of where enPR would completely fail to represent accurately those pronunciations without the supplements I propose. The enPR equivalencies to those IPA symbols you give are:

⟨ɜː⟩ — ӛ
⟨ɑ̃⟩ — äɴ
⟨ã⟩ — ăɴ
⟨ɔ̃⟩ — ôɴ

Thanks for mentioning them. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 17:49, 20 September 2009 (UTC)

My specific comments:

+<é> for , Template:X-SAMPAchar (as in (deprecated template usage) né), in contrast with <ā> (, Template:X-SAMPAchar; as in (deprecated template usage) nay).

To my ear, and in my dictionary, English né and nay both have the same diphthong vowel (/neɪ/). You'll have to explain why not to use \nā\ for both.

+<ӛ> for , Template:X-SAMPAchar (i.e., the long form of ə, , Template:X-SAMPAchar), as in (deprecated template usage) première danseuse.

Don't use Cyrillic letters for Latin letters, even if they look the same in your favourite font. The whole point of a respelling system is that it doesn't use exotics. And using exotic look-alikes would be adding an extra layer of misrepresentation.

The only correct representation of Latin schwa with diaeresis is the conventional schwa U+0259 with combining diaeresis U+0308: \ə̈\. (By the way, Unicode 5.2 will add small Latin schwa with acute and with grave next month.) It displays fine in my Safari/Mac and Firefox/Mac, with default fonts. In which browser–OS combinations is it failing to display?

I would prefer text representations which survive being reduced to plain UTF text, i.e. with the (HTML or whatever) formatting stripped out. But Wikitext is a rich-text format, so it would be acceptable for us to set this as a requirement for our pronunciation system. —Michael Z. 2009-09-21 00:50 z

The OED2 gives (ne) as the pronunciation of (deprecated template usage) née; the OED2½ (a draft revision from March 2008) gives /neɪ/ and notes “N.E.D. (1906) gives the pronunciation as (ne) /ne/. Not fully naturalized in English.”. Currently, the OED2½ has a draft entry (June 2008) for the South African colloquial interrogative particle (deprecated template usage) nê, wherefor it gives the Brit. pronunciation /neɪ/, the U.S. pronunciation /neɪ/, and the S. Afr. pronunciations /næ/ and /ne/. So, depending on your take, (deprecated template usage) né and (deprecated template usage) nay (or (deprecated template usage) nê and (deprecated template usage) neigh, or whatever) are either homophones or a minimal pair. Such things vary between dialects. The same is true of sociolects (which I think is the important -lect here). To you (and probably to most Anglophones) ā and é are not distinct, but to a guy who’s fussy about his French (or Afrikaans) pronunciation, they’re different; for such a guy, (deprecated template usage) début is pronounced /deby/, and not /ˈdeɪbuː/ or /ˈdeɪbjuː/. The sociolects that bother with these semi-naturalised pronunciations are those in which ā and é are separate phonemes, as are, variously k and ĸʜ ((deprecated template usage) lock–(deprecated template usage) loch), ä and äɴ ((deprecated template usage) barn–(deprecated template usage) ban), r and ʀ ((deprecated template usage) rall.–(deprecated template usage) râle), and so on. Does that explain adequately why we ought “not to use \nā\ for both” ā and é?
I’m using Firefox with Microsoft Windows XP on a PC. I understand that using a Cyrillic character to represent the long form of ə is not ideal; if there’s a way we can cause {{enPR}} and {{enPRchar}} to force the use of a font that displays the schwa + diæresis correctly, then I’d gladly opt for that over the precomposed Cyrillic schwa with diæresis. In the meantime, however, I remain unconvinced that a misplaced diacritic is to be preferred over a visually-identical character from another script.
“I would prefer text representations which survive being reduced to plain UTF text, i.e. with the (HTML or whatever) formatting stripped out. But Wikitext is a rich-text format, so it would be acceptable for us to set this as a requirement for our pronunciation system.” — I don’t understand that; please explain again. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 02:55, 21 September 2009 (UTC)

I think extending an English respelling system to represent vowels from France and South Africa is probably going beyond its scope. It is based on phonemes, and even using the same system for North American and British pronunciations may be pushing the envelope. If we extend it to duplicate the IPA's functionality, then it will no longer justify its own existence.

Re. me: If the pronunciation key includes “Canadien /ẽ/”, then this will be meaningful to many Canadians, but may be useless to most Alabamans, New South Walers, Liverpudlians, Johannesburgers, and New Delhians. I think respelling works for native speakers using native-speaker dictionaries. Extending it to international English is recreating IPA. I don't know how we address these problems of scope except to move ahead and abandon enPR. —Michael Z. 2009-09-21 04:07 z

I'll oppose using a Cyrillic character for a Latin one. There are problems with this at different levels. It is not a “visually-identical character from another script”—that's a naïve appraisal based on your own fully-sighted use of a visual browser and a particular font set. Second-guessing the underlying text encoding on such assumptions is not acceptable for a free, open, and inclusive dictionary based on web standards. Better to create a new respelling system with easily-assimilated IPA characters like ɑ, ɒ, ə, ɔ, ʊ, ŋ, and ʒ.

What I meant by my last comment is that it may be beneficial to change KH to the Unicode representation ĸʜ, to make the content accessible as plain (Unicode) text, but it is not absolutely necessary. We rely on Wikitext, templates, and inline HTML to make our point here all the time (as I do in this very paragraph). —Michael Z. 2009-09-21 03:24 z

But I’m not arguing we internationalise enPR into a phonetic representation scheme; what I’m arguing is that, in certain significant sociolectal contexts, some of these phones of foreign origin have become phonemes. Three more (weaker) minimal pairs pertaining to é: (deprecated template usage) E-day–(deprecated template usage) idée (US: iʹdā' vs. idé (French may be stressed howsoever, phonemically speaking)), (deprecated template usage) epoch–(deprecated template usage) époque (in Belle Époque) (ĕʹpŏk or ēʹpŏk vs. épôk), and (deprecated template usage) Bombay–(deprecated template usage) bombé (bŏmbāʹ vs. bôɴbé). Obviously, comparisons with (deprecated template usage) né, (deprecated template usage) née, (deprecated template usage) nê render stronger minimal pairs. Also, from my experience, é is strongest (i.e., least liable to conflation with ā) when it’s the last sound in a word (there are exceptions though: (deprecated template usage) élite pronounced ālētʹ sounds awful to me, for example).
The use of ӛ really isn’t an important issue for me. What are the possibilities that we can edit enPR to force the use of a certain font, as {{IPA}}, {{SAMPA}}, and our script templates do?
I concur with your third paragraph. Are there any reasons why we ought not to use ĸʜ &c.? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 04:57, 21 September 2009 (UTC)

OED transcribe's the last vowel of bombé with /e/ (it also marks the word “not naturalized, alien”). Random House (\-bey\) and AHD (\-bā\) don't indicate a foreign sound for it either. The same transcriptions are used for the last vowel in idée in all three.

I think it may just be that some anglophones pronounce this like French and others don't, none percieves é as an English phoneme. If you push a respelling system to capture every regional or sociolectal nuance, every phone that most language speakers do not perceive, then it becomes a phonetic transcription.

Any references supporting that “some of these phones of foreign origin have become phonemes?” In particular the French é? —Michael Z. 2009-09-22 13:30 z

Well, I can imagine at least one plausible instance when conflating é with ā could cause actual confusion. A person is reading an extract from a book: wĭʹnĭfrĭd smĭth nā smīdh wŏ.zə… — is that someone misreading it as “Winifred Smith — nay Smithe — was a…” or saying “Winifred Smith, née Smithe, was a…”? With é, there is no confusion (providing one is listening attentively).
As for references, I’d have no idea where to look; however, I wouldn’t know where to look for anything on minimal pairs. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:14, 25 September 2009 (UTC)

The name enPR

We use the name “enPR” in our documentation without even expanding the abbreviation (English Phonemic Representation). As far as I can tell, this is name is a novel Wikiism.

Dictionaries usually just refer to their pronunciations, but when lexicographers are being more specific, this kind of system is called respelled pronunciation, pronunciation respelling, or just respelling (some details at w:Pronunciation respelling for English). The defining characteristic is that English words are respelled using only the letters of the alphabet, without exotic characters such as in the IPA or the OED's original system. (Respelling is not inherently American, but academic dictionaries, learner's dictionaries, and most non-US native dictionaries now use the IPA.)

I urge everyone to refer to this type of system as respelling. If we do adopt a distinct system of our own, maybe we can call it something like Wiktionary respelling. —Michael Z. 2009-09-20 22:40 z

Two things:

What about the schwa (ə)? Or does that not count because it’s essentially a rotated e?
My proposed additions don’t add any “exotic characters” (in case that was the implication).

As for the name, I’m largely indifferent, except that I think we should retain “enPR” as the short-form name; one of the reasons it was decided that it would be prefixed with (deprecated template usage) en was so that it would allow a consistent nomenclature (based on languages’ ISO codes) if we decided to create or adopt phonemic representation systems (or “respelt pronunciation schemes”) for other languages (e.g., one for Dutch would be called “nlPR”, one for Welsh “cyPR”, &c.). But if you wanna call it “Wiktionary English pronunciation respelling” as the long form, then go right ahead. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:46, 20 September 2009 (UTC)

I don't know if the schwa was controversial when introduced, but it appears to be fairly accepted in English respelling systems now. I guess a turned character falls somewhere between a diacritical and an altered character in invoking fear among readers of American dictionaries.

“I guess a turned character falls somewhere between a diacritical and an altered character in invoking fear among readers of American dictionaries.” — I LOL’d. :-D † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 01:40, 21 September 2009 (UTC)

I wish everyone here could read about dictionary pronunciation in Landau 2001, pp 118–125. He mentions the acceptance of the schwa from IPA. He writes that American dictionary publishers complain that even respelling is complicated and widely misunderstood by readers, so some of them resort to “newspaper respellings” like uhtenshun for attention. The publishers neglect to mention that respellings are difficult because each uses a different system, and that the “resistance” also comes from their own marketing priorities. Landau writes that the audience for even native-speaker dictionaries in this global time includes many who would prefer the IPA, and opines that “It is time that we stopped defining the audience of native-speaker dictionaries as monolingual people too dumb to understand the IPA.” —Michael Z. 2009-09-21 02:55 z

Does Landau report any facts about the numbers learning the various pronunciation systems and the level of learning retained after, say, one year? DCDuring TALK 03:16, 21 September 2009 (UTC)

He says that a respelling system “does not work at all among foreign learners of English . . . . For ESL and bilingual dictionaries, then, it is obvious that a phonetically based system is necessary”, and mentions that “studies have shown that the respelling systems currently in use are widely misunderstood.” He also gives examples of problems like respelling transcribed from a (rhotic) general American accent being difficult to interpret by a (non-rhotic) Southern US reader. One could follow the book's notes to their sources. I'm pretty convinced by his reasoned argument that dictionaries should transition to IPA. Wiktionary could play a positive role in this instead of acting like an American publisher's marketing department. —Michael Z. 2009-09-21 05:08 z

Whilst I agree that the IPA is the most important of our three pronunciatory transcription schemes, there is no reason for us to do away with enPR since we can use both. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:17, 25 September 2009 (UTC)

I mean no criticism of the proposal at all, just a tangentially-related appeal to use nomenclature which is conventional, if a little obscure. —Michael Z. 2009-09-21 01:10 z

There is an aded complication that the name was voted in along with the initialism. --EncycloPetey 01:20, 21 September 2009 (UTC)

It should be fairly easy to vote a change to that, though. And there exists the happy coincidence that “English phonemic representation” and “English pronunciation respelling” create the same initialism. (BTW, could we use “English pronunciatory respelling” instead? It’s better to style to use an adjective rather than a noun in attributive use…) † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 01:40, 21 September 2009 (UTC)

I thought so, and I know that enPR/AHD has a history here and in Wikipedia. A name change could accompany a clear transition to a new system or distinct version of the system. —Michael Z. 2009-09-21 01:58 z

Interestingly, only 4 of the respelling systems fully dispense with diacriticals, even for vowels (World Book Online, NBC, Arpabet, and Dict.com). enPR doesn't seem to actually qualify as a pure respelling system. The Dict.com system seems to be the most complete of these. Wordsmyth has adopted a similar respelling system that also uses bold for stress (augmented by underlining where two syllables are stressed) and have incorporated it into their "logo" as displayed on the frame of their pages {Wordsmyth). See their entry for accommodation. DCDuring TALK 00:11, 21 September 2009 (UTC)

These might all be the result of technical restrictions at the time of these sources' origins, whether in the days of ANSI terminals, PC-DOS microcomputers, or Mosaic and Netscape browsers. —Michael Z. 2009-09-21 01:14 z

Yuck. I don’t like their system. Downsides include: (1) the loss of stress information in copying-and-pasting; (2) the increase in dissimilarity between words and their pronunciation respellings; and, (3) the use of majuscules for certain vowels (viz. I, U, E) breaks the all-minuscule æsthetic, approaching ugly SAMPA-style transcriptions and making the system less suited to use in running text. Their vowels (in their order), for comparison, are:

i = ē; ⟨i(ː)⟩
I = ĭ; ⟨ɪ⟩
e = ā; ⟨eɪ⟩
eh = ĕ; ⟨ɛ⟩
ae = ă; ⟨æ⟩
a = ŏ; ⟨ɒ>, <ɑ⟩
aw = ô; ⟨ɔ(ː)⟩
o = ō; ⟨əʊ>, <oʊ⟩
U = o͝o; ⟨ʊ⟩
u = o͞o; ⟨u(ː)⟩
uh = û; ⟨ʌ⟩
ai = ī; ⟨aɪ⟩
au = ou; ⟨aʊ⟩
oy = oi; ⟨ɔɪ⟩
E = ə; ⟨ə⟩
ih = ĭ; ⟨ɪ̶⟩
oe = ö; <œ>, (+<ø>?)

Something I note is the distinction made between the ĭ in (deprecated template usage) bit and the “barred i” (ĭ, ⟨ɪ̶⟩, if I understand what is meant) in (deprecated template usage) hopeless. Is the distinction between and phonemic? Should we reflect this in enPR? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 01:34, 21 September 2009 (UTC)

The "barred i" isn't phonemic as far as I've been able to tell, but is more common (if not restricted) to certain dialects. I've encountered it primarily in American English (again AFAICT) in words where the letter "i" can be represented in a pronunciation respelling as either a schwa or a short i, depending on the speaker. Some people (some dialects?) use a sound that isn't quite either one. The local pronunciation of Missouri is one place I've heard the sound. The symbol may not be phonemic, but it is worth noting that the Wikipedia:IPA for English chart includes the symbol. --EncycloPetey 03:06, 21 September 2009 (UTC)

More noteworthy, I think, is the use of both ⟨ɪ⟩ and ⟨ɪ̶⟩ by the OED and the inclusion on the IPA chart of a grey-shaded character for that said near-close central unrounded vowel. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:33, 21 September 2009 (UTC)

Wordsmyth's system is relatively new, though I don't know where the content came from. They seem to be aiming at a fairly basic user.

Why should we case about cut and paste? Editors and users have access to the edit window to copy.
To whom does the visual similarity matter? Why?
With our pronunciation sections, a little more ugliness is a triffling matter.

The virtues of such a system is that it requires only the most minimal of learning effort by a user, particularly desirable for anons. It might even facilitate the addition of pronunciations by folks who are not facile with getting characters not on the basic keyboard. It is particularly good for us because we already have 3 systems that rely on users having learned or being willing to learn and remember a phonetic system from one use of Wiktionary to another the following week (average number of uses per month at Wikt: 2-3). Since no one has suggested that any language learner who uses Wiktionary has any prior knowledge of any of these systems, there is an excellent chance that a system that has a negligible prior knowledge requirement would lead to much more success in getting pronunciation information from our entries.

Because we already have 3 systems that meet the esthetic, editor convenience, and technical requirements that you mention, while being more or less standard, it would seem the most noble use of our creative talents to meet needs not addressed by the existing systems. DCDuring TALK 03:04, 21 September 2009 (UTC)

Are you suggesting we adopt a fourth pronunciatory transcription system?! If so, let’s throw out SAMPA — it’s ugly and cumbersome, I don’t know anyone who uses it, and its virtue of using only basic-ASCII characters is obsolescent (at best) given today’s vastly improved text-display support. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:27, 21 September 2009 (UTC)

Not really. I am just intrigued by systems that might have significantly different characteristics in terms of their usability by both contributors and passive users. I have the impression that SAMPA is not succeeding in the lexicography world, so that it would seem the weakest of the sisters. If we are working for an end user population one system that worked for a large number of users by not confronting them with characters they don't recognize would seem desirable. The most industry-standard system seems to be IPA. Even the ways in which its variants are non-standard seem potentially productive of desirable change in the standard. enPR might give our community an opportunity to experiment without any need to respect the needs of any users other that our own community of active users. I don't know what niche would be open for SAMPA. DCDuring TALK 03:44, 21 September 2009 (UTC)

Just to bitch some more about SAMPA: note that the phone represented in the IPA by a single diacriticked character, (viz. the voiceless retroflex approximant) takes five characters in SAMPA: Template:X-SAMPAchar. Though ingenious in some ways, its value has passed IMO.
“enPR might give our community an opportunity to experiment without any need to respect the needs of any users other that our own community of active users.” — To achieve what exactly? I don’t really see the benefit of that; please explain. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 13:54, 21 September 2009 (UTC)

Really, I don't know what value enPR has over a no-diacritic/no-non-alphabetic phonetic representation system and IPA. DCDuring TALK 16:58, 21 September 2009 (UTC)

One big plus is that (ignoring the diacritics) enPR tends to render a spelling fairly similar to the original word. For example, consider the third pronunciation for (deprecated template usage) definiens — dēfīʹnĭënz, /diːˈfaɪnɪɛːnz/, Template:X-SAMPAchar — the enPR transliteration is virtually identical to the original spelling (ignoring diacritics and stress marks, and the terminal z notwithstanding); what similarity has ⟨iː⟩, Template:X-SAMPAchar to (deprecated template usage) e or has ⟨aɪ⟩, Template:X-SAMPAchar to (deprecated template usage) i (in enPR, ē and ī, respectively)? This preservation is good for etymological purposes, and means that Anglophones get used to what the characters mean more quickly (since they’re already used to the 2–4 commonest sounds that a given vowel denotes). † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:55, 23 September 2009 (UTC)

i'dput ipa1. asmore internat.,then/folowd bywoteva els i/abc order asmore regionl

its fonetik>noconfusion w/imview due spelinreforms

ps why dorem. no admin?--史凡>voice-MSN/skypeme!RSI>typin=hard! 13:15, 26 September 2009 (UTC)

I'd put IPA first as more international, then followed by whatever else in alphabetical order as more regional afaik

it's phonetic, so no confusion with imo due spelling reforms

p.s. why is Doremítzwr not an admin?
L☺g☺maniac chat? 14:59, 26 September 2009 (UTC) +sv--史凡>voice-MSN/skypeme!RSI>typin=hard! 07:01, 27 September 2009 (UTC)

IIRC, the ordering of pronunciatory transcriptions as enPR (AHD at the time), IPA, SAMPA was decided upon because of alphabetical order. Then again, the name change from AHD to enPR was decided upon partly to allow a consistent nomenclature (relying on ISO codes) for other systems of pronunciatory respelling that we might care to develop in future (so a system for Welsh would be called “cyPR”, for example). This means that if we had systems for Dutch and Sundanese, they’d be called “nlPR” and “suPR”, respectively; this means that alphabetic ordering would require IPA, nlPR, SAMPA in the case of Dutch, and IPA, SAMPA, suPR in the case of Sundanese. IMO, this would look less consistent, not more. We could change the alphabetising rule, ordering PR as PR, so that every entry would have IPA, PR, SAMPA as the order in which its pronunciatory transcriptions appear. We could do that, or we could just throw out the alphabetical-ordering thing altogether.
I agree with you that the IPA is the more important system generally, and should be prioritised in Pronunciation sections.
Thanks for what I’ll assume is a compliment. :-D I guessed I haven’t been given administratorship because I haven’t tended to do much that required administrator privileges; I don’t tend to report many vandals, tag things for deletion very often, or need to edit protected templates — it just hasn’t come up. Whilst I wouldn’t say no to administratorship, it’s not really much of a concern, since it doesn’t hinder any of my activities. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:06, 28 September 2009 (UTC)

Even with these changes, we just have a supplemented AHD system. IMO it would still be plagiarism not to acknowledge that.

It would not be at all difficult to change all tranclusions of the enPR template to AHD. AWB could do it in relatively little time, and if need be, we could always use a bot. So that argument is a strawman.

If we really want an independent system, why not something like Wikipedia:United States dictionary transcription? I'm not claiming it's perfect (and of course I'm biased), but it is not particularly close to any one proprietary dictionary. It is also copy friendly, using dh, for example, as proposed here. It would also mean better compatibility between WT and WP, though it isn't very widely used on WP.

Schwa isn't a problem, IMO. It's been used for decades in US elementary schools in what is otherwise an AHD-type system, and so is much more familiar to US Americans than other IPA letters. I think that font support for schwa is reasonably wide spread, but correct me if I'm wrong.

One of the problems I have with AHD (besides the lack of copy friendliness) is illustrated with "The pär syllable is unstressed" example above. If we're going to use ö and ü based on German (a proposal which I support, and is in the WP system), then ä would suggest the foreign vowel . So this is a prime candidate for getting away from the AHD standard. On WP we have â, by analogy with French and the similar vowel ô : "The pâr syllable is unstressed".

If we start changing the existing system, say with dh, then we're going to end up with a mess, with different articles complying with different versions of enPR. WT is not very consistent with pronunciations as it is, and I imagine such transitions would make things much worse. Better, IMO, would be to move enPR to "AHD"; then, once AWB or a bot fixes all the transclusions (which would not take long), we could resurrect the name enPR (or whatever people decide on) and link it to the key for the new convention.

Also, if people agree on something reasonably close to the WP respelling system, but object to a few of those conventions, we could always change WP to bring it into line with WT, easing movement between the two wikis.

As for SAMPA, it only seems useless because all of us here have access to relatively hi-tech systems. This is not the case of internet cafes in much of the Third World. Many computers do not have IPA-compliant fonts installed, and the cafe user does not have the authority to install them, rendering the IPA useless even for people who are mystified by enPR. That's what SAMPA is designed for. (Someone here, maybe EncycloPetey?, told me that the computers in the library at UC Berkeley do not display the IPA when browsing Wiktionary, so this is a serious problem.) kwami 22:44, 26 September 2009 (UTC)

I’ve asked bd2412 to pass comment hereat as to whether enPR in its present or its proposed–modified form plagiarises AHD; he’ll probably have a better idea of the actual situation. w:WP:USdict looks about as similar to the AHD’s as enPR is; if we’re plagiarising, how isn’t Wikipedia’s system?

Right, that settles it — your concerns about plagiarism have no basis. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:14, 28 September 2009 (UTC)

I tend to agree with you in re ä. As I said in the initial long explication, as part of the argument for the use of ö and ü: “There is…the lack of a parallel with the third German umlauted vowel, (deprecated template usage) ä, pronounced , Template:X-SAMPAchar in German, but which denotes , Template:X-SAMPAchar in the context of enPR; that said, I don’t think that’s much of a problem, since that German vowel’s short form is denoted by <ĕ>, whereas the long form ”. Nevertheless, since ä is often caused by a following ‘r’, â would seem more appropriate for that phoneme; however, that character denotes a different sound, which itself would be better represented by ê. What that means is that we’d need to change â to ê before we changed ä to â, just to make sure that no one thought ä meant what ë means. It sounds like a lot of hassle, and not worth it to me, despite the better system that would result.

I’ve responded to you about th → dh below. AFAIK, it shouldn’t be very difficult to use some AWB-style process to convert all such instances automatically. There are far fewer chances than you think that these changes will cause confusion.

† ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:06, 28 September 2009 (UTC)

Yes, I was the one who raised the point about UC library computers displaying special IPA characters as empty rectangles (including the stress and length markers). This means the problem most likely extends to other libraries and universities, as well as schools, where the latest software versions are usually not installed because of budgetary limitations. I don't like to use SAMPA myself, but I do still use it anyway in electronic correspondence because I can't rely on the recipients of my messages being IPA-capable. SAMPA is better than a series of empty little rectangles. --EncycloPetey 23:14, 26 September 2009 (UTC)

Hmm, then I guess I spoke too soon about SAMPA’s obsolescence; I suppose it’s here to stay for the foreseeable future. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:06, 28 September 2009 (UTC)

But are we concerned about people editing the respellings on these hampered computers or just viewing them? If it is just the latter, maybe there is a way to use javascript to automatically display SAMBA output from IPA or enPR transcriptions (and get rid of SAMBA in the wikitext). I don't have enough expertise to know if automatic conversion is possible, but that might give us the best of both worlds. --Bequw → ¢ • τ 02:50, 29 September 2009 (UTC)

SAMPA was intended to have phonetic-representation capacity equivalent with that of the IPA’s, but using only basic-ASCII characters; consequently, automatic conversion of IPA transcriptions into SAMPA ones should be possible. (It does mean that we’ll get ugly little things like Template:X-SAMPA for , however.) AFAIK, it’s just a matter of teaching equivalencies, which are already nicely laid out at w:X-SAMPA. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:03, 29 September 2009 (UTC)

Changes agreed thus far

From what I can tell, there’s been a lot of discussion about the point of enPR specifically, pronunciatory transcription in general, and the comparative merits and demerits of enPR, IPA, and SAMPA, but there hasn’t been a huge amount of discussion about the specific proposal (it’s pretty much been a duologue between Michael Z. and me). That said, most of the changes seem uncontroversial; the addition of é and ӛ remain contentious (but the latter only because it’s Cyrillic, with the addition of ə̈ drawing no objection). In the light of this, I’ll restate the proposal here, omitting é for now and substituting ə̈ for ӛ; please mark your assent (if you grant it) with WT:VOTE-style emboldened “Support”s followed by your signatures. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:39, 25 September 2009 (UTC)

I propose that we make the following changes to enPR, for the multitudinous reasons given ad nauseam above:

+<ɴ> to denote vowel nasalisation; equivalent to ⟨˜⟩ and Template:X-SAMPA.
+<ʀ> to denote any of the “rolled ‘r’s” (viz. ⟨r⟩, Template:X-SAMPAchar; ⟨ʀ⟩, Template:X-SAMPAchar; and ⟨ʁ⟩, Template:X-SAMPAchar).
<KH> → <ĸʜ>
<th> → <dh>
+<ö> for German’s (deprecated template usage) ö and French’s (deprecated template usage) œ (, ; Template:X-SAMPA).
+<ü> for German’s (deprecated template usage) ü and French’s (deprecated template usage) u (, ; Template:X-SAMPA).
+<ë> for , Template:X-SAMPAchar (i.e., the long form of ĕ, , Template:X-SAMPAchar).
+<ə̈> for , Template:X-SAMPAchar (i.e., the long form of ə, , Template:X-SAMPAchar).

Please note your support or opposition hereunder:

Support † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:39, 25 September 2009 (UTC)
Oppose I think we should take a dual-track approach: move existing enPR to AHD with AWB or a bot, then resurrect it (or a similar name) with a new transcription convention, so that we don't have a mess of different versions of transcription linked through the same template. We could then gradually convert articles using AHD to the new enPR, perhaps using AWB, without confusing our readers during the transition. I also worry about font compatibility at libraries and internet cafes for some of the symbols proposed here, such as ɴ, ʀ, ĸ, ʜ, and ə̈. kwami 22:44, 26 September 2009 (UTC)
There won’t be any confusion resulting from the transition, since almost all the changes are for supporting new sounds that the old system couldn’t denote. There are no equivalents for ɴ, ʀ, ö, ü, ë, or ə̈; KH and ĸʜ look almost identical (the change being made for functionality’s sake), so no confusion there; the only way that confusion could arise is in the transition from th to dh, which could be avoided by listing both under WT:ENPRONKEY#Consonants, noting (in a method similar to the note for ⟨ɹ⟩) that the former is deprecated and remains only as an artefact of the conversion. (On the topic of the th → dh conversion, could we have a bot replace all instances of ''th'' and th occurring within the {{enPR}} and {{enPRchar}} templates converted to dh?)
As for your character-support concerns: ĸ is part of the Latin Extended-A Unicode subset, whence also ā, ă, ē, ĕ, ī, ĭ, ō, ŏ, and ŭ, so any display problems with ĸ would be shared by those nine vocalic characters; ɴ, ʀ, ʜ and the combining form of ¨ are part of the IPA Extensions Unicode subset, whence also ə, so any display problems with those three consonantal characters and diacritic would be shared by the schwa; let’s not mention the already-extant display problems that the combining double macron and breve above of o͞o and o͝o are known to cause. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 21:18, 28 September 2009 (UTC)
True, adding foreign distinctions wouldn't be much of a problem, though we still haven't established that these are all phonemes in English. Your nay vs. née example is normally distinguished by intonation, which we don't transcribe anyway. ĸ might not be a problem, but schwa is much more widely supported than other IPA-block letters. (Font creators pick and choose letters; it's very common for Unicode blocks to be only partially supported. And why not just use "kh"?) I agree that o͞o and o͝o are equally problematic, and I would support a conversion to ōō, ŏŏ: this is what AHD does online. A bot conversion to "dh" would work fine; the problem as I see it is that previous changes have not been followed through with. E.g., the city vowel was added to the chart, but the vast majority of articles were never updated to match. I'm worried that a piecemeal approach will make things more of a mess than they already are. However, if we have a separate template and key for enPR version 2, then we can set AWB to change the template along with the transcription, so that all articles would be supported during the transition. kwami 21:37, 30 September 2009 (UTC)

Dispute resolution, contacting administrators, and contesting judgments

I have a proposal, since I had a recent difficulty in dealing with Wiktionary bureaucracy.

I was recently blocked for a reason I do not know, and which when discussed over Wiktioanry's IRC channel, no other administrator could figure out. As I was blocked, there was no way for me to contact the blocking administrator through Wiktionary. An email sent to the administrator was not responded to since the block to this point in time. I find this dissatisfying, since there's no way to contest a block.

I suggest that blocked users be able to edit their own user talk page, and ask for administrator assistance, through a process similar to what is used on Wikipedia, the w:Template:unblock process. As the communication would take place on the user talk page, instead of over private email, other users and administrators may also look over the arguments to see if there is error in the process. The blocked user can also request specification as to the reasoning behind the block, since you cannot modify your own behaviour if you do not know why you are being bloked.

A second suggestion is to implement a set of user warning templates to indicate why users are being blocked with boilerplate warnings for the common reasons for blocking. This would serve in the informative capacity, so that users actually know why they are being blocked. It is not helpful if you don't know why you're being blocked, since you may repeat the same action, and still not know why.

A third suggestion is an Administrator's noticeboard, similar to w:WP:AN so that users can easily contact an administrator. As alot of users are not conversant in the ways of the internet, requiring users use IRC to contact one is rather user unfriendly. The other option of emailing an administrator would need that the one you choose be willing to respond, and if a mass email to all administrators was used, could be construed as mass mail spam. It would be better if a central contact location was available. Disputes could also be laid out there, so that multiple viewpoints can be considered, since it may be that one administrator has different views than another. (say like w:WP:3O) This would also help various administrators harmonize their views, so that they are consistent, and train new administrators in current conventions, by reading the history of previous discussions.

76.66.196.139 07:33, 17 September 2009 (UTC)

- - I will note that Simple English Wiktionary (simple:) has an admin noticeboard (simple:Wiktionary:Administrators' noticeboard) 76.66.197.30 07:53, 29 September 2009 (UTC)

The big difference between us and -pedia is that they have a great number of sysops and we have many fewer, most of whom don't do much sysoping. Those that regularly patrol recent changes and fight stupidity would rather be building the dictionary and don't have much time to fiddle around with warnings. In your case the addition of (deprecated template usage) 8514, in the middle of a wave of vandalism and stupidity, was enough to get a short block. At a later time this was lifted when it was realized to be an innocent, yet inappropriate, edit. Sysops also have a life outside the wiki and sometimes don't read their email for days at a time. Being blocked for a short time is not the end of the world; I'm sure that you'll get over it and continue to add useful entries. SemperBlotto 08:34, 17 September 2009 (UTC)

- - re:days at a time - that's part of my point. If an administrator goes on a three week camping holiday in the middle of a National Park, they won't have access to email, so such a query will go unheard. So if someone could post to their own user page, other administrators can review the request, not just the administrator involved. In any case, one may not know why they are blocked unless an explanation is provided. If there is a dispute between administrators as to what constitutes actions requiring a block, then users will feel that there is capriciousness in the use of the block, so a history of usage and reasoning would be a good idea, hence histories on user pages or centralized to an Administrator's Noticeboard, with reasoning for blocks, would be a good idea. 76.66.196.139 07:56, 19 September 2009 (UTC)

- IMHO, this would not have warranted a block at all. It was a good entry, and the fact we don't want such entries shouldn't have immediately resulted in a block. -- Prince Kassad 13:55, 17 September 2009 (UTC)

- - - - Still, it hasn't been explained to me what was wrong with it. (ie. why is it undesirable) It looks exactly like a lot of other pages on Wiktionary. 76.66.196.139 07:56, 19 September 2009 (UTC)

And we do sort of have a system of warning templates that are applicable in most cases- Category:User warning templates. Nadando 17:07, 17 September 2009 (UTC)

You were blocked by mistake, so no user warning template would have helped in any way - it would be an incorrect message. The default "you have been blocked" message provides instructions on how to "email this user" and links to Wiktionary:Contact us which lists ample means for those who need to to get in touch. Conrad.Irwin 19:55, 17 September 2009 (UTC)

Yes, a warning template would not have been useful in my case, but that's missing my point. It would be useful in many cases. If you don't know you can't change, can you? And as for email the user who blocked you, if they don't respond, there's not much point, as SemperBlotto points out, administrators have lives too, and may not answer email for many days. So a way to contact administrators without mass mailing everyone or picking someone who's on vacation and out of touch, would be a good idea. Such is part of the reasoning behind my request for an Administrator's Noticeboard and an "unblock process". 76.66.196.139 07:56, 19 September 2009 (UTC)

The problem is that an "unblock process" itself would require much time and administrator involvement of the sort that we don't have. In the vast majority of cases where a user has been blocked, the edits are clear vandalism (such as the insertion of offensive language into entries where it isn't relevant), are promotional insertions (spam), are attack pages (against a Wikt editor or against a person known to the blockee), or are repeated problems from an editor who has ignored previous messages (either on the user's talk page or automated messages generated by previous deletion). Instituting an appeal process for all these individuals would only take more time from administrators who primarily want to write and expand our entries. Yes, there are occasional cases like yours, where a block is given accidentally. However, creating an entire process and superstructure for those few cases would not be productive since, as noted, the majority of appeals would be from vandals, spammers, attackers, and people who did not pay attention to notices. The few legitimate complaints would most likely be swamped out by meritless requests. --EncycloPetey 09:08, 20 September 2009 (UTC)

Still, can someone explain what's wrong with 8514 that it resulted in a (mistaken) block? All I currently know is that something about it is not desirable for an entry on Wiktionary... except that it looks just like alot of other entries on Wiktionary... What is undesirable about it has never been defined, no policy or guideline pointed out. 76.66.196.139 06:59, 22 September 2009 (UTC)

our"procedure"=flawd,needs amendmnt--史凡>voice-MSN/skypeme!RSI>typin=hard! 08:32, 29 September 2009 (UTC)

Superscript and subscript letters

Is there any guideline for the use of superscript and subscript letters? English Wiktionary has articles like CO₂ and H₂O, but I’m afraid the subscript ₂ is not always supported by web browsers. I’m asking this question because some French Wiktionarian is against their use. - TAKASUGI Shinji (talk) 23:44, 17 September 2009 (UTC)

I have not seen this subscript before. It looks too small, and not low enough. The way I have seen it is H₂O. For superscript, I find both m² and m². —Stephen 00:57, 18 September 2009 (UTC)

The Unicode superscript and subscript letters are better when it comes to article names, because we can separate CO₂ and CO2. But letters with HTML tags (sup and sub) look better and are always supported by web browsers. Personally, I prefer the Unicode letters to the HTML tags. - TAKASUGI Shinji (talk) 01:23, 18 September 2009 (UTC)

These seem very contrary to usability, especially since someone who cuts and pastes an HTML/PDF/Word superscript/subscript will just end up with the plain text "CO2" or "m2" anyway. We should certainly have a basic technical entry for Unicode codepoint 33A1, aka ㎡ aka "SQUARE M SQUARED". But the entry proper should be at m2 IMO, with ㎡ and m² being soft and hard redirects respectively. -- Visviva 05:43, 18 September 2009 (UTC)

I disagree. The main entry should be at the technically-correct (deprecated template usage) ㎡ (or at least at (deprecated template usage) m²), with the other two existing as soft redirects. There isn’t really a usability issue, since it’s only one click any way you look at it, and screen-loading time for soft-redirecting entries (since they’re so small) is negligible. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 09:29, 18 September 2009 (UTC)

This is a little digression but ㎡ (U+33A1) should not be an entry, because it is a character for the backward compatibility with the Japanese character set. Unicode Inc. recommends m² (U+006D U+00B2). The former must be a redirect to the latter. We should have also a link from m2. - TAKASUGI Shinji (talk) 10:34, 18 September 2009 (UTC)

It certainly shouldn't be the main entry, but wouldn't we want to have an entry simply to explain what you just wrote? People -- say, the recipients of a poorly edited translation from a CJK language -- who want to know what this "㎡" character is about, are not going to be particularly edified by an entry about m². -- Visviva 16:39, 18 September 2009 (UTC)

Of course, there should be an article of the legacy character itself. - TAKASUGI Shinji (talk) 08:07, 19 September 2009 (UTC)

Agreed that m2 should redirect to m². --Bequw → ¢ • τ 14:07, 18 September 2009 (UTC)

I don't know much about this area, but what is the basis for considering one more correct than the other? Has the Unicode Consortium published official guidance in this area? Their range description didn't seem to contain any specific recommendations regarding use or non-use. -- Visviva 16:39, 18 September 2009 (UTC)

I think Unicode prefers using mark-up (i. e. and ) unless the super/subscripting changes the meaning significantly. I'm not sure whether this is the case in our example. -- Prince Kassad 17:06, 18 September 2009 (UTC)

See Unicode 5.0.0 Chapter 15:

p. 493 (㎡): Some symbols are composites of several letters. Many of these composite symbols are encoded for compatibility with Asian and other legacy encodings. (See also “CJK Compatibility Ideographs” in Section 12.1, Han.) The use of these composite symbols is discouraged where their presence is not required by compatibility. For example, in normal use, the symbols U+2121 TEL telephone sign and U+213B FAX facsimile sign are simply spelled out.

p. 501 (m²): Therefore, the preferred means to encode superscripted letters or digits, such as “1^st” or “DC00₁₆”, is by style or markup in rich text.

They recommend HTML tags. However, article names on Wiktionary are plain texts, and I think we can use Unicode superscripts and subscripts. We can use {{wrongtitle}} instead, though. - TAKASUGI Shinji (talk) 08:26, 19 September 2009 (UTC)

I agree with Doremítzwr (09:29, 18 September 2009 (UTC)).—msh210℠ 18:48, 22 September 2009 (UTC)

I’ve created the entries. I think that what I’ve done is the best solution; please see (deprecated template usage) m² and (deprecated template usage) m2 to see whether you agree with me.
TAKASUGI Shinji, please create entries for (deprecated template usage) ㎡ and (deprecated template usage) legacy character, including the appropriate specific and general bodies of information therein. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 15:24, 19 September 2009 (UTC)

Looks reasonable to me. -- Visviva 17:54, 19 September 2009 (UTC)

I have created the article of ㎡. The phrase legacy character is just a combination of two words, and the article of the adjective legacy explains well. - TAKASUGI Shinji (talk) 13:51, 25 September 2009 (UTC)

EncycloPetey getting abusive again

EncycloPetey blocked me for three days for cleaning up redirects on two talk pages, with the comment "Disruptive edits: Intentionally changing document content that was decided by vote". This is ridiculous. He also ignored my request on his WP talk page to revert the block.

One edit which he reverted was this: Was I being disruptive in removing a circular link that simply brought the reader back to the originating page? Isn't it rather lame for EP to intentionally put such nonsense back in the article, let alone block someone for it?

The other was this: Here we had a supposed link to an explanation of the enPR transcription, but the link is simply a redirect to the pronunciation key (AHD, IPA, SAMPA) that the template already linked to. I redirected it to an actual article on the transcription system, the WP page on AHD. (Isn't calling the AHD system "enPR" at best plagiarism anyway?) I could see him reverting me with the explanation that we need a text link to the target of the template, and not leave that to the examples below, in which case I could have reworded the text to say that. But blocking me for three days? There was no warning, no discussion. This was once typical misbehavior for EP, but I'd thought he'd improved recently.

What's next, he blocks me because he doesn't like my date format? I'd like to remove the circular link, but he'd probably block me for it again. Can one of you at least take care of this simple and I would think uncontroversial housecleaning task?

And what do we do with a sysop who blocks people for trivia? kwami 09:59, 19 September 2009 (UTC)

A couple things. First of all, your edits were not in the right, and EP was correct to revert them. We did have a vote about the name change, and the vote decided enPR. The block was perhaps a bit quicker than I would have done, but looking at your talk page, I imagine it has to do with previous mishaps on your part. You have been talked to about being overly bold with policy page editing (granted this is not a policy page per se, but it's close enough). Concerning enPR vs AHD, I think that most of these ad-hoc pronunciation schemes are all pretty similar, and crying plagiarism is a rather weak assertion, in my opinion. -Atelaes λάλει ἐμοί 10:13, 19 September 2009 (UTC)

Yes, he blocked me almost two years ago for formatting problems that were clearly due to me being a newbie. EP responds to any edits he doesn't like by calling me a "liar" etc. when he obviously knows better (eg linking to diffs that shows he's the one lying). I don't know if this is an emotional problem on his part, but it isn't appropriate behaviour.

Are you saying also that it's appropriate for an article to apparently link elsewhere, only to have that link be a redirect back to the originating page? That's improper architecture in any navigation system. kwami 23:47, 19 September 2009 (UTC)

I'm confused by your posting. From the title of this section and your tone, I infer that you think it is wrong to be abusive. However, you have posted abuse here and here. Your response to Atelaes also has me confused, since it does not seem to follow thematically from his comments. Could you please explain my "misbehavior" in terms of WT:BLOCK? --EncycloPetey 09:54, 20 September 2009 (UTC)

For those who don't want to follow the entire issue, suffice it to note that Kwamikagami's posting to EP's 'pedia talk page reads in part: "Has your medication run out?". IMHO, no other discussion from this user is worthy of attention. (And the underlying issue is long since entirely resolved.) Robert Ullmann 10:14, 20 September 2009 (UTC)

Agree, not Beer Parlour worthy. Mglovesfun (talk) 10:20, 20 September 2009 (UTC)

plagiarizing AHD

It would appear from the talk pages that the enPR transcription system was copied from AHD with the misunderstanding that it was a pan-US dictionary standard. It was never modified to be anything but AHD, with the minor exception that a little while ago a new (and non-US) distinction was added, /i/ for the y in city. However, this has not actually been implemented in any but a handful of articles; nearly all instead use standard AHD /ē/ for this vowel. There's also a minor formatting difference from actual AHD in the stress marks, but this is insignificant, being less than the difference between the print and online versions of the AHD.

This is plagiarism. It may also be copyright infringement, though I wouldn't know. Regardless, it is inappropriate for us to take someone else's work and claim it as our own. I suggest we change the enPR template to properly display "AHD", and move it to an appropriate title (perhaps "AHD"). Then we could use AWB to gradually shift the transclusions to the new name. kwami 10:10, 19 September 2009 (UTC)

Without giving an opinion either way, I would note that because of the way this was already decided (Wiktionary:Votes/2007-02/Renaming AHD), you should try to resolve this through discussion and a vote, not the unilateral edits you were making before. Also for others, this past discussion will be relevant: Wiktionary:Beer parlour archive/2009/January#enPR. Dominic·t 11:08, 19 September 2009 (UTC)

Kwmikagami is a known troublemaker. Compare the tone of the text above with the message left on my WP talk page, which includes both lies and insults directed at me. the lie is that he "removed the link without changing the wording", but this edit shows otherwise. This is not the first time for this problem for this user for this issue. --EncycloPetey 18:47, 19 September 2009 (UTC)

Yes, I'm a "known troublemaker" because I made some edits as a newbie which, out of ignorance, did not fully follow Wiktionary guidelines, and for that I was blocked. Your general hysteria aside, calling me a liar because I do not consider reordering enPR/AHD to AHD (enPR) to be "changing the wording" is simply dishonest. I always thought "it takes one to know one" was a silly saying, but you would seem to be its proof.

As for whether EP received some well deserved insults, that's irrelevant to the point here.

The discussion which Dominic pointed out petered out without any resolution. I would have no problem with introducing an actual Wiktionary respelling key, but for stability's sake it would need to be a new template. enPR should be renamed "AHD" to give them proper credit, the new template and key can be introduced, and then we can modify the entries one by one, converting from AHD to enPR/WEN/whatever. I'll put it up for a vote: Wiktionary:Votes/2009-09/renaming enPR to AHD. kwami 20:42, 19 September 2009 (UTC)

Leaving aside the personal elements for now, I really can't see the problem here. To me, enPR simply means "the system that most Americans (and others?) learned in elementary school". I've used the AHD, and I don't recall ever having to look anything up in the pronunciation key. Why? Because there isn't really anything original about it. Likewise, I can't recall ever being puzzled by the symbols used for "enPR" here in Wiktionary, or by the very similar symbols used in any number of other 20th-century dictionaries and glossaries. The very trivial differences that have been discussed seem to me to be more in the range of style than content. Style is not copyrightable in the first place, and copying another person's style scarcely counts as plagiarism. -- Visviva 03:24, 20 September 2009 (UTC)

Whatever the merits of your case, kwami, your arguments will become moot if the community decides to adopt the supplements to enPR I suggest above. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:46, 20 September 2009 (UTC)

Hi there. I'm an intellectual property attorney. There is no copyright concern whatsoever arising from our use of a system that resembles AHD because AHD is a utilitarian work - that is, it is an actual means of doing something (rendering phonetic descriptions of words) rather than a mere description of how to do it. The phonetic scheme could arguably be protected by a patent, but a quick search of the patent office database shows no such patent on this scheme, or in the name of the American Heritage Dictionary. As for the plagiarism charge, plagiarism is an academic offense; by itself it is not actionable. Wiktionary bears no legal risk even if we were using an identical scheme. bd2412 T 22:46, 28 September 2009 (UTC)

Thanks for resolving that, bd2412. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:14, 28 September 2009 (UTC)

Renaming Spanish Arabic to Andalusian Arabic

I suppose this could have gone on WT:RFDO as much of the effort needed to accomplish this will be in moving categories, but it will also involve editing the contents of {{xaa}}. Andalusian Arabic is the name used at both SIL and Ethnologue and it made direct contributions to all the Iberian Romance languages, not just Spanish. Won't be too much effort to change it now, but I wanted to touch base before going ahead with this in case someone has an unexpected complaint. — Carolina wren discussió 22:24, 19 September 2009 (UTC)

I couldn't say which name is the more common among either linguists or non-specialists, but it sounds like a reasonable change to me based on what exposure I've had to the terminology. --EncycloPetey 08:50, 20 September 2009 (UTC)

I'd say go right ahead, "Spanish Arabic" was something of a misnomer to begin with. (note the wikipedia link on the talk page for the template ;-) Robert Ullmann 10:37, 20 September 2009 (UTC)

You have my support. I never liked the name "Spanish Arabic" to begin with, and I didn't use it when I added the Andalusian Arabic translation to water. -- Prince Kassad 12:08, 20 September 2009 (UTC)

Given the evidence, support. Mglovesfun (talk) 20:46, 20 September 2009 (UTC)

Given such overwhelming and quick support, it has been done and Al-Andalus can get its proper respect. — Carolina wren discussió 22:37, 20 September 2009 (UTC)

Allowing users to toggle spelling

About one–two weeks ago, Visviva, DCDuring, Bequw, and I discussed in the Grease Pit a potential means of allowing users to toggle entries’ spelling; see WT:GP#Allowing users to toggle spelling for the pertinent discussion. This would add greater functionality, accommodating to a significant degree variant spellings without the conflicts and inconsistencies that sometimes arise from having to opt for one spelling or the other. The technical discussion is inchoate; I was wondering what interest there was for such a thing. What do you all think? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 04:45, 20 September 2009 (UTC)

I don't really see the need for such a feature, The added complication of having to insert cent{{S:re}} seems quite large, while the benefit of seeing things in your preferred dialect seems quite small, given that almost universal cross-understanding is possible (for the spelling differences at least). Such a system would slightly debilitate the current search function (more than it is already broken) as it does not index template calls. Also, were such a system implemented, people would need to be careful not to overuse it, there are some words, like dialogue, which is spelt dialog when in America, and also by computer people even in the UK. Conrad.Irwin 10:12, 20 September 2009 (UTC)

I do think there is a (small) issue with our current inconsistency in spelling, given that we are a dictionary. It would seem to be much more professional to stick to using only one dialect of English. As I don't think our current political system is up to deciding which one to use; and I can't imagine there being very many good reasons to prefer one over the other (beyond the obvious, "it's spelt nicer"), this will probably die on the vine. Conrad.Irwin 10:12, 20 September 2009 (UTC)

I agree with everything Conrad just said. Mglovesfun (talk) 10:32, 20 September 2009 (UTC)

I do too. There is no problem that needs this solution. Who is asking for this? More complicated editing, new maintenance burden, broken search, zero benefit. —Michael Z. 2009-09-20 15:41 z

Differences in spelling are usually small; however, the benefits of accommodating those differences are not small, at least not in our case. Many, many people get pissy about spelling, hence spelling wars (see w:WP:LAME#Spelling for some choice examples). Unlike on Wikipedia, we don’t tend to go in for edit wars, but our compromises are problematic; e.g., (deprecated template usage) color–(deprecated template usage) colour, (deprecated template usage) façade–(deprecated template usage) facade, (deprecated template usage) Tokyo–(deprecated template usage) Tōkyō. I decry the follies of maintaining parallel entries on my talk page (see my post timestamped 18:40, 10 September 2009), the pertinent bits of which I copied to WT:BP#English spelling of Japanese derivations, in particular names - with macrons or without macrons?, therein adding some case-specific comments (in the post timestamped 03:17, 21 September 2009) — quod vide.

There is currently a discussion going on at WT:BP#-or vs. -our about lemmatising one or the other class of (deprecated template usage) color–(deprecated template usage) colour variants (i.e. whether the (deprecated template usage) -or form or the (deprecated template usage) -our form should house a given lexeme’s main entry); the irrationality of the discussion is evident from its fairly humorous tone and that decisions are being made literally on the toss of a coin (a one-real coin, to be precise). As you said “I can’t imagine there being very many good reasons to prefer one over the other (beyond the obvious, ‘it’s spelt nicer’)”, likewise Visviva said “I don’t really think that any set of logical arguments would ever be satisfactory”. The valid philological, pronunciatory, and functional reasons are strong ones, but they are dwarfed in the popular consciousness by the momenta of what different dialect communities are used to using. A partially analogous situation, I think, is that of the ASCII apostrophe ( ' ) vs. the typographer’s apostrophe ( ’ ) — no one gets too stressy about that dispute, since fixes in templature and link-piping allow the typographer’s apostrophe to be displayed throughout the content of an entry, be it housed at an ASCII- or typographer’s-apostrophe spelling; that a single ASCII apostrophe remains at the very top of the entry is of little concern to anyone. (The apostrophes’ situation is only partially analogous because no one really maintains that ASCII apostrophes are “more standard” than typographic ones, only that they are easier to input; however, this does show that the accommodation of typographic variants / concerns renders a technical–accessibility priority wholly non-controversial.) Those said philological, pronunciatory, and functional reasons are difficult for some people to accept if it leads to their language forms receiving no coverage; however, I wager that such reasoned arguments are less likely to be clouded by emotion if such stakes are reduced by ensuring such coverage via spelling toggling and such. Given that, I’m sure that “our current political system up to deciding which to use”.

Regarding the concern that the introduction of the S: templates will place an unreasonable and unnecessary burden on editors: Not at all. Using these templates will (or at least should) be entirely optional (working on a “you needn’t add them, but just don’t delete them if they’re already there” kind of rule); the S: templates would need only be added by editors who care enough about spelling-toggling to use them. I can’t imagine that uptake would be enormous or quick, but if a spelling war breaks out, it could be resolved immediately by the use of one or more S: templates. Such S: templates are analogous with {{,}} (the template for PREFS-controlled display or non-display of the serial comma) — no one has to use it (though it’s an unwritten rule that if it’s added to an entry, it stays) and it’s not massively common, but if there’s ever a disagreement about serial-comma use, it’s resolved immediately with four little braces.

Concerning concomitant search problems: It is already the case that searching for a term under a particular spelling will omit hits for the other spellings; e.g., amphitheatre (22) vs. amphitheater (11). Therefore, the search engine is already broken; these S:-prefixed templates offer a way to fix that — they can tell the search function to index S:-prefixed template calls simultaneously as each of the spelling variants a given S:-template accommodates (so an entry containing amphitheat{{S:re}} would be indexed as containing both amphitheatre and amphitheater). That problem is, in fact, a solution.

Finally, for oddball terms like (deprecated template usage) dialog (ue): There is no reason that the application of {{S:gue}} or any other S:-template need be uniform. If (deprecated template usage) dialog is indeed used in UK computing, then we can disallow dialo{{S:gue}} in computing contexts. A useful aspect of using S:-templates is that we can allow toggling exactly where, and only where, it’s deemed appropriate.

Anyway, I think this proposal has great promise, so I hope you’re wrong and that “this w die on the vine”. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 23:34, 26 September 2009 (UTC)

Messing up searching could be a problem. I think it would be small, however, as the S: templates would not be used in links or for the headword (where I think searching is the most important). --Bequw → ¢ • τ 03:34, 27 September 2009 (UTC)

Surely this needn’t cause search problems at all (quite the opposite, in fact), per my ¶ 4… † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 11:42, 27 September 2009 (UTC)

Yes, it should work like that. In fact it doesn't. The search doesn't index templates at all, so it would just index "ampitheat" (try searching for luserit a form of ludo if you want proof). If as suggested above, people use the S tempates only if they want to, then there is very little chance that any given word will have been S-templatised, so there is very little reason for anyone to find the PREF to toggle it. Hiding words using CSS (which is how all our similar templates work) leads to small irritations like copy and paste not being exactly expected (try copying and pasting a translation including the language code from an {{t}} invocation and you will see word en(en) when you paste even if you only saw word^(en) when you copied. A much better solution, is to choose one dialect in which the entries are written (which it is still possible to do while defining both spellings of a word - deciding to always use one spelling as the main headword is a more controversial decision, though it would make more sense if the entries were written in one spelling). Conrad.Irwin 13:14, 27 September 2009 (UTC)

That may be so, but it’s a brute fact that we’ll never “choose one dialect in which…entries are written”. Initially, I’d proposed that we strive for internal consistency within entries (see User talk:Doremítzwr#Template:palæontology for a more detailed explanation), but that was generally considered objectionable (though its marginal benefits were acknowledged — see the last, parenthetic sentence of Ruakh’s post thereat, timestamped 17:01, 31 August 2009); moreover, trying to maintain parallel entries is A Bad Thing™ (as I argue on my talk page in the second paragraph of my posting timestamped 18:40, 10 September 2009). Insert here rehashes of the first and second paragraphs of my above posting timestamped 23:34, 26 September 2009.

My concerns that this would break the search function disappeared when I saw your example of a broken search request. The only time that searching for an entryless term in the text of our entries would have broad utility is for finding the non-lemma forms of a term in a highly inflected language (for whose lemma we did have an entry). It’s frankly pathetic that the search function misses those scores of terms in each inflexion and conjugation table — that’s thousands, if not myriads, of legitimate terms that are missed in search requests. What S:-prefixed templates do to the search function is a very low priority until that very serious shortcoming of the search function is fixed.

Two last points: This is something intended for universal display, not something exclusive to WT:PREFS (see the box and discussion at WT:GP#Allowing users to toggle spelling for an explanation). The copy/paste problems are a minor irritation, that is true, but that, again, is a technical problem that needs fixing; I don’t think it can be used as much of an argument unless you intend to simultaneously dispense with {{t}} &c. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 16:10, 27 September 2009 (UTC)

I would much rather that we reduced the number of trivial options that exist just because people enjoy moaning. The issue with Template:t should also be simply got rid of, it needlessly complicates the page layout for the benefit of one-or-two hard-core "I hate brackets" people. (that's one or two our of several (tens of) thousands of readers of Wiktionary, a totally negligable part of a percentage who only get heeded because they complain a lot). Template:t has the advantage that it greatly simplifies the editing experience, and greatly facilitates the uniformity of pages. The S: templates do exactly the opposite, encouraging divergence at the expense of more complicated syntax. If you really want this to work, I would suggest writing some javascript that finds and replaces all instances of words in one spelling with words in another - this would have the advantage of not needing modifications to the wikipage at all, would work on all existing pages, with the slight downside that a manual list of such words (or heuristics for finding them) would have to be maintained - but then maybe that information would be useful in its own right (and would almost certainly be less effort than manually tagging every single instance of a word). Conrad.Irwin 19:07, 27 September 2009 (UTC)

Since we’re never going to settle on only using a single given dialect (the chance of which diminishes in inverse proportion with the growth of the editing community), the use of the S: templates does facilitate entry consistency, allowing uniform standards to coexist, without the exclusion of all but a single form that you propose. I almost certainly don’t know enough about Javascript to try the solution you propose, and even if I did, it would run into the objection you raised above, viz. the problem exemplified by (deprecated template usage) dialog (ue); moreover, such a thing would require the programming of a huge number of exceptions, to avoid the substitution being performed in particular bits of etymologies, terms lists, quotations, references, &c., &c. — something that sounds much more complex than using S: templates. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 20:24, 27 September 2009 (UTC)

Though heuristics might be hard, I think it's probably the cleaner solution. The script would default to convert the "acceptable" words to the user's browser's language unless it was set at WT:PREF. Maybe invisible wiki-markup could be used for cases where automatic detection couldn't get it right, such as the dialog(ue) situation Doremitzwr mentioned. Hand coding the exceptions is probably easier than hand-coding the non-expections. --Bequw → ¢ • τ 03:20, 29 September 2009 (UTC)

I would be OK with that method as long as it allowed the toggle box (or something with equivalent functionality) and was open to everyone, not just registered users fiddling with WT:PREFS. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:37, 29 September 2009 (UTC)

Unless implementation is fairly straightforward, I doubt if it is worth the effort. It seems ill advised to do anything that complicates our current efforts to put highly structured material on software designed for other purposes. If this solved some major problem, it might be worth more effort, but I don't see that it does. Are any US/UK differences, for example, major barriers to intelligibility? In any event it would not be something accessible by default. Making it available to occasional, unregistered users would require an intrusion on delivering our principal products: high quality definitions, translations, pronunciations and usage information. I would defer to the judgment of our tech-capable contributors as to where this limited-use enhancement ranked among tech priorities. DCDuring TALK 15:39, 27 September 2009 (UTC)

I’d envisioned this as something available to all users, including unregistered ones; it would make little difference, IMO, if this was something set by WT:PREFS. I’ll ask Visviva to comment on the technical implementation of this. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 20:06, 28 September 2009 (UTC)

How would an unregistered user select it or know how to select it without using space on the first screen they see? Should it essentially be mandatory that users make a selection each time they use en.wikt? Which and how much white space would be devoted to this or should content be squeezed off the screen? DCDuring TALK 20:48, 28 September 2009 (UTC)

I’ll answer each of your three questions in order:

See the example box in the top-right-hand corner of the discussion at WT:GP#Allowing users to toggle spelling; I was thinking about that sort of thing. That particular example assumes a large variety of variable words are used in the entry, which is unlikely (though not impossible, such as in a definition like “a connexion re-energized by colourful amœbæ” or something similar). It is reasonable to assume, I think, that small entries will tend to have small spelling-toggling boxes (if they have one at all), whilst larger boxes will be limited to the larger entries. Unregistered users see left-aligned tables of contents (which, IMO, is the common feature of our entries with the largest occupation-of-space : usefulness ratio). Small entries have no table of contents; one is only created once the page contains four or more section headers. Even in a four-section entry, the spelling-toggling box only just crosses the top language header; such a box could only interfere (very rarely) with rel-tables at the very top of the entry, right-aligned images, or a {{wikipedia}} box (which is ugly as hell, which I think ought to be deprecated in favour of {{pedialite}}, and which is substitute therewith whenever I see one). Ideally, the spelling-toggling box would be aligned alongside the table of contents, and could have its own show/hide toggle if need be; moreover, editors who dislike it could tick a box to get rid of it in WT:PREFS (where one may also opt for a right-aligned table of contents). Basically, there is no instance I can think of where the spelling-toggling box would take up first-screen space to the detriment of any other content.
No, there should be a default display for each S:-prefixed template. It will be much easier to agree on such default displays once everyone knows that it only takes one click to change whatever the default spelling is to whatever spelling one prefers.
This one’s probably been answered in #1 and #2.

† ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 00:14, 29 September 2009 (UTC)

Commenting by request, I would just say that the system I had in mind would basically just consist of a very small JS dropdown positioned about where WP puts the star on featured articles. The dropdown as I had envisioned it would just contain options for the major national variants, starting at the English variant sniffed from the browser (if any, usually "en-US" or "en-UK" if the browser language itself is English). So if I access a page as an anon and my browser is US English, I would initially see a page with US English spellings. As mentioned above, templating might not be necessary for the majority of cases (though I suppose this depends on how efficiently the JS could work); the JS could automatically skip things like headwords and <ul>'s inside <ol>'s.

I had initially thought that Raifʻhār's proposal was incompatible with the dropdown menu idea, but I wonder if the dropdown could just contain an "advanced options" item or similar, that would then serve up the full menu. That way a full range of spelling options would be available, but only to those who seek it out.

I should add that I am not really volunteering to do more than a token amount of work on this. Well, maybe two tokens. ;-) The thing is, it seems likely to be a lot of work, especially with the minimal-templates approach. -- Visviva 09:56, 29 September 2009 (UTC)

Hmm. Kinda like the progression shown at right? I’m concerned that the Javascript fix will require massive work just to get it off the ground; e.g., see septicæmia#Related terms (terms related by derivation from the Ancient Greek noun (deprecated template usage) αἷμα (haîma)) — there exist hundreds of variably-spelt terms just deriving from αἷμα, and each of them has (deprecated template usage) -aem-, (deprecated template usage) -æm-, (deprecated template usage) -aim-, and (deprecated template usage) -em- variants. In contrast, use of templature creates immediate, 100%-flexible, and gradually growing functionality. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 17:11, 29 September 2009 (UTC)

Purely bureaucratic suggestion

It has occurred to me that all English verbs are also verb forms, apart from pehaps defective ones. For example play is conjugated !I play, you play ", ergo, adding ] to {{en-verb}} seems correct, if admittedly just about pointless. Mglovesfun (talk)

What about be? -- Prince Kassad 13:41, 20 September 2009 (UTC)

Most of its subjunctive mood is homographic with the infinitive form of (deprecated template usage) be. I oppose this, since it will only clutter the categories section at the bottom of English verb entries, with almost no benefit. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 15:30, 20 September 2009 (UTC)

I agree with Doremítzwr.—msh210℠ 18:44, 22 September 2009 (UTC)

Agree. I think "verb form" means "form of a verb which is not the lemma." -Atelaes λάλει ἐμοί 23:48, 23 September 2009 (UTC)

That exclusion should be noted in the category’s preamble which, at the moment, is frankly a mess. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 14:45, 24 September 2009 (UTC)

Attributive use of proper nouns

The ongoing WT:RFD#Garfield debate is very interesting, because I didn't know that according to WT:CFI any proper noun can be include, including individuals. For example I have two attributive cites each for George Washington and Michael Jackson from Google Books. It's one of those things that if someone speedy deletes Michael Jackson, nobody bats an eyelid, but if you RFD it and it has three attributive cites, under the letter of the law it can't be deleted. On the other hand, be able to, is the other way around. Every knows it exists, but we want to delete it anyway, but Michael Jackson can't be. Doesn't this mean our criteria are just tosh? Mglovesfun (talk) 17:31, 20 September 2009 (UTC)

No, but it means that you disagree with them. Which, assuming you are a living human being, is not really a surprise. I daresay it would be hard to find an editor who doesn't have several bones to pick with CFI or allied precedent; the problem is that we all disagree with the policy in different ways. FTR, I don't see any particular consensus to delete be able to, though perhaps others see things differently. I would be interested to hear how you would prefer to address the proper nouns/names issue; I had previously thought of this as one of the areas where the CFI are fairly sane. -- Visviva 18:05, 20 September 2009 (UTC)

Well, presumably the "attributive use" bit comes from a vote, can we all see it? Mglovesfun (talk) 20:42, 20 September 2009 (UTC)

Not every issue in CFI was voted into policy individually. Rather, the majority of the document began as an evolving page that (after a time) we then chose to "set" in place with any further changes enacted by vote. So, there are some quibbles that arise from time to time with the document, and the attributive use section is one of those that is frequently quibbled over. --EncycloPetey 00:59, 21 September 2009 (UTC)

"A name should be included if it is used attributively, with a widely understood meaning. For example: New York is included because “New York” is used attributively in phrases like “New York delicatessen”, to describe a particular sort of delicatessen. A person or place name that is not used attributively (and that is not a word that otherwise should be included) should not be included. Lower Hampton, Sears Tower, and George Walker Bush thus should not be included. Similarly, whilst Jefferson (an attested family name word with an etymology that Wiktionary can discuss) and Jeffersonian (an adjective) should be included, Thomas Jefferson (which isn’t used attributively) should not."

This doesn't make a lot of sense (to me, anyway).

"Lower Hampton, Sears Tower, and George Walker Bush thus should not be included."

This seems to assume that they aren't used attributively. I mean they're definitely widely understood right? Not that we seem to have a definition for that.

"Similarly, whilst Jefferson (an attested family name word with an etymology that Wiktionary can discuss) and Jeffersonian (an adjective) should be included, Thomas Jefferson (which isn’t used attributively) should not.""

As above, it assumes that Thomas Jefferson isn't used attributively. But what if it is? The argument is sort of logical, I can follow it, but to counter the argument you just have to see "that's not true - Thomas Jefferson is used attributively, I have citations". As far as I can see, it doesn't matter what the proper noun means either, so the fact that Arnold Schwarzenegger says "A Germanic muscleman" isn't relevant. I'm not even sure you can "define" proper nouns. Empire State Building says "A skyscraper in New York City, the tallest in the world in 1931–72." Isn't this a description rather than a definition? Mglovesfun (talk) 13:14, 21 September 2009 (UTC)

This whole thing confuses me too. Actually I had a similar discussion about weekend (albeit not a proper noun). Pretty much every noun can be used attributively. Unless it offers up a new meaning, I don't see why such a distinction should be included. I am, after all, editing a Wiktionary page... oops, I just used a (proper) noun attributively. Dang, but no one listed it as an adjective on its definition page! Tooironic 20:20, 22 September 2009 (UTC)

All nouns, proper and common can be sued attributively, though perhaps that would make the proper noun common. But most proper nouns do not have attestable attributive use. If they do, it suggests that the proper noun might be being used fairly commonly to refer to something other than its direct referent itself and has entered the lexicon with that derived meaning. Determining that meaning is not always so easy. See Marilyn Monroe and Arnold Schwarzenegger. As I see it, the primary purpose of such a criterion is to exclude proper nouns that have basically encyclopedic rather than lexicographic interest. DCDuring TALK 21:59, 22 September 2009 (UTC)

No, the intent of the rule is to include proper names which have become words in the language in their own right, independent of their original referent. Other specific criteria could be added. I tried to clarify it, but I'm being shot down both by editors who disagree on the intent, and others who are all sour grapes about their pet criteria not being in there yet (ever?).

Perhaps Empire State Building needs a better definition, or a metaphor label or something. I'm not positive it qualifies, but its addition has been discussed a few times without any RFV or RFD. —Michael Z. 2009-09-24 00:25 z

English spelling of Japanese derivations

English spelling of Japanese derivations, in particular names - with macrons or without macrons?

What should be considered the main entry and what is the alternative - Ōita or Oita, Tokyo or Tōkyō in English entries? User Bendono has provided references for usage with macrons - Shibatani, Masayoshi (1990). The languages of Japan. Cambridge University Press. →ISBN.

In my opinion, one of the spellings should be considered the main entry and the other - alternative spelling. I chose spellings without as the main entry, marking those with macrons as alternatives. What are your thoughts? --Anatoli 02:08, 21 September 2009 (UTC)

Just like color vs. colour, neither are main. Both are used. For example, take a look at the references for Ōsaka: Britannica and Encarta. Of course that is how it is spelled in English if you actually go to the place. (BTW, there is also a different place called Osaka separate from Ōsaka.) Many do not know how to type diacritics, and historically there were technical issues, so they "magically" disappeared. But just be descriptive and list them as they come. Bendono 02:34, 21 September 2009 (UTC)

The macrons might find usage in special, limited situations, such as academic papers, but generally they are not used by anyone. Americans don’t know how or where to use them or even what they mean. Virtually everyone outside of a few universities writes only Tokyo, Osaka, sayonara, etc. —Stephen 02:47, 21 September 2009 (UTC)

I was in Japan in May this year. Although spellings with macrons are common, as I said in Bendono's talkpage, even in Japan, English spellings without macrons are too common as well. I admit that missing diacritics cause mispronunciations of many loanwords but that's the reality, the situation with the Japanese transliteration is mild, it's only the length of vowels, which is ignored. I don't wish to upset you, Bendono, let's see if these entries can be treated equally with cross links, e.g. I placed the translations into entries without macrons but made translation link in entries with macrons. Anatoli 03:11, 21 September 2009 (UTC)

I personally use the macra, but it doesn’t really matter where the main entry goes, just as long as there is a main entry. Whatever you do, please don’t maintain parallel entries like (deprecated template usage) Tokyo–(deprecated template usage) Tōkyō. I make the case this sort of thing on my talk page (in the second paragraph of my post timestamped 18:40, 10 September 2009); I’ll reproduce it here, given its relevance:

“I’ve been thinking about the wisdom of allowing full entries for alternative spellings, and have concluded that it is a folly. Much is made of the belief that we should minimise the number of clicks that a user must make before he reaches a considerable concentration of useful information, and this has been presented as an argument in favour of mass carbon-copying, transclusion, and other forms of redundancy. The downside of such an approach, when applied to the idea of maintaining parallel entries for each of a word’s alternative spellings, is that we do not manage entry synchrony at all well, especially when it comes to larger entries, such as those for greatly polysemic terms. This means that a user has to make many clicks, comparing the content of multiple entries in a tedious and time-consuming game of spot the difference, and in cases where the information presented conflicts, he has to guess which entry is correct and which erroneous; this is a disservice to our users, who become exasperated by our scattered lack of completion and our self-contradiction — such alienation can only harm Wiktionary’s credibility. This is a strong argument in favour of having only one main entry for the lemma of each term. However, it doesn’t matter so much which particular spelling houses the main entry, since that all-important ‘considerable concentration of useful information’ is only ever a maximum of one click away, and if a person is so unreasonably and contemptibly lazy or impatient to make one click, then accommodating him cannot be a priority.”

† ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 03:17, 21 September 2009 (UTC)

Yes, OK, one should be the main entry, I can see that entries with macrons have more text, like Tōkyō, Ōsaka, some are pretty much identical Kyōto and Kyoto. Let's wait what Bendono has to say. --Anatoli 03:37, 21 September 2009 (UTC)

I descriptively must recognize that spellings such as Tokyo, Osaka, Kyoto, Kobe etc. exist. And I have no problem noting it as such with citations. However, coming out of academia where those spellings are strongly shunned upon, I will not use them. On the extreme end, I have seen students not receive a grade because they have left out important diacritics.

The original Hepburn romanization scheme of the 19th century used macrons. This is how the words were initially introduced to the English world. The only reason that the forms without diacritics came into being is because of the difficulty in typing them and also for historical technical reasons. And now that the technological issues have been dealt with, there has been an increasingly massive trend to using the proper spellings (cf, Britannica, Encarta etc). Note how this is different from Roma > Rome where an actual new word with a new spelling came into being.

I am often stopped by tourists asking for directions here in Tōkyō. Increasingly tourist books are using diacritics where necessary as well. One recent example that I recall is Foder's Pocket: tōkyō. There are others as well.

As a personal anecdote, the neighborhood that I once lived in had a macron in the romanized spelling. It contrasted with a similar place without a macron nearby. I carefully gave my address to an English-speaking acquaintance. I had trouble receiving the package because she left off the macrons. When I asked her about it, it was a neighborhood that she had never heard of so was not using some common spelling, but rather she thought that it didn't matter. It may not to some, but it is an important difference for others. As I mentioned above, Ōsaka (literally Big Hill) contrasts with Osaka (literally Small Hill), and these two places happen to be relatively near each other.

For those who come to Japan, they will notice inconsistencies. Looking back over more than ten years here, those inconsistencies are increasingly reduced each year. I have personally talked with officials about the inconsistencies and have been told that in the past they were more limited by now obsolete technological constraints, but being more consistent is a goal as things are slowly replaced.

If you look at older versions of Britannica, you will note that the spelling revisions are only a recent update. I spoke with a representative about Britannica about this in May. They told me that their criteria for English spellings of names "was based, in part, on the United States Board on Geographic Names, which we generally use as our source for place-name spellings". I am unsure how Encarta works, but they often agree with Britannica with regards to spelling.

Of course this is not just limited to encyclopedias. Regular books such as translations by renowned Seidensticker and McCullough and many others regularly utilized diacritics. One may argue that they are coming out of academia, but their books are read my many. I have even heard this used as a kind of litmus test for potential translators.

While we have an obligation to describe all forms supported by citations, if we need a primary entry, then I must strongly implore for the form with diacritics. Those who dislike them can easily ignore them, but their absence can not be inferred a priori. Bendono 04:55, 21 September 2009 (UTC)

I personally don't mind making entries with diacritics the main ones, if there are no strong objections but old habits die hard. The side benefit is to educate users what is strictly correct in both academic English and Japanese Romaji but this will raise issues around cafe and café, facade and façade. The English entries based on Chinese Pinyin or Vietnamese never retain tone marks - it's Beijing, not Běijīng, Hanoi, not Hà Nội but it's São Tomé, not Sao Tome. So there is some inconsistency. French, Spanish and Portugues names seem to retain the diacritics more often, as they are more familiar to English speakers.

I thought the number of English pages with diacritics is an increase, if Google handled the count and diacritics well:

108,000,000 English pages for Tokyo

116,000,000 English pages for Tōkyō After checking a few hits for Tōkyō, they didn't actually have macrons. --Anatoli 05:57, 21 September 2009 (UTC)

I didn't take up the edit war with user Bendono on Tokyo/Tōkyō and Osaka/Ōsaka but have created new Japanese proper names using the approach that words without macrons are more common and English. Have created cross links with ample usage of macrons in non-macron entries. The issue, however remains unresolved, we have duplicate entries with different contents (macron entries are a bit too encyclopedic, IMHO). What can be done? See entries in blue with macrons (linked to non-macron entries): Japanese prefectures]Anatoli 05:32, 25 September 2009 (UTC)

Just re the Google thing: while Google counts always have to be taken with many grains of salt, it is possible to get a more accurate count using the '+' operator, which excludes most "fuzzy matches". google:+Tōkyō gives me 2.25 million, while google:+Tokyo yields 119 million. I find this reasonably probative as to which is more widespread (and it is generally about what I would have expected). -- Visviva 14:02, 25 September 2009 (UTC)

Thanks for the hint! I also used English only settings in the Advanced Search option. The result is strange with +, I'm not sure I can trust it... Anatoli 22:06, 25 September 2009 (UTC)

Proposed amendment to CFI and/or new namespace "Wikitranslate:"

Here is an idea: amend CFI so that an English phrase which might be SOP but which has non-SOP or SOM (sum of morphemes) translations to esp. single words in, say, three (or five) other languages would then be includable. ~~There could be a new namespace, say "Wikitranslate:", especially for this purpose.~~ Example of an "SOP translation": English "white dog" translates to Spanish "perro blanco", which is sum of "dog"→"perro" and "white"→"blanco". Example of an "SOM translation": English "little dog" translates to Spanish "perrito", which is sum of "dog"→"perro" and "little"→"-ito" (morpheme). Likewise: English "little dog" to Dutch "hondje" would be SOM. Example of non-SOP translation: English "sugar bowl" to Spanish "azucarera" or Esperanto "sukerujo". Non-SOP translation requirement is stronger than SOM translations. If SOM translations were allowed, then "little dog" would pass, but its meaning is totally SOP which means that it doesn't really need a definition, so that would be an argument for ~~having a special namespace "Wikitranslate:" whose entries do not need definitions, but would go directly to translations.~~ If an SOP entry is polysemic, then it isn't really SOP, is it? ~~The existence of an entry Wikitranslate:X would imply that entry X would not exist, except possibly as a hard redirect to Wikitranslate:X.~~ (By the way, "sugar bowl" is not a bowl made out of sugar, so it is not SOP, so it passes CFI independently of these new criteria.)

The trigger for such proposal is the discussion in RfD concerning be able to: Mzajac (talk • contribs) suggested the someone write a proposal, and DCDuring (talk • contribs) suggested that someone start a thread on this page. I am not sure that the proposal in its present form is well thought out, though: ideas, anybody? —AugPi 05:22, 21 September 2009 (UTC)

It would seem better to me to have translations for SoP hyponyms on the page of the word itself, the ELE for translations allows for a qualifier before the translation: {{t+|es|perro}}, {{qualifier|when small}} {{t+|es|perrito}}. In this case however, isn't perrito merely the diminutive of perro, and therefore including the translation to perro should be sufficient (though I do agree, the problem exists in general, i.e. maternal aunt - which even though is borderline SoP still has hyponyms in translation on the page). Namespaces are very blunt tools, and I can't imagine it being easy for anyone to work out why some words are in different places. Conrad.Irwin 07:50, 21 September 2009 (UTC)

See also Ruakh's proposal above, at #Inclusion_of_SOPs_for_translations_.E2.80.94_proposal, and continued by Dan at Wiktionary_talk:Criteria_for_inclusion#Translation target. I agree with Conrad above that namespaces are not the way to handle this. Personally, I would not oppose something like Ruakh's proposal, where we defer to existing bilingual dictionaries. But it does not fill me with joy, either. -- Visviva 08:21, 21 September 2009 (UTC)

Thanks for the replies. I have scrapped, stricken through, the "Wikitranslate:" namespace idea. I was not aware of the {{qualifier}} template and its use: that is good to know. Let me note in passing that some translations listed for "maternal aunt" are not "faithful", but rather "forgetful", i.e. when translated back to English they yield just "aunt", not "maternal aunt". Anyway, I was going to say that Ruakh (talk • contribs) proposes an "extrinsic" criterion for inclusion and I was wondering if there could not be, alongside it, an alternative, more "intrinsic" criterion for inclusion. For example, allowing SOP translation targets, but only if they have, say five, non-SOP translations. Perhaps someone else could come up with/refine/tweak some improved "intrinsic" inclusion criteria? —AugPi 06:50, 22 September 2009 (UTC)

On a different note, Appendix:Words found only in dictionaries seems to be working well, and it relies solely on an "extrinsic" inclusion criterion. —AugPi 08:11, 22 September 2009 (UTC)

Differences between American and British English

This Wikipedia page is an excellent source of material for Wiktionary. Has anyone made an attempt to apply its contents to lexemes in Wiktionary, I wonder? — Paul G 12:10, 21 September 2009 (UTC)

Proscription of long discussion titles

Could we add something to the preambles atop our main discussion fora discouraging long titles in discussions’ section headers? Long titles significantly cut down on the space one has to write his edit summary. Does that seem reasonable? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 13:28, 21 September 2009 (UTC)

Simple English Wiktionary needs editors!

Hello there all. I would like to humbly ask anyone who edits here if they would like to come and edit the Simple English Wiktionary. We are a growing community that is in need of new editors, so please feel free to drop by and start making entries for us. Thanks, and I hope to see you all there :). Razor flame 14:13, 21 September 2009 (UTC)

I notice it is very different from English Wiktionary... or Wiktionary in other languages... 76.66.197.30 08:01, 29 September 2009 (UTC)

Category:Given name appendices

In an effort to regain some sanity, we need some sort of decision over what to do with these, as ~~one user (who's name is hard to type, sorry)~~ User:Makaokalani is nominating a lot of them for deletion (about 10 already today). I don't really know if we need these when we already have categories which do the same thing. The advantages of categories seem to me to be:

No red link, the articles always exist.
Automatically in alphabetical order.
No need to edit them manually, MediaWiki does it automatically.

For appendices:

You can add red links for articles that don't exist yet, but should.

Any other thoughts on this? Mglovesfun (talk) 14:15, 21 September 2009 (UTC)

In favor of appendices: It's also possible to add additional information in tabular form, such as originating language, cultural context, variant spellings, etc. This is not possible to accomplish in a category. --EncycloPetey 02:16, 22 September 2009 (UTC)

True, but if you all you want is the list, cat's are fine. Category headers can even serve as special-purpose todo lists (e.g. Category:Spanish suffixes where there's some important red-links listed). So I'd say 'delete on those Appendices that just duplicate Categories. --Bequw → ¢ • τ 02:30, 22 September 2009 (UTC)

So, you would rather delete than see them expanded or improved? --EncycloPetey 03:04, 22 September 2009 (UTC)

If it appears that the author has not intention of improving them beyond a mere category dump, and there's no interest by anyone else, then yes, remove them from the main namespace (as I said on RFDO, some could be userfied). Don't have an Appendix just to keep red-links around, have a Wiktionary todo page (such as a subpage of an About Language page). Appendices should be more than just category lists. --Bequw → ¢ • τ 03:39, 23 September 2009 (UTC)

Appendix:Czech given names lists variants of names ("Jakub (Kuba, Kubík, Kubíček, Jakoubek)"), which a category cannot do.

Appendix:German given names does not list variants, but the majority of its items are redlinks.

From randomly clicking to given name appendixes, many seem to be like the German one: they either have a lot of redlinks, or have non-wikilinked items.

IMHO most given name appendices should be kept to serve at least as todo-lists. Adding all these redlinks as todos to a category page seems to bring no advantage; the list are sometimes rather long. I think category pages should contain as little info as possible, apart from the categorized items.

In general, wiki pages of lists provide more flexibility than categories: information can be added per item in a list in brackets; tables can be used; info can be placed to a tooltip; items can be organized or arranged not only alphabetically in a flat list but also using additional structures in various hierarchical lists and under headings. List pages are much easier to rename (meaning move) than categories.

The need to edit wiki pages with lists manually is not really a disadvantage over categories. When adding an entry to a category or a list, either you edit the entry page to add it to a category, or you edit the list page; the number of manual edits is actually lower with list pages than with categories when adding multiple entries. --Dan Polansky 03:32, 22 September 2009 (UTC)

I'm not against given name appendices in general, only against bad and useless ones. This summer, User:Alasdair has created several appendix stubs and, instead of finishing them, spent his time on everything else. He is also making the old appendices look like categories - changing the format, hiding explanations in brackets, moving pet forms to the main alphabetical list. I have asked him four different times about the purpose of his appendices and their sources, but he doesn't understand the question.

A Norwegian to-do list from a good source would be quite welcome. But Alasdair created Appendix:Norwegian given names by copying the given name categories - just the names that are not needed! (After the rfd he has added a some red-linked names.) I'm not rfd'ing the French or German appendices, for example - they can be used as to-do lists even though their source is obscure. But since Alasdair is bad at organizing, and doesn't understand questions, his appendix stubs, and those ones that simply copy a category, should be moved to his own user space until he makes them presentable and explains why they are needed. There is no "Requests for Moves" page so I used the rfd. --Makaokalani 11:40, 22 September 2009 (UTC)

Thank you for the explanation. Appendix:Norwegian given names is worth keeping as there are enough redlinks to make it worth it, IMHO anyway.

On the day of its creation, the Norwegian appendix was in this state, which is admittedly malformatted, yet already contains plenty of redlinks. The appendix was still malformatted a week later. But ultimately, the appendix looks okay now.

On stubs of the form "Appendix:<Languagian> given names": I don't see any problem with keeping these stubs. These kind of appendices can and do bring advantages over categories. Alasdair has incrementally expanded these appendices in the past; judging from his past pattern of contribution, there is a good hope that the appendixes will not be abandoned. --Dan Polansky 13:31, 22 September 2009 (UTC)

Ooops. In this edit, Alasdair shortened the content of the Norwegian appendix to one fourth so the apparent redlinks were no longer there. I agree that leaving pages malformatted and leaving them full of content that does not belong there is a poor practice. Alasdair should get an appendix to a minimum quality state in his userspace before releasing it in the mainspace. --Dan Polansky 13:41, 22 September 2009 (UTC)

The original red links were Icelandic names. I'm becoming an expert Alasdairologist by now.--Makaokalani 14:07, 22 September 2009 (UTC)

Wiktionary:Page deletion guidelines

Not a major thing, but I think we also speedy delete copyright violations, say if someone copies from the Oxford or whatever. I don't think this actually requires a vote as we already do it (in fact, all Wikimedia projects do it, don't they?) so it's more a case of saying what we already do, rather than a policy change. Mglovesfun (talk) 08:16, 23 September 2009 (UTC)

Yes, we need to delete these if they can't be rewritten. I have added this to the guidelines; feel free to revise as necessary: "Licence violations: Pages whose content has been copied from another source, where that source is protected by a current copyright or otherwise incompatible with our Terms of Use." Equinox ◑ 09:58, 24 September 2009 (UTC)

Looks good. I really prefer to delete these before rewriting them, since otherwise they remain in the database, tagged as freely-licensed. Not many people would ever go hunting through past revisions looking for stuff they can copy, but it's the principle of the thing. ;-) -- Visviva 16:47, 24 September 2009 (UTC)

WT:RFV

As I think a lot of people are aware, the problem with RFV (requests for verification) is that a lot of the time, nobody bothers to look. There are entries on there right now that have been there for over a year, but I feel a bit dodgy about deleting them because I see no evidence of anyone trying to cite them. So my tip would be that anyone who tries should say so on WT:RFV and say "I can't find X on Google Books" or whatever. Another more challenging problem is when you're just RFVing one meaning that is either hard to define (or hard co cite) or that the other meanings are so prevalent that the disputed meaning is almost invisible. On the French Wiktionary, soir is up for deletion as a verb. Apparently a rare spelling of seoir, but soir meaning evening is so overwhelmingly common, it's like trying to find a needle in a haystack. Mglovesfun (talk) 08:21, 23 September 2009 (UTC)

On the first point, this is definitely a problem. It helps a lot, I think, if people filing RFV noms mention (preferably in some detail) what searches they've performed and what the results were. I try to do this when I RFV, but it's easy to skip.

On the second point, we should really have a "Verification tips and tricks" page. (or maybe we already do?) There are a lot of handy techniques that us hardened citators use, especially when it comes to narrowing down the search to a specific sense or POS and filtering out extraneous gunk. -- Visviva 09:00, 23 September 2009 (UTC)

*Topics

The asterisk at the start of Category:*Topics (and, by extension, Category:ja:*Topics, Category:pt:*Topics, etc.) don't seem to be very useful or desirable. I'd like to know if there's any particular reason to keep it. --Daniel. 18:11, 23 September 2009 (UTC)

It has the quite useful purpose of making the category sort somewhere near the start of the category lists. I don't think we want to lose that. -- Prince Kassad 18:46, 23 September 2009 (UTC)

Isn't that what sortkeys are for? Indeed, for each of these except Category:*Topics, a sort key will be needed anyway to get it appear at the top of a list of categories. — Carolina wren discussió 20:22, 23 September 2009 (UTC)

They don't affect stuff like Special:Categories and such. Though since we have languages starting with exclamation marks and apostrophes, it doesn't work that well. -- Prince Kassad 20:32, 23 September 2009 (UTC)

So it doesn't work and no one is complaining? Let's remove it. —Michael Z. 2009-09-24 00:08 z

I dislike any category name that starts * or !. Why have them? Mglovesfun (talk) 13:07, 24 September 2009 (UTC)

I think part of the motivation for the asterisk is to differentiate topical categories from lexical ones, which I believe to be an important distinction. I have no special love for the asterisk, but I would kind of like to see the distinction maintained. -Atelaes λάλει ἐμοί 13:43, 24 September 2009 (UTC)

Well, since you bring it up, I'd like to see us scotch the entire unworkable concept of topical categories and replace with something more lexically based, like the WordNet system. :-) But apart from that -- yeah, the asterisk seems kind of silly to me. Unless one of our resident techies weighs in with a sound reason for keeping it, count me in. -- Visviva 14:06, 24 September 2009 (UTC)

Hear, hear. -Atelaes λάλει ἐμοί 14:33, 24 September 2009 (UTC)

It may make sense to some of us to get rid of topical categories, but we need to make the case very explicitly so as not to be revisiting the matter every three months or less. Can we formulate a vote on the question? Can we be explicit about lexical categories?

Pending implementation of a Wordnet-like system, perhaps we ought to have some suggestion for how to retain that kind of information when it does not fit into lexical categories. For example, could we make appendices serve the purpose sometimes. Do we want appendices to do so? DCDuring TALK 15:24, 24 September 2009 (UTC)

While a WordNet style system does have its advantages, and I try to provide entries for the various -nyms when I can, it is dependent upon having entries for the semantically relevant words (or at least knowing them) to work properly. For languages where we have a reasonably complete vocabulary its fine, but for the languages where the entries are largely missing, and are likely to remain so for quite some time the topical category system is essential. Implementing topical categories in a manner other than Wiki categories seems like a way to make categorization dependent upon the (deprecated template usage) 1337 editors to do it as if we don't already have other tasks we'd all prefer to do. Besides, it's not a case of one or the other. — Carolina wren discussió 19:31, 24 September 2009 (UTC)

“I think part of the motivation for the asterisk is to differentiate topical categories from lexical ones, which I believe to be an important distinction. I have no special love for the asterisk, but I would kind of like to see the distinction maintained.”

There is no distinction. Every subcategory of *Topics is a mish-mash of words included for lexical reasons—for example, by being labelled as specialist vocabulary by {{chemistry}}—and for topical reasons—for example by the inclusion of ]. The asterisk-talisman has failed us. Melt it down into golden calves. —Michael Z. 2009-09-25 00:52 z

You are right, of course. This being the case, I suppose the removal of the asterisk would offer us little loss. However, it strengthens my resolve in thinking that every cat currently asterisked should be considered for deletion (considered, mind you). We really need to figure out what kinds of categories we have, whether they be restricted use/jargon, topical, what. The use of contags and categories has been done rather haphazardly, and really needs to be thought out. -Atelaes λάλει ἐμοί 01:24, 25 September 2009 (UTC)

Uh, Michael? You do know where the golden calves story goes nest, don't you? They worship the golden calves, then have to eat them as punishment. Topical categories include words for topical reasons, not for lexical ones. That's the whole point of having topical categories. --EncycloPetey 02:55, 25 September 2009 (UTC)

Medium-rare for me, please.

Adding a restricted-usage label to a term or sense is a lexical classification, isn't it? But then it shows up in the same category as a word whose usage is not restricted. So acetabulum has the label {{anatomy}} applied, indicating that its usage is restricted to the technical field of anatomy. But instead of being grouped with other technical terms, it is also lumped with hand and Adam's apple. —Michael Z. 2009-09-25 03:58 z

The asterisk was originally included (IIRC) to be a visual signal that it is the root category for all topical categories, and that all topical categories should be nested somewhere under that root category. Without the visual signal, how would a visitor who is not proficient in English (or in Wiktionary) identify the basal starting point of the topical category tree? How would they know it's not just a collection of words on the subject of "topics"? --EncycloPetey 02:52, 25 September 2009 (UTC)

I understand the intent, but with the visual signal, how would a new visitor know this is a top-level category? Nothing about the asterisk says “basal starting point” to me. Better to name it Category:All topics or something. Better yet, add an asterisk or a prefix like Topic: to every subcategory, and then we can create a separate tree of restricted-usage categories (perhaps using Usage:). —Michael Z. 2009-09-25 03:58 z

I can see creating a set of "Usage:" prefixed categories, but not prefixing the topical ones. That way, the "Usage:"-prefixed categories would nest within the related category of the same name without the prefix. --EncycloPetey 04:26, 25 September 2009 (UTC)

The text of Category:*Topics says, as it already did upon the creation of the category back on 22 July 2005:

"The use of an asterisk in the name of a category is reserved for the highest levels of categories. The effect of this will be to allow these categories to appear at the very top of the category list for faster access."

That would mean that the categories that are children of Category:Fundamental should all start with asterisk, which they do not do.

Quoting from Wiktionary:Beer_parlour_archive/May_06#categories_-_toward_consensus?:

"The asterisk insures that it is listed first in the list of categories.--Eclecticology, 10:25, 2 June 2006 (UTC)"

The same can be achieved without asterisk, as has been pointed out in this discussion.

I think that the asterisk can be dropped as an obsolete trace of history, resulting in the creation of "Category:Topics", "Category:Terms by topic" or the proposed "Category:All topics".

Just to be on the save side, we could drop a note or email to Eclecticology if he could provide us with some input on what negative consequences the dropping of asterisk would have. I see no negative consequences. --Dan Polansky 17:57, 29 September 2009 (UTC)

Did Eclecticology say anything? --Bequw → ¢ • τ 19:11, 21 October 2009 (UTC)

Translingual

This being discussed on the French Beer Parlour, but it's come up on WT:RFD as well today. Do we have Wiktionary:About translingual or anything similar? This category perhaps more than most other categories risks becoming a bit of something and nothing unless we have some sort of agreement. For example, shouldn't fro have a translingual entry for "ISO 639-3 code of the Old French language"? I'd say yes. If not, why not? Mglovesfun (talk) 13:05, 24 September 2009 (UTC)

We aren't very discriminating on abbreviations, whether it is by policy, guideline, or practice. We have the language codes, though I'm not convinced that they would all meet CFI, because they serve the convenience of the only class of users that is well represented here. The, E numbers, and national and provincial codes are about the most complete sets of abbreviations we have. DCDuring TALK 14:28, 24 September 2009 (UTC)

There was certainly no reason for fro to lack a Translingual section when enm has one, so I added it. I thought we had all of these, but clearly we don't. -- Visviva 14:57, 24 September 2009 (UTC)

Translingual is not really intended to have a cohesive identity, which is probably why WT:AMUL has not been created up to now. But such a page could be useful simply as a collection-point for detailed treatments (or links thereto) of the specific types of things that can be treated as {{mul}} -- Han characters, taxonomic names, symbols (but maybe not all symbols?), etc. And maybe it could even address the "does translingual mean panlinguistic?" issue that keeps resurfacing. (I think the answer is obviously no, while some others think the answer is obviously yes. Obviously it isn't as obvious as we imagine.) -- Visviva 14:57, 24 September 2009 (UTC)

Internet TLDs is another area we're good in (for example, .fm) -- Prince Kassad 14:58, 24 September 2009 (UTC)

Some coordination and structure is better than none, right? I can't see a single negative effect of having a discussion leading to a WT:AMUL being created, can anyone else? Mglovesfun (talk) 15:15, 24 September 2009 (UTC)

As said above by Visviva, we could use a page to list specifically which types of things we treat as Translingual. The word pizza, for example, looks very Translingual. --Daniel. 17:15, 24 September 2009 (UTC)

While Daniel. was being a bit tongue-in-cheek there, the various Italianate music terms would be an excellent candidate for Translingual entries. — Carolina wren discussió 18:32, 24 September 2009 (UTC)

I've started it off at Wiktionary:About Translingual. Please improve. I'd imagine nuts and bolts stuff can be handled there, but discussions about whether certain types of entries are "Translingual" should continue to happen here. --Bequw → ¢ • τ 15:48, 28 September 2009 (UTC)

-or vs. -our

A comment above in English spelling of Japanese derivations about how neither (deprecated template usage) color or (deprecated template usage) colour could be considered the main entry got me to thinking. A fair number of the words in which the -or/-our distinction occurs are Latin or Latinate words in which Latin and some of the other Romance languages (such as Catalan) are likely to have cognates with the same -or spelling. A case could be made for either:

Consistently preferring the -or spelling for the main entry so that the cognates in other languages appear in the same entry as the main English entry, making accessing detailed meanings or translations easier.
Consistently preferring the -our spelling so that so that the cognates in other languages appear in a different entry from the main English entry, allowing to average out the size of the two entries instead of having one large entry at -or and one small entry at -our. It also avoids having to use piped #English links to get to the main English entry.

Personally, I feel the reasons for standardizing on -or I gave are more important than the reasoning I gave for standardizing on -our, but both arguments are valid and neither depends on making a judgment as to whether the -or or -our form is "better" English. — Carolina wren discussió 02:03, 25 September 2009 (UTC)

Ignoring “whether the -or or -our form is ‘better’ English”, the arguments of point 1 are stronger. This could also be a way out of (deprecated template usage) façade-style disagreements (by the logic of your first point, the main English entry should be at the diacriticked form, since that is the spelling of its French etymon). If we are to follow the cognate-matching argument in this case, I’d want to see it followed in other applicable cases. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 02:16, 25 September 2009 (UTC)

I favor flipping a coin. :-) Your arguments are both sensible, but I don't really think that any set of logical arguments would ever be satisfactory; the only thing to do is just pick one and get it over with. -- Visviva 04:57, 25 September 2009 (UTC)

Like in English spelling of Japanese derivations, there was no agreement, everyone remained with their opinion. Yes, flipping a coin looks like the only solution. Anatoli 05:23, 25 September 2009 (UTC)

If the community reaches an agreement on which option is heads and which option is tails, I'd be happy to flip the coin. --Daniel. 05:35, 25 September 2009 (UTC)

That's another toss of a coin. ;) Do in order of appearance in the question: (deprecated template usage) color and (deprecated template usage) colour - heads and tails. I personally don't care if American or British is the main entry, as long as they are cross-linked and searchable. Derived entries deserve the same treatment. Alternatively (just a thought, not a suggestion, don't quote me if it's too complicated or there are too many opponents), one can be a redirect to the other, the entry could have color (AE) or colour (BE) and the entry could contain info for both. Anatoli 05:52, 25 September 2009 (UTC)

The result was heads, from a genuine R$1,00 Brazilian coin. This certainly elevates the status of color (and related American English words) over colour (and related British English words) as preferred spelling at Wiktionary entries. Though of course, new ideas for standardization on this matter are always welcome. --Daniel. 06:20, 25 September 2009 (UTC)

It's color all the way for me, then. -- Visviva 06:48, 25 September 2009 (UTC)

Don’t forget the -ize/-ise and -ization/-isation spellings. —Stephen 15:15, 25 September 2009 (UTC)

<humor> I support American spellings (-ize, -or, -ization, etc.) as main entries (reinforced by Daniel.'s coin toss), but of course, I'm American, so that probably doesn't count. And don't forget encyclopedia/paedia/pædia and the like....... :) L☺g☺maniac chat? 16:40, 25 September 2009 (UTC) </humor>

I suppose it doesn't matter, but if we flip a coin for this one issue we can't really do it for others, since we have to make sure that on average there is no preference being given to one country's spelling. As for -ise/-ize, that is easier, since despite appearances -ize is perfectly good UK English too (and is preferred by the OED and the Times and various others). Ƿidsiþ 17:14, 25 September 2009 (UTC)

The OED uses "ize" doesn't it? That's British, right? 76.66.197.30 02:45, 27 September 2009 (UTC)

I would support separate coin-flips for each issue. That should result in an even distribution overall. Of course, it will also be insanely confusing. What's not to like? :-) -- Visviva 06:05, 26 September 2009 (UTC)

I don't speak or write American English or British English. I speak and write Australian English. That means I use "colour". I don't want "colour" labelled with something that excludes me and 21 million of my neighbours. Also in Australian English as in British English "-ize" is prescribed by dictinoaries and usage guides, notable Macquarie. Despite the prescription "-ise" if felt by most people to be more correct so a truly descriptive dictionary of Australian English would have to give the "-ise" spellings first and en.wiktionary claims to be such a dictionary. Of course "-ize" is also used here and many people actually use both, including me. — hippietrail 02:33, 26 September 2009 (UTC)
My experience with Australian clients and professors is much as you describe -- they may not have a problem with the "ize" spelling, but they don't really consider it proper Australian English. (Same with British folks, mutatis mutandis.) This seems like useful information to convey in some fashion; it's certainly something I had to learn the hard way. As far as labeling, I don't think many of us actually had a problem with the "Commonwealth" label. Maybe we could restart that discussion? Or could we take more of a wiki approach -- instead of glossing over the issue, maybe the label could link through to an appendix that details the distribution of each spelling difference? If we don't have such an appendix already, we certainly should... -- Visviva 06:05, 26 September 2009 (UTC)

I would strongly support a move to (perhaps in the first instance) write all our page text in one form of English; I find the idea that people could actually get het up about which form of spelling used to write entries depressing - though I suppose the fact that I get slightly irritated that we are inconsistent with it is no better. I think that sticking to a specific form of English for the definitive head-word is a perhaps more arbitrary decision; would there be a "fairer" (if less obvious) metric, like "earliest form with three attestations" (taking the date of the earliest quote). I certainly think that if we were to use one form of the other we should put {{alternative form of}} in the bin and use something much more specific, perhaps including the dating ideas from below, and certainly mentioning (in whatever style) which flavours of English use that form. Conrad.Irwin 19:27, 27 September 2009 (UTC)

List of letters

As part of my project of cleaning up entries related to letters, I am currently adding a list of Latin letters into Translingual entries such as Ç, which looks quite helpful. I'd like to know the opinion of other users about it. If there is no objection, I'm planning to finish the Latin letters and start lists of other scripts as well. --Daniel. 10:12, 25 September 2009 (UTC)

I like the concept, but it's not playing nicely with the right-floating TOC. Could the "width" statement be removed from the table without breaking anything? -- Visviva 10:17, 25 September 2009 (UTC)

On the issue of appearance, I'd like to review other alternatives than removing the "witdh" statement, so please point me to a page where this problem can be seen; because in my opinion, my example Ç looks fine. --Daniel. 19:49, 25 September 2009 (UTC)

I think this issue is only apparent when you enable "Put the table of contents onto the right of entries." in WT:PREFS. -- Prince Kassad 20:08, 25 September 2009 (UTC)

I see, thanks for pointing me to there. So I edited the related template (including a change in the "width" statement), it should be working properly now. --Daniel. 20:42, 25 September 2009 (UTC)

Looks excellent now, thanks. -- Visviva 06:07, 26 September 2009 (UTC)

One spelific concern for Ç; some of the "letters using cedilla sign" aren't actually using a cedilla, they're using a "comma below". A cedilla is attached underneath the letter, a comma below is not. The letters from Turkish and Latvian do not use cedillas. --EncycloPetey 23:02, 26 September 2009 (UTC)

You can see at ț a list of letters using comma. From Unicode naming system , Latvian letters such as "ķ" do have cedillas. --Daniel. 00:27, 27 September 2009 (UTC)

No, they don't. The Unicode descriptions of the letters are in error. --EncycloPetey 15:16, 27 September 2009 (UTC)

EncycloPetey, from your definition "cedilla is attached underneath the letter, a comma below is not", the situation seems a little complicated. For instance, in Unicode "Ģ" is called LATIN CAPITAL LETTER G WITH CEDILLA and many fonts draw this character as actually having a comma, therefore the fonts are technically wrong for misplacing such character (instead of inventing a G with a cedilla attached or simply ignoring this character) and Unicode is incomplete for not providing an individual code for a "G" with comma. Though, there is a code for "combining comma", whose effect is to add a comma to the previous letter, therefore a G with comma is possible by Unicode standards: G̦. --Daniel. 03:52, 28 September 2009 (UTC)

Unicode originally unified the cedilla and the comma below. Then it later disunified them at the request of the Romanians who insisted that their Ș and Ț were not to be written with cedillas but commas. Unicode's stability policies prevent it from renaming a character or changing the decomposition, even if it later is discovered or decided that it is wrongly named. Hence using Unicode's name for a character as proof of whether the characters should be rendered typographically with commas or cedillas. Typographically the Latvian characters should be rendered with commas, as is done with the reference versions in the Unicode charts. The Latvians didn't have to deal with other orthographic traditions preferring cedillas for their letters, plus the attempt to disunify cedilla and comma below has created a mess for the Romanians. So it's no surprise that the Latvians haven't pushed for *LATIN CAPITAL LETTER G WITH COMMA BELOW and the like to be added to Unicode to replace LATIN CAPITAL LETTER G WITH CEDILLA. It's sort of like the situation with the Turkish dotted and dotless i's where for purposes of casing it would be nice if LATIN CAPITAL LETTER I and *LATIN CAPITAL LETTER DOTLESS I were disunified as well as LATIN SMALL LETTER I and *LATIN SMALL LETTER I WITH DOT ABOVE. — Carolina wren discussió 05:13, 28 September 2009 (UTC)

Well, if we're going to say that one classification is correct and another incorrect, we should have some authority for doing so. If the Unicode descriptions are not considered authoritative, what alternative authority should we use? -- Visviva 08:40, 28 September 2009 (UTC)

Perhaps usage is a good way to solve this issue. According to my interpretation of our well-known "three-citations rule" from WT:CFI, we could ideally cite three websites (I mean, at least three websites for each letter and some reliable grammatical source, printed or not, stating that the symbols are actually called commas) that use specifically the letters described as having "cedilla" by Unicode. In this case, websites are better than printed media because the coding of the symbols in question may be verified. Though, even with this preparation, I think the Unicode description cannot be completely ignored, so the Translingual description would not be very different than: Ŗ 1. Letter R with cedilla according to Unicode, widely used as letter R with comma in many languages. By extension, Latvian and other language sections would preferably have similar descriptions when this affirmation is true in their contexts. --Daniel. 10:22, 28 September 2009 (UTC)

I too like the idea and stole it for {{hy-script}} :) --Vahagn Petrosyan 10:08, 28 September 2009 (UTC)

I'm so flattered that I too stole most of your template, for generic information on Armenian script at Translingual sections. For example, see Ա. You made my research of Armenian letters easier. Of course, you are welcome to discuss this change, as the Armenian entries also look fine without Translingual sections. --Daniel. 11:23, 28 September 2009 (UTC)

Sorry, I don't think Translingual sections for Armenian are necessary as they duplicate what I have already created. Technically, of course, Armenian letters are used for Old Armenian {{xcl}}, Middle Armenian {{axm}} and Modern Armenian {{hy}} but I don't think that makes the letters translingual. I treat the differences between those under one ===Armenian=== header as in ղ. --Vahagn Petrosyan 11:33, 28 September 2009 (UTC)

Well, this is another one for WT:AMUL -- if, as and when -- but I do think we should have brief Translingual sections for every Unicode character/codepoint/symbol/thingy, to cover technical and typographical information. But alas, I believe I'm in the minority on this one; people couldn't even understand why I resisted putting ==Korean== headers on nonsense "syllables" like 궑.-- Visviva 12:00, 28 September 2009 (UTC)

In my opinion, differently from Eastern Armenian and Western Armenian, that are currently treated as dialects of Armenian, that information on Old Armenian under an Armenian section at ղ seems seriously out of place. When working on "my project of cleaning up entries related to letters" said above, I could eventually find these Armenian entries and separate them, as I done in various other languages, simply because they are different languages. As for Armenian letters being Translingual or not, we have the definition from WT:ELE: "terms that remain the same in all languages", but we don't have a detailed policy on what is Translingual. However, I think that distinct letters are Translingual enough to merit entries. For instance, someone could look for a symbol without knowing its language; in this case, details such as romanization and pronunciation would not be strictly necessary, only language information and links to related symbols. --Daniel. 12:16, 28 September 2009 (UTC)

Since we're talking about the letter, I agree that it's Translingual. I see no reason why an Armenian character could not appear in an English (or other language) sentence as the topic of discussion. A text about Greek pronunciation can discuss "the palatalization of ν," for example, and the only alternative to doing this would be to use a name of the character, which is usually more cumbersome than just using the character. --EncycloPetey 13:33, 28 September 2009 (UTC)

My main concern is duplication. How about merging the ===Armenian=== section in Ա with and into ===Translingual=== then? --Vahagn Petrosyan 13:54, 28 September 2009 (UTC)

Perhaps obviously a complete merging of Armenian and Translingual sections would not be always possible, because not every Armenian definition is a letter, number, chemical formula, etc. The letter է also has a Verb sense. A merging of the "letter" senses would be possible, but not every information is Translingual: by comparison, the current Albanian, Azeri, Catalan, French, Jèrriais, Kurdish, Manx, Occitan, Portuguese, Turkish and Turkmen sections at Ç could be merged into Translingual, but this would inevitably cram language-exclusive information, such as alphabetical order and pronunciation. --Daniel. 18:42, 28 September 2009 (UTC)

I meant to merge only Letter senses, naturally. --Vahagn Petrosyan 10:23, 29 September 2009 (UTC)

Being able to say "the palatalization of ν," does not convince me that the character can be included as an English (or by extension, a Translingual) term. If the character wasn't obviously in a different script, I'd expect it to be italicized to denote that it's foreign. The phrase strikes me more as a mere mention of the term than an actual usage. Similarly, a textbook could say "the translation of 愛情" and that would not make me consider 愛情 worthy of an English or Translingual entry. Maybe we should include all Unicode code-points as Trans entries, but just because something is a letter does not give it Translingual status, especially if it's only used to write a single language. --Bequw → ¢ • τ 03:10, 29 September 2009 (UTC)

Bequw, to have your opinion properly analyzed by me, you could please be more specific about your statement "Maybe we should include all Unicode code-points as Trans entries, but just because something is a letter does not give it Translingual status". Looks like you are suggesting the guideline of adding all Unicode code-points (which would would theoretically include every letter) while actually disliking this idea. Your suggestions may be considered to build a clear list at WT:AMUL of which types of things have Translingual status. In my opinion, this list would, of course, include every letter. --Daniel. 10:01, 29 September 2009 (UTC)

I am undecided on including all Unicode points as Translingual characters. I just didn't want to justify making these letters into Translingual entries for a weak reason. --Bequw → ¢ • τ 15:50, 29 September 2009 (UTC)

Every letter now has an appendix of forms, including combinations with punctuation/capitalization, appearance in other scripts, etc. See, e.g. Appendix:Variations of "c". Shouldn't the "Variations of" phrase in the template link to that appendix, which is going to hold far more information than we're likely to want to replicate on every "c" page? Cheers! bd2412 T 19:37, 28 September 2009 (UTC)

I've put it in to "C" in the template as an example. Of course, that makes it redundant to the one on top of the page, but that could be eliminated as well. bd2412 T 19:42, 28 September 2009 (UTC)

Appendix:Variations of "c" lists entries with similar names. There are other appendices such as Appendix:Variations of "fa", and none of them is limited by the Translingual header. For consistency with other pages, a link should remain at the top of entries. Though, despite the redundancy, another link could exist automatically by the template, as your example at Ç doesn't look bad. --Daniel. 09:21, 29 September 2009 (UTC)

Quotation marks and terminal punctuation

Here's an issue for WT:MOS, if anyone ever gets around to writing it...

A while back WikiPedant raised the concern on my talk page that it was incorrect to place the comma or other punctuation outside of quotation marks, as for example in a citation line such as this:
- 1998, Seom Young-Guy, "The Problematics of Transboundary Graphology", Journal of Irreconcilable Trivialities, Volume 1 Issue 2944820, page 25J
As WikiPedant noted, this is contrary to standard North American usage, as described for example in the Chicago Manual of Style (6.8-6.10) and The Copyeditor's Handbook (pp. 77-78), which would have us put the comma inside the quotation marks.
However, this usage is consistent with what the North American style guides refer to as the "British style", as described for example in the Oxford Style Manual (pp. 148-150). In the "British style", which I am following here against all of my Leftpondian instincts, terminal punctuation is placed within the quotation marks only if it is part of the quotation. (The OSM would also enjoin us to use single rather than double quotes.)
It had been my understanding that the standard approach on Wiktionary is to follow the OSM system, apart from the use of double quotes. My recollection was that the reasoning went something like this: we have to pick one or the other, and the "British style" is more globally accepted and more precise -- precision being a particular concern for a dictionary.
The thing is, I cannot find where any of this has actually been discussed or documented on-wiki. So I bring it here for your consideration. Should we follow, or continue to follow, "OSM with double quotes"? And if so, should we perhaps make a note of this somewhere? -- Visviva 07:32, 26 September 2009 (UTC)

What you call the “British style” is properly called (deprecated template usage) logical quotation, and is neither restricted to nor universally followed in Britain. It is the English Wikipedia’s house style and is more consistent with other languages’ quotation marks. The “American style” (properly (deprecated template usage) typesetters’ quotation) derives from the use of printing presses requiring that the easily damaged smallest pieces of type for the comma and the period be protected behind the more robust quotation marks. Typesetters’ quotation is a historical artefact, and there’s no need to perpetuate it here; logical quotation is the, well, logical choice.
As for which quotation marks to use, it only matters that when quotations enclose other quotations, that we alternate double–single–double–&c.; e.g., “I said ‘she said “he said ‘they said “the dog said ‘Wha’ choo talkin’ about, foo’?!’ ” ’ ” ’.” — That example also demonstrates, IMO, why double quotations marks are to be preferred over the single forms; use of the latter is liable to confusion with apostrophes.
So yeah, “OSM with double-quotes” it is. Go status quo. Woo! † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 13:56, 26 September 2009 (UTC)

Ah, yes. The discussion I thought I remembered probably took place on Wikipedia, some years earlier.

As long as everybody's down with the status quo, I've got no issues. I would note that the American system is settled style on this side of the Atlantic, and is fairly drilled into us Leftpondians (or at least those of us who go into writing of one sort or another), but I have to admit it doesn't make a great deal of sense when you step back and look at it. -- Visviva 14:22, 26 September 2009 (UTC)

Yeah, it’s used everywhere in the US except in technical writing. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 14:31, 26 September 2009 (UTC)

I welcome a good excuse to dispense with the illogical leftpondian approach in this instance, without in any way prejudicing my ability to make choices in the future based on other principles ~~or, possibly, no principles at all~~. DCDuring TALK 19:32, 28 September 2009 (UTC)

I feel like this is perfect material for a real WT:STYLE (not the ELE redirect that is currently there). Do we think there's enough material for a full page? --Bequw → ¢ • τ 02:45, 29 September 2009 (UTC)

Portmanteaus/blends

A look at the list of words that link to "portmanteau" reveals many entries that are mere compounds (A + B) rather than actual portmanteaus or blends, in which the words that are combined overlap. These need to be fixed by amending the etymology to use the "blend" template. EDIT: OK, I think I've got all of these now.

I notice that the "blend" template now adds words to Category:English blends, which is good, but there are still many words at Category:Portmanteaus. Do these need to be migrated to the newer category? — Paul G 14:51, 26 September 2009 (UTC)

Yes, since (deprecated template usage) blend is a broader term (meaning a word formed from two words slammed together, irrespective of meaning); portmanteaux also require blent meaning. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 15:54, 26 September 2009 (UTC)

Quite. I'm not sure that slamming is entirely necessary, though - "blend" suggests a more gentle mixing :) Can the migration be done automatically, or does it need to be done manually? — Paul G 16:48, 26 September 2009 (UTC)

Are they hard-cat.’d with ]? If so, I’d assume that it needs to be done manually. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 17:09, 26 September 2009 (UTC)

Dating and subsenses

So have a look at the expanded version of mark, noun 1. I did two things which are probably quite controversial so people might want to comment and/or throw things. Firstly I split it into subsenses. I think there's some agreement that subsenses might be useful on some entries, although it's been a while since we discussed it and I remember some editors like Connel were very against the idea. Obviously we could just change it back to a list, but I don't know if that would be necessarily easier on the eye, and I also think subsenses allow us to show historical ordering of definitions while also keeping related senses together (which was one of the main objections). The second thing is the dating template {{defdate}}. I mentioned something similar a while ago. Now I've tried to make it more understandable and less garish. Personally I think we desperately need this kind of info, but maybe I'm on my own. Any thoughts? Ƿidsiþ 10:50, 27 September 2009 (UTC)

It looks very good, I am a big fan of subsenses, I'm not convinced we need the dating info, but it looks quite neat there, so if people want to add it, I don't see any reason to remove it. The yellow is not over the top, though I do find it draws my eye still - would it be possible to merge it into the context template system? Conrad.Irwin 11:06, 27 September 2009 (UTC)

Presumably that template could be customised -- or turned off altogether -- via WT:PREFS..? Ƿidsiþ 11:15, 27 September 2009 (UTC)

Indeed, it looks nice. I think we should employ subsense-format more often. --Vahagn Petrosyan 12:00, 27 September 2009 (UTC)

As to my personal preference, I like subsenses and some chronological ordering. The date information is very desirable and would be part of my personal preferences. I would prefer a paler yellow. I would also prefer some way to more visibly suggest that the sense was still in use in the date information. The overlap in meaning with the obsolete (and archaic ?) tags is bothersome, made worse by their separation and the attention-grabbing use of color.

One possibility is to explicitly mark the current use by ending the date range with the word "date" ("19c.-date") instead of a null ("19th- c.") for a sense still in current use.

As to the choice for default for the the casual user, our access to PREFS only serves to separate us from the normal-user experience. I would argue that the default for the date information should probably be its exclusion. It is the kind of thing that is of much more interest for a serious, repeat user. I would also like the default yellow to be much paler, especially if the date information is to remain in the default view. The use of color is so fundamental a departure from the user interface that the default permitted colors should be radically limited and made subject to the dreaded VOTE. DCDuring TALK 12:08, 27 September 2009 (UTC)

I could definitely see the value in making, say, obsolete senses appear red or something – though that kind of thing is way beyond my own last-resort template-making skills. Ƿidsiþ 14:05, 27 September 2009 (UTC)

Ooh. Dat dere iz sna-zy! Very good work, Ƿidsiþ. I tend to agree, however, that {{defdate}} should be integrated with {{context}} — perhaps like what’s used (probably just by me) for some archaic or obsolete variant spellings, viz.:

(obsolete, in use 15 ^th–18 ^th centuries)

…or a terser version thereof. Either that, or {{defdate}} should come before context tags (and probably in a paler yellow, per DCDuring); in-use dates would be more visible and easier to compare that way, since they’d all be left-aligned. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 12:26, 27 September 2009 (UTC)

Love the data, not particularly thrilled with the yellow. :-) Of mixed feelings about integrating with context templates, as suggested above. The data would go smoothly enough with "obsolete", "archaic", and "dated", but would look strange coming before senses that are still in active use. Plus allowing the display to be toggled would be difficult if it were integrated with {{context}}; the text could still be hidden, but would leave a "comma to nowhere," unless someone did a complete rewrite of {{context}} (any volunteers? heh...). From my own selfish point of view as someone trying to find ways of extracting and repackaging our data, I would much rather that this be encoded in one way (e.g. at the end of the sense line in a separate template) rather than in two or more ways that would each have to be addressed differently. So I would vote to stick with the end-of-the-sense-line placement. -- Visviva 13:55, 27 September 2009 (UTC)

I am curious about how well this will function in the wild, for example with regard to senses that were "dead" for centuries before being reinvented, or basically obsolete/archaic words that may be occasionally still be used in reference to the past, or that crop up in some smart-alecky Usenet post. But regardless of how we handle those cases, this seems like it would add good value to the 99% or so that are not especially problematic. -- Visviva 13:55, 27 September 2009 (UTC)

Would you be OK with {{defdate}} being consistently placed pre-{{context}}? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 14:26, 27 September 2009 (UTC)

It seems to me that doing so would create unnecessary visual noise. The vast majority of people looking for a definition -- especially a non-obsolete definition -- are not going to much care whether it was first used in the 9th century or the 19th. (I don't consider our web interface a major concern, but am inclined to think that anything that makes it less usable is probably not good.) In the same way that too-extensive contextual qualifiers should be moved either to a second sentence or a usage note, I'm inclined to think that this date information is better placed after the meat of the definition, rather than before. -- Visviva 15:56, 27 September 2009 (UTC)

There are a few words with famous gaps in attestation. Eg (deprecated template usage) heavenish, which is recorded from the 9th to the 16th century, and then disappears only to reappear in the 1800s. It could conceivably be done just with two separate templates (, ), but as you say I think these marginal cases could probably be thrashed out later. Ƿidsiþ 14:18, 27 September 2009 (UTC)

(deprecated template usage) Secret (verb) seems to be another one like that. How about (in use since the 16 ^th century) or something for current terms? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 14:26, 27 September 2009 (UTC)

The open-ended date ranges don't visually come across as open-ended. They came across initially to me as a single date with odd formatting, especially so since "century" followed the range without a specific value in front of it. I can't be thrilled about the yellow smudges either. Do we have people enough willing to research those date ranges? --EncycloPetey 15:15, 27 September 2009 (UTC)

I think we can tolerate even more non-uniformity of quality of our entries. There are entries where we have the information, but had not had a sanctioned way of presenting it usefully.

The more I think about it, the less any color seems appropriate for this use. I think we need to save the use of such color for items where we need to grab user attention to deliver a warning of some kind. Alternatively, we could use very pale yellow to highlight only non-obsolete, non-archaic, non-dated, non specialist senses of headwords. DCDuring TALK 16:02, 27 September 2009 (UTC)

I don't think the colour is important. At any rate, I envisage that users can make those choices for themselves via PREFS. Ƿidsiþ 16:05, 27 September 2009 (UTC)

I have taken the liberty of re-formatting the subsenses of definitions 3 and 4 of mark using {{date}} which interacts with {{context}}. I have left the highlighting in, but feel it should be removed. Would this be a less-intrusive way of including the same information? Conrad.Irwin 23:41, 27 September 2009 (UTC)

I think so (without the highlighting), but I think {{range}} would be a better name, since "date" is ambiguous as a template name. --EncycloPetey 23:44, 27 September 2009 (UTC)

I agree, or {{century}} or something to make it more specific. Conrad.Irwin 23:48, 27 September 2009 (UTC)

Five things:

Without “attested”, “in use ”, or somesuch, the meaning is ambiguous; e.g., (sports, since 19th c.) could be taken to mean “used to mean since the nineteenth century”.
Calling it {{century}} may cause confusion with {{R:Century 1911}} and {{R:Century 1914}}.
Date ranges take en dashes.
The ordinal suffixes should be superscribed.
Let’s use (deprecated template usage) C as the abbreviation of (deprecated template usage) century (per The New Penguin Dictionary of Abbreviations: from A to zz), since it has one character fewer than (deprecated template usage) c. and is less liable to confusion with (deprecated template usage) circa.

† ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 00:12, 28 September 2009 (UTC)

I don't mind it at the beginning of the line, and if it IS at the beginning then I agree that the colouring is unnecessary. But I don't think it should be added to the already-overburdened context brackets; I'd prefer it separate, perhaps in square brackets, and preferably 80% size. Ƿidsiþ 12:31, 4 October 2009 (UTC)

Quotation ordering

I’ve recently noticed that neither WT:QUOTE nor WT:ELE says anything about the order in which quotations ought to be arranged. Common practice here, as well as everywhere else, is to order quotations from oldest to newest. Since this isn’t always followed, could we codify this somewhere, please? † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 11:19, 27 September 2009 (UTC)

Ordering by date is obviously desirable for quotes for each sense. (I assume you meant within groupings by sense.) I have a related peeve against an undifferentiated quotations section or citations page for a multi-sense word. A failure to indicate which sense a quotation is supposed to support ought to be considered a cleanup issue. DCDuring TALK 12:18, 27 September 2009 (UTC)

Yeah, certainly, I meant ordering by date to be the lowest principle of organisation, after language > POS > sense. I make a fair bit of use Quotations sections, sorted by glossed sense for polysemic terms, since interlarding definitions with many quotations makes it difficult to make out the definitions; I really have no idea why we’re not using Ruakh’s quotations-box template… † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 12:48, 27 September 2009 (UTC)

I had quite forgotten about that; it does look mighty fine. That plus Widsith's data would give us a very nice layout for well-cited entries. Were there specific objections to using Ruakh's template? -- Visviva 13:16, 27 September 2009 (UTC)

I do find that it visually interrupts the hierarchy of the definitions, but I don't see why we ought not to set up a test page and have a better look at it. --EncycloPetey 15:11, 27 September 2009 (UTC)

At least an off-line test. But why not an on-line test on a popular entry of moderate complexity (and no subsense structure)? We can rollback if necessary. Getting some user feedback early in the process might be useful. Off-line prototyping/testing might be useful for interaction with subsenses. DCDuring TALK 15:49, 27 September 2009 (UTC)

IIRC, it was chiefly Connel MacKenzie’s hypothetical (read: never demonstrated, I suspect unfounded) concerns that the template would be broken by more fundamental, unforeseen, future code changes that scuppered its adoption. I think it’s one of the best proposals we’ve had for simultaneously improving functionality, clarity, space-saving, and our professional appearance. Regarding the minutiæ of the template itself, I formerly advocated the first approach, with the example sentence inside the box; I now think that a hybrid of the two would be best — with the border of the first combined with the external example sentence of the second. Also, we’ve gained clickable trans-, rel-, and other tables since then, so that new functionality should be added to the {{cite-top}} (or whatever we call it) template. Concerning our inchoate subsense structure, would it be possible to add something like an indent= parameter to the template, so that leaving it blank or inputting indent=0 would leave the template as it is demonstrated, suitable for quotations citing main senses, indent=1 would alter it to be suitable for subsenses, indent=2 for sub-subsenses, and so on? As for gathering user feedback, probably the most realistic approach to that is to set up some examples of the template in use in a number of well-cited entries of moderate complexity (as DCDuring suggests), and then posting the before-and-after diffs for comparison in a box on the main page (like the ones there for “Word of the day” and “Behind the scenes”) with requests for user feedback; a lot of our more irregular users go to the main page first to search for terms (in my experience), so it probably won’t be completely ignored; as long as we phrase questions properly, we should get some useful feedback.
I’d really like to push to make sure this template gets used, but I’ve been occupied lately with arguing for the retention of notable-but-unattestable terms (which is now pretty much sorted thanks to Visviva’s appendix of words found only in dictionaries), with proposing reforms to enPR, with advocating the adoption of {{citedterm}}, and with talking about allowing users to toggle spelling in entries, so I don’t really have the time or the energy to follow this one up at present; moreover, my situation will soon change, making it far less easy for me to edit Wiktionary (increased IRL workload and restriction to public computers). Hopefully, this won’t die on the vine again. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 02:03, 28 September 2009 (UTC)

sop4cfi!

words influence ea oth~most"synonyms" arnt 100%so,but ustil keep'em>so u'd w/quite afew "sop"s->tolerat'em til cfi expanded!--史凡>voice-MSN/skypeme!RSI>typin=hard! 15:21, 28 September 2009 (UTC)

Translation (I think):
Words influence each other. Most “synonyms” aren’t 100% so, but you still keep them; so you should with quite a few “sums of their parts” — tolerate them until the CFI are expanded!
† ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 19:47, 28 September 2009 (UTC) Sorry 'bout that... didn't see it soon enough. Very good job, Doremítzwr :) L☺g☺maniac chat? 21:01, 28 September 2009 (UTC)

I've always favored giving the benefit of the doubt to any terms for which reasonable doubt exists as to their SOPness. I've added this to Wiktionary:Editable CFI, which I encourage other editors to join in revising. -- Visviva 14:16, 1 October 2009 (UTC)

is 4/5 ofour del>wt poorer imv[iew:( [nyes,i'v radicaly difrent concept ofusefl dict thanmost ppl here!:P

ta 4lnghnd-guys!--史凡>voice-MSN/skypeme!RSI>typin=hard! 14:57, 1 October 2009 (UTC)

4/5 of our deletions make wiktionary poorer in my view (and yes, I have a radically different concept of what makes up a useful dictionary than most people here :P) thanks for the longhand guys!

I agree with you that people here overuse the term SOP and that we should really keep most of the stuff we're deleting. And like I said, please don't thank me..... (I'm sure the others don't mind tho :) L☺g☺maniac chat? 20:23, 1 October 2009 (UTC)

k,the ta'lbe implied;)plno uwerpivotal i/me getin answes&havin'anicer time here

Thanks :) I think of it as my duty. L☺g☺maniac chat? 01:11, 2 October 2009 (UTC)

I'm inclined to agree. There are too many "(quasi-)deletionistic" attitudes about the minds of some Wiktionarians (not any one person in particular that I can think of). 50 Xylophone Players talk 15:10, 26 October 2009 (UTC)

Because I am trying to rationalize our use of English-language Idiom, Phrase, Proverb, and Interjection headers, I regularly come across questionable entries. At the same time, I find that our coverage of legitimate idiomatic entries is not so complete. My belief is that entries for non-idiomatic collocations can be positively misleading in that they imply idiomaticity that does not exist. I would be very unhappy if a library catalog contained an entry for each shipment of books received in a single package. I'm interested in books not shipments. There are many collocations in the world that are not worth including. Including them is not without cost because it is difficult to maintain quality with such entries, especially because there are no other reference works that provide models for what entries for such terms should look like.

I view including mere collocations as permanent entries as likely to lead to exclusion of desirable true idioms and missing senses of words. OTOH, if we view such entries as a kind of request for entry, we find that some of them are worth keeping and others serve as a way of discovering weaknesses in the entries for the components. DCDuring TALK 16:44, 26 October 2009 (UTC)

Wiktionary:Editable CFI

I've been reading up on this new concept called a wiki, and it seemed like it might be helpful in resolving policy issues right here on Wiktionary (Wiktionary, wiki -- cool coincidence, huh?). So I've created Wiktionary:Editable CFI. At the moment it is just WT:CFI with a different header template. But I'm hoping that people who have specific ideas for additions/improvements to (or deletions from) CFI will consider applying them to Wiktionary:Editable CFI. Working together over time, I'm hopeful that we might even be able to develop what I believe is known in the wiki community as a "kon-sen-suss" regarding the best approach to various inclusion issues. -- Visviva 01:26, 29 September 2009 (UTC)

I like the idea, and have now performed several nonsubstantial edits, meaning edits that do not change the consequences of CFI. Feel free to revert me. --Dan Polansky 06:42, 29 September 2009 (UTC)

Looks good. I have started editing more substantially. -- Visviva 12:02, 1 October 2009 (UTC)

Would and Wiktionary:Editable ELE be desired as well? From what I can tell it was also locked down w/o a vote along with most of the others listed on {{policy}}. --Bequw → ¢ • τ 17:49, 19 October 2009 (UTC)

That sounds like a good idea. --Yair rand 16:53, 26 October 2009 (UTC)

suggesting an addition to MediaWiki:Anoneditwarning

There has been much discussion on users being blocked for vandalism without warning. The reasoning for doing this is that vandalism is a waste of our time, and that vandals do not need to be warned. However, others argue that blocking users without even a single warning violates the principal of "assume good faith." From what I have seen, a large portion of "vandals" consists of newbies experimenting with Wiktionary, rather than people who are intentionally trying to cause trouble.

True, common sense dictates that people who make inappropriate edits to wikis should expect to be blocked, but that doesn't mean we shouldn't give warnings at all.

After thinking for a while, I came up with a possible solution: a polite but firm warning should be added to the editing window. Specifically, MediaWiki:Anoneditwarning and similar messages should have something along the lines of the following:

Please note that vandalism and other disruptive edits may result in the loss of editing privileges without warning. Such edits include but are not limited: additions of nonsense or gibberish, deliberate introduction of incorrect information, unrelated advertising, removal of content without explanation and repeated recreation of deleted material. If you are unsure whether something is permitted, please discuss it first. Thank you.

That way, all users will see a warning before they edit, and we can't say that they weren't warned! --Ixfd64 02:24, 29 September 2009 (UTC)

Yeah, good idea, but let’s make it a bit more authoritarian-sounding:

Please note that vandalism and other edits considered to be disruptive to the project are likely to result in a block from editing without warning. Such edits include but are not limited to: additions of nonsense or gibberish, the introduction of incorrect information, removal of content without explanation, divulging personal contact details and other sensitive information, repeated recreation of deleted material, personal attacks, unrelated advertising, and so on. If you are unsure whether something is permitted, discuss it first, either on an entry’s page or in one of Wiktionary’s discussion fora. Thank you.

That way, there is a clear presumption against things that damage the project, with little consideration given to disruptive editors. I believe that accurately reflects the extent to which vandals are accommodated on the English Wiktionary. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 02:55, 29 September 2009 (UTC)

I would say it would still be helpful if an explanation on why you're being blocked is left on the user talk page, since even with such a warning, it is not necessarily clear why you are being blocked. 76.66.197.30 07:59, 29 September 2009 (UTC)

Are given names proper nouns?

Following an earlier Tearoom discussion on things like Belgian being a common noun with a capital letter, are given names (or first names) proper nouns? They don't refer to a specific thing like the Eiffel Tower and they can always have plurals. Why don't we have plurals of given names, then? Georges should have an English section. Mglovesfun (talk) 10:07, 29 September 2009 (UTC)

I don't have the book handy right now, but I believe the Cambridge Grammar of the English Language would consider them to be proper nouns -- but not proper names. It's one of CGEL's more peculiar distinctions, but I'm beginning to understand why they draw it. -- Visviva 10:31, 29 September 2009 (UTC)

Given names are proper nouns as they have no intension, only extension, to answer in terms of logic; they do not denote a set of qualities that an individual has to have in order to come under the head (to deserve to be called by the term). A given name such as "Martin" is intended to uniquely refer to an individual, albeit only in a specific context of a small group of people, so in the end it refers to many individuals. While "Martin" refers to many individuals, it does not do so by virtue of their common characteristics or nature, unlike "tree". Put differently, what all "Martins" have in common is not their "Martinhood" but only the name "Martin", and some other characteristics as being a person and being a male, which follows from the masculinity of the name "Martin". So my understanding anyway, probably at least in part erroneous.

For a nice treatment of proper nouns, see User:EncycloPetey/English proper nouns. --Dan Polansky 12:15, 29 September 2009 (UTC)