Euphemistic spellings with asterisk are currently discussed at Wiktionary:Requests for deletion/English#All "euphemistic spellings" with asterisks or other character placeholders. They are in Category:English terms spelled with *, which currently has 68 items.
An August 2019 deletion discussion will be archived to Talk:b*tches.
Other past deletion discussions include Talk:f*der (2010 keeper), and Talk:f**k (a near-unanimous 2008 keeper with 8 keep votes).
A relevant template is {{euphemistic spelling of}}
.
WT:CFI#Attestation vs. the slippery slope seems relevant: "There is occasionally concern that adding an entry for a particular term will lead to entries for a large number of similar terms. This is not a problem, as each term is considered on its own based on its usage, not on the usage of terms similar in form."
--Dan Polansky (talk) 10:42, 15 February 2020 (UTC)
Hey. What part of speech is řidčeji supposed to be? For the abbreviation řidč, which has no POS header--Vitoscots (talk) 01:45, 10 April 2020 (UTC)
The following is what I think is a fine edit summary standard for Wiktionary:
For comparison, Wikipedia has W:Help:Edit summary. Interestingly, it says "Always provide an edit summary". That is far from being the common practice in the English Wiktionary, often to detrimental effect. On the other hand, contrary to its section title, even Wikipedia body text of the help page does not strictly require an edit summary for every single edit; I think it would be an overkill to require or even enforce that in Wiktionary.
--Dan Polansky (talk) 07:59, 8 May 2020 (UTC)
It seems to me there is synonymy between a given name and its pet forms. Thus, Jack is a synonym of John, and Maggie is a synonym of Margaret. The synonymy may not be entirely obvious since we are dealing with proper names, and these attach to referents via acts of christening. What seems to support the synonymy thesis is the readiness with which someone called, say, Margaret can be referred to as Maggie, and the latter reference is not via christening. A given name and its pet forms are coextensive (refer to the same set of individuals), to say the least. The trick by which it works seems to be that, say, Maggie has not the extensional meaning typical of proper names, one arising via christening, but rather seems to have the intensional meaning "a person named Margaret". To obtain full intensional synonymy, we need to assign this intensional meaning also to "Margaret", which is kind of weird, but anyway. "Margaret" would have 1) extensional meaning, arising via christening, and 2) intensional meaning "person named Margaret", where the term Margaret used in the intensional definition depends on the extensional meaning or else we would have an infinite recursion. The intensional meaning seems to be used in the plural Margarets. And then some people claim that "a given name" is also a meaning of "Margaret", which I submit is no meaning of the term at all but rather a description of the term.
--Dan Polansky (talk) 14:06, 8 May 2020 (UTC)
English appears to me to be a hybrid Romance-Germanic language. The degree to which English vocabulary is permeated with words stemming from Latin is remarkable.
When I see Italian, it reminds me of English; when I see Danish, it reminds me of German.
I saw Richard Dawkins opine about English in a similar way, in a video that I cannot quickly find.
Is English Really a Germanic Language?, Sep 8, 2016, Langfocus at youtube.com, has a pie chart indicating that English vocabulary is 26% Germanic, 29% French, and 29% Latin. I don't know whether these numbers are correct and for what layer of vocabulary they are determined; if you include the large swaths of the bottom-ontology scientific vocabulary, surely Latin and Greek are going outnumber everything else, but that is to be expected and is not interesting.
The same video also relates the creole hypothesis, by which English is a creole language. The theory highlights huge simplification in English grammar that took place, including considerable reduction of inflection. Old English had an inflection system not unlike many other inflected languages, the video tells us.
--Dan Polansky (talk) 10:19, 9 May 2020 (UTC)
Links:
--Dan Polansky (talk) 10:59, 9 May 2020 (UTC)
Later: The graph in the medium.com article above, in article section Visualizing the data, suggests that French origin and Latin origin combined reach 40% of vocabulary for about 1000 most common English words, reaching 50% of vocabulary for about 2000 most common English words, and rising slowly higher as the number of most common English words analyzed increases. The article indicates wordfrequency.info as its source for word frequencies, where the website indicates that "The data is based on the one billion word Corpus of Contemporary American English (COCA) -- the only corpus of English that is large, up-to-date, and balanced between many genres." --Dan Polansky (talk) 08:54, 22 May 2020 (UTC)
One might use Google Ngram Viewer for a frequency-based notability test for inclusion of individual people. To determine whether an individual should have a sense in their surname, we might compare the frequency of a fuller name with the frequencies of names of other notable people. For example, entry Newton includes individual sense "Sir Isaac Newton, English physicist, mathematician, astronomer, alchemist, and natural philosopher." Admittedly, one would in fact be interested in uses of the surname alone, but that is not amenable to an easy frequency test.
Some test results:
All of the above would be includable in surnames. More work required. See also Category:en:Individuals.
Another useful test is the lemming test (WT:LEMMING); M-W has Einstein, Hitler, Hume, Russell; M-W does not have Popper.
Yet another test is the existence of -ian/-ean adjective: Galilean, Newtonian, Einsteinian, Humean, Russellian, Popperian, Wittgensteinian. One would have to make sure that the particular person is sufficiently often invoked by the adjective rather than another person of the same surname. --Dan Polansky (talk) 19:49, 7 June 2020 (UTC)
This is probably a really stupid question, but why does Darth Vader as "a powerful individual or force" clearly meet the criteria for CFI while Morgan Freeman as a description of a certain kind of deep voice does not? For example:
I can't quite put my finger on it. Alexis Jazz (talk) 10:31, 27 June 2020 (UTC)
Since 2010, Arnold Schwarzenegger can be deleted as failing WT:NSE's "No individual person should be listed as a sense in any entry whose page title includes both a given name or diminutive and a family name or patronymic" unless one argues that since that entry is made as a noun with a noun definition rather than a proper noun with a definition line identifying the particular individual, it does not fall under the quoted NSE regulation. This was introduced by Wiktionary:Votes/pl-2010-12/Names of individuals. It created an incongruence with inclusion of Darth Vader via WT:FICTION, via "With respect to names of persons or places from fictional universes, they shall not be included unless they are used out of context in an attributive sense." Here again one might object that there is no incongruence since Darth Vader is kept as a countable noun with the main definition "A powerful individual or force, particularly one that is seen as malevolent, dominating and threatening" and therefore, both Arnold Schwarzenegger and Darth Vader can be kept as nouns, escaping WT:NSE.
I would rather modify CFI and keep Arnold Schwarzenegger as a proper noun with definition "An Austrian-American bodybuilder and actor noted for highly muscular body" or the like, a definition that both identifies the individual and the characteristics that can be picked by metaphorical uses.
I made some relevant comment in RFD for Morgan Freeman, to be archived at Talk:Morgan Freeman.
--Dan Polansky (talk) 09:42, 3 July 2020 (UTC)
Names of organizations are governed by WT:NSE. Names of organizations include United Nations, and some other items in Category:en:Organizations including Federal Intelligence Service, Greenpeace, Hamas, Hezbollah, International Court of Justice, National Aeronautics and Space Administration, World Trade Organization, and more.
An ongoing RFD is going to be archived at Talk:National Hockey League. One property easing the deletion of National Hockey League is that it consists of multiple capitalized nouns or adjectives, unlike e.g. Greenpeace.
The WT:LEMMING test can be useful.
Greenpeace is in Lexico, Collins and Macmillan. Greenpeace survived RFD per Talk:Greenpeace.
Hamas is in Lexico and Collins.
--Dan Polansky (talk) 13:51, 3 July 2020 (UTC)
WT:FICTION seems problematic as per RFD comments that are going to be archived at Talk:Scheherazade.
Replacing WT:FICTION's "With respect to names of persons or places from fictional universes, they shall not be included unless they are used out of context in an attributive sense" with editor discretion could be considered, like "Inclusion or exclusion of attested names of fictional persons and fictional places is subject to editor discretion"; then, editors could use any tentative policy they like in RFD. The attestation requirement involves independence, a basic filter to does not allow any single-attested fictional character to be included but rather multiple authors would need to refer to the character. --Dan Polansky (talk) 15:02, 3 July 2020 (UTC)
Blend/portmanteau formation patterns can sometimes be included as suffixes, e.g. -gate. A deletion discussion is to be archived at Talk:-geddon; other candidates include -mageddon and -pocalypse. -gate is particularly productive: Category:English words suffixed with -gate has 151 items. --Dan Polansky (talk) 08:43, 4 July 2020 (UTC)
The following is inspired by Kant's categorical imperative. It may be a proper application of the principle or not; it is in any case inspired be it. It is further inspired by Popper's falsificationism.
The following principle guiding policy override comes to mind:
The above is probably too stringent since it requires the overriding person to do the policy work and to iron out the kinks and details. For instance, we used translation target (now known as translation hub) as an override at the time at which we had not a proper policy change proposal.
The above results in a more lenient modification of the above principle:
On the other hand, the above may be too lenient. It may allow overriders to throw around hugely disfunctional principles lacking all qualifications and distinctions and claim that the principles would work if only the proper qualifications were added.
Now how do you know whether you could wish a principle to become part of policy? You would know that by examining the impact of incorporating the principle into policy, and by determining whether the impact is desirable or not. Unfortunately, you may not be qualified or have enough information to properly assess the impact. That's a complication. Furthermore, people differ about what is desirable and what is not.
On a different note, policy overrides are a fact of life in the English Wiktionary. One example are the translation targets/translation hubs, codified after years of use. Another example are the hot words, which have not yet been codified via a vote as far as I know.
Policy overrides should not be done too lightly, or else the English Wiktionary's atmosphere of rule of law, at least as far as RFDs and RFVs go, would erode. Those who apply overrides should be ready to respond to inquiries and provide rationales and supporting evidence. Those unwilling to do the research and articulation work or point to research and articulation work done by someone else should not be throwing around overrides.
--Dan Polansky (talk) 09:09, 11 July 2020 (UTC)
Google Ngram Viewer had such a beautiful, functional user interface. Today, I found its interface spoiled using some "modern" web design or whatever it is. I wonder how and if this fashionable nonsense could be stopped. (Only borderline relevant to Wiktionary; nonetheless, GNV is a very useful tool for us.) --Dan Polansky (talk) 05:59, 14 July 2020 (UTC)
In English, adjectives often have corresponding same-looking nouns, featuring plurals. While such nouns exist for a host of adjectives, one cannot automatically assume the noun existence for any adjective.
Adjective → noun and its plural, examples:
Linguistics will have many of these, on the model of "X case" → noun:X and the like.
Pharmaceuticals will be often named liked this.
If there were a neat linguistic term for the above phenomenon, we could create a convenient category for such nouns; we could create a category regardless, clumsily named, perhaps.
--Dan Polansky (talk) 07:43, 15 July 2020 (UTC)
Czech has a somewhat similar phenomenon in which adjectives give rise to what is usually ranked as nouns but are inflected as adjectives anyway. Examples include proměnná, neznámá, vrchní, etc. In Czech, adjectives have plurals anyway, and the nominalization (turning-into-noun) does not change that. In Czech, the process seems much less productive than in English; compare Czech nouns aditivum (vs. aditivní), plurál (vs. plurální), principál (vs. principální), konstanta (vs. konstantní), etc. The lesser productiveness in Czech may have to do with English having Latin as one of its major sources, to the point of being considered by some to be a hybrid Latin-Germanic language, unlike Czech. --Dan Polansky (talk) 10:16, 15 July 2020 (UTC)
Administration statistics including admin action total and break down into action types, e.g. page deletions:
Top 7 admin action totals for 2020-06-14 to 2020-07-15 (31 days):
--Dan Polansky (talk) 12:00, 15 July 2020 (UTC)
Time range and types of admin action examined can be set here:
Maximum time range seems to be one year. --Dan Polansky (talk) 07:04, 16 July 2020 (UTC)
Sorry for bringing up something that happened five years ago, but a related question has just come up in a Czech discord server I'm in and I once again find this page inadequate.
From the SSJČ link at the bottom of the page:
3. kniž. souhrn základních znaků, rysů někoho, něčeho; podstata 1, povaha 2, charakter 3, tvářnost 2: lidská t. básníka; poznat skutečnou t. života; pravá t. fašismu ... 4. kniž. vnější podoba věci n. jevu; vzezření, vzhled, tvářnost 3: měnící se t. města; t. krajiny; t. časopisu (Fuč.)
Google results for "tvář města" are plentiful. Clearly none of this refers to a part of the human body and I think the page should account for these senses too. filelakeshoe (holla) 07:28, 10 August 2020 (UTC)
The idea of the title of this section was beautifully put by Equinox on a different project:
(Is this a quasi-retweet?) --Dan Polansky (talk)
Hi, Dan. This is a bit of a delayed reaction to something you mentioned in April. Yes, a machine can determine a phrase from a word given the spaces entailed in the text, but I don't think machines have yet figured out how to distinguish phrases according to lexical category. For instance:
Do machines know the difference? True, most native English speakers can tell the difference within a given context without the need to know anything about the lexical categories. But many non-native English speakers would have difficulty interpreting this sentence absent some phrasal parsing: "Towering iron gates seen three blocks down the road marked the main entrance." I.e.:
I have tons of students who would think "three" is a noun that's blocking something, or that the road marked the entrance.
By extrapolation, my approach to lexicography entails labeling each phrase in its own right: noun phrase, verb phrase, adjectival phrase, adverbial phrase, and so on. I'm not on a stump to change others' approach. Rather, I made one edit here along those lines by force of habit, not meaning to linguistically evangelize. But, think about it: What if Wikipedia, in the same way that it requires Verb over Verb phrase, suddenly implemented a rule that required Noun instead of Prepositional phrase. It wouldn't change the way anyone speaks, but it would hamper the ability to learn the language, and the phrase, "take something for granted," would become a noun. It's like saying, "You can use any whole numbers from zero to nine, but you must not use fractions." Okie dokie. More thoughts on my approach: "Metaknowledge Re.transivity" --Kent Dominic (talk) 09:45, 6 December 2020 (UTC)