Hello, you have come here looking for the meaning of the word User talk:DerekWinters. In DICTIOUS you will not only get to know all the dictionary meanings for the word User talk:DerekWinters, but we will also tell you about its etymology, its characteristics and you will know how to say User talk:DerekWinters in singular and plural. Everything you need to know about the word User talk:DerekWinters you have here. The definition of the word User talk:DerekWinters will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofUser talk:DerekWinters, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
Latest comment: 12 years ago2 comments2 people in discussion
Derek, when you add a new L-2 entry, make sure you put a line between the entries with four hyphens (----). Compare what I did. Thank you for reading. --Lo Ximiendo (talk) 22:48, 18 November 2012 (UTC)Reply
Latest comment: 11 years ago2 comments2 people in discussion
Heya, I see you've added a number of ====Derived terms==== sections to JA entries; thank you for that.
A couple minor points:
Make sure that the ====Derived terms==== header comes at the end of the relevant POS or etymology section. Over on the 金 entry, you added it before the POS header (), but then I also see on the 日 entry that you added it in the correct place. :)
The {{l}} template doesn't need the sc parameter for JA. Compare:
Latest comment: 11 years ago2 comments2 people in discussion
Hi! Regarding the entries such as उदजन(udajana) and प्रकाशाणु(prakāśāṇu) - we can only add words that are actually attested in the written corpus, not made-up words that nobody uses/has used. Modern words coined/borrowed into extinct/ancient languages through some kind of "revival" efforts can only be added if there is evidence for them. E.g. we already have some modern Latin terms that are can be backed by quotations from Vatican publications. So unless there is actual attestation for these Sanskrit terms, they should be removed. --Ivan Štambuk (talk) 10:29, 24 March 2013 (UTC)Reply
I understand. In all Indic languages, these words are used (with slight alterations in two or three) and so I assumed that, since they derived from Sanskrit, I should simply add them under Sanskrit too. And I doubt I'll be able to find any scientific articles written in Sanskrit, as they are primarily done in Hindi or English in India. DerekWinters (talk) 03:46, 26 March 2013 (UTC)Reply
Latest comment: 10 years ago2 comments1 person in discussion
Hi, the Hungarian words containing tan (science) are compound words (tan is not a suffix). What was your source? Can you please go back and correct all of them? Thanks. --Panda10 (talk) 19:26, 27 December 2013 (UTC)Reply
Thank you. No, I don't know how to speak Telugu, but I do know how to write it. The Indic scripts are very to learn once one is, because they are all so similar. DerekWinters (talk) 05:13, 30 January 2014 (UTC)Reply
So, there's no dropping of inherent "a"? Telugu is now partially enabled. To make it mandatory in headword templates, the templates need to change or if manual transliteration is removed, the automatic will work, e.g. see అంకపాళి(aṅkapāḷi) (the first noun I've come across). You can try Tamil, Kannada, etc. based on the Telugu module. If you want to edit here, please consider adding Babel to your user page. --Anatoli(обсудить/вклад)05:41, 30 January 2014 (UTC)Reply
Yes, the Kannada one is good, no issues. Thank you for all the help so far. However, as I tried to make a Tamil one Module:ta-translit, there were complications. There are 3 digraphs, {ஃப-f, ஃஜ-z, ஃஸ-x}, but I am unsure how to deal with them properly. If you could help with that as well? DerekWinters (talk) 02:22, 31 January 2014 (UTC)Reply
I made a Malayalam one Module:ml-translit and it seems to be working perfectly. However, the last of the testcases returns an error, even though the transliteration matches the expected perfectly. Either way, I still believe it to be fully functional. DerekWinters (talk) 17:26, 1 February 2014 (UTC)Reply
Well, maybe he's not the nicest. And if you're telling me you won't help any longer, I shall be very sad indeed. But for the Tamil and the Inuktitut and the Cherokee I just completed Module:Cher-translit, I was mainly hoping you could simply put them into use. DerekWinters (talk) 23:13, 3 February 2014 (UTC)Reply
If you're happy with testing, you can do it yourself, you already know where to add translit modules - Module:languages/data2 (for languages with the two letter code). For languages, which share a module, you just need to repeat the same line. Of course, you can ask me questions but my Lua knowledge is limited. --Anatoli(обсудить/вклад)23:37, 3 February 2014 (UTC)Reply
┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ I see. Could you make a list of languages/modules to add, so that I can edit easily, e.g. like this:
Latest comment: 10 years ago1 comment1 person in discussion
You need to be more careful when you add blocks of translations: your attempt to add a translation to the computer entry using the non-existent language code "eml" failed with a big fat "Module error: Module error", which you might have noticed if you had checked your edit. FYI, "eml" is a fake code they made up in order to have one for the Emiliano-Romagnol Wikipedia. It's tempting to crib translations from other Wikipedias, but contributors in smaller Wikipedias have a strong tendency to make things up/guess when they don't know a word in their language for something- even when there's a name for it in the language already. Chuck Entz (talk) 08:34, 11 February 2014 (UTC)Reply
Oh I see, I had been under the impression that Yiddish was generally written in a fully pointed way, but it seems that it can vary and is often only partially pointed. Sorry if it caused any errors. DerekWinters (talk) 17:08, 29 March 2014 (UTC)Reply
I must say that I don't think it is attested in that form. Also, I was not aware of this discussion, so I apologize for my mistake. I can change it back, or would you rather revert the change I made? DerekWinters (talk) 19:52, 15 June 2014 (UTC)Reply
Latest comment: 10 years ago2 comments2 people in discussion
Hi,
Please do not speedy delete entries, especially not 5 from one page without an explanation. If you wish to challenge a word, use WT:RFV (and read the page intro of that page to see what qualifies and what doesn't). Thank you, Renard Migrant (talk) 20:10, 19 August 2014 (UTC)Reply
Also, don't delete anything that's in actual use- even erroneously: we're a descriptive, not a prescriptive dictionary. And don't delete terms in scripts that are used by native speakers because they're not the "right" scripts for their languages. Remember, as well, that we aren't limited to any one time or place: if a script was used briefly, then abandoned, we'll want to have entries in the abandoned script for those terms that were known to be written in that script. You can tag incorrect forms as obsolete, proscribed, nonstandard, etc., and you can explain in usage notes why they shouldn't be used. If, on the other hand, you don't think they were ever used with that spelling/script, that's when you would take it to WT:RFV. Chuck Entz (talk) 02:05, 20 August 2014 (UTC)Reply
Okinawan kana entries linking to kanji entries
Latest comment: 10 years ago3 comments2 people in discussion
I noticed you added some more Okinawan content, thank you for that. One minor change to make going forward, please use {{ryu-def}} to link from Okinawana kana entries to their corresponding kanji spellings. I found that you'd used {{ja-def}}, which links to the corresponding Japanese kanji spelling instead of the Okinawan entry. :) ‑‑ Eiríkr Útlendi │ Tala við mig07:02, 6 November 2014 (UTC)Reply
No worries, easy fix. :) I also noticed that we don't have very many templates for Okinawan. The Japanese templates' code can probably be copied over and tweaked to create new Okinawan templates where required. ‑‑ Eiríkr Útlendi │ Tala við mig05:51, 7 November 2014 (UTC)Reply
Latest comment: 10 years ago4 comments2 people in discussion
Heya, I'm not fully up on Okinawan, but I do notice that the reading given here matches the mainland on'yomi for 場#Japanese instead. Mainland o shifts to うちなーぐち u, much as the o in okinawa becomes the u in uchinā, so the expected Okinawan on'yomi for 所 would be シュ and for 場 would be ジュー. I checked http://hougen.ajima.jp/hougen.php?q=%E6%89%80 to see what data that might give, and while that list is not exhaustive, it doesn't include any ジョー readings for 所.
FWIW, I also see the listed kun'yomi is tukuru, following the same o > u shift.
Thank you very much for pointing this out, as I had simply run with this. Other sources do point out that 所 can have both シュ or ジュ (in 御所: うんじゅ) for its onyomi reading, although the second one is probably just because of rendaku. However, http://www.jlect.com/entry/350/unju/ notes that 御所 can be うんじょー when topicalized, and perhaps this is what User:Viskonsas saw in some Okinawan text? Should I change them to じゅ?
So if we include any entry for Okinawan うんじょー (or should it be in katakana?), the entry should probably describe this form as a contraction. The じょー part, at any rate, does not appear to be any standard Okinawan reading for 所.
As far as Viskonsas's edits, yes, those should probably be changed to シュ・ジュ as appropriate. They (he? she?) self-describe as ja-1 with no mention of ryu anything.
Re: Okinawan for on'yomi or kun'yomi, I assume that such terms exist in Okinawan, as the phenomenon of both Chinese-derived and native-derived readings for a single character does seem to happen in Okinawan as well, but I don't know what these terms would be. My brief searching so far has also failed to find anything. ‑‑ Eiríkr Útlendi │ Tala við mig06:24, 10 November 2014 (UTC)Reply
This is wonderful. Thanks! I'll make the changes to 所, but I think I'll hold off on うんじゅ and うんじょー for now until I better understand them. Most sources tend to treat Okinawan on the same level as Japanese, using hiragana and kanji for "native" terms and katakana for modern borrowings. I also believe that historically, after the Japanese developed hiragana, it was imported to Okinawa and used there as well. So, overall, I think we should stick to the standard rules we know for Japanese writing for the Ryukyuan languages. DerekWinters (talk) 04:16, 11 November 2014 (UTC)Reply
Bengali transliteration module
Latest comment: 9 years ago14 comments3 people in discussion
Hi,
Are you still interested in Indic languages? Do you think you can work on Module:bn-translit and Module:gu-translit? I will try to address dropping inherent vowels later for Hindi et al, Bengali, Gujarati. Amharic/Tigrinya have a similar problem with dropping vowels. There's no reason we can make these languages transliterated 100% or nearly 100% automatically, they are much easier than Korean or Arabic. Just need to get some help from Lua gurus. @Dick Laurent, Dijan your help on Bengali transliteration would be much appreciated. --Anatoli T.(обсудить/вклад)01:05, 6 January 2015 (UTC)Reply
Hi. I'll definitely be able to create a module for Gujarati, but as you noted, the schwa-dropping exists as well in Gujarati (sources say it's different to the schwa-dropping of Hindi, but I've never noticed a difference). Some words will have to be hard-transliterated because Gujarati lacks proper trasnscription for 2 less-used vowel phonemes (ɛ and ɔ), but that shouldn't be too hard.
Bengali on the other hand is a little more complicated (less transparent) and I never have truly learned the script. I'll make a basic Bengali module, but it most definitely won't be ready to use until someone with expertise makes some changes.
Also, I noticed a defect with the Tamil module. "Plosives are unvoiced if they occur word-initially or doubled. Elsewhere they are voiced." I, too, have noticed this, but I'm unable to code Lua with such skill. DerekWinters (talk) 12:24, 6 January 2015 (UTC)Reply
So I've made a Gujarati module, but the testcases show what the main issues are. 1st is the schwa-dropping. 2nd is the uṃ sequence word-finally. It is always to be transliterated ũ, but I am unsure how to code that. 3rd is the issue of ṃ in front of a consonant.
ṃ in front of a velar (k, kh, g, gh) is ṅ. In front of a palatal letter (c, ch, j, jh) it is ñ. In front of a retroflex (ṭ, ṭh, ḍ, ḍh) it is ṇ. In front of a labial (p, ph, b, bh, m) it is m. In front of a dental (t, th, d, dh) and all remaining consonants (y, r, l, v, ḷ, ś, ṣ, s, h) it is simply n. I also don't know how to code this. Also, this last issue I noted is common to all Indic languages except in a few cases where words will have to be hard-transliterated.
I don't know what we should do about Bengali transliteration. I can see the merits of sticking with a more scholarly system, given the differences in pronunciation between Indian and Bengali dialects, but it shouldn't be identical to the systems used for other Indic languages. For example, where others use a short vowel "a," in Bengali this vowel is pronounced "o," and of course as you guys mention, that vowel is often dropped. — — 16:51, 6 January 2015 (UTC)Reply
Thanks very much. I will address it in due course. Ric, we can choose one system and stick to it. I think you meant "ô", not "o" (the short vowel). There is no 100% consistency in transliterating dropped vowels, so if we come up with a working logic, we could use for many languages like Hindi, Gujarati, Bengali and (surprisingly) Amharic/Tigrinya (short vowel "ə"), e.g. ዩክሬን(yukren) should be "yukren". Amharic et al (Module:Ethi-translit) also have gemination issue, which is not expressed graphically. Native speakers don't have any problem with it and it seems that some transliterations ignore it altogether. --Anatoli T.(обсудить/вклад)22:33, 6 January 2015 (UTC)Reply
Do you think you can write a short paragraph describing the rules when, e.g. in Hindi, the inherent vowel "a" is dropped: e.g. ("C" is any consonant, "V" is any vowel, apart from "a") CaCaCa = CaCaC, CeCaCāCaCī = CeCCāCCī (devanāgarī = devnāgrī), etc.? Does it matter, which consonants are involved, e.g. in consonant clusters? CaCCCa = CaCCC or CaCCCa? --Anatoli T.(обсудить/вклад)23:27, 7 January 2015 (UTC)Reply
The idea behind vowel dropping lies with syllabification. A schwa at the end of a syllable is always dropped.
करन - क|रन (ka|ran) (the 'na' becomes 'n')
करना - कर|ना (kar|nā) (the 'ra' becomes 'r)
One major exception is if the schwa is part of a consonant cluster involving a "special" consonant (y, r, l, v, h, ṇ, n, and m) word-finally. The schwa here is not dropped. The words syllabifies by the first member of the cluster becoming part of the previous syllable, and the rest of the cluster becoming its own syllable.
वस्त्र - वस्|त्र (vas|tra) (the 'tra' remains because 'r' is a special consonant)
भस्म - भस्|म (bhas|ma) (the 'ma' remains because 'm' is a special consonant)
Another is if the schwa is part of any consonant cluster (or gemination) word-medially. The schwa here is not dropped. Syllabification happens just as above.
अस्पताल - अस्|प|ताल (as|pa|tāl) (the 'pa' remains because it is part of the cluster)
उत्तम - उत्|तम (ut|tam) (the 'ta' remains because part of cluster) (the 'ma' becomes 'm' because end of syllable)
I laid out a list of some CVC formations.
S - Special Consonants (y, r, l, v, h, ṇ, n, and m)
I am super dooper impressed. Very, very, very impressed. The transliteration aṁgrez (actually angrez) is correct for both. अंगरेज़ and अंग्रेज़ both get split the same way: अंग्‧रेज़ / अंग‧रेज़. DerekWinters (talk) 11:13, 9 January 2015 (UTC)Reply
I believe this system will work for Gujarati, Marathi, Sindhi, Kutchi, Rajasthani, Marwari, Bhojpuri, Konkani, Saurashtra. Beyond that, I'm not sure if any other languages would work. Gujarati and Kutchi share the same script (Gujarati). Saurashtra has its own script (Saurashtra). All the others use Devanagari. Do you know what we should do about Bengali, Oriya, etc.? @AtitarevDerekWinters (talk) 07:24, 10 January 2015 (UTC)Reply
Yes, you're right. I don't feel comfortable with Lua, though. Are you able to make a basic Bengali module based on WT:BN TR, perhaps? Then we can ask Wyang to do his magic tricks, also for Gujarati and Oriya by copying the logic? --Anatoli T.(обсудить/вклад)23:59, 11 January 2015 (UTC)Reply
Sorry for being a lazy poo. I made some edits to the bn-translit module, but after visiting the wiki article and other sources on the Bengali alphabet, I realized why I'm so terrified of it. It's a lot, and we need an expert to help us here. DerekWinters (talk) 08:33, 13 January 2015 (UTC)Reply
Latest comment: 9 years ago4 comments3 people in discussion
Hey. Could you check that entry? I'm pretty sure that the transliteration is wrong and I'm not sure if the definition is correct. --Dijan (talk) 02:50, 3 February 2015 (UTC)Reply
Oh god. Do forgive me. I keep making these copy-paste errors. I absolutely hate writing an entry from scratch so I copy-paste and sometimes I forget key things. DerekWinters (talk) 08:15, 8 February 2015 (UTC)Reply
No worries, I've done that too, with similar erroneous results sometimes. :) FWIW, you might find the edittools JavaScript useful: ]. This allows you to define your own one-click insertion items. I've found this extremely helpful over the years. Cheers, ‑‑ Eiríkr Útlendi │ Tala við mig08:56, 8 February 2015 (UTC)Reply
Citability
Latest comment: 9 years ago1 comment1 person in discussion
Hi. Some of the recent English terms you've added don't seem to be citable per WT:ATTEST, and I've sent a couple to WT:RFV. If these can in fact be cited, can you please help do so, and if not, can you please refrain from adding such entries? Thank you! —Μετάknowledgediscuss/deeds02:51, 11 May 2015 (UTC)Reply
@Type56op9 Hey, no need to be so pessimistic. It's actually a rather solid entry. Nothing at all wrong with it. If you wish to make it stronger however, you could add the categories it would fall under (just check the star to see which ones), an etymology, a pronunciation, and even sample sentences or citations. Keep up the good work. DerekWinters (talk) 19:34, 2 June 2015 (UTC)Reply
Thankless tasks mostly: Being the guy who cleans up after vandals. Deleting pages, protecting pages, changing the Main Page and other proctected pages, blocking users. --Type56op9 (talk) 08:01, 3 June 2015 (UTC)Reply
I mean, it is a true honour. You will be able to do lots of cool things! You can see the content of deleted pages, many of which include personal information of our users; you can get rid of users you disagree with, your opinion will be worth more in our forums, and you'll have loads of fun! --Type56op9 (talk) 08:03, 3 June 2015 (UTC)Reply
@Chuck Entz I seem to have messed up there. I was certain that I'd seen a category somewhere that handled missing gender, but I couldn't find the exact name for it, so I modeled it after the other category that was already set up as Category:Gujarati terms needing transliteration. I've gone ahead and removed the cat2. I hope that's fixed the problems at hand. Sorry for any inconveniences. DerekWinters (talk) 16:42, 13 June 2015 (UTC)Reply
Telugu module
Latest comment: 9 years ago6 comments2 people in discussion
Thanks, looks good. Yes, there are a few special, classical and rare characters: ౘ(ĉa), ౙ(za), ఌ(l̥), ౡ(l̥̄), ఽ (w:avagraha, or apostrophe ’, referring to the Sanskrit letter ऽ(ʼ), u+0c3d), ౸(0⁄4) (fraction sign 0⁄4, u+0c78), ౹(¼) (fraction sign 1⁄4, u+0c79), ౺(2⁄4) (fraction sign 2⁄4, u+0c7a), ౻(¾) (fraction sign 3⁄4, u+0c7b), ౦(0⁄16) (fraction sign 0⁄16, u+0c66), ౼(1⁄16) (fraction sign 1⁄16, u+0c7c), ౽(2⁄16) (fraction sign 2⁄16, u+0c7d), ౾(3⁄16) (fraction sign 3⁄16, u+0c7e), ౿ (tuumu sign, an antiquated measuring unit for grains, u+0c7f)). —Stephen(Talk)04:30, 16 July 2015 (UTC)Reply
@Stephen G. Brown I added the avagraha and the extra numerals too, thanks. I think that almost everything from the Unicode chart can be transliterated now. Also, would you happen to know the way (if any) that Telugu accomodates Arabic/Persian/Urdu and English loanwords that use z, f, x, q, ɣ, ʒ, etc.? I was thinking that Telugu might use some variation of the nukta like Hindi does, or perhaps even something similar to the Tamil āytam. DerekWinters (talk) 16:00, 17 July 2015 (UTC)Reply
Latest comment: 9 years ago1 comment1 person in discussion
Most sources on Proto-Kartvelian use the Latin script for reconstructions. Please do not change them to the Georgian script. --Vahag (talk) 19:31, 4 October 2015 (UTC)Reply
Just a reminder to be careful when editing
Latest comment: 9 years ago1 comment1 person in discussion
Latest comment: 9 years ago1 comment1 person in discussion
Hello. Book Pahlavi is not in Unicode. You should not replace the Romanizations with Inscriptional Pahlavi, a different script. Also, no Manichaean fonts exist, even though it is now in Unicode. It is the consensus among the Middle Iranian editors on Wiktionary to use Romanizations for lemmas of Middle Iranian languages, except for the words attested in Inscriptional Pahlavi and Inscriptional Parthian. See also Wiktionary:Votes/pl-2011-09/Romanization of languages in ancient scripts 2. --Vahag (talk) 10:10, 5 December 2015 (UTC)Reply
I don't know about any specific cases, but, in general, it's entirely possible for terms from a literary language to be both inherited and borrowed. When inherited, it stays in continuous use as the parent language evolves into the daughter language, reflecting any sound changes that happen along the way. When borrowed, someone reads it (or, in this case, perhaps hears it recited) centuries later, and adopts it into their language directly. Think about all of the religious terminology in Hindi that's pure Sanskrit. Those terms may contain basic vocabulary that has made its way separately down to Hindi by inheritance, but the precise combination that has religious meaning is intentionally kept as close to the Sanskrit original as possible.
When you're doing etymology, you have to look at the history of the individual terms. The whole language may have been indirectly inherited from Sanskrit (actually Old Indic, of which Sanskrit is a very artificial subset), but individual terms may have been borrowed directly from Sanskrit, or indirectly via another language that got it from Sanskrit. Chuck Entz (talk) 02:16, 20 January 2016 (UTC)Reply
Exactly what Chuck said. For example, the Hindi word काम is inherited from कर्म -> कम्म (through assimilation) -> काम (simplification and compensatory lengthening). Thus I would say that काम is inherited, but कर्म is borrowed. These words have been borrowed as opposed to having been inherited. However, it gets really murky along the lines, as every new stage of Indic languages tried to sound like its former stage in an effort to sound erudite and intelligent. Thus, the Prakrits, while originally celebrating their separateness from Sanskrit, began borrowing heavily from it in learned speech. The Apabramshas, again, celebrated their distinctness from their Prakrit forebears, but then began borrowing lexically and morphologically from them and lexically from Sanskrit. And the same has happened with the new Indo-Aryan languages. A good example of different forms of the same language are Shadhu-bhasha and Cholitobhasha for Bangla. DerekWinters (talk) 18:32, 20 January 2016 (UTC)Reply
That makes complete sense! कम्म is actually attested as Pali kamma, so that was a very good example. And the digloss in Indic languages is quite common - in Hindi there's शुद्ध हिंदी(śuddh hindī) and the spoken version हिंदुस्तानी(hindustānī). According to my Odia textbook, the same applies for that language. Thank you for the explanation! —Aryamanarora(मुझसे बात करो)18:43, 7 February 2016 (UTC)Reply
I'm glad it helped :). It's fairly common in all modern (and most historical) Indian languages to have very high levels of Sanskritic loanwords. And diglossia is when two different languages are actually spoken concurrently by the population. Hindi doesn't really have that though. Shuddh Hindi and Hidustani differ only in vocabulary, not morphology. Gujarati also isn't diglossic (except for maybe Hindi and some English in today's Gujarat?), but you can even find Old Gujarati forms borrowed into the language for use in bhajans and kirtans, etc. to give an old or rustic feel to them. DerekWinters (talk) 21:05, 7 February 2016 (UTC)Reply
Oh, I seem to have misunderstood the meaning of diglossia - I thought it meant when there were two registers of the same language, one with higher prestige. Again, thanks for the knowledge! —Aryamanarora(मुझसे बात करो)23:42, 17 February 2016 (UTC)Reply
I don't think you're so far off. In diglossic situations, there are two different dialects or languages, with different prestige. Usually, the languages are fairly closely related; it seems a bit odd to me to refer to English/Gujarat as diglossia, but technically I think it's correct. I'm not sure whether two registers that differ largely or only in vocabulary would count as diglossia, although Wikipedia does indicate both Hindi and Urdu as diglossic. Among Indian languages, Tamil definitely has diglossia. Benwing2 (talk) 01:54, 18 February 2016 (UTC)Reply
BTW I agree with Chuck and Derek that you can have borrowings from an earlier form of the language. The Romance languages, for example, have tons of borrowings from Latin. Benwing2 (talk) 01:58, 18 February 2016 (UTC)Reply
Diglossia is with another language or a higher form of the same language. Its just where there are two languages with differing levels of prestige and usage within a community. So Shadhubhasha and Cholitobhasha are (were) the diglossic forms of Bengali, but I would also say that high levels of Arabic proficiency in Nigeria, within a Fulani community, would be classified as diglossia. DerekWinters (talk) 02:11, 18 February 2016 (UTC)Reply
Ngoko, Krama, & Krama Inggil
Latest comment: 8 years ago1 comment1 person in discussion
Latest comment: 8 years ago5 comments4 people in discussion
Hey I'm just curious about the title of the source where you got the Tagalog words like balnidinagipik and balngawsukatan. Thanks.
@Mar vin kaiser Hello. Sorry for the delay, just got back from vacation. Oh my. Those were so long ago, to be honest, my zeal was such at that time, I may have simply seen it somewhere online and found that to be worthwhile enough to add it. If you don't see any valid reason to keep them, feel free to remove them. DerekWinters (talk) 04:24, 20 July 2016 (UTC)Reply
balngawsukatan was probably a scanning or typing error. It probably should be balangaw sukatan (literally, rainbow metrics). The entry balnidinagipik at least needs another "a", balani dinagipik, and I think "dinagipik" is also not quite correct.
Your terms seem to come from here. The correct spellings should be in a scientific dictionary named Maugnaying Talasalitaang Pang-agham Ingles-Pilipino, by Gonsalo del Rosario. However, it is out of print and no copies are available for sale. It can be found in many major libraries. Someone has photocopied the dictionary (jpeg), a page at a time, and it is available for free download here. It is awkward to use, since it consists of something like 300 individual jpegs (not searchable). —Stephen(Talk)10:20, 11 September 2016 (UTC)Reply
Latest comment: 8 years ago2 comments2 people in discussion
Hey, I've noticed you made Old Uyghurᠨᠤᠮ(nom) a year or so ago. I'm guessing it's not the same encoding as Mongol script, judging by the fact that it's a different page from Mongolianᠨᠣᠮ(nom), so how did you get O. Uyghur characters? Wikipedia only has images of the letters. Crom daba (talk) 22:25, 9 September 2016 (UTC)Reply
@Crom daba From my memory, I had seen Old Uyghurᠨᠤᠮ(nom) as a redlink on here, with its pronunciation as 'nom'. I can, very slowly, read some Mongolian, and cross-referenced it with some source, and just added it from the redlink. Doing some research now, it seems Unicode has both an "o" and an "u", which both look the exact same (actually though). I'm not sure what to do. DerekWinters (talk) 01:16, 10 September 2016 (UTC)Reply
A quick and easy reference is Westendorf, Koptisches Handwörterbuch (German). More extensive are Černý, Coptic Etymological Dictionary (English), and Vycichl, Kasser, Dictionnaire étymologique de la langue copte (French). Probably present at larger libraries. Lingo Bingo Dingo (talk) 13:47, 8 October 2016 (UTC)Reply
Share your experience and feedback as a Wikimedian in this global survey
Latest comment: 7 years ago1 comment1 person in discussion
Hello! The Wikimedia Foundation is asking for your feedback in a survey. We want to know how well we are supporting your work on and off wiki, and how we can change or improve things in the future. The opinions you share will directly affect the current and future work of the Wikimedia Foundation. You have been randomly selected to take this survey as we would like to hear from your Wikimedia community. To say thank you for your time, we are giving away 20 Wikimedia T-shirts to randomly selected people who take the survey. The survey is available in various languages and will take between 20 and 40 minutes.
You can find more information about this project. This survey is hosted by a third-party service and governed by this privacy statement. Please visit our frequently asked questions page to find more information about this survey. If you need additional help, or if you wish to opt-out of future communications about this survey, send an email to [email protected].
^ This survey is primarily meant to get feedback on the Wikimedia Foundation's current work, not long-term strategy.
^ Legal stuff: No purchase necessary. Must be the age of majority to participate. Sponsored by the Wikimedia Foundation located at 149 New Montgomery, San Francisco, CA, USA, 94105. Ends January 31, 2017. Void where prohibited. Click here for contest rules.
Assamese
Latest comment: 7 years ago25 comments4 people in discussion
Just the same as Bengali, Assamese has no real predictability with schwa-dropping, and words must be learned individually. Otherwise, Assamese is not bad for the standard transliteration. I can create the transliteration page. As an aside, I have noticed many many words incorrectly transliterated (in regards to the schwa-dropping) for Gujarati, but honestly I don't know how feasible it would be to try and fix it. DerekWinters (talk) 02:25, 13 March 2017 (UTC)Reply
If a shwa rule represents a common behaviour, they can and should be implemented. The real irregular readings are not that numerous. The test cases should only represent what the module must do, not the exceptions. There are also cases where shwa is light or optional. We can just decide what the rule should be and leave the phonetics to the pronunciation sections. A light shwa can follow the normal rules for transliteration purposes. In short, humans don't find the shwa-dropping rules overly complicated. It's just a matter of combining that knowledge with the programming knowledge. As for Gujarati or Bengali - the modules are just far form complete. --Anatoli T.(обсудить/вклад)02:43, 13 March 2017 (UTC)Reply
As an example, Hindi डायनासोर(ḍāynāsor) and Gujarari ડાયનાસોર(ḍāyanāsor) should be transliterated as "ḍāynāsor", not "ḍāyanāsor". The modules drops the final "a", which is good but not the one between "y" and "n". The rule for dropping the vowel there is straightforward but it hasn't been implemented. Also, the Bengali module drops the final shwa in মানচিত্র(mancitro) but it shouldn't, the rule is simple as well (for humans) but the module doesn't know about it. --Anatoli T.(обсудить/вклад)03:07, 13 March 2017 (UTC)Reply
For the ones where the inherent vowel is indeed a schwa, you're right it really isn't an issue, and yeah the module for Gujarati really isn't complete yet. But for Bengali, Assamese, Oriya, etc. where the inherent vowel is like /ɔ/ or something related, it very much matters. And I do have to stress that for Bengali it's not a matter of exceptions when it comes to word-final schwa-dropping, it's quite unpredictable. E.g. তাল (tal), ডাল (Dal), ভাল (bhalo), গাল (galo), লাল (lal); হর (horo), নর (noro), ঘর (ghor), বর (bor), and more and more. Word-medial dropping is much more regular though. I haven't learned enough about Assamese yet to make any such claim. DerekWinters (talk) 03:19, 13 March 2017 (UTC)Reply
If there is a mess with shwa-dropping in Bengali, then the module shouldn't drop "ô" by default. The module might as well show the inherent vowel and a method to drop them when required could be used for phonetic respellings in the pronuniciation section.--Anatoli T.(обсудить/вклад)03:48, 13 March 2017 (UTC)Reply
I mean, in case the final "ô" is definitely unpredictable, transliterate both cases নর(nor) and ঘর(ghor) with inherent "ô", ie. "nôrô" and "ghôrô" but mark the entry ঘর(ghor) that it's actually pronounced ঘর্(ghor) "ghôr". "ঘর্" as a phonetic respelling, employing the virama or হসন্ত(hośonto) symbol ্ to suppress inherent vowels. --Anatoli T.(обсудить/вклад)05:07, 13 March 2017 (UTC)Reply
Well, one still needs to assess, as User:Wyang said, the percentage of unpredictable shwa-droppings. What is more common, cases like নর or ঘর. If cases like ঘর are much more typical, then they should still be used as the default behaviour. For new modules, e.g. Assamese or Oriya, you can probably ignore the shwa-dropping rules altogether until they are understood and described. Some online dictionaries display all inherent vowels even if they are silent. --Anatoli T.(обсудить/вклад)05:50, 14 March 2017 (UTC)Reply
I don't know how to get an accurate percentage, but I would wager quite a bit that it's a significant part of the vocab. Several verb conjugation endings have undropped schwas in Bengali, and a whole host of random words have it, like কেন (keno), মত (moto), বর্ষ (borṣo), etc. DerekWinters (talk) 04:14, 15 March 2017 (UTC)Reply
@Metaknowledge An issue I am coming up against is the plurality of letters that sound the same now. Should I transliterate them uniformly (as I see on Wikipedia and elsewhere too), or should I transliterate them according to our IAST standards? For example, all the Ts and Ds are alveolar and the whole sibilant set is now an unvoiced velar fricative. DerekWinters (talk) 03:38, 13 March 2017 (UTC)Reply
Really, it's up to editors for each language to decide. Different languages have different standards; there's no need to be bijective, but if that's the standard, we can cleave to it. —Μετάknowledgediscuss/deeds04:33, 13 March 2017 (UTC)Reply
I don't know Assamese so I'm only speaking from the experience from doing the Hindi and Nepali romanisation modules. The approach will have to depend on how irregular the schwa dropping and other unpredictabilities are. If more than 20% of all Assamese words in a comprehensive dictionary are unpredictable, a sensible approach may be to use a phonetic respelling in the main entry (relying on a pronunciation module) to cover the romanisation, and have all Assamese links refer to the Assamese articles themselves to extract the respelling, instead of applying an automatic algorithm which relies on external transcription assistance any time a word is romanised anyway. This is the approach used by the Thai-editing community here. Many Indic languages, for which a close-to-perfect transliteration algorithm is impossible, may benefit this way. Wyang (talk) 06:47, 13 March 2017 (UTC)Reply
I don't know Assamese and Bengalese either. From what I know, like most Northern Indic languages, they also feature shwa-dropping. Dropping the inherent vowel shwa (transliterated as either "a" or "ô") must be common with Bengali as well in the final position after a consonant, which follows a vowel. There are apparently exception, see DerekWinter's post above. If they are not typical, I hope not, then there still can be a rule to drop shwa's in such positions.
For Bengali definitely it's safer to just have the phonetic respelling because there are a lot of exceptions. After I do more research on Assamese I'll come back to this. DerekWinters (talk) 23:29, 13 March 2017 (UTC)Reply
I think it would be a good idea to include the light schwa that Anatoli mentioned above. I know Wikipedia uses ǎ, but I think maybe using ə wouldn't be a bad idea. DerekWinters (talk) 23:40, 13 March 2017 (UTC)Reply
The recent Oxford Hindi-English dictionary has the info on the light shwa. This is the best Hindi dictionary for foreigners so far (It has genders and usexes as well). --Anatoli T.(обсудить/вклад)05:50, 14 March 2017 (UTC)Reply
@Aryamanarora I only know three, Gujarati, Hindi, and as of late Bengali. Oh, and a little Cochin Konkani. Bengali and Cochin Konkani I learned from friends, and then supplemented my information from online. But I love to dabble in the others, picking up little bits of Marathi, Punjabi, Assamese, etc. (as you can see my focus is very strongly on North Indian languages). I'll do that from online resources, or, as is often the case, from music in those languages. That is especially how I got into Assamese recently. However, I do love learning how to write all and any manner of script out there, so that helps a lot with my Indic focus as well. But other than that I really don't know that many, nothing compared to those who live in India. DerekWinters (talk) 00:52, 15 April 2017 (UTC)Reply
Oh sometimes I wish I still lived in India. My town only has Gujarati immigrants who I can't practice in Hindi with... And you'd be surprised at how many Hindi speakers don't know any other languages (besides English). I didn't know any others until I came on Wiktionary and got interested in them. Anyway, it's cool to see Indian languages becoming more important outside of India. —Aryamanarora(मुझसे बात करो)22:22, 15 April 2017 (UTC)Reply
True we gujjus do move everywhere. And you can probably practice some Hindi with them, although from experience some really suck at Hindi. And yeah I've noticed lol. DerekWinters (talk) 01:17, 16 April 2017 (UTC)Reply
Caribbean Hindustani
Latest comment: 7 years ago2 comments2 people in discussion
I know the least if anything about Caribbean Hindustani. Although, just googling Guyanese Hindi gives me a wealth of stuff if you want to look into that. DerekWinters (talk) 20:38, 20 May 2017 (UTC)Reply
Latest comment: 7 years ago12 comments2 people in discussion
So, I found this comprehensive grammar but it seems to treat Apabhramsa as a continuum instead of separate languages. The author divides it into 4 dialects, Northern (attested only in 1 work), Southern, Eastern, and Western. Which one of these is Gurjar and which one is Sauraseni do you think? —Aryamanarora(मुझसे बात करो)03:25, 28 May 2017 (UTC)Reply
@Aryamanarora: He also mentions a lot of confusions, so I wouldn't take anything all that seriously in terms of classification. I think our best course is to work up for now, add Old Gujarati, Hindi, Marathi, Punjabi, Oriya, Bengali, etc. etc. Then we can properly pick at the next stage. That or complete the Prakrits on here first and then move down. Those are the best classified of the sources remaining and that should help a lot. DerekWinters (talk) 03:51, 28 May 2017 (UTC)Reply
@DerekWinters: Sometimes it's disappointing how poorly recorded New Indo-Aryan languages are on Wiktionary. It seems though that we're getting a lot more editors as of late. I'll start bulking up Punjabi soon. —Aryamanarora(मुझसे बात करो)04:00, 28 May 2017 (UTC)Reply
@Aryamanarora "By 500 AD these Middle Indo-Aryan dialects had been developed many local features and lost many inflectional morphemes. Literary form of these dialects is known as Apabhramsha (ਅਪਭ੍ਰੰਸ਼,اپبھرنش). Principle Apabhramshas are Takka Apabhramsha in Central Punjab and Vrachada Apabhramsha in Southern Punjab. By 1200 AD these Apabhramshas or 'corrupt dialects' had few inflectional morphemes left. During Middle Ages Takka Apabhramsha developed into Lahori dialect and Vrachada (व्राचड/व्राचड़) Apabhramsha developed into Multani dialect." I found this on the "History of the Punjabi language" wiki page. I have never even heard of those Apabhramshas so maybe you know something. DerekWinters (talk) 06:20, 28 May 2017 (UTC)Reply
"The immediate predecessor of Sindhi was an ApabhramshaPrakrit named Vrachada. Arab and Persian travellers, specifically Abu-Rayhan Biruni in his book 'Tahqiq ma lil-Hind', had declared that even before the advent of Islam in Sindh (711 A.D.), the language was prevalent in the region. It was not only widely spoken but written in three different scripts – Ardhanagari, Saindhu and Malwari. Biruni has described many Sindhi words leading to the conclusion that the Sindhi language was widely spoken and rich in vocabulary in his time." This is from an older version of the Sindhi page on wikipedia, but due to lack of sources was removed.
Latest comment: 7 years ago10 comments3 people in discussion
This is 1925 Gujarati-English dictionary (1600+ pages) published by Baroda State. It is now in public domain so it can be added to Wiktionary. It was scanned by Digital Library of India and then mirrored on the Internet Archives.
ગુજરાતી વિક્શનરીમાં ઉમેરો કરી એને વિશાળ બનાવવા ગુજરાતી વિકિપીડિયાના ઘણા સભ્ય ઉત્સુક છે પણ વિકિડેટા જેવી સ્ટ્રકચર્ડ ડેટા સપોર્ટ કરતી વિક્શનરી લોન્ચ થાય એની રાહ જોઈએ છીએ. એ પછી ગુજરાતી ભાષાના સૌથી વિશાળ શબ્દજ્ઞાનકોષ ભગવદ્ગોમંડળ (૯૦૦૦થી વધુ મોટા પાનાં; ૨,૮૧,૩૭૭ શબ્દ; જે ૨૦૧૬થી પબ્લિક ડોમેનમાં છે) ને ઓનલાઈન કરવા પણ વિચાર છે. આ સાથે આધુનિક શબ્દો પણ ઉમેરીશું. અત્યારે હું વિકિડેટા અને વિકિપીડિયામાં વધુ કાર્યરત છું જેમાં પણ ઘણું કામ પેન્ડિંગ છે. આમ છતાં હું અહીં અવારનવાર શબ્દો ઉમેરતો રહીશ. સ્લેંગ (ગાળ/અપશબ્દના સંદર્ભમાં કે રોજબરોજના વપરાશના સંદર્ભમાં?) અને આધુનિક શબ્દો તો ઘણા છે ઉમેરવાલાયક, તો એ પણ ઉમેરીશું. આજથી વીસ વર્ષ પહેલાં બોલાતી અને અત્યારે બોલાતી ગુજરાતી ભાષામાં પુષ્કળ ફરક છે. અત્યારે અંગ્રેજી શબ્દો વગર ભાગ્યેજ ગુજરાતી બોલાય છે અને એનું શુદ્ધ ગુજરાતી માત્ર ડિક્ષનરીમાં જ જોવા મળે એવું છે. જેમકે ટેલીવિઝન/ટીવીનું શુદ્ધ ગુજરાતી દુરદર્શન થાય પણ કોઈ વાપરતું નથી. આ ઉપરાંત મારે પોતાને પણ વિક્શનરી શીખવું પડે અને સાથે સાથે ગુજરાતી વ્યાકરણ/લેક્ઝીકોગ્રાફી પર પકડ કેળવવી પડે. જોઈશું અને કરીશું પણ ધીમે ધીમે. આભાર. --Nizil Shah (talk) 19:17, 26 July 2017 (UTC)Reply
ખૂબ ખૂબ ધન્યવાદ ફરીથી. ખબર છે મને કે શુદ્ધ ગુજરાતી આજકાલ કોઈ બહુ બોલતું નથી, પણ તેમ છતાં, અંગ્રેજી શબ્દો ઉમેરવા કરતા ગુજરાતીના એવા એવા ખાસ શબ્દો (જેમકે ગ્રામ્ય ભાષા, ગાળો, અપશબ્દો, ઇત્યાદી) ઉમેરિએ તો એનો ઘણો વધારે ફાયદો મળશે. અને હાં, દુરદર્શન, દુરવાણી, અગ્નિરથ જેવા ભદ્રંભદ્ર શબ્દો નથી જોયતા, પણ સાચ્ચા વાપરેલા શબ્દો. અને સ્લેંગમાં જે તમને યોગ્ય લાગે એ. DerekWinters (talk) 21:33, 26 July 2017 (UTC)Reply
ટેમ્પ્લેટ સમજવાની માથાકુટમાં પડ્યા વગર નવા શબ્દ ઉમેરવા કોઈ ગેજેટ કે ટુલ છે? હોય તો જણાવશો. ઉપર દર્શાવેલી ડિક્ષનરીના શબ્દો સરળતાથી ફટાફટ ઉમેરી શકાય એવું કોઈ ટુલ?--Nizil Shah (talk) 14:36, 1 August 2017 (UTC)Reply
Latest comment: 7 years ago7 comments2 people in discussion
So far I'm aware of Delhi, Mumbai, Hyderabad, and Indore-specific Hindi. Does Gujarat have any special Hindi dialects? Also, what exactly is Mumbai Gujarati like? —Aryaman(मुझसे बात करो)20:22, 13 August 2017 (UTC)Reply
I am not sure if it is considered as dialect or not but Gujarati do speak Hindi bit differently. They tend to add Gujarati words to fill the sentences when they do not know specific Hindi word. Apart from that some Gujarati common idioms and informal words are also used frequently. Gujarati Hindi is heavily influenced by Hindi TV shows and films. Mumbai Gujarati is different in two ways; first, almost all second generation Gujarati living in Bombay (Mumbai as they call it now) are educated in English Medium schools surrounded by Hindi and other regional language speaking friends. So their Gujarati is mostly limited and they had hard time understanding Gujarati words which are not in daily use. Their Gujarati is sometime more flavoured with Hindi and English words to fill in the gaps for which they do not know Gujarati words. Apart from that their tone differ from family to family as all their families have roots in different regions of Gujarat. Like Kutchi Gujarati family may speak Kutchi in family and their child may speak Gujarati with Kutchi flavour. I do not know if this aspects of Gujarati language is studied and documented by scholars or not.--Nizil Shah (talk) 05:38, 24 August 2017 (UTC)Reply
Aryaman ,Thank you for creating Swadesh list. I will fill it when free. And sorry DerekWinters for using your talkpage as discussion page. Do we have discussion page for discussing Gujarati/Indian language related things? Ideally we should talk there as interested parties can join in.-Nizil Shah (talk) 18:40, 24 August 2017 (UTC)Reply
We do have some active Gujarati Wikipedians but none of them is active either on Gujarati/English Wiktionary. I just recently started editing though I am active on Wikipedias and Wikidata for very long time. Gujarati Wikipedians too want to have large Gujarati Wiktionary and had plans to add largest Gujarati dictionary (9000 pages) which came in Public Domain in 2016 but postponed the plan due to upcoming structured dictionary powered by Wikidata style. We may have more active Gujarati people once it become operational (with possibility of few bots).-Nizil Shah (talk) 18:52, 24 August 2017 (UTC)Reply
Gujarati Swadesh list
Latest comment: 7 years ago3 comments1 person in discussion
I can read it pretty well now, I'm trying to get to gu-2, then I'll probably slow down... it's amazing how easy it is to learn just by knowing Hindi. Good to see you back (if only for a while)! Be sure to look through WT:BP. —Aryaman(मुझसे बात करो)20:46, 24 September 2017 (UTC)Reply
@Aryamanarora. If you know Hindi, it is very easy to learn to read Gujarati. Both language follow same alphabets like Ka, Kha, Ga.. etc. Only script is bit different. Just memorise Hindi equivalent of Gujarati alphabet and you are good to read. @DerekWinters, "બહુ સમયથી તમને જોયો નથી" has a minor issue. "તમને" જોડે "જોયો"ના બદલે "જોયા" વપરાય છે = "બહુ સમયથી તમને જોયા નથી". અંગત વ્યક્તિ કે મિત્રો વગેરેને "બહુ સમયથી તને જોયો નથી" એમ કહેવાય છે જેમાં "તને" જોડે "જોયો" વપરાય. જો કે હવે રોજબરોજની વાતચીતમાં "બહુ ટાઈમથી તને જોયો નથી" એવું લોકો બોલે છે. સમયના બદલે ટાઈમ હવે વધુ વપરાતો જોવા મળે છે. :)--Nizil Shah (talk) 07:14, 4 October 2017 (UTC)Reply
તને જોયો નથી is casual way to speak while તમને જોયા નથી is polite way. It is somewhat difficult to tell how to decide and use gender in Gujarati. But I will try to tell later. (after some research/reading)--Nizil Shah (talk) 11:52, 5 October 2017 (UTC)Reply
Latest comment: 7 years ago28 comments4 people in discussion
I've unmerged Braj (and Haryanvi, but there was only one Haryanvi lemma) from Hindi, see CAT:Braj lemmas. I think the best way to deal with the Hindi "dialects" is something like {{zh-dial}}, which could also have rows for Shuddh Hindi and Colloquial Hindustani, as well as Persianized Urdu. That kind of system could also be useful in Punjabi (if you don't know, Punjabi has three standard dialects and a bunch of lesser ones), and I'd imagine other Indo-Aryan languages that have clear dialectical variations like Konkani. —Aryaman(मुझसे बात करो)22:02, 10 October 2017 (UTC)Reply
@Aryamanarora If we do plan on remerging any of the "dialects" again, then we should really do so in the way English on wikt. handles them, with a {{lb|hi|Braj}} marker or something similar. If we want, Hindustani wouldn't be bad for a merger idea (something I would strongly support). Chinese here does it is by merging what are very separate languages, or at least much more separate than Braj and Haryanvi are from Khariboli. And to really take Shuddh Hindi, Baazari Hindi, and Urdu as separate registers really just plays into the politicians in India and Pakistan. I think it would be best to treat the terms that are more Urdu as {{lb|hi|Urdu}} (or whatever the new code would be), the ones that are more Hindi as {{lb|hi|Hindi}}, and those that are extremely arcane Persian, Arabic, or Sanskrit (or other) learned borrowings as {{lb|hi|rare}} or something similar. DerekWinters (talk) 00:42, 11 October 2017 (UTC)Reply
Also, since what is called "Hindi" isn't even monophyletic, let's keep the "Eastern Hindi" languages out of this. If we were to make an entry in Hindi that inherited a term from Ardhamagadhi Prakrit, that would be a bit problematic. DerekWinters (talk) 00:58, 11 October 2017 (UTC)Reply
Well that was the original idea, see e.g. कौ before I unmerged. It has {{lb|hi|Braj}} and it was categorized into CAT:Braj Bhāṣā, which was under CAT:Regional Hindi. As for merging Hindustani, I would definitely be for that, it's just there's no real benefit in terms of duplication of content since you won't have Hindi and Urdu headers on the same page (unlike Serbo-Croatian or Chinese, where several "languages" would be on the same page; in the case of Chinese, the logogrammic script is the real problem). Also I'd rather do {{lb|hi|India}} and {{lb|hi|Pakistan}} (but that also alienates Urdu speakers in India, e.g. the prestige dialect of Lucknow), but I wouldn't be averse to your idea.
With a template like {{zh-dial}} we could keep the "languages" in separate headers and still integrate them tightly without having to deal with merging (not to mention the neat maps the template generates). Yes, you're right the Hindi isn't molophynetic, but I think some sort of convergent evolution has occurred here between Eastern and Western Hindi, where they are pretty much mutually intelligible nowadays. We can't merged Eastern Hindi lects into Hindi obviously, but a dialectical synonyms template could still be used to link them together. —Aryaman(मुझसे बात करो)01:16, 11 October 2017 (UTC)Reply
Obviously more varieties are there, and for simple words like मैं it would be much more comprehensive. I also suppressed transliteration because it's visually distracting IMO. —Aryaman(मुझसे बात करो)00:29, 12 October 2017 (UTC)Reply
@Aryamanarora This is interesting. What purpose would this serve? Do you plan on removing the Braj, Haryanvi headers and replacing it with this? Also, though this need not be the case, certain Saaf Urdu terms derived from Arabic may be far too rare to include. Essentially the complexities of the two standards and the various sociolects I think are being simplified a bit too much in something like this. And unlike some of the "dialects" of Hindi, Braj and Awadhi have long had independent literary traditions, and Bhojpuri speakers today are quite adamant about keeping their language separate. I'm not happy about the situation with Chinese, because I think it hides the complexities within each form, and I think this would do the same here. DerekWinters (talk) 00:58, 12 October 2017 (UTC)Reply
No way! I'm not arguing for a merger here at all. मैं भारतीय सरकार का चमचा या हिंदी-वादी नहीं हूँ! The template would be placed in a synonyms section at Hindiभाषा(bhāṣā), Urduزبان, Brajभाखा(bhākhā), and Bhojpuriबोली(bōlī), and whatever other words mean "language" in the Hindi family languages. Chinese at least has the unified script going for it (things like Brajभाखा(bhākhā) are rare), so IMO it's easier to do a merger. I agree Bhojpuri and Braj etc. have independent evolution from Hindi (Khadiboli has a short literary tradition, and Manak Hindi is an artificial recent creation). Basically, this would allow us to have links between these sister (or in the case of Eastern Hindi, cousin sister) lects which are necessary if we continue to expand content. Plus, data would be kept in a database that we could edit and have it affect all entries involved, which really cuts down on maintenance.
I did originally think that a merger would work, but learning about different "varieties" of Hindi has shown me how daunting such a task would be. Not to mention it wouldn't make sense to merge Hindi and Urdu while keeping Braj, Awadhi etc. separate, and also vice versa. —Aryaman(मुझसे बात करो)01:15, 12 October 2017 (UTC)Reply
@Aryamanarora Interesting, it's not bad actually. What forms will choose for the "dialects"? Because to use भाखा or बोली would reduce the language to almost exhibition/display status: showing off it's differences when भाषा would be just as appropriate (if not more so) in any of these languages. Like, does बोली now mean a dialect in them (because it does in Gujju)? This is also a bad example, because काला wouldn't have this issue. DerekWinters (talk) 02:22, 12 October 2017 (UTC)Reply
True, it was just something I whipped up to show how it would look like. Of course, they would all use भाषा in writing as well, but I wonder if in colloquial speech would they still? I feel like the enthusiastic use of Sanskrit borrowings is only a recent invention in the history of these languages. Braj and Awadhi subsisted just fine on Desi words and Perso-Arabic borrowings for 500 years. As for बोली, it means "dialect" in standard Hindi, but I noticed on Bhojpuri Wikipedia some articles say भोजपुरी बोली? Do they really mean dialect in that case? —Aryaman(मुझसे बात करो)10:24, 12 October 2017 (UTC)Reply
I'll try to whip something up tomorrow (or later today if I can). I also am thinking about a Hindi declension module since our current templates are quite primitive. —Aryaman(मुझसे बात करो)18:59, 12 October 2017 (UTC)Reply
@Aryamanarora If you have the time, ability, and information to implement such a system it would look fantastic. The Deccan language also appears to be variety of Hindi-Urdu even though it is farther away from the 'Hindi belt'.
Perhaps Punjabi and Marathi-Konkani could use this system too as you mentioned. According to w:Maharashtrian Konkani, 'there is a continuum between standard Marathi and Goan Konkani', and according to w:Marathi-Konkani languages ‘several of the Marathi-Konkani languages have been variously claimed to be dialects of both Marathi and Konkani’. If there is information about words in these dialects and how they are related, then such a template would be useful. Kutchkutch (talk) 01:33, 12 October 2017 (UTC)Reply
@Kutchkutch: It won't be that difficult, it would just be a slightly edited form of Module:zh-dial-syn for the backend code. I do agree that there is a severe lack of information about regional dialects, and I imagine much of it isn't even in English and so is harder to obtain. Specifically for Hindi belt languages though, there is plenty of information about Hindi and Urdu, and pretty comprehensive coverage of Braj and Awadhi (McGregor's Hindi dictionary has both), as well as local vocabulary for Mumbai and Hyderabad (CAT:Hyderabadi Hindi, but it is usually considered closer to Urdu AFAIK; it's a form of Dakkhini like you said). Madhavpandit knows quite a bit about Konkani dialects it seems, and Punjabi has some scattered resources online. It's about time we modernized the entries for Indic languages. —Aryaman(मुझसे बात करो)01:39, 12 October 2017 (UTC)Reply
@Aryamanarora: Thanks!! This is very cool. Can't wait to use it to add Braj words. Lol always liked Braj/etc. poetry for how freely they use their tadbhavs. Hindi's forced Sanskritization has honestly been so bad because now everything sounds forced, and it's unfortunately seeping into everything, be it Braj, Awadhi, or even Gujarati. The tadbhavs are seen as village/uneducated speech and the Sanskritization is horribly artificial, so English and Farsi step in. Like damn, there goes half the expressivity of the language. But regardless, thanks again for this! DerekWinters (talk) 23:42, 28 October 2017 (UTC)Reply
No problem! Rupert Snell has written some really cool stuff. A lot of cool words like मीत(mīt, “friend”), हिया(hiyā) and साद(sād, “word”) have been replaced by tatsamas sadly. But to be fair, the Sanskritization is necessary if Hindi ever wants to be used in technical contexts. I suppose using tadbhavs would be possible, but words *पयासानू(apyāsānū, “photon”) just aren't suited for that kind of task. English does the same, borrowing from Latin instead of Sanskrit. But yeah, it's a shame the cool (and archaic) village dialects are subsumed by high-minded Hindi due to this kind of thing. Anyways, there's a lot of great Braj prose too. Lalluram's Rajniti is one I can think of. —Aryaman(मुझसे बात करो)14:33, 29 October 2017 (UTC)Reply
@Aryamanarora: I do have to disagree with you here. As words are inherently a set of sounds strung together, why not make them the most meaningful set of sounds for a speaker. For a native Braj speaker, something like *पयासकन would be infintely more understandable and useful than प्रकाशाणु, but inherently neither word is more suited to the concept. They both are, but one works for Braj speakers and the other for native Sanskrit speakers. English also could use something like *lightbit or *lightpart, and honestly, I would have preferred such a term in science class, as it's infintely more meaningful to me than *photon, which, without having an understanding of Greek (and as an English-language student, why should I have to), is essentially the same as any other meaningless string of sounds. DerekWinters (talk) 15:57, 29 October 2017 (UTC)Reply
Well, *पयास(apyās) would actually be a borrowing from Prakrit lol, so it wouldn't be all that transparent. And at this point, Sanskrit borrowings have become far too entrenched in Hindi (and all other Indian languages) to be purged. Old Sanskrit morphemes have become productive again, and we end up with words like केंद्रक(kendrak, “nucleus”) which never actually existed in Sanskrit but are native coinages (much like Englishphoton is a coinage from Greek components). And it's convenient that every Indian language has a word like kendrak meaning "nucleus". I guess we'll have to agree to disagree. —Aryaman(मुझसे बात करो)20:06, 29 October 2017 (UTC)Reply
Out of curiosity, what are tadbhavs and tatsamas? In any case, this discussion reminds me of this wikipedia article. I find the idea of "transparency" fascinating and worth aiming at, and I've always thought calques are much more interesting words than simple borrowings, which ofttimes simply look/sound hideous to me (χάμπουργκερ, I'm thinking of you). --Barytonesis (talk) 21:35, 29 October 2017 (UTC)Reply
@Barytonesis: In the ancient Sanskrit grammatical tradition (and so in modern Indian-language linguistics), a तद्भव(tadbhava, literally “coming/arising from that”) is a word that is inherited into an Indo-Aryan (or Dravidian) language from Sanskrit by way of the Prakrits. The Prakrits (प्राकृत(prākṛta, literally “natural”)) were the vernacular languages of India c. 300 BCE (the composition of the Ashokan Edicts in an early Prakrit) to I'd venture about 900 CE. Meanwhile Sanskrit (संस्कृत(saṃskṛta, literally “put together, well formed, perfect”)) was by this time a literary language. The relation between Sanskrit and Prakrit is analogous to that between Latin and Vulgar Latin, except Prakrit flourished as a literary language for a long time and we have strong corpuses from three literary dialects of Prakrit.
A तत्सम(tatsama, literally “same as that”) is a word that is borrowed from Sanskrit into an Indo-Aryan (or Dravidian) language. These words are very important to the use of modern Indian languages in technical fields. However, these words can be intransparent and often cumbersome, and are very rare in spoken language, where they are replaced by English. Often, these words are used to calque English compounds and even whole expressions (e.g. Hindiएक शब्द धन्यवाद का(ek śabd dhanyavād kā), calque of a word of thanks).
Often, tatsama and tadbhava doublets are both in use in different semantic fields. One of my favorites is खेत(khet, “a field for farming”)/क्षेत्र(kṣetra, “a region; field of study”). There's also बाँस(bā̃s, “bamboo”)/वंश(vañś, “lineage; dynasty”), सब(sab, “all”)/सर्व(sarv, “universal (in compounds)”) etc. etc. —Aryaman(मुझसे बात करो)17:29, 30 October 2017 (UTC)Reply
Although, on the mater of being important in technical fields, that is simply matter of preference for fancy, high-sounding Sanskrit words due to the perception that common words are inherently not meaningful enough, similar to how English treats its technical/scientific vocabulary as well. DerekWinters (talk) 18:28, 30 October 2017 (UTC)Reply
Yes, Sanskrit is now way too entrenched in technical language too really get rid of, much like no one "in the field" pays much heed to English linguistic purism and instead use the Greek and Latin words that were borrowed or coined a long time ago. Sadly, Hindi purism focuses too much on purging the Perso-Arabic element (which I think is essential to the language) and not on old Sanskrit borrowings that are "native". —Aryaman(मुझसे बात करो)19:01, 30 October 2017 (UTC)Reply
@Lingo Bingo Dingo: Hi, sorry about not responding sooner. I am not so knowledgeable on Coptic as I'd liked, but I gave my input on those matters that I have knowledge of. I'm glad to see an increase in coverage of Coptic (and Demotic!) on wikt! DerekWinters (talk) 20:54, 6 November 2017 (UTC)Reply
Ashokan Prakrit
Latest comment: 7 years ago13 comments5 people in discussion
@माधवपंडित, Aryamanarora Many sources just refer to this as Pali. But since Pali was so widespread, making a separate code would be better for keeping this separate from Pali. So 'Piyadasi' would be 𑀧𑀺𑀬𑀤𑀲𑀺(piyadasi), 'Kalinga' would be 𑀓𑀮𑀺𑀦𑁆𑀕(kalinga) and 'Dhamma' would be 𑀥𑀫𑁆𑀫(dhamma) even though धम्म(dhamma) already exists? Some inscriptions are in the Greek script, Kharosthi script, and Aramaic with Hebrew script. The Greek inscriptions use Πιοδασσης(Piodassēs) for 'Piyadasi' and εὐσέβεια(eusébeia)Eusebeia for 'Dhamma'. Kutchkutch (talk) 05:01, 11 November 2017 (UTC)Reply
In that case, maybe we can figure out how many Ashokan Prakrits there were and create a a code each one of them if they don't already exist. Kutchkutch (talk) 07:42, 11 November 2017 (UTC)Reply
The Dramatic Prakrits ((Jain) Sauraseni, (Jain) Maharastri, (Ardha)magadhi, etc) only were written down about in the 3rd century CE, while these inscriptions are from the 3rd century BCE, 500 years before. The Indologist Amulyachandra Sen on page 8 says: "The language of the Aśokan inscriptions is Prakrit. But it is not quite the same as any of the other literary forms known of Prakrit, it has been called Aśokan Prakrit or Prakrit of the Aśoka inscriptions. It has affinities with Māgadhī Prakrit . The language of the Girnār version of the REs is close to Pali." It's the oldest Prakrit by far, I think it ought to have a code. —Aryaman(मुझसे बात करो)20:28, 11 November 2017 (UTC)Reply
Interesting! If that's so, we should consider it. Are you sure that it was unified though? Because I remember reading that it was quite different in different parts of India. DerekWinters (talk) 01:17, 12 November 2017 (UTC)Reply
Apparently, there were some spelling differences in some of the edicts (many were just copying errors, since they were "translated" from a master tablet). AFAICT they all seem to very similar, and no doubt mutually intelligible. We could always use dialect tags, like Turner does. —Aryaman(मुझसे बात करो)17:07, 12 November 2017 (UTC)Reply
I just started looking through the edicts and translations provided by Hultzsch in 1925 for quotes and stuff. Apparently some Greek names are attested; they would make for some cool WT:FW nominations. —AryamanA(मुझसे बात करें • योगदान)02:23, 17 November 2017 (UTC)Reply
Latest comment: 7 years ago2 comments2 people in discussion
It seems the Indian government is working on something huge: . I was able to find the project description online: "Under this project dictionaries of 48 dialects of Hindi are to be developed. At the initial level Unicode based trilingual digital dictionaries of Bhojpuri, Brijbhasha, Rajashthani, Chhattisgarhi, Bundeli, Awadhi, and Malvi, Kangari, Gadhwali, Magahi, and Hariyanavi dialects are being prepared." We should keep a watch on it. —Aryaman(मुझसे बात करो)17:09, 12 November 2017 (UTC)Reply
Latest comment: 7 years ago14 comments4 people in discussion
It seems that some of the Indic scripts (including Devanagari for me) don't actually become bold when using '''? I used a hack in my personal css a while ago to force it to appear bold, should we add it to global css? I also made {{hi-x}} make the bolded text bigger and highlighted, if the user's font doesn't support bolding.
I think font-style: normal; suppresses bolding and italicizing. Also, since we're on the topic, I think the Devanagari fonts looks horrible. I use Noto Serif Devanagari on my personal CSS, but most people don't have that font. Do you guys have any suggestions for some fonts? —AryamanA(मुझसे बात करें • योगदान)04:28, 25 November 2017 (UTC)Reply
@AryamanA I don't have several of the fonts mentioned such as Adobe Devanagari or Utsaah. Is there a way to embed the the consensus font into the system so that the viewer doesn't need to worry about fonts? I've heard it's possible to to this on websites so that the viewer can see the custom font, but perhaps that's only possible for private websites. Kutchkutch (talk) 01:08, 26 November 2017 (UTC)Reply
If copyrights and terms of use for fonts are an issue, Google Fonts appear to indicate their licenses, but they may not be as good as the fonts already included with operating systems by default. Kutchkutch (talk) 02:58, 26 November 2017 (UTC)Reply
@Kutchkutch: We can render the fonts by loading them into the browser, but that increases page load times and uses memory, which may bother people who never look at Devanagari stuff. I personally use Google's Noto Fonts (that I downloaded on my system) by adding CSS rules to my personal User:AryamanA/common.css. So you can see that I use a serif font for Devanagari and Nastaliq style for Urdu. You could add the fonts that you like at User:Kutchkutch/common.css and it will override the default. btw, Utsaah and Adobe Devanagari are preinstalled on Windows I think. —AryamanA(मुझसे बात करें • योगदान)17:35, 26 November 2017 (UTC)Reply
@AryamanA: Well, that explains why loading fonts into the browser hasn't been done especially for the new scripts in Unicode. That's similar to the reasoning about why templates could be faster and more efficient than modules for doing simple things.
With this edit, you removed 'Devanagari Sangam MN', and perhaps that's better.
હું વેકેશન પર છું. છતાંય ક્યારેક થોડું ડોકિયું કરી લઉં છું વિકિપીડિયા પર અને અહીં. ઉનાળામાં કદાચ ફરી એક્ટિવ થઈશ. રિયલ લાઈફ ઘણી બીઝી છે અત્યારે. :) ગુજરાતી માટે ગુજરાતી લેકઝીકોન ઘણી ઉપયોગી છે. અને બીજા કોઈ સોર્સ મળશે તો તમને જાણ તો કરતો જ રહીશ. કોઈ ખાસ સવાલો હોય તો પૂછજો. હું જવાબ આપવા જેટલો એક્ટિવ તો હોઉં જ છું. સસ્નેહ. -Nizil Shah (talk) 04:54, 5 January 2018 (UTC)Reply
Oh that's awesome. I'll check it out. Also, what should we do about certain words. चारि(cāri) is attested from Tulsidas, and I think (though I am not sure) that modern Awadhi has चार. Do we split the languages or keep one as {{lb|awa|Old Awadhi}} and one as New/Modern Awadhi? Especially the treatment of Sanskrit loans by Tulsidas, which is likely much different from how it is done today (thanks to manak Hindi influence). What do you think? DerekWinters (talk) 14:35, 4 January 2018 (UTC)Reply
This is definitely an issue for Awadhi since resources for Modern Awadhi are quite lacking. I think using {{lb|awa|New/Old Awadhi}} would be the best idea, since the changes haven't been so drastic to warrant a new code by ISO (although ISO can be pretty dumb sometimes, since it didn't have Old Hindi or Gujarati codes). I wonder if some Awadhi Wikipedia editors could help us out. —AryamanA(मुझसे बात करें • योगदान)14:57, 7 January 2018 (UTC)Reply
Share your experience and feedback as a Wikimedian in this global survey
Latest comment: 6 years ago1 comment1 person in discussion
Hello! The Wikimedia Foundation is asking for your feedback in a survey. We want to know how well we are supporting your work on and off wiki, and how we can change or improve things in the future. The opinions you share will directly affect the current and future work of the Wikimedia Foundation. You have been randomly selected to take this survey as we would like to hear from your Wikimedia community. The survey is available in various languages and will take between 20 and 40 minutes.
You can find more information about this survey on the project page and see how your feedback helps the Wikimedia Foundation support editors like you. This survey is hosted by a third-party service and governed by this privacy statement (in English). Please visit our frequently asked questions page to find more information about this survey. If you need additional help, or if you wish to opt-out of future communications about this survey, send an email through the EmailUser feature to WMF Surveys to remove you from the list.
Reminder: Share your feedback in this Wikimedia survey
Latest comment: 6 years ago1 comment1 person in discussion
Every response for this survey can help the Wikimedia Foundation improve your experience on the Wikimedia projects. So far, we have heard from just 29% of Wikimedia contributors. The survey is available in various languages and will take between 20 and 40 minutes to be completed. Take the survey now.
If you have already taken the survey, we are sorry you've received this reminder. We have design the survey to make it impossible to identify which users have taken the survey, so we have to send reminders to everyone.
If you wish to opt-out of the next reminder or any other survey, send an email through EmailUser feature to WMF Surveys. You can also send any questions you have to this user email. Learn more about this survey on the project page. This survey is hosted by a third-party service and governed by this Wikimedia Foundation privacy statement. Thanks!
Your feedback matters: Final reminder to take the global Wikimedia survey
Latest comment: 6 years ago1 comment1 person in discussion
Hello! This is a final reminder that the Wikimedia Foundation survey will close on 23 April, 2018 (07:00 UTC). The survey is available in various languages and will take between 20 and 40 minutes. Take the survey now.
If you already took the survey - thank you! We will not bother you again. We have designed the survey to make it impossible to identify which users have taken the survey, so we have to send reminders to everyone. To opt-out of future surveys, send an email through EmailUser feature to WMF Surveys. You can also send any questions you have to this user email. Learn more about this survey on the project page. This survey is hosted by a third-party service and governed by this Wikimedia Foundation privacy statement.
Latest comment: 6 years ago5 comments2 people in discussion
I had seen that you had asked Sushant Savla on Gujarati Wikipedia about Kachchhi lemmas. I have lived in Kutch (Kachchh) and have very basic knowledge of Kachchi language and words. I would be able to help a little bit. Sushant Savla is native speaker (who is not living in Kachchh btw) so he would be super helpful. I will try to find resources for Kachchi for you. Regards,--Nizil Shah (talk) 17:41, 8 May 2018 (UTC)Reply
@Nizil Shah Thanks! Resources would be very helpful. And please, add as many words as you want here, we need as many as possible. And if you know anyone else who would be willing to add to the Gujarati section here, that would be very very helpful. ગુજરાતીના કરોડ જેટલા શબ્દોમાંથી બે હજાર જ છે અહિંયા. DerekWinters (talk) 05:21, 13 May 2018 (UTC)Reply
ગુજરાતી વિકિપીડિયાના સભ્યો ઘણા સમયથી વિકિડેટા પર લેક્સિકોન ડેટા શરૂ થવાની રાહ જુએ છે. તેમનો હેતુ ભગવદ્ગોમંડલ (જે હવે પબ્લિક ડોમેન માં છે અને ગુજરાતી લેકઝીકોન એનાથી સમૃદ્ધ કરેલી છે) ને પબ્લિક ડોમેન એવા વિકિડેટા પર ચડાવવાનો છે. તેમાં અંદાજે બે લાખ એન્ટ્રી છે. આ સિવાય બીજા પબ્લિક ડોમેન ડિક્શનરી પણ ચડાવાશે. એટલે અહીં કે ગુજરાતી વિક્શનરી પર કોઈ એક્ટિવ નથી. જોઈએ ભવિષ્યમાં કેટલી સફળતા મળે છે. પ્રોજેકટ શરૂ થાય એટલે તમારો સંપર્ક પણ કરીશ. -Nizil Shah (talk) 07:34, 13 May 2018 (UTC)Reply
Latest comment: 5 years ago1 comment1 person in discussion
I had always thought that Module:etymology languages/data had an inscrutable selection of lects, and I found that you had added them in 2014 (diff). I need closure as to how these lects were chosen :p (what was Wuhua Hakka added for?)
Latest comment: 5 years ago1 comment1 person in discussion
Hi Derek.
Your Hadza template, Template:hts-noun, could use a m= parameter for derived masculine nouns. E.g., fsg dongoko is 'zebra', derived msg dongo is 'zebra buck'. Also, it might be nice to have a simple pl= parameter, for e.g. dongobee 'zebras'. While grammatically feminine, it's used for mixed gender (just as hazabee means 'people', not 'women'), which won't be clear if it's labeled 'fpl'. If the lemma is labeled 'f', it should hopefully be obvious that 'pl' is the plural of the lemma.
Also, the sg is only grammatically sg. dongoko 'zebra' could be one or a whole herd, just as in English. so it might be better to call it 'transnumeric'? dongobee is an individuated plural, and can't be an indefinite number as in a herd. Though perhaps that's info for a grammar rather than a dict.