Hello, you have come here looking for the meaning of the word User talk:Conrad.Irwin/¹. In DICTIOUS you will not only get to know all the dictionary meanings for the word User talk:Conrad.Irwin/¹, but we will also tell you about its etymology, its characteristics and you will know how to say User talk:Conrad.Irwin/¹ in singular and plural. Everything you need to know about the word User talk:Conrad.Irwin/¹ you have here. The definition of the word User talk:Conrad.Irwin/¹ will help you to be more precise and correct when speaking or writing your texts. Knowing the definition ofUser talk:Conrad.Irwin/¹, as well as those of other words, enriches your vocabulary and provides you with more and better linguistic resources.
I archive my talk page when it gets to ~75 topics by moving the first 50 to a new subpage. Please do not edit the archive pages, if you want to talk about something again - copy it back to my current talk page or just start a new topic there and link back.
I wish they would :D. Nah, the right hand ones are OK when you don't have an image, but if you have lots on the right-hand side the page looks wierd (to me anyway). It's not a big problem. Conrad.Irwin16:17, 14 March 2009 (UTC)Reply
Possible problem with your anagram finder
Latest comment: 14 years ago1 comment1 person in discussion
Sorry, I'm new to wiktionary, but under the entry for gamin, your bot added ænigma. This is not really an anagram, since the neither the æ character nor any analogue for it exist in the original word, except for the a, which is used at the end of ænigma. Just thought you might like to know. Sorry if I didn't post to your talk page right, I couldn't figure out what you were trying to say. 76.125.225.19422:19, 17 February 2010 (UTC)Reply
That's odd
Latest comment: 15 years ago2 comments2 people in discussion
Latest comment: 15 years ago5 comments2 people in discussion
Hey again. I was asking Atelaes back in january how to make a index and he mentioned a slow way of doing it and certain magical ways of creating one with the help of a great wizard called Conrad =). I have added alot of words since january and it would be fun to have an index, If you find the time to help me it would be awesome. -Edelstam17:11, 25 March 2009 (UTC)Reply
Hey Conrad, been a while, thanks for updating the mapudungun index again. I can't get on the irc anymore because got this TOR program on my computer and I can't get it to work with some IRC servers. So I'll have to ask you here. I have been working some with trying to start up the mapudungun wikipedia and there they are working with some separate solutions for the different orthographies(what I mistakenly have called alphabets) within one wikipedia. I looked some at the serbian wikipedia where they use two different alphabets for example. I got some comments about the mapudungun language category here on english wiktionary, that it was confusing to have words from two different orthographies in one place. So I was thinking if we could possibly adopt some solutions the serbian language category has here. They have for example a index with the two different alphabets, look here. Since we only have words in two orthographies it would mean a similar solution to the one the serbian category uses. What do you think about this, is it alot of work? -Edelstam20:14, 26 April 2009 (UTC)Reply
Hi Edelstam, the main problem is automatically telling the two apart. If there is some indication in the page text, or even a way of testing the page titles, that will tell me which alphabet which entry belongs to, then it shouldn't be that hard. I can obviously look for some letters that only appear in one of the alphabets, but I don't know enough about Mapudungun to know if I will always be able to tell the difference. Conrad.Irwin20:19, 26 April 2009 (UTC)Reply
Yeah I was thinking about how to separate the words since they don't look different like the Serbian Cyrillic- and Latin alphabet words. Right now all words written in the raguileo orthography have this in the article tho: (using Raguileo Alphabet), look at cew for example. Some words are the same in both orthographies and in those cases they look like this word: awka. Words written in Unified alphabet look like this: chafod. Can you use that for the bot? -Edelstam20:47, 26 April 2009 (UTC)Reply
Latest comment: 15 years ago1 comment1 person in discussion
This diff raised a question for me: There's a direct link to en.wp, plus the wp template which links to a disambiguation page. What would be the appropriate layout/template use? I really don't know. - Amgine/talk16:45, 29 March 2009 (UTC)Reply
Sorry. I'd neglected to appreciate that it appears in other places. Do you use the link that appears on the Main Page, or exclusively where it appears elsewhere? Conrad.Irwin22:30, 29 March 2009 (UTC)Reply
I see no reason to keep WB:WB, and doubt there are other mainspace pages that are appropriate soft redirects, but perhaps there are. Very well.—msh210℠23:20, 31 March 2009 (UTC)Reply
I would also like to delete them, but I think someone told me not to (may have been a year or so ago, so might "just do it" now :D) Conrad.Irwin23:22, 31 March 2009 (UTC)Reply
639-3 templates
Latest comment: 15 years ago6 comments2 people in discussion
So we are going to have thousands of unchecked templates? Are you checking everything you should? (You are not.) Likely to be more of a mess than a help. (Sorry.)
Would be good to stop until a list of issues is addressed.
I had not looked at the "I" flag - as I was using the "Language Names Index" from http://www.sil.org/iso639-3/download.asp - which I had assumed very likely to be correct (it is the definition after all) However there do seem to some discrepancies between what we have and what they say; mainly only spelling. I had thought about checking against Ethnologue (run by the same organisation) but it doesn't seem to have been updated fully with the 2009 changes. I could change to using the full code-set and ignore those 62. Conrad.Irwin10:55, 6 April 2009 (UTC)Reply
bak Bashkir: Bashkir<includeonly>{{#if: {{{1|}}}|]|]}}</includeonly><noinclude>]</noinclude>
bas Basa (Cameroon): {{{l|]}}}<noinclude>]</noinclude>
bav Vengo: {{{l|]}}}<noinclude>]</noinclude>
bcl Central Bicolano: {{{l|]}}}<noinclude>]</noinclude>
bej Bedawiyet: {{{l|]}}}<noinclude>]</noinclude>
bem Bemba (Zambia): {{{l|]}}}<noinclude>]</noinclude>
bez Bena (Tanzania): {{{l|]}}}<noinclude>]</noinclude>
bfe Tena: {{{l|]}}}<noinclude>]</noinclude>
bgc Haryanvi: {{{l|]}}}<noinclude>]
bgk Buxinhua: {{{l|]}}}<noinclude>]</noinclude>
bin Edo: {{{l|]}}}<noinclude>]</noinclude>
bjn Banjar: {{{l|]}}}<noinclude>]</noinclude>
bla Siksika: {{{l|]}}}<noinclude>]</noinclude>
bno Bantoanon: {{{l|]}}}<noinclude>]</noinclude>
bnv Bonerif: {{{l|]}}}<noinclude>]</noinclude>
bnv Edwas: {{{l|]}}}<noinclude>]</noinclude>
bot Bongo: {{{l|]}}}<noinclude>
There is no "consensus" to include maccrolanguages: they were wrongly created and must be deleted; we want templates for precisely the set of languages we want to use as L2 headers. In some specific cases we do want to use a macrolanguage code where it is better to treat the I languages as dialects (or in some combination). This requires looking at individual cases, with some knowledge of the language(s); this is why we want to create codes only as needed.
I'm afraid you are creating a massive raft of problems to add to the (fairly reasonable) number we already have. IMHO, it would be better to delete all code templates for languages not referenced anywhere at this time. We do carefully allow new contributors to add L2 headers and translations lines w/o having to have code templates, which can the be sorted with care. Robert Ullmann11:48, 6 April 2009 (UTC)Reply
I agree there are difficulties with language templates, but which problems am I actually causing? It is just as easy, if not easier, to correct a template that is found to be wrong than for someone to lookup the language code for a language they are trying to add; and there will be (far?) fewer mistakes to correct than there are templates to create. Conrad.Irwin12:09, 6 April 2009 (UTC)Reply
Oops, I guess you're right. The full text of the response's Status-Line is exactly “HTTP/1.1 404 Not Found”, consisting of HTTP-Version, Status-Code and Reason-Phrase. I'll adjust the phrasing. —MichaelZ. 2009-04-11 22:56 z
Latest comment: 15 years ago3 comments3 people in discussion
Hi Conrad,
I tried to add a translation using the translation-thing, and it didn't work out. I filled out the form, and then I didn't see any way to save. So, I figured I'd continue playing around, and I clicked one of the arrows, and suddenly I got a save-button, which I clicked. But then I understood that all it was doing was saving the re-balancing, and suddenly it occurred to me to click the preview-button. But it was too late: a message came back that saving was complete, and I lost the save button. And as far as I could tell, there was no longer any way for me to recover what I'd entered; I'd have to redo it.
Also, since the preview-button clears out the form, and the undo-button doesn't restore it, there doesn't seem to be a way to fix a typo, to add a qualifier, etc. after clicking the preview-button, except by re-entering everything. (Admittedly, that wouldn't be a big deal if I had a Hebrew keyboard, but I don't, so it takes me a bit of effort to enter Hebrew text. I found it disheartening when my translation just disappeared into the ether.)
Hmm, it is necessary to click "Preview" before save, whether this is desirable is questionable, but I think generally it is; as typos are harder to spot when they are not in-situ. Maybe I need to make it clearer that "Preview" is the first step, but how to do that I'm not sure. (We should really organise some training for using Wiktionary...). I debated about making the "Undo" button re-fill-out the form, but wasn't sure whether that would be very usual either, as if people want something undone the last thing they want is it to come back and get in their way. The main problem is neither of those though, it needs to hold onto previous edits while saving, which I think I can fix. Conrad.Irwin18:08, 14 April 2009 (UTC)Reply
I agree requiring Preview before Save is desirable, for the reason you say. (Perhaps the usual method of saving a page should also require it! :-P ) But I agree with Ruakh that undoing should prefill, for the reason he says.—msh210℠18:34, 14 April 2009 (UTC)Reply
The “Add” label on the fields implies that using the fields performs the action. Might be clearer if it said “New”, or was removed altogether, and the action word were restricted to the button which performs the action. “Preview” is a bit confusing, because although that button does incidentally change mode to previewing the first time it's clicked, the editor only sees it add a translation every time it's clicked. —MichaelZ. 2009-04-14 20:51 z
Latest comment: 15 years ago3 comments2 people in discussion
I was wondering why you so severely pruned this entry in this edit when you formatted the definition line? I could possibly see removing the hypernym entry as being covered by the definition, but not the rest of the pruning, and in any case, pruning of that extent would seem to warrant a more substantive comment than "fmt". — Carolina wrendiscussió14:41, 15 April 2009 (UTC)Reply
Latest comment: 15 years ago2 comments2 people in discussion
The new button for adding translations is not working right for me: it responds only with "Please use a language code" but I do use a language code! However, I gather from the above message by Hooiwind that the tool is working all right for him. What is wrong? —AugPi18:09, 15 April 2009 (UTC)Reply
It appears that it has stopped working under Internet Explorer since I tested it. I am investigating now, and hope to have it fixed in the next few minutes. Conrad.Irwin18:15, 15 April 2009 (UTC)Reply
Translations
Latest comment: 15 years ago4 comments2 people in discussion
Something is amiss. Currently, none of the Translations sections are collapsed for me, and there is no option to Show/Hide. The Related terms boxes are also affected. --EncycloPetey20:17, 15 April 2009 (UTC)Reply
I can't see this in Firefox, Safari, Opera, IE6, IE7, Chrome or even Konqueror. What are you using? Did a hard re-fresh do anything about it? Conrad.Irwin21:01, 15 April 2009 (UTC)Reply
I'm using Safari, but the problem has (mostly) gone away now. The problem is now intermittent, and goes away if I empty the cache, then refresh the page. However, I have to do this again when I go back to the page again later. I'm also seeing an extra bullet point for a non-existent translation at the end of translation lists. --EncycloPetey21:17, 15 April 2009 (UTC)Reply
Latest comment: 15 years ago3 comments2 people in discussion
Hi,
I like your assisted translator. Well done! I don't use for Chinese entries, though because it doesn't allow traditional/simplified distinction. The other thing, I use Chinese, as the title, not Mandarin, which is the Standard Chinese dialect, anyway. By default, the entry is in written Chinese, shared by all dialects with some exception. The Mandarin's pinyin is also standard for China. Dialects could be separated by *: notation (Cantonese, Minnanhua, etc.
Not in the near future, sorry. I was thinking of adding a field to the "More" view that allowed you to enter the entire line, so you could type "Chinese: {{zh-zh-p|...}}", it's not ideal but I think it might be helpful for some people. Supporting nested translations, for Chinese and Greek, is also on the todo list, but I need to have a chat with Ullmann about which languages are nested and under what circumstances. Conrad.Irwin09:29, 17 April 2009 (UTC)Reply
Latest comment: 15 years ago3 comments2 people in discussion
After a year, I think it's time to re-run Index:Latin. There are many, many new entries, and many previously indexed entries have been removed as errors (especially those beginning with J and K). --EncycloPetey19:25, 18 April 2009 (UTC)Reply
I'll try to get round to it tomorrow, I'll also update the formatting to match the other indices generated by Conrad.Bot (unless you have a strong objection thereto). Conrad.Irwin23:23, 18 April 2009 (UTC)Reply
No objections, no. I don't use them all that much myself (except for cleanup purposes), but I know other people do make use of them. --EncycloPetey04:15, 19 April 2009 (UTC)Reply
language templates
Latest comment: 15 years ago8 comments2 people in discussion
WHAT THE FUCK were you doing?
You sure as hell didn't check the results. I don't even know how to make the server store pages with no revisions, but whatever broken, fucked-up thing you did managed.
I am cleaning them up, one by one by page ID. It is 5 AM, I've been chasing this problem for several hours, and it looks like half a fucking day of manual cleanup. Thanks a lot.
I will apologize for all the invective above. (:-) but I ask one thing: can you please learn why it is not a good idea to just blast through things? Adding all these required (and requires) checking all the work. Did you write a program to read all the templates and check them all? Or just write them and hope? ... best regards, Robert
The pages you have deleted, along with six others, threw an error on save. I made the, incorrect it would seem, assumption that an error on page save was equivalent to not saving. As it wasn't important to be utterly complete, I just logged "Error" and carried on. I can't vouch for the fact that all the others are fine, but given that everything you deleted logged an error I am confident enough that the others are not in this state. There is also a query running now on toolserver to find all pages with no revisions, just in case this problem has gone unnoticed elsewhere. (The people in #mediawiki seemed reluctant to admit that doing this was possible, and so now give the "recommended fix" as deleting the page...). Thank you for your help and time. Conrad.Irwin09:08, 20 April 2009 (UTC)Reply
SELECT page_title FROM page LEFT JOIN revision ON rev_page = page_id WHERE rev_id IS NULL;
Empty set (52 min 47.11 sec)
Hi! the dump has run (unassisted, and okay ;-), so I did get all of them (from this case). I'm very curious how this managed to happen (and not very surprised that "they" would think it "impossible", it ought to be ;-)? How exactly did you do the "save"? (which code, index or api.php? etc?) And what was the error? (as far as you know). I'd like to see if I can maybe make it happen, and we can fix the underlying bug. Now that I've had some good sleep. Robert Ullmann13:11, 20 April 2009 (UTC)Reply
I (foolishly) didn't record the actual error, I'm using mwclient, page = wikt.Pages; text = page.edit(); try: page.save(newtext) except: print "Error: %s %s" % (code, name). I can't really see anything in common with the edits, though there's clearly two main patches of problems, so I wonder whether it was something like database lag and a buggy anti-overload tool getting in the way. Conrad.Irwin13:27, 20 April 2009 (UTC)Reply
page.save(text=newtext) (presumably?) can I see the actual code (you can email it if you don't want it floating about ;-). I'd like to set up the exact case, and then look at how it would trace through api.php. (but sleeping now I think) Robert Ullmann22:22, 20 April 2009 (UTC)Reply
Doesn't look like anything odd on that side. Presumably I would have to hunt something in/under api.php that would cause an incomplete DB commit. (and I'm not doing that at 2AM ;-) thanks, Robert Ullmann22:57, 20 April 2009 (UTC)Reply
Latest comment: 15 years ago4 comments2 people in discussion
Hello. I noticed that you reverted my self-revert on this entry. I initially entered a reference for its etymon under a "Notes"-thread (see the history of the entry) exactly as is done in Wikipedia. Then, the bot moved the "Notes"-thread on the top of the wikicode and it created a cite-error ("<ref> tags exist, but no <references/> tag was found"). The truth is that I don't know how to properly make a citation on Wiktionary yet, so I reverted myself (my rationale was that if the reference doesn't show up, there's no point having it in the code). But since you reverted me, the cite-error is back. Do you have a better idea on how to undo it while at the same time make the entry conform with the formatting standards of Wiktionary concerning citations? --Omnipaedista18:31, 21 April 2009 (UTC)Reply
Hi, sorry for not replying, isn't Atelaes fix correct? I just reverted you because it looked like you wre removing random stuff, sorry. Conrad.Irwin08:55, 25 April 2009 (UTC)Reply
Latest comment: 15 years ago14 comments2 people in discussion
Hi. Great work on making those indexes. However, here are some points:
1. By sorting Russian е and ё together, here is what I meant: your bot should temporarily replace every ё in the dump by е, sort such list as if ё did not even exist, than add ё-s back to the sorted list. This is the convention used in Russian dictionaries. Thus, these 5 words will be sorted like this:
Sorry, my fault - I have two normalization routines (one for the letters used as headings and page titles, the other for sorting) and I only fixed one of them. (will fix now) Conrad.Irwin00:42, 26 April 2009 (UTC)Reply
2. The sorting of Armenian according to Unicode corresponds to old Armenian conventions in two points. First, is it possible to treat ու as a single, separate letter, and add words starting with it to Index:Armenian/ու? Right now they go under Index:Armenian/ո. And, whenever it's found in the middle of the word, it should be treated as the 34th in order, coming after ց and before փ in Index:Armenian, and not as a combination of ո + ւ. Besides, the letter և is treated by Unicode as the 6th in order, when it should be 37th - after ք and before օ. If you can't make this changes, no problem, old convention indexes are better than empty indexes.
3. What happens if the translations or entries, from which the indexes were generated, are moved, changed or deleted? Will your bot automatically update the indexes? --Vahagn Petrosyan22:47, 25 April 2009 (UTC)Reply
Yes, I run the script every fortnight or so - it takes about an hour to download the xml dumps process them and re-upload the indices. Conrad.Irwin00:42, 26 April 2009 (UTC)Reply
I noticed that you'd been cleaning out the /0 pages, and have reverted the bots overwrite of you. Next time it runs, it should see the fixes you have made to the pages and not re-create the mess. Conrad.Irwin01:11, 26 April 2009 (UTC)Reply
There's always one :D, I'll see if I can work out what's causing this, and then see if I can fix it for opera without breaking everyone else. (I suspect it's something property of the headings) Conrad.Irwin00:42, 26 April 2009 (UTC)Reply
Awesome, you got it all right! Two, teeny-tiny things. If I move the remnants belonging to Index:Armenian/ու from Index:Armenian/ո to Index:Armenian/ու (just 3 articles at the bottom linking to the letter itself), will the bot overwrite me next time? Also, if I put a text like "Transliteration: č" at the top of each index page, will the bot overwrite me next time? --Vahagn Petrosyan01:37, 26 April 2009 (UTC)Reply
Yes, it will. I can fix those "ու", not sure what to do with transliteration - I suppose if you pointed me to a list and showed me how you wanted it to be formatted, I could get Conrad.Bot to add it (it would then be easy to add for Russian too) - but it would not be trivial to change the code not to overwrite everything. Conrad.Irwin07:34, 26 April 2009 (UTC)Reply
I just noticed that after you added Index:Old Armenian to bot's maintenance list (which is sorted according to Unicode) the sorting rules of Index:Armenian have defaulted to Unicode as well. Can you return the two rules for modern Armenian, please? Namely:
Index:Armenian/և should be 37th - after ք and before օ (and not Unicode's 6th)
Latest comment: 15 years ago2 comments2 people in discussion
As you can see in this edit, the assisted translations feature does not recognize or warn about edits made between the time the page is opened and the time an edit is saved, resulting in a loss of content/edits. See the edit history around that edit for a clearer picture. --EncycloPetey04:39, 28 April 2009 (UTC)Reply
Latest comment: 15 years ago5 comments2 people in discussion
This word is the plural of szülő (parent) and has the "plural of" template in its definition line - it is still listed in the index. Not urgent, but when you have a chance, would you please take a look at it. And thanks again for regularly updating the index! --Panda1022:53, 29 April 2009 (UTC)Reply
What do you think about creating a template such as {noindex} that would be added to entries the editors don't want to include in the index? Your code would check for this template. --Panda1002:33, 10 May 2009 (UTC)Reply
Hmm, it sounds like a reasonable idea - and a similar template for including irregular form-ofs in the index might be nice, but a better name than {{NOINDEX}} which is already used to stop search-engine indexing of pages is needed. Perhaps {{doIndex}} and {{dontIndex}}? Conrad.Irwin13:44, 10 May 2009 (UTC)Reply
Latest comment: 15 years ago10 comments4 people in discussion
Hi ConnelCirwin. I've been improving my metadata server.
I want to build the master data from all the master sources automatically and periodically.
So I'm reading the master list of wiktionaries and the master list of ISO language codes and I've found these Wiktionaries which are not ISO codes:
tokipona: A constructed language with no ISO code
roa_rup: Aromanian, so why not the standard code rup?
mo: Moldovan/Moldavian, to be deprecated in favour of combining with Romanian ro
zh_min_nan: Some long-running dispute about the correct language code I think
als: Tosk Albanian, its a code on Ethnologue but not in ISO, need to know more
simple: Simple English, not a separate language
sh: Serbo-Croatian, deprecated by ISO
So how should we handle such cases? I haven't yet included the English Wiktionary's Template namespace titles which are 2 or 3 letters or which are members of the language code category. There may be further anomolies there. — hippietrail08:11, 30 April 2009 (UTC)Reply
See Wiktionary:Language code extensions; most of these are stupidity by the WM "language committee" which seems to be terminally clueless. (they denied Jerrais a WP because it "had not code", then allowed Norman, and give it the code for Narom (nrm). fortunately no wikt yet) Specifically:
tokipona: we ignore, as with tlh (Klingon)
roa_rup: standard code didn't exist at the time, should get changed
mo: thus was/is a standard code, we can sort it when mo.wikt is closed/merged
zh_min_nan: was supposed to be temporary, invented by IETF/IANA, not us should become nan (also zh-yue, the same, but has no wikt)
als: Tosk Albanian; but used for "Alsatian", which does need an extension code, this is committee stupidity. however, they also decided to put the dictionary inside the pedia (Wort: namespace), so we just ignore it ...
simple: should have been en-simple, but harmless
sh: sh.wikt is basically dead but not closed yet, sr, hr, bs etc wikts are alive and well; eventually the Serbian will all be identified as such (as well as what is Bosnian and Croation) and the extraordinarily offensive Serbian Nationalist "Serbo-Croatian" crap will be removed
Actually there is a context my question. I'm making a tool to serve metadata about languages, designed to be used by other tools such as cirwin's editor.js so that they know whether to include gender fields, use script templates, need transliterations, provide interwiki links, etc. So I want to know whether and how to support things we use here but the standards don't.
As for "extraordinarily offensive crap", some places are lucky. ISO allows Tibet no region code and tow's the government line on calling Burmese "Myanmar". I'm sure there's plenty more to be enraged about but all of that is besides the point and off topic.
So how should we handle old Cyrillic Romanian/Moldovan/Moldavian spellings? I guess just Cyrl is fine but ro-Cyrl or does MD belong in there somewhere? — hippietrail08:54, 30 April 2009 (UTC)Reply
Firstly, I am still called Conrad :p. Secondly, I've taken a completely pragmatic approach to this; although the situation on-wikt is not ideal, I've just copied it - doing anything else really isn't useful for what editor.js does. For a more generic meta-data thingy, I suppose it would be silly to copy wiktionary; so how you handle that I'm not really sure - probably with multiple "override" sets that can be used in place of the true values when a different "context" is requested by the client. The main problem I am coming across currently is that our Template:zh returns "Mandarin" (the name of the Wiktionary) and not "Chinese" (the name of the language) - but I don't know enough about the issues to deal with it. Conrad.Irwin09:24, 30 April 2009 (UTC)Reply
So do you need data for the languages codes which we have language templates for including nonstandard templates and all the language codes which have Wiktionaries whether or not those codes are standard or whether or not we use those codes on enwikt? I suppose I should include fields for "in iso", "in wikimedia" and "wiktionary template".
Which means the only real problems are where one of these three uses a different code to one of the others, or when any two of them use the same code for different languages. And this does seem to be the case for a few where ISO uses one code and either wikimedia or enwikt or both use a different code. — hippietrail09:52, 30 April 2009 (UTC)Reply
(@Robert Ullmann) Serbian Nationalist "Serbo-Croatian" crap - the policy guideline is proposed by Ivan, who is an administrator of Croatian origin. I suggest you discuss this with him, but this policy (about Serbo-Croatian) is not going to be rescinded without a community consensus. Furthermore, it is completely futile to squander resources on three dialects elevated to the state of languages, when with the help of the SC header this is succinct and concise. And Bosnian and Serbian wiktionaries are no more alive than Serbo-Croatian, see the BP discussion for that, where evidence was presented. A veritable example for nationalist crap is the this article about the Montenegrin dialect. The uſerhight Bogormconverſation09:11, 30 April 2009 (UTC)Reply
It seems discovery of valid Wiktionary language codes is not straightforward. We reserve all 2- and 3- letter all lowercase ASCII alphabetical template names as Language templates which resolve to the name of the language. This works for language codes of the form xx and xxx. So all can be discovered by searching the Template namespace.
But as far as I can see we don't reserve hyphenated language codes and we don't restrict membership of Category:Language templates either. If I search for all templates in the category beginning with two lowercase letters I get these results which are not in ISO 639-3:
aoq: Ammonite -- no ISO 639-3 code. nothing in Ethnologue.
ast-leo: Leonese -- ast is Asturian. Ethnologue says Leonese is a dialect of Asturian. code leo must be unofficial.
bat-smg: Samogitian -- Ethnologue says its a dialect of Lithuanian which is lt or lit. ISO 639-3 has no code bat.
be-x-old: Belarusian (Tarashkevitsa) -- nothing in Ethnologue.
bh: Bihari -- has a Wiktionary, seems to have progressed from being a singular code to a collective code to being removed.
cbk-zam: Zamboanga Chavacano -- cbk is Chavacano. Ethnologue says Zamboanga is a dialect. code zam must be unofficial
el-it: Salentine Greek -- nothing on Ethnologue.
eml-rom: Romagnolo -- Ethnologue says eml is for Emiliano-Romagnolo.
fiu-vro: Võro (already discussed, should be moved to vro)
fr-ca: Canadian French (nominated for deletion)
fr-nng: Guernésiais -- Ethnologue counts it as a dialect of Fench. code nng must be unofficial
fr-nnj: Jèrriais -- Ethnologue counts it as a dialect of Fench. code nng must be unofficial
fr-nnx: Norman -- Ethnologue counts it as a dialect of Fench. code nng must be unofficial
map-bms: Banyumasan. ISO counts it as a dialect of Javanese jv. code map doesn't exist in ISO 639-3
mo: Moldavian -- has a Wiktionary. now covered by Romanian ro
mol: Moldavian -- see above
nah: Nahuatl -- has a Wiktionary. was an ISO 639-2 code but is not an ISO 639-3 code. There are many codes for Nahuatl languages/dialects
nap-cal: Calabrese -- Ethnologue says nap is for Napoletano-Calabrese.
nds-nl: Dutch Low Saxon --
no-rik: Norwegian Riksmål -- historic predecessor of Bokmål but seems to have no language code of its own.
roa-rup: Aromanian (has a Wiktionary, already discussed)
sfk: Safwa -- seems to be an erroneous dupe of sbk.
simple: Simple English (has a Wiktionary, already discussed)
sr-mon: Montenegrin -- code mon must be unofficial
suh: Suba -- Ethnologue gives suh for Suba but ISO 639-3 gives sxb. Ethnologue does not list sxb at all.
szk: Sizaki -- Ethnologue says Sizaki has ISO 639-3 code szk but ISO 639-3 has no such code.
tokipona: Toki Pona (has a Wiktionary, already discussed)
twf-pic: Picuris. Ethnologue says Picuris is a dialect of Tiwa. code pic must be unofficial.
wwg: Woiwurrung -- seems to be an Australian language without a code. wwg is not in Ethnologue or ISO 639-3, nor is the spelling Woiwurrung anywhere in Ethnologue, perhaps there is another spelling?
zh-classical: Old Chinese
zh-cn: Simplified Chinese
zh-min-nan: Min Nan -- has a Wiktionary
zh-tw: Traditional Chinese
zh-yue: Cantonese
zkm: Maikoti -- Ethnologue says this code has been retired and code kjl for Western Parbate must be used instead.
Latest comment: 15 years ago1 comment1 person in discussion
"When you approve someone, don't forget to go to Special:UserRights and mark them as autopatroller." Oh, is it enough for one person to support the vote? I was regarding it as something like an admin vote, where a decision was made after several people had contributed viewpoints. Equinox◑23:35, 2 May 2009 (UTC)Reply
P and NP
Latest comment: 15 years ago2 comments2 people in discussion
Latest comment: 15 years ago6 comments3 people in discussion
or just having a lend?
I can't see what is wrong with putting a link to a transcription of a text of a war diary in with the citation on "Cornstalk"? "Tommy Cornstalk" is certainly not a dictionary.
As for your RfV on the slang meaning of "bosky", if you had an olive complexion (like me) and been around factories, farms and football training areas in Australia in the 50s, 60s, and 70s (like me) you would have heard the word used to refer to you generically if they didn't know your name. "Hey, bosky" would be used to attract your attention just as if you had fair or blond hair you would be called "snowy" or "snow" or red hair "bluey" or "blue". It is is likely that most of these usages have not found their way into even a slang dictionary. The usage I have got for the word comes from a novel. I have to track it down on my bookshelves.
I was not stalking you in particular, just looking at Special:RecentChanges in order to find contributions that needed improvement (see meta:Help:Patrolled edit for more information). I had a brief look on http://books.google.com, and http://groups.google.com and could find no written evidence to support what you were saying at bosky (which is needed by WT:CFI). You are right there is nothing wrong with the citation you added, but there is also nothing useful about it; which is why I linked you to the Help page and did not remove it from the entry. Conrad.Irwin08:36, 11 May 2009 (UTC)Reply
I understand but Googlebooks is far from exhaustive. The usage of "bosky" I put in is almost certainly archaic and may only have been used in a region ie. inland NSW and Queensland. I am going to try to find the book I saw the usage in. Why isn't a link to a book's online transcript useful? Albatross214703:25, 14 May 2009 (UTC)Reply
Hi, sorry, I'm not great at explaining things. Essentially the quotations you added were "mentions" of the word, they did not show it being used, they showed it being discussed. We generally only accept the former for WT:CFI (though whether it is documented well, I don't know); while there is no harm in adding quotes that show the word being discussed, they don't show the reader (of Wiktionary) how the word is used. Conrad.Irwin08:51, 14 May 2009 (UTC)Reply
Latest comment: 15 years ago2 comments2 people in discussion
Dear Conrad,
the Spanish Index generated by your bot code contains quite some verb forms, which have not been detected as such. These are mainly the words ending in "ríais" and "rían". As all Spanish words with these endings are verb forms you could maybe consider to detect them and exclude them from the index, which should be IMHO easy to implement into the code. Matthias Buchmeier09:26, 13 May 2009 (UTC)Reply
Sure, I'll try to up date it before the next run. (Seems to me that I just need to exclude definitions containing "(first|second)-person singular of" in addition to all the current rules that only match if the definition contains "form of"?) Conrad.Irwin09:30, 13 May 2009 (UTC)Reply
Simple English Wiktionary request
Latest comment: 15 years ago2 comments2 people in discussion
Hello! We've really been enjoying your script over at se:wikt. I have a small request, however. Do you think you could make it so that it shows green links for template:irrnoun? Thank you! Tygrrr17:15, 13 May 2009 (UTC)Reply
Latest comment: 15 years ago3 comments2 people in discussion
Conrad, I've noticed that the masculine plural is not properly accelerated for this template. The feminine singular and plural both work fine, but the masculine plural is generated as a feminine entry for some reason. I can't figure out why (in part because the coding is a bit beyond me). --EncycloPetey03:11, 14 May 2009 (UTC)Reply
Latest comment: 15 years ago1 comment1 person in discussion
Hi there, it's w:User:Hangfromthefloor from #mediawiki on irc.freenode.net from Friday evening. I'm extremely grateful for your help, and even though your solution didn't work perfectly in the end, I appreciate all the effort you made trying to make it work. So, thanks once again! Hangfromthefloor02:30, 6 June 2009 (UTC)Reply
Galician index
Latest comment: 15 years ago23 comments6 people in discussion
I've added more than 1200 words in the past few months (which more than doubles the number of raw entries we had (not counting "Translation section entries" in the Index). It's probably time to re-run this. Reminder: alphabetization is the same as for Spanish. --EncycloPetey16:26, 8 June 2009 (UTC)Reply
I can do this when I get back to my computer. It was last run on the 27th of May, and I had intended to do it on Saturday, but forgot. Conrad.Irwin16:36, 8 June 2009 (UTC)Reply
On an unrelated issue: Do you know why the server is acting so sluggish today? I'm trying to add Galician verb forms via bot, and all I get is "Pausing 5 seconds due to database server lag." over and over. ...and I can't recall the (UNIX) code to stop a running process. --EncycloPetey16:57, 8 June 2009 (UTC)Reply
Not sure, maybe they are doing the work to get the new search engine installed on Wiktionary too (well, I can dream :D). Conrad.Irwin17:10, 8 June 2009 (UTC)Reply
(Oh, and CTRL+C should halt it, if not use CTRL+Z to send it to the background and then ps ax and kill to get rid of it). Conrad.Irwin17:11, 8 June 2009 (UTC)Reply
Thanks. Are you ready (and willing) to branch out into other Romance languages? This would be relatively easy, since they all use Latin-based alphabets. If so, I can work up details (if you need them) for alphabetizing Romanian, Catalan, Portuguese, and French. All of these are major languages that are well-represented on Wktionary, but which have old Index pages in dire need of updating. Starting an Index for Asturian, Neapolitan, Sicilian, and Occitan could also be useful, although each of those would be less likely to need updating, as people seldom contribute in those languages. --EncycloPetey19:05, 9 June 2009 (UTC)Reply
Yes, certainly, most of what has been holding me off has been that some of the indexes for these languages already contain lists of red-links, but I can always do the same as for Index:Russian and move them somewhere (i.e. Wiktionary:Requested Entries:<Language>/<letter>). Conrad.Irwin11:27, 10 June 2009 (UTC)Reply
Catalan: Vowels may have an acute accent (only É é, Í í, Ó ó, Ú ú), a grave accent (only À à, È è, Ò ò), or a diaresis (only ï and ü, and I don't think this occurs over capital letters). I'm uncertain, but I believe indexing order is in the sequence: unmarked, acute, grave, diaresis. This sequence applies for ordering in the event of a "tie" (otherwise identical spelling), but the marked vowels are considered equivalent with the unmarked vowels otherwise, so words beginning with A and Á are intermingled. The only consonant oddities I know of are the c-cedilla (Ç ç), which comes right after C/c, and the ela geminada, which is used to distinguish two "L"s from a double "LL" (as in anul·lar). I do not know whether the double "LL" (without ela geminada) is indexed as a separate letter, or not. User:Carolina wren should be able to confirm or amend this information. --EncycloPetey15:19, 10 June 2009 (UTC)Reply
First sort criteria is indeed with all accents (cedilla and the middle dot of the ela geminada in addition to the diaeresis, acute, and grave) ignored. Hyphens are also ignored unless what follows the hyphen is captalised, in which case it is treated as two separate words.
Second sort criteria is the position of an accent. The later the accent occurs in the word, the earlier it sorts.
Third sort criteria is which accent. Grave comes last, but I believe the relative order of diaresis is a moot point, as I don't believe there are any words in Catalan where a diaresis contrasts with acute or grave. In any case the IEC grammar merely asserts that grave comes after acute.
Examples
forca < força < forcall
cella < cel·la < cellut
suplica < suplicà < súplica (no accents < grave on 7th letter < acute on 2nd letter)
sénia < sènia ( acute on same letter < grave on same letter )
beneit < beneït
mallar < mà-llarg < mànec (sorted as mallar < mallarg < manec )
Finally there are three sanctioned methods of sorting multi-word phrases, with apostrophes ignored in all three.
Word by word
mà < mà a mà < mà d’obra < mà d’urpa < mà de morter < mà invisible < mà morta < maça
Word by word, ignoring prepositions and articles
mà < mà invisible < mà a mà < mà morta < mà de morter < mà d’obra < mà d’urpa < maça
All smashed together
mà < mà a mà < maça < mà de morter < mà d’obra < mà d’urpa < mà invisible < mà morta
French: Ther are a number of accented or modified characters, but none of them are considered separate letters. They are all indexed along with the same chaacter unmarked. For consonants, there is the c-cedilla (Ç ç). Vowels may have an acute accent (É é only), a grave accent (À à, È è, Où, où), a circumflex ( â, Ê ê, Î î, Ô ô, Û û), or a diaresis (Ë ë, Ü ü). I believe the order I've given is that used for breaking spelling "ties", but I do not know French well enough to be certain. User:Lmaltier could probably tell you. There is also an uncommon digraph Œ /œ, and I'm not certain but I believe it is indexed as if it were Oe / oe. --EncycloPetey17:08, 10 June 2009 (UTC)Reply
For French, I think that EncycloPetey is right. But you should add ÿ (used in a few proper nouns) and ï (used in many words, e.g. naïf). Also: ä (used in länder), ñ (used in cañon). Other foreign words with diacritics are sometimes used with their original spelling, but this is not a general use. Lmaltier18:06, 10 June 2009 (UTC)Reply
What to do with compound vs. uncompounds? (IMHO the ideal sensible option is to list them next to each others no matter how we deal with hyphenation in general) cf. abri sous roche vs. abri-sous-roche (the later is the one listed in dictonaries because of the French typograpic definition of an entry, but appears much, much rarer than its hyphenated counterpart), antiaérien vs. anti-aérien... Circeus18:05, 23 June 2009 (UTC)Reply
Portuguese: Vowels may have an acute accent (Á á, É é, Í í, Ó ó, Ú ú), a grave accent (À à, Ù ù), a circumflex ( â, Ê ê, Ô ô, although  seems very rare), a tilde (à ã, Õ õ), or a diaresis (Ü ü, which seems to occur only in the combination Qü or qü). Indexing sequence seems to be in the order: unmarked, acute, grave, circumflex, tilde, diaresis (although I'm making a bit of a guess on the last three). This sequence applies for ordering in the event of a "tie", but the marked vowels are considered equivalent with the unmarked vowels otherwise, so (for example) words beginning with A and Á are intermingled. The only odd consonant character is: Ç / ç, which is considered equivalent to C /c for indexing (but follows in the event of otherwise identical spellings). Portuguese also has the following digraphs: ch, lh, nh, rr, but I do not know the current conventions for indexing them. Traditionally, they were considered separate letters. You might inquire of User:Daniel. to see whether this is still the case. --EncycloPetey15:19, 10 June 2009 (UTC)Reply
Various ortographic reforms ocurred in Portuguese language through the last hundred years, the last one being in effect since January 2009. Therefore, a relatively high number of different rules produced words to be included here in Wiktionary, including many that would probably be defined as {{obsolete spelling of}}.
The Portuguese alphabet is composed of the same 26 letters of the English alphabet and in the same order, from A to Z. This includes the letters K, W and Y, that were once considered not part of the alphabet and are still avoided in favor of substitutions with the same sounds. For example, in Portuguese karate is caratê and Kenya is Quênia.
There are the following letters with diacritics: á, à, â, ã, ç, é, ê, í, ó, ô, õ, ú (and their uppercase counterparts: Á, À, Â, Ã, Ç, É, Ê, Í, Ó, Ô, Õ, Ú). Some uses of the diacritics were eventually abolished, such as the distinction between tôrre and torre, platéia and plateia, among thousands of other words. All of these letters with diacritics are still very common in lowercase, but some are never used as first letter in any word, so their uppercase usage is restricted.
The diaresis (Ü, ü) is now obsolete, but was also very common; it mainly ocurred in the combinations güe, güi, qüe and qüi to determine whether the vowel u would be pronounced. It could also be used to transform diphtongs into hiatuses: saüdade, uïvo, aïpo, etc.
Another obsolete use of diacritics is the placing of a grave in certain vowels (À, à, È, è, Ì, ì, Ò, ò, Ù, ù) to change their sounds: pregar and prègar to distinguish between the translations to preach and to nail. Today, the grave has other use: it occurs only in the letter a, to mark certain contractions.
There are the following digraphs: bd, bt, cc, cç, ch, ct, gd, gu, lh, mn, nh, pc, pç, pt, qu, rr, sc, sç, ss, tm, xc and xs.
Additional digraphs, uses of diacritics and uses of K, W and Y may appear in words directly borrowed from other languages, such as in watt, kanji, walkie-talkie and the names Shakespeare, Loth and Müller.
Digraphs and diacritics are not considered separate letters and do not interfere with the alphabetical order. For example, here are some words correctly sorted: cachorro, chaminé, coração, eleger, ética, excelente.
In the event of a tie, the order or diacritics commonly used in dictionaries, including diaresis, is: unmarked, cedilla, acute, grave, circumflex, diaresis, tilde. Another example: agua, água, sanguinário, sangüinário. --Daniel.05:47, 11 June 2009 (UTC)Reply
Romanian: It looks as though there's already a good guide across the top of Index:Romanian. There are a few letters (k, q, w, and y) that are non-native, but do occur in some borrowed foreign words. There is a "concave accent" that can occur over an "a" (as Ă / ă), a circumflex that can occur over "a" or "i" (as Â, â, Î, î), and a "cedilla" (actually a floating symbol not attached) that can occur under "s" or "t" (as Ş, ş, Ţ, ţ). Each of these modified symbols (vowels and consonants) is treated as a separate letter, and is indexed immediately following the same spelling unmarked. All of these except  may occur at the beginning of a word and should have separate index pages. Acute accents over vowels are possible in writing, but are only used in rare circumstances to differentiate between homographs. Of course, check with User:Opiaterein before starting, as he may have additional comments. --EncycloPetey15:19, 10 June 2009 (UTC)Reply
Life has suddenly become unexpectedly busy, I will try and implement these over the course of the next fortnight, though I had been hoping to have them done yesterday. Conrad.Irwin23:36, 12 June 2009 (UTC)Reply
Catalan and Romanian are being generated (and I've yet to receive any feedback about the inaccuracies in either). French gets exported, and translations for it found, but I have not yet written the sorter, I can do that tomorrow I think. Havent yet looked at Portuguese, again over the weekend is my aim. Conrad.Irwin01:47, 1 August 2009 (UTC)Reply
Catalan is being indexed pretty much as expected as far as I can tell. However, would it be possible for the sorting bot to override its own rules and use the sort key if an inflection line template uses it? That would allow phrases such as a la romana to sort under romana, a la instead. (Sort key was added to the entry since the last indexing, so if this is already being done, just ignore me.) All the català inflection line templates that have a sort override use sort as the parameter, the same as {infl}. The suggested improvement would likely apply to the other languages as well. — Carolina wrendiscussió01:30, 7 September 2009 (UTC)Reply
I could add support for it, for languages like English I just remove all the "the"s and "a"s from the front anyway - should I do that for "a" and "la" for Catalan as well as looking for the sort parameter? Conrad.Irwin07:15, 7 September 2009 (UTC)Reply
No, don't try autostripping. Catalan is fond of using combinations of prepositions where English would use only one. We have one yellow link in the index "a dins" that I believe should stay in its current location, and there might be more worthy additions. Nor can "la" or the other articles be stripped automagically, as unless you know whether the following word is a noun or a verb, there's no way to tell whether a Catalan article is instead a Catalan pronoun. Which makes sense since they have the same etymological origin. It would be as if instead of "The boy hit the baseball with the bat," English used *"Him boy hit it baseball with it bat." — Carolina wrendiscussió07:57, 7 September 2009 (UTC)Reply
T-balancing
Latest comment: 15 years ago2 comments2 people in discussion
Latest comment: 15 years ago3 comments2 people in discussion
Not sure if this was discussed before. Could the code you've written for assisted translations be used to add requested entries without editing the page? It would be one entry field for the entire page and the code would place it in correct alphabetical order. E.g. on Wiktionary:Requested entries:Hungarian but it can be for any language. --Panda1000:10, 19 June 2009 (UTC)Reply
Yes-ish, the code has been designed that such changes would require the "minimal" amount of Javascript. The -ish because Javascript cannot sort properly (though we can get it most of the way there), and the "minimal" in quotes, because such a change requires making an edit to the Wikitext and the HTML in parallel (which is, although not hard, not trivial to get right). I don't have any immediate plans to add this feature, but if you want to have a go, feel free (it should be very similar to the way that translations are currently added). Conrad.Irwin00:15, 19 June 2009 (UTC)Reply
I am hesitant. I don't have a good understanding of how to do this and coding would take away all my time from adding/correcting Hungarian entries. Maybe there is someone else who is more knowledgeable and would be insterested in coding this project for the English requests. This should be much simpler than adding translations. It is really just adding a word (no gender specifications or other grammatical things) and if sorting is not perfect, there is no harm. --Panda1002:04, 19 June 2009 (UTC)Reply
"no can do" error message found
Latest comment: 15 years ago1 comment1 person in discussion
Hi cirwin. I had a hunch that paid off. There is an incompatibility between "paperview" and one of my JavaScript extensions. parjer.js does an alert "This doesn't look like a Wiktionary page. No can do I'm afraid." I boldly edited parser.js to add "Conrad Irwin / parser.js\n" at the begninning of all alert messages so they won't be so hard to find in the future. Feel free to change it however you think is best though.
Oh there are also two Google scripts included randomly from somewhere too but I think these were a read herring and not the things which were clashing with paperview. — hippietrail02:33, 21 June 2009 (UTC)Reply
from Georgian Wiktionary
Latest comment: 15 years ago4 comments2 people in discussion
(excuse me, because my knowledge of English is not good)
Hello, I'm David, Admin of Georgian Wiktionary. I've seen English Translation template. It is very easy template for all users. I want this template in its Georgian Wiktionary. I have translated and have something done, but that is not as good as in English. Can you tell me how they made this template? Or is this a secret :) thank you from the outset. Dato deutschland06:49, 22 June 2009 (UTC)Reply
Unsere Vorlage arbeitet. ist nur eine Problem. Englische template hat zusätzliche Funktion (Add translation, sehr leicht). But, georgian template has not this function. Entschuldigen sie bitte mir, wenn ich habe schlecht erklären. Dato deutschland12:49, 22 June 2009 (UTC)Reply
Latest comment: 15 years ago2 comments2 people in discussion
Hi Conrad,
Just FYI, I modified the race-condition–handling code in MediaWiki:langcode2name.js and User:Conrad.Irwin/iwiki.js. In the latter case, the issue was that JavaScript is finicky: ! document.getElementById (for example) is true when document lacks a getElementById property, but ! langcode2name is an error if langcode2name is undefined. I don't know why this should be, but I Googled and I tested, and I'm pretty confident of it. In the former case, the issue is that the for(var prop in obj) construct iterates over the names of the properties, not their values, so in this case callback is the string 'iwiki' rather than the function add_prominent_interwikis.
(I don't know if similar race-condition–handling code is being used elsewhere on Wiktionary; if so, I imagine we need to change all of it.)
Thanks a bunch! Don't worry about the Latin letters — now that I know about them, and they're conveniently indexed, I can go through and take care of them. :-) —RuakhTALK17:57, 6 July 2009 (UTC)Reply
By the way, the Hebrew index is not-clickable in Opera. I mean the browser does not recognize the links except for the last letter ת. And it was that way even before automatic indexing. --Vahagn Petrosyan23:30, 6 July 2009 (UTC)Reply
The code that you use to parse the dump, is it public, by any chance? I ask because a lot of the Hebrew errors are technically parsing errors rather than content error; but I'd rather fix our wiki-syntax to match what your parser will recognize, since I'm sure your parser isn't going to be the only one with this sort of problem. Thanks in advance! —RuakhTALK03:48, 27 July 2009 (UTC)Reply
Umm, yes it is, but I'd rather fix the code than the entries - it's a rather huge mess of awk and python and perl held together with a bash script. What is the main problem? That way I can at least point you to the right file. You may get some idea of how it works by reading User:Conrad.Bot and looking in bash/create_indices.sh. Conrad.Irwin07:39, 27 July 2009 (UTC)Reply
I think I came across two issues, both having to do with translations tables. One is lack of support for preposed named parameters; for example, {{t|sc=Hebr|he|מלכה|tr=milká}} was interpreted as referring to he, when in fact it refers to מלכה. The other is lack of support for {{he-translation}}; I don't even know what the issue is there, but sometimes it gets interpreted as a single space (]), sometimes as two (]). —RuakhTALK15:08, 27 July 2009 (UTC)Reply
5 0
2 a
1 d
1 f
22 h
1 l
4 m
2 n
3 p
1 q
2 s
1 t
2 y
2 z
698 א
370 ב
224 ג
183 ד
370 ה
59 ו
90 ז
289 ח
109 ט
203 י
206 כ
605 ל
1 ם
762 מ
1 ן
246 נ
254 ס
265 ע
259 פ
154 צ
291 ק
198 ר
390 ש
240 ת
6516 total
T-balancing error still exists
Latest comment: 15 years ago2 comments2 people in discussion
Ok, I'll have another look then, (and secretly hope it's a caching problem and that they were still using an old version) Conrad.Irwin23:05, 4 July 2009 (UTC)Reply
Good work
Latest comment: 15 years ago1 comment1 person in discussion
Latest comment: 15 years ago7 comments3 people in discussion
Hi, can I ask a favour of you? EncycloPetey created this template and asked me to comment, which I did, and he improved it so it's just about perfect now, with only one thing left he doesn't know how to do: Many masculine nouns in Scottish Gaelic have the same genitive singular and nominative plural. EP tweaked the template so that when one writes eg
(When the form is the same it affects eg the genitive plural or vocative, so it's helpful if it's pointed out.) Question is, using accelerated editing the "form-of" page has a definition line for the genitive only - is it possible to make it create both the genitive and the plural lines as at balaich, so that the line for the plural wouldn't have to be added manually? Another example of such word, where I haven't created the inflected forms' entry yet, is oileanach. Thanks in advance whether it is technically possible or not. --Duncan16:19, 6 July 2009 (UTC)Reply
Argh! it seems highly browser-dependent, which is why for the other hashes I use #, unfortunately, that seems to introduce an extra space. Meh, I'm not sure I can fix it trivially - a fix needs to be made to w:User:Lupin/AutoEdit.js - or, perhaps better, to stop using that entirely. Conrad.Irwin00:08, 7 July 2009 (UTC)Reply
When I deleted "w:User:Lupin/AutoEdit.js" from User:Duncan MacCall/monobook.js the accelerated editing stopped working at all, but never mind - highlighting "%23" and overtyping it with "#" is still much better than copypasting the line and overtyping "genitive" with "plural". Another thing occurred to me in the meantime, though - my bad, I should have noticed it at once: the accelerated editing creates the inflected entries like "three apostrophes - inflected form - three apostrophes", thus only making them bold - but I think those too, not just the lemma, should show the gender of the noun - at least that's how I've been creating them until now, though I don't know about any policy requiring it, and it seems to me from eg the Spanish puños that this can be achieved - could you make the template work similarly to the Spanish one? Apologies and thanks in advance again, --Duncan15:26, 7 July 2009 (UTC)Reply
Latest comment: 15 years ago1 comment1 person in discussion
Hi Cirwin. Last night I actually got around to using newNode() for the first time rather than just reading over code that uses it. Before I actually thought it read through one big object to create a tree of nodes. Now I realize that of course it does just create one node at a time but that calls can be easily nested.
Helping search engines find collocations that don't meet CFI
Latest comment: 15 years ago10 comments4 people in discussion
As you must know, our search engine doesn't find terms like "stiff drink" when they are marked up per WT:ELE as "stiff drink". I also note that en.wikt gets a very high percentage of its hits from search engines (per Alexa). It would not be an evil use of meta-tags (or whatever they are called) to incorporate certain collocations which the search engines would not find due solely to wikimarkup. I took a look at Free Dictionary source for their "stiff" page. They span "stiff" "drink", so no markup interferes with the collocation. Another way of doing it would be the "Collocations" header that MZajac has been advocating, assuming that the search engine looks that deeply, especially "beneath" our show/hide templates. Any thoughts? DCDuringTALK15:14, 14 July 2009 (UTC)Reply
Right now it's finding stiff, with the excerpt text “Adjective: A stiff drink; a stiff dose; a stiff breeze. Translations: of an object, rigid, hard to bend, inflexible. Finnish. fi | jäykkä ...” (I can't tell whether it is compiling this excerpt from the current entry text or has indexed an older version.)
If the search engine can't find phrases broken up by formatting, then it should (must) be fixed (regardless of any other workarounds we adopt in the meantime). Is there a bug report we can all vote for? If not, then let's file one. —MichaelZ. 2009-07-14 15:30 z
Me too. Maybe something has been changed. Maybe there is a problem only in more specific circumstances. Maybe I erred. I'll have to pay closer attention to exactly what situations lead to unsatisfactory search results, if any. DCDuringTALK16:34, 14 July 2009 (UTC)Reply
I like the collocations idea, except that I fear that such a section will get unwieldy fast (especially on certain pages). Allowing boldfaced common collocations (instead of merely boldfaced words) in example sentences is another idea, though not one I'm all that fond of. See {{Keyword}} for yet another idea.—msh210℠15:46, 14 July 2009 (UTC)Reply
The template addresses the issue, of course, but at the cost of a lot of labor and, if done by bots, say, using COCA collocations, a lot of space. Using lists of the top COCA collocations to populate the keyword template if the text is not in the body might be seen as cheating. It would have to be explicitly sold to the search engine folks.
Perhaps we could look at COCA to find the top N collocations, insert the list in a namespace like "Talk", "Citations", or, dare I say it, "Collocations", and add usage examples manually for some of the redlinked ones based on that data, using a template {{usex}} to facilitate formatting and metatagging at the same time. DCDuringTALK16:34, 14 July 2009 (UTC)Reply
Actually, I'm not sure that {{Keyword}} addresses the issue: search engines use keywords, perhaps, but I don't know whether Wiktionary's internal one does. Another idea is to not boldface anything in citations in the citations namespace (but to continue boldfacing in citations in the entry), and to by default search also in the citations namespace. Actually, I like this idea: it doesn't require "cheating" or new infrastructure. I don't know, though, whether we can change the default search namespaces. That should be easy to find out, though.—msh210℠16:43, 14 July 2009 (UTC)Reply
I was going to stop complaining about our search engine until I have a better feel for when or how it fails from a user perspective. I would love to read up to 10 pages of more-or-less English on how the Mediawiki search software works. Any very easy low-resource-cost solutions, no matter how partial, are very desirable. DCDuringTALK18:16, 14 July 2009 (UTC)Reply
I think the issue with the built-in search may have been fixed. The issue you mention with Google just has to do with indexing delay; its cached version is from Jul 11, 2009 06:37:37 GMT, which was before DCDuring added the "stiff drink" sense and example. I'm sure it'll be fine once they re-index the page. —RuakhTALK23:34, 14 July 2009 (UTC)Reply
Thanks for the link. Looks just about right for me. I'll be paying more exact attention to any search problems that I experience so I can be more specific. As Ruakh says, the problem with wikiformatting preventing MW internal search from working seems resolved. Whether google search has a problem with wikiformat is the next question. The last question regards what could be called "collocation stuffing". How can we productively insert common collocations what don't meet CFI into web pages? We could:
make sure that all headwords deleted by reason of being SoP appear in one or more relevant entries.
have a kind of a checklist to make sure that N of the most common collocations for a given headword appear in the entry. Or
could just stuff the top M collocations into metatags and be done with it.
I'm sure there are better ideas, too. And many drawbacks and implementation problems, not to mention questions as to desirability.
Per Alexa, we do seem to be getting a lot of our traffic from search engines already, much more than from WP. And we pass many users on to sister projects. Both of those facts should make us seem valuable to WM, I hope. Maybe we could become important enough to get more of their attention to our more distinctive needs. I get the impression that we are sucking hind tit most of the time. DCDuringTALK00:24, 15 July 2009 (UTC)Reply
I have three questions:
How can I find out more about stop words as applied for en.wikt search?
Do we have any ability to add or subtract from the stop word list specifically for en.wikt?
Pending some kind of more definitive resolution of search engine handling of collocations, do you know whether, 1., hard redirects or, 2., soft redirects show up at all on google searches? At WT:RFD#check it out we have good content including usage examples that could go to a sense of check out or to check it out. There are also 2 hard redirects. Some collocations are now usage examples also. From the point of view of our internal search I think I understand the differences. But I don't know what works best for Google or other search engines, including OneLook.com, which has a fairly big list of collocations available for a secondary search. DCDuringTALK03:20, 27 July 2009 (UTC)Reply
I lied to you.
Latest comment: 15 years ago1 comment1 person in discussion
Latest comment: 15 years ago4 comments2 people in discussion
Hi Conrad ^_^ I have some questions that need your wizardry: Would it be possible to pre-process Wiktionary wikicode when a user visits a particular page, do some magic on it, send that magicaly processed wikitext back to MediaWiki and display it only then to the user?
E.g. suppose I want it so to split the Serbo-Croatian L2 section as three B/C/S section, taking into account scripts and the context labels, and display such split text to the user that would select to have split display (all three of the sections, or just one of them).
Except for a bit longer opening of pages, what other drawbacks could it have?
Suppose if we used subpages for non-English language sections, e.g. {{PAGENAME}}/hr, {{PAGENAME}}/bs, {{PAGENAME}}/sr, {{PAGENAME}}/sh that would be transcluded on the main page this would all be pretty much trivial.. --Ivan Štambuk20:33, 20 July 2009 (UTC)Reply
It would be possible, a better idea would be to just find the section and duplicate it in javascript which would be considerably faster, if slightly harder to program correctly first time. (And which would also ensure that all the information for the three languages was in the page of the unified language). It might get slightly confusing, as we already have too many buttons and distractions, but providing some thought is given to the interface I don't see any show-stopping difficulty for this proposal. If you were to want to render wikitext and give the result to the user directly, User_talk:Conrad.Irwin/Api.js describes the library that I am currently merging into WT:EDIT, which would allow you to give the user rendered text trivially. JsMwApi().page(wgPageName).parse(newText, function (parsed) { document.getElementById('bodyContent').innerHTML = newText }) (Code like this is the basis for the preview that assisted translations provides). Conrad.Irwin21:36, 20 July 2009 (UTC)Reply
That looks awesome! It's just that I have programmed javascript something like 2 times in my entire life (both times were cookie stealers for some dumb Croatian blogging service ^_^), so it might take me some time to simply get started (the SC->B/C/S conversion algorithm would would be some 50 lines of code max.). Would you be kind enough to point me to what exactly I must set up in my monobook.css to get started? I would be esp. thankful if you could provide me with the 3-LOC skeleton of a js function that appends e.g. "Hello world" to the page's wikitext. From there I can figure out the string processing on my own.. --Ivan Štambuk21:54, 20 July 2009 (UTC)Reply
There are both simpler (but buggy) and (almost pointlessly) more complicated ways to do the same thing, but that should work without breaking the current page. If you haven't already, I'd recommend getting Firefox with Firebug as then you can easily try out new things and it gives you easier to follow error messages. Conrad.Irwin23:40, 20 July 2009 (UTC)Reply
creation.js seems to have a slight problem
Latest comment: 15 years ago3 comments2 people in discussion
Could you please take a look? Dutch diminutive plural forms are being created and categorised as plurals. This is wrong they should be using the following format: {{diminutive of|xxx|plural=1|lang=nl}} and I guess just like the singular forms it should be noted that they are neuter gender. 50 Xylophone Playerstalk12:05, 28 July 2009 (UTC)Reply
Latest comment: 15 years ago4 comments2 people in discussion
Looks good so far. My only gripe would be that IMHO, a space should be treated as such, i.e. au contraire should precede aubaine, and a hyphen is probably best treated as going after a space (i.e. au-delà would be between au violon and aubaine, but given dictionaries appear to differ, I'll leave this to your discretion). AFAIK there is no standard treatment for this issue because French dictionary are very keen on the typographical word concept:a hyphenated word will have an entry even if the unhyphenated form is actually standard (cf. abri sous roche), and multiword expressions are systematically relegated as subentrie (in fact, in the Robert, they are not even considered subentries at all!).
Nonetheless, I feel strongly that putting a word break there to group such entries together is far more intuitive, and would suggest the same policy be applied to English: common sense leads one to look for at odds before Atacama rather than after atmosphere! Such a treatment also makes it possible to e.g. take at a glance a list of compounds starting with a preposition.
None of the reference book i personally own discuss alphabetical ordering or collation. If you want, I can look into and get back to you at another time? in any case, I remind much of the opinion that a space should be a separator. I was a bit confused by your comment re:trimming off various forms of the articles, so if you could show specific case so I understand your meaning.
Since you're already pumping the translation sections (which allows me to fix all sort of small things), is it possible to also skim sections like, synonyms and derivatives for red entries? (at first I had "alternative forms" in these, but these entries are automatically discarded for being constituted only of a "form of" template...)
Since you're obviously checking whether a French section exists (or the index couldn't selectively link only when they exist), could you apply a formatting to entries where the page exist, but not the section (cf. Abdoulla before I created the section).
Yes, I also was unsure of space handling, not sure what made me choose the current method, but I can easily change that; as long as the collation looks right, I'm not too fussed about making it perfect. The forms of the article that I strip off are so, for example, "arithmétique", "d'arithmétique" and "de l'arithmétique" all appear next to each other - again if that's incorrect behaviour, it is easy to fix. I wasn't sure about whether au violin should be under "a" or "v".
I can try adding a scan for synonyms and derivatives - I tried once with Hungarian and found the results unpleasing, but maybe it will be better for a language with so many more entries, I'll have a go. It should be possible to colour the link if it doesn't have a local language link (I hadn't thought of that before, and might apply it to all languages). Conrad.Irwin22:18, 1 August 2009 (UTC)Reply
I would recommend not dropping "de" and "d'" at the beginning of word: it would be exceptional for them to be determiner rather than article preposition in that particular position within the lemma
Regarding derivatives and synonyms, I meant it for locating additional missing entries (red and "orange" links), so IMHO the symbols can be dropped without problem. (I think it's safe to assume there will be less issues than there are with incorrectly entered translations) Circeus01:50, 2 August 2009 (UTC)Reply
Wiktionary logo 2009 refresh voting
Latest comment: 15 years ago2 comments2 people in discussion
Hello there, I just would like to inform you that the Wiktionary logo 2009 refresh voting should start now, but so far we haven't come up with any rules yet. Since you were very active in the discussion, would you like to help out again? Thanks in advance. Wyvernoid03:14, 8 August 2009 (UTC)Reply
Latest comment: 15 years ago1 comment1 person in discussion
Thanks for your advice. I got 'told off' on Wikipedia for not using that template so I tried to bring it across to Wiktionary but it depended on another template which in turn depended on another and so on. It seems over engineered and messy.
If you mean, could it be made to jump to the correct section of Wiktionary:Glossary, then the answer is yes, if you add some markup around each item in the list as follows:
Latest comment: 15 years ago2 comments2 people in discussion
Do we know how search engines handle hard redirects vs soft redirects vs only-in vs full entries? How could I find out?
From a user point of view hard redirects afford some advantages IMO because the user needs one click less to get to meaningful content. It is also slightly quicker to add redirects than "form of" entries. But I suspect that they may not show up in search engine search pages. DCDuringTALK18:13, 10 August 2009 (UTC)Reply
I don't know, I suppose by doing some searching you could work it out - the only one I tried "to kill the fatted calf" was not indexed. One of the advantages of the big search engines is that they do some of the form-normalisation for us, so that searches are more likely to end up at the right place without the need for the redirection - I suppose we have to remember that most people won't type into the URL bar, they'll use the search boxes or a search engine. Conrad.Irwin21:13, 10 August 2009 (UTC)Reply
OK, experimentation it is: Google, Yahoo, Bing(?), OneLook, Answers.com.
Stop words for internal search
Latest comment: 15 years ago2 comments2 people in discussion
It is my understanding that Lucerne search includes personal pronouns (he, his, etc) as well as "one" and (I think) "one's" as stopwords. It does not seem to include "someone", "someone's", "somebody", and "somebody's". I'm not sure that I'm thinking about this right, but if they were also on the stopword list, wouldn't that sometimes improve folks ability to find lemmas containing those words, eg, when they typed in other words (usually stopwords like "you", "your", "yours") ? DCDuringTALK18:13, 10 August 2009 (UTC)Reply
It would increase the number of pages that matched the search, because it would not be searching for "someone's", but it would also bring up more false-positives. On Wiktionary, I think you are right, and having "someone" as a stop-word might be beneficial given that we have pages with "one" - I can't think of any phrases involving somebody that have only one other word (off the top of my head), but even those with only two words might be a bit of a problem. Conrad.Irwin21:18, 10 August 2009 (UTC)Reply
Automatic translation
Latest comment: 15 years ago5 comments2 people in discussion
Hello, Conrad. If I remmeber correctly, it was introduced by you, so there is some sort of feedback - when I add Bulgarian translations for nouns, the fields for gender and transliteration appear, but when I try to add Old Norse translations and I fill up the code non in the first field and click on the second field, both fields (check box and edit box respectively) for gender and transliteration disappear, whereas the first one (the check box for gender) should remain available, since the Old Norse language has three genders. Is this flaw easily reparable? I shall be adding Old Norse nouns manually in the meantime. The uſerhight Bogormconverſation12:03, 13 August 2009 (UTC)Reply
Hi, the gender buttons are still there, just hidden under the "More..." link. I will look into the genders issue though, it should show all genders for any language that User:Hippietrail hasn't told it about. Conrad.Irwin23:35, 13 August 2009 (UTC)Reply
Latest comment: 15 years ago2 comments2 people in discussion
"i Sven, long time no chat. We'll have to catch up soon when I'm not so snowed under with work. :( Just to let you know, the "standard" formatting for Eng->Chi entries nowadays is:
Please don't put them under "Mandarin"; it makes it harder to find, as most people look under Chinese when looking for translations. Cheers! Tooironic 22:18, 19 August 2009 (UTC)
Latest comment: 14 years ago7 comments2 people in discussion
Ariel Glenn overstates my expertise, we needed some table of equivalents so I consulted geography and library sites. Have only just seen you note and will look it over, over the next couple of days. —Saltmarshαπάντηση09:18, 27 August 2009 (UTC)Reply
Conrad - The transliteration table I set up at Wiktionary:About Greek/Transliteration created little interest at the time that I did it - looking back at it now there seem to be one or two possible "errors" which I should address (perhaps). Looking through the discrepancies:
It seems to sort out the ευ => ef/ev dichotomy alright.
el:προϊόν => proión != proion should IMO => proïón (see table)
Thanks a lot! I was hoping that it would already be done. We are waiting on the CTO of Wikimedia reviewing the extension as and when he gets round to it. Once it has been reviewed, it will then need another wait (hopefully much shorter) before it gets actually installed. https://bugzilla.wikimedia.org/show_bug.cgi?id=20246 is where my request to them is. Conrad.Irwin22:27, 23 September 2009 (UTC)Reply
Latest comment: 15 years ago4 comments2 people in discussion
Hi Conrad, lately when I am adding two translations in the same trans table, it adds two Hungarian lines and I have to edit it to remove the duplicate line. Before, I believe, it just added the second translation with a comma. Thanks. --Panda1014:24, 30 August 2009 (UTC)Reply
It definitely should just add them with a comma, I am currently investigating a similar problem that Guest63h brought to my attention, so maybe I'll find the problem. Conrad.Irwin14:28, 30 August 2009 (UTC)Reply
About that hard, I hope. (as with other similar templates, when you link them manually, you must also bold them manually - this then allows for short notes to be included in the inflection line). Conrad.Irwin23:00, 10 September 2009 (UTC)Reply
Latest comment: 15 years ago1 comment1 person in discussion
Hi Conrad. Could you explain the situation in re the purportedly “discarded” audio-recording button discussion mentioned herein please? It’s best to explain in that section of the Beer Parlour, since it’ll be of general interest. Thanks. †﴾(u):Raifʻhār(t):Doremítzwr﴿13:11, 12 September 2009 (UTC)Reply
a more detailed reply
Latest comment: 15 years ago6 comments4 people in discussion
Hi, you asked me yesterday on the IRC about Sven . . . I've been thinking about it and came up with a longer answer for ya. :) My keyboard, I guess, just has a bad relationship with that channel or something because I have to type reeeeeeallly slowly there to be legible but here it works. (silly thing)
I think the problems are communication and hypersensitivity. Communication, because it is really hard to read his writing and so sometimes we can't tell what the problem is. Hypersensitivity, because he likes to write in all caps so it often sounds like he's shouting and so that further aggravates other people, and then that further aggravates him. I don't think it's your fault. However, I do sympathize with him ... It's hard being neutral. Very hard.
Anyway, I've now taken up translating for him in discussion pages. I wish there was another solution. :P L☺g☺maniacchat?15:16, 12 September 2009 (UTC)Reply
I hope he appreciates your efforts. I do. I have to adopt SB's solution when my patience and kindness are on holiday together, as they often are. DCDuringTALK15:39, 12 September 2009 (UTC)Reply
Latest comment: 15 years ago1 comment1 person in discussion
I guess basically a bot would just have to look for the ==Mandarin== and ===Hanzi=== sections, then check if {{zh-hanzi| specifies sim= or tra=. If all that comes out positive, then the sim=/tra= should be copied to the {{yue-hanzi| under ==Cantonese== ===Hanzi===. If not... then nothin. :) — opiaterein — 21:03, 13 September 2009 (UTC)Reply
anagrams
Latest comment: 15 years ago2 comments2 people in discussion
Urm, ok. If I said "that shouldn't happen" it wouldn't help much. I think I need to re-think the timing issues with the balancers, maybe this is another aspect of them. Conrad.Irwin20:43, 15 September 2009 (UTC)Reply
Latest comment: 15 years ago4 comments3 people in discussion
Speaking as the 2009 UK French Scrabble Champion, you're right about French. Ignore case, diacritics, treat æ and œ as -ae- and -oe- and you've got it. Mglovesfun (talk) 21:41, 15 September 2009 (UTC)Reply
Yes, I've since patched the bot so it won't make more of these, but haven't gone back to fix them yet. Thanks for the reminder. (It now treats œ as "oe", is this correct?) Conrad.Irwin23:02, 20 November 2009 (UTC)Reply
Latest comment: 15 years ago3 comments2 people in discussion
The A index page contains only 14 entries, it should be more than 875. Can you take a look whan you have a chance? Thank you. --Panda1018:25, 20 September 2009 (UTC)Reply
Latest comment: 15 years ago2 comments2 people in discussion
Hi Conrad
After getting tired of updating stats @nl.wikt I decided to use the magic word PAGESINCATEGORY instead and so we now have a stats page that updates itself see. The functions may be a little hard on the server but I think it is functional enough to warrant that.
<cirwin> is there a way to time the rendering of a page?
Latest comment: 15 years ago1 comment1 person in discussion
Here are some tricks used on WMF (though they're mostly for skins/messages, but they're better than "served by" time): forceprofile=true and forcetrace=true. View the page source, a big comment at the end holds the data. Note the data can change depending on which apache you hit and how busy it is.
Zocky has this thing, but it requires core changes (some before/after hooks to the brace substitution) which make it a bit inefficient on WMF. You can see an example of it here. Anonymous editing is enabled so you could probably test your template there in a sandbox and action=profile it after. Splarka06:06, 21 September 2009 (UTC)Reply
Creating a monster...
Latest comment: 15 years ago1 comment1 person in discussion
I may need to learn javascript.
So, the word lookup script appears to have become a very popular idea. The fr.wn contributors are all over it, and are thinking it needs to be added to the site and maybe to wikisource as well... and they've found a few bugs they'd like me to fix.
trailing punctuation is included in the url. Clicking on cabinet, would retrieve cabinet,.
preceding articles, such as l'article, are included in the url; in this case l'article.
The most confusing bug is the quotes bug: in French « and ». Apparently this causes all clicks in a multi-line quote to be assigned to words in the first line of the quote, with terms vertically aligned to the left of the « being assigned to that character. This behaviour can be seen at this article.
Assuming you'd rather not become the godfather of this monster, where would you suggest I go online to learn how to code javascript? - Amgine/talk17:28, 25 September 2009 (UTC)Reply
Anagrams
Latest comment: 15 years ago1 comment1 person in discussion
Hi, the header ===Anagrams=== has to be in the page text, not in the template. Otherwise can't be found, used, ignored by whatever is parsing wikitext. And other issues ... (like edit sections etc)
(they all seem sort of pointless anyway to me? Why not get generate lists rather than add stuff to every entry?) I hadn't even noticed the proposal for this, too little time and too much going on! ;-) Robert Ullmann13:02, 29 September 2009 (UTC)Reply
Latest comment: 14 years ago7 comments3 people in discussion
Hiya. Now that you're bot-adding anagrams, do you think it'd be possible to add Hebrew to the code? Discussion at About:Hebrew yielded that the following rules should be followed:
Ignore spaces and punctuation, noting that punctuation may include U+05BE, U+05F3, and U+05F4 in addition to the usual suspects.
Any page with U+0591 through 05BD, or with 05BF through 05C7, or with 05F0 through 05F2, or with FB1D through FB28, or with FB2A through FB4F, which contains a Hebrew section, is a bad pagetitle (for Hebrew at least), and should not list anagrams, nor be listed as one (and if possible should be tagged for Hebrew attention).
The letters U+05DA and 05DB are identical for purposes of anagrams. 05DD and 05DE are identical. 05DF and 05E0 are identical. 05E3 and 05E4 are identical. 05E5 and 05E6 are identical.
Alright. The bot is currently on hold until I reimplement it to not use the template system (initially through concerns by Ullmann and Hippietrail, but having seen people make a mess of editing the templates I now agree). I will try and get it running at some point, but I can't promise when. Conrad.Irwin16:41, 8 October 2009 (UTC)Reply
I'm positively giddy about the bot's running once more, and am posting this to re-ask you to add Hebrew to your code if and when you have time and inclination. Thanks.—msh210℠18:05, 16 November 2009 (UTC)Reply
I had entirely forgotten about this request, sure! though, they will have to wait until the second run (probably starting immediately after the first has (evenutally) finished. Conrad.Irwin18:07, 16 November 2009 (UTC)Reply
BTW, not a big deal, but in alphagrams, for the pairs U+05D, U+05D, U+05(DF|E0), U+05E, and U+05E, I think the greater codepoint (i.e., the usual form of the letter, rather than the "final" form used at the end of a word) should be preferred. Thanks in advance! —RuakhTALK22:52, 12 December 2009 (UTC)Reply
Sure, could you let me know how `exactly` the sorting works? Looking at w:Polish orthography there are several digraphs that are not included in the alphabet, are they simply sorted as though they were individual letters? Should c and ć be treated as different letters, or are they just forms of the same letter (same for other such characters). I'll try to get round to this at the weekend - I need to re-run all the others too. Conrad.Irwin12:39, 23 October 2009 (UTC)Reply
Thanks for the list. It may come in handy, I found many translations that need to be corrected. BTW, something is wrong on Index:Polish/ł, there are ła łą ła łą ła sections... But I think we don't even need separate section for each, where there are only hundred-two hundred words for a letter. Or even less. Maro17:54, 25 October 2009 (UTC)Reply
New green links
Latest comment: 15 years ago3 comments2 people in discussion
There is a new green link in the Hungarian declension tables, for the plural accusative. When I create the form-of entry using that, it does not put the entry in any category. I had to replace the template in the definition line with hu-inflection of. Was there a change in the Hungarian templates? --Panda1020:41, 23 October 2009 (UTC)Reply
I added some plural accusative rules to work with Esperanto, so they must have conflicted. I can remove them again as I ended up doing something totally different (but I assumed there'd be no side-effects). Conrad.Irwin21:33, 23 October 2009 (UTC)Reply
Latest comment: 15 years ago3 comments2 people in discussion
Hey again Conrad. :) I was wondering if you could add code for the following three things, if it's possible, to my monobook.js whenever you have the time (as I am unable to do any kind of "programming" and stuff like this)
code to make edits for creating Hungarian noun form entries automatically minor, like how in creation.js URLs you have "&preloadminor=true"
same as above but to make the summary always be "n" or something (n standing for new that is)
finally, (and I realise this might be a bit much to ask) creation.js-style links for the form-ofs.
Maybe, how should the script tell you are creating a hungarian noun form?. Or would this be the same as just having the Hungarian forms all accelerated? Conrad.Irwin11:36, 25 October 2009 (UTC)Reply
Latest comment: 15 years ago6 comments2 people in discussion
I wanted to bring RHS ToCs up in the BP to see if it can get some widespread usage. Do you know of any outstanding systematic issues dealing with right-hand side ToCs? Additionally, I wanted to bring up the idea of scrollable, fixed-height ToCs ({{scrollable-toc}}, {{scrollable-tocright}}). Do you know if there's any problems with those? Thanks. --Bequw → ¢ • τ00:03, 27 October 2009 (UTC)Reply
I would support moving it to the right (still). I am not a big fan of the scrollable idea - when the entry is big enough that it scrolls, it is big enough that you need to use the ToC to get to useful places. Conrad.Irwin17:52, 27 October 2009 (UTC)Reply
With many users using the plain RHS ToCs, would it be useful to have additional editing guidelines to deal with situations where RHS elements get pushed out of their language section by big TOCs? Maybe we could say that, on entries that experience this problem, that these RHS elements should be "inlined. Images could go in galleries and navigation templates (both to related entries (eg {{ordinalbox}}) and to sister projects) could get put into See also sections. Is there anything that needs to be RHS? --Bequw → ¢ • τ21:02, 27 October 2009 (UTC)Reply
As far as I see it, images are the only problem - floating navigation boxes are a bit icky. What to do with is complicated, I suspect a gallery at the bottom of the entry is called for. Conrad.Irwin21:23, 27 October 2009 (UTC)Reply
Templates like {{top2}} specify table widths of 100%. As this can mess with RHS ToCs and other elements, is there anything that we can do? Can we put div tags in the top and bottom templates? --Bequw → ¢ • τ16:43, 6 November 2009 (UTC)Reply
Like {{trans-top}} these could be wrapped inside a <div style="width:auto;margin:0px;overflow:auto;"> - but that would require updating everything that can work with {{bottom}} so that the </div> would be matched. (There's probably some way to bully CSS into treating the table like this directly, but I don't know what it is - Ullmann might). Conrad.Irwin17:04, 6 November 2009 (UTC)Reply
edits
Latest comment: 15 years ago1 comment1 person in discussion
Latest comment: 15 years ago1 comment1 person in discussion
Hi there. Would you be willing to convert this to template form for us over at the Simple English Wiktionary:
<big>''']'''</big> <small>verb</small>
#If you publish a book, an article, a song, etc. you make it available for other people to buy, read, listen to, etc.
#: ''The study was '''published''' in the British Medial Journal.''
#: ''She '''publishes''' a monthly magazine.''
#: ''The government '''published''' the results on the Internet.''
<small>]</small>
We want to use it as a word of the week template where the arguement would be like {{wotw|<insert word here>}} and it would automatically set up a blank form where the word shows up and then has blank template like stuff where we can fill in the stuff that is needed like this:
<nowiki>
<big>''']'''</big> <small><insert part of speech here></small>
<insert definition here>
<small>]</small>
or something of this matter. Since I suck at templates, and you seem to be pretty good at them, would you be able to make sure a template for us? Thanks, Razorflame19:20, 1 November 2009 (UTC)Reply
club_butler
Latest comment: 15 years ago2 comments2 people in discussion
Is it a bug or a feature that club_butler requires underscores for spaces? For instance:
<hippietrail> .? Bronx cheer
<club_butler> Couldn't get any definitions for Bronx cheer.
<hippietrail> .? Bronx_cheer
<club_butler> Bronx_cheer — noun: 1. A razzing noise made with the lips and tongue; a raspberry
Such "anagrams" are deliberately excluded in the current model - they should be linked to using {{also}}. Our definition of anagrams implies it is a re-arrangement of letters which I take to preclude the trivial case when the letters are in the same order (if the word "arrangement" was used it would be more ambiguous). Conrad.Irwin23:03, 4 November 2009 (UTC)Reply
Re: Dutch Index
Latest comment: 15 years ago1 comment1 person in discussion
Thanks for the offer to automatically use your bot program on the Dutch Index. Dutch in general does alphabetize it's words the same way that English does. Sometimes Dutch dictionaries treat the digraph ij as its own letter, but that method is not followed by wikipedia, so it's just the regular twenty-six letter English/Latin alphabet, with the same order. Thanks, Mitchell Powell03:00, 6 November 2009 (UTC)Reply
edittools
Latest comment: 15 years ago5 comments2 people in discussion
Hello!
I would like to know how could I get something like your personalized edittols working at pt.wikibooks, but I don't know where to start to look at... =/
You should be able to add importScriptURI('http://en.wiktionary.orghttps://en.wiktionary.org/w/index.php?title=User:Conrad.Irwin/edittools.js&action=raw&ctype=text/javascript'); and then create User:Heldergeovane/edittools. (Just copy mine for now, then when you know it's working, you can change it). Conrad.Irwin22:01, 6 November 2009 (UTC)Reply
After some adjustments, it seems to work, but I needed to add the name we use for the main div of the edittools.
Do you mind of adding it to your js too? This way other users could just import it (as you suggested above), instead of make a copy =). The name "specialchars" (without the prefix "editpage-") is used at Wikimedia Commons (from where the edittols of pt.wb was copyed).
I've been testing and actually I haven't found a way of make them compatible. So, I just added a new div above our default edittols, with the id used by your script (and we can hide the default using a css rule with the id "specialchars"). Helder17:13, 7 November 2009 (UTC)Reply
Or you could copy the script and edit it to suit your wiktionary, it has not been updated in a very long time and I'm unlikely to make further changes to it now. Conrad.Irwin00:09, 8 November 2009 (UTC)Reply
Latest comment: 15 years ago6 comments2 people in discussion
Hi Conrad
I updated the index for Dutch by importing the Open Taal index, which gives a pretty exhaustive list of our language in proper official orthography. The lists are simply alphabetical and very lengthy. Is there any way to make them more accessible the way you have done that for the English list? If there is I would be much obliged if that could also be done #nl.wikti because we have the same problem there.
Yes, I can run it through the last stage of the Conrad.Bot stuff - would you like me to insert all the Dutch that is present on Wiktionary at the same time? Conrad.Irwin22:56, 6 November 2009 (UTC)Reply
What en.wikti wants to do is up to the community here, but at nl.wikt we would like to keep the lists intact as they are, because as they are, they have received the official seal of approval of the Taalunie, the integovernmental body that regulates our spelling.
They represent a calibration set in a sense. So we do not want to alter them, unless the OpenTaal/Taalunie people come with an update. The Dutch present here (and at nl) contains entries with various orthographies, some of them historically obsolete ones (our spelling has changed numerous times), some of them downright miss-spellings. We certainly do not want to pollute the list with those. Jcwf23:21, 6 November 2009 (UTC)Reply
To be honest, I can't remember by now either - I seem to recall Hippietrail getting annoyed when I added a multiline flag to the original, but it works well enough for now. (It's his script) Conrad.Irwin18:22, 7 November 2009 (UTC)Reply
Bot task
Latest comment: 15 years ago4 comments2 people in discussion
I wasn't sure of which bots were approved for consensus search & replace but I thought yours was a good bet. Per WT:RFDO#Dialect etymology templates would you replace:
I've orphaned the other with AWB already. This way we can see who's still using these old templates, and how long to hold onto them. If yours isn't well suited to this, is someone else's better? Thanks. --Bequw → ¢ • τ18:21, 14 November 2009 (UTC)Reply
Yes, I can do this - I'm not sure whether I'll be able to run it at the same time as importing anagrams (which I'm hoping to restart now, and will take several days to complete); if I can't I'll let you know - then maybe User:Opiaterein would be another possibility. Conrad.Irwin18:27, 14 November 2009 (UTC)Reply
Latest comment: 15 years ago5 comments2 people in discussion
Hi.
I noticed that Conrad.Bot updates only the main page of Index for Old Armenian, Hilgaynon, Mapudungun and Polish. A bug? Or was it deliberate? Maro20:30, 14 November 2009 (UTC)Reply
Polish seems to be a bug, the other three are single-page indices (because they are so small). These indexes should be possible, but should I tree à etc. as an "A" with an accent (as French does), or as a seperate letter (as Polish does). And should "Ch" etc. be filed under "C" in the "h" section, or as "Ch" - a single letter? Conrad.Irwin20:36, 14 November 2009 (UTC)Reply
Sorting order:
Kashubian: A, Ą, Ã, B, C, D, E, É, Ë, F, G, H, I, J, K, L, Ł, M, N, Ń, O, Ò, Ó, Ô, P, R, S, T, U, Ù, W, Y, Z, Ż
Lower Sorbian: a, b, c, č, ć, d, e, ě, f, g, h, ch, i, j, k, ł, l, m, n, ń, o, p, r, ŕ, s, š, t, u, w, y, z, ž, ź
Upper Sorbian: a, b, c, č, ć, d, dź, e, ě, f, g, h, ch, i, j, k, ł, l, m, n, ń, o, ó, p, q, r, ř, s, š, t, u, v, w, x, y, z, ž Maro22:28, 14 November 2009 (UTC)Reply
Latest comment: 15 years ago3 comments2 people in discussion
Hi Conrad,
I notice that לחיים(l'khayím) is orange-ified as "partlynew" in the Hebrew index, apparently because it's given as a different part of speech (phrase) from the English word it's listed as a translation of (cheers, interjection). Is that something that can be addressed somehow, or do we basically have to choose between changing ]'s POS header, or accepting that it will be orange in the index?
The actual problem is that it doesn't understand "{{non gloss definition}}" and, for templates it doesn't know, it uses the heuristic that if the definition is completely contained within one template then it is a form-of (and thus should not be in the index). I can fix this in the same way it treats {{given name}} and {{SI unit}} - are there any others you can think of? (While it would be nice to remove the partlynew on form-ofs that are translations, this is not trivial with the current architecture). Conrad.Irwin21:35, 14 November 2009 (UTC)Reply
Oh, interesting. The problem is, not only need not a template-only def be a form-of, but also a form-of need not be a template-only def. (Some editors precede form-ofs with glosses.) I don't have a better suggestion, though, so until we come up with one, thanks for exempting {{non-gloss definition}}. :-) —RuakhTALK04:41, 15 November 2009 (UTC)Reply
Aramaic sub-translations in Hebrew index.
Latest comment: 15 years ago3 comments2 people in discussion
Also, the bot seems to get confused by the way the Aramaic editors format their translations. For example, the first translations table at ] has this for Aramaic:
; that is, it has two copies of the translation, one in the Syriac script, and one in the square script. The bot seems to mistake the latter for a Hebrew translation, so it adds לחמא to the Hebrew index (in orange).
Can it be changed either to understand this format, or, failing that, to discount lines that start with *:?
I can make it ignore Hebrew things that start with *: - this will also be a problem for WT:EDIT as it may find the nested Hebrew and try to add a Hebrew translation to it (if no *Hebrew exists yet) Conrad.Irwin21:30, 14 November 2009 (UTC)Reply
Latest comment: 15 years ago3 comments2 people in discussion
Hi Conrad,
At my admin vote, you expressed some concerns. Is there anything you’d like to elaborate on or that you’d rather I do differently, or was it mostly “added tools don’t seem useful”? Thanks for your comments!
I was just thinking back to events in 2008 - it was perhaps unfair to bring them up again after so long, but as I said, I haven't really had much to say to you since then! Congratulatuions for passing. Conrad.Irwin08:55, 15 November 2009 (UTC)Reply
No worries – fair enough, and we did get off on rather the wrong foot (I had a few WT teething pains – “What, changing ELE isn’t ‘being bold’?”). Hope we’ve not had much to say since b/c I’ve learned how to not make messes (and just quietly worked), and look forward to working with you in future – thanks for the welcome!
Latest comment: 15 years ago3 comments3 people in discussion
Aren't IPA tags supposed to be preceded by a *, or is that only necessary when there's more than one thing under ===Pronunciation===? --Yair rand02:20, 17 November 2009 (UTC)Reply
Let me know of any flaws you find. (it will be re-generated completely, with translation back-references like the others, next time I run all languages - probably this weekend). Conrad.Irwin18:39, 17 November 2009 (UTC)Reply
translations bug?
Latest comment: 15 years ago3 comments2 people in discussion
It certainly looks like it :( - I'm not entirely certain how it could have happened, but I'll try to fix it when I do. Thanks for letting me know. Conrad.Irwin22:15, 17 November 2009 (UTC)Reply
Latest comment: 15 years ago3 comments2 people in discussion
Hi there. We need to talk about someone getting Darkicebot to be able to make form-of entries for nouns, adjectives, and verbs (Esperanto), that includes these characters (anywhere in the word):ĉ, ĝ, ĥ, ĵ, ŝ, ŭ. I don't know how to do it because the program that I use to enter the stems for the bot code currently does not allow these special characters to be added, and I don't know how to make them otherwise. Any ideas? Razorflame23:28, 18 November 2009 (UTC)Reply
It should just be a case of changing template = """ to template = u""" and calling .encode('utf-8') on anything before you print it or write it. Let me know if you need a more helpful hand, I'm likely to be back on IRC in quarter of an hour. Conrad.Irwin23:31, 18 November 2009 (UTC)Reply
Latest comment: 15 years ago2 comments2 people in discussion
Well, I try to write a Catalan translation with automatical tools (showed in the same page) and sometime I discobert changes not succes, then I repeat the same operation and then appear translations duplicated. At the end I try do repair all that. I supouse it don't refresh enough rapidly. Sorry if I do anything wrong. --Maltrobat17:43, 19 November 2009 (UTC)Reply
Latest comment: 15 years ago4 comments2 people in discussion
I'm wondering if it would be possible, using wiki syntax, to create an 'invisible' L2 header that redirects the TOC to another header? It would need to result in ^====\n in the article source. - Amgine/talk19:51, 22 November 2009 (UTC)Reply
You couldn't insert invisible headings into the TOC, but you could do something like:
<!--
==Heading==
-->
and then fix the TOC with javascript. However this would break many many scripts both belonging to me and other people, that make the assumption that there won't be much inside comments. Conrad.Irwin20:23, 22 November 2009 (UTC)Reply
In what way breaking? That is, the header is primarily to create an L2 entry in the TOC, and to store machine-readable instructions to find the actual data elsewhere for database reusers. I'm clearly being dense if this would somehow be information your scripts need? (I'm also not suggesting implementing this; I'm just looking for some compromise which does no harm.) - Amgine/talk20:38, 22 November 2009 (UTC)Reply
Take things like WT:EDIT, which need a corrsepondance between what can be seen in both places; things like the anagrams bot (and others) make the assumption that a language section runs from ==Language== to ---- or ==Language== whichever comes first, with yours there would be random open and close comment characters in unrelated sections (I can imagine AutoFormat reshuffling the sections on a page and commenting them out, and it certainly wouldn't make it easier to parse - certainly the way I do it, which is to first divide the page into language sections and assume that they are independant). Thinking about it more logically, you could wrap the entire section in <div style="display:none"> to get them into the TOC - but then the TOC links would only work in some browsers, and it has the same problem with putting random bits of markup in random places. Conrad.Irwin23:42, 22 November 2009 (UTC)Reply
Latest comment: 15 years ago1 comment1 person in discussion
Thank you very much for your help and feedback on this. I understand that a dictionary is very different from an encyclopedia, in some ways much more formal, and fully understand that I don't understand the conventions I should follow - so thanks. It is a good word though. It came up in a Wikipedia article I was working on, needed a definition (after I figured out what it meant), and definitions do not belong in Wikipedia. Aymatth204:22, 23 November 2009 (UTC)Reply
Latest comment: 15 years ago5 comments2 people in discussion
Hello Conrad.Irwin, did you know that the French Wiktionary want this tool? I asked the import and I translated it but the position of the backup window is not correct for Frenchies. Is it possible to position it in the lower right corner? Also, they use the templates "trad-", "trad", "trad+" and "tradø" instead of templates "t-", "t", "t+" and "tø". Is it possible to change the script? Can you help me please? It would greatly help the French Wiktionary. Thank you, --Sniff22:57, 23 November 2009 (UTC)Reply
Hi, yes (see on the page there). I don't have the time to maintain this tool on more than one wiki, but I am very happy to help if you have specific issues (and if I suddenly find I have a burst of spare wiki time - once the anagram bot is fixed assuming there are no more bug reports - I can hopefully help you with the first port). I will try and write out some detailed documentation for it this evening, though I have a few more hours of real work to do first. Conrad.Irwin23:16, 23 November 2009 (UTC)Reply
Hi, you should probably delete all the code inside TranslationLabeller{, as you don't seem to use glosses on fr.wikt. The English version of editor.js uses primarily the textual gloss to detect which table it is editing (this further helps prevent errors if a translation table in the HTML is not explicitly mentioned in the source code e.g. color). As you don't have this system on fr.wiktionary, you will probably need to redefine util.getTranslationTable in order to parse the french wiktionary's code to find the position of the wikitext for a translation table. The actual error is caused because within regular expressions you need to escape the (, so /\{\{( should be /\{\{\(. Thanks and good luck! Conrad.Irwin00:06, 26 November 2009 (UTC)Reply
Scary bug
Latest comment: 15 years ago2 comments2 people in discussion
Latest comment: 15 years ago2 comments2 people in discussion
Could you update the bot to do Icelandic anagrams? You'd just have to make sure it treats a/á, e/é, i/í, o/ó/ö, u/ú, y/ý, ae/æ, d/ð as separate letters. – Krun16:16, 25 November 2009 (UTC)Reply
Latest comment: 14 years ago9 comments3 people in discussion
Hi! It would be very nice if your bot could be updated to do Italian anagrams. I will be easier than French, Italian has just à è ì ò ù at the end. Thanks!! Pharamp17:16, 25 November 2009 (UTC)Reply
I meant like Mglovesfun said, "`" accent on the last syllable like lì or verità etc. which don't affect the pronunciation of the single letter (pronunciation of the French é is different than French è). For this I think that doing Italian anagrams can be quite easy: à will be a etc. (not separated like in Icelandic). For the verbs, that's true it is quite useless, but it will be funny to mark it, and this is also a French anagrams "problem" XD. Pharamp11:55, 26 November 2009 (UTC)Reply
Well, I was just going to exclude the case when they were verb forms of the same verb, but if you'l prefer them included that's fine too. Conrad.Irwin16:18, 26 November 2009 (UTC)Reply
alphagrams
Latest comment: 14 years ago7 comments5 people in discussion
Hello, I noticed that your bot produces anagrams and alphagrams, I was wondering, since it does this, why doesn't it produce a reversed alphagram as well? (Z first, through to A)
Is there much need for such a thing? I can imagine that if someone wanted to find anagrams they might search for the letters in alphabetical order. I can't think of any reason listing the inverse order would be useful, and besides it can be trivially discovered by just reversing the alphagram. Conrad.Irwin12:45, 28 November 2009 (UTC)Reply
""I can imagine that if someone wanted to find anagrams they might search for the letters in alphabetical order. I can't think of any reason listing the inverse order would be useful, and besides it can be trivially discovered by just reversing the alphagram. Conrad.Irwin 12:45, 28 November 2009 (UTC)"" Conrad.Irwin14:54, 9 December 2009 (UTC)Reply
ga-verb
Latest comment: 14 years ago2 comments2 people in discussion
Hi,
I was just looking through the XML dumps for Irish/index, and noticed that it might not be picking up the {{ga-verb}} tags (i.e., the verb bain by itself was missing. If you tweak your bot commands, you may want to search for that as well.
Hi, the bot doesn't look at the inflection line, just at the definition lines and the headings; also, bain does seem to be on Index:Irish/b (in the third row and third column, with five asterisks). Have I misunderstood you? Conrad.Irwin23:41, 3 December 2009 (UTC)Reply
Albanian index stuff
Latest comment: 14 years ago1 comment1 person in discussion
So, there are a few digraphs that are always counted as individual letters and no letters are treated as equivalents, so no fancy bullshit like that :) Here's the alphabet:
A B C Ç D Dh E Ë F G Gj H I J K L Ll M N Nj O P Q R Rr S Sh T Th U V X Xh Y Z Zh
Latest comment: 14 years ago14 comments4 people in discussion
Hi Conrad,
Thanks for doing this! It doesn't seem to be working yet, though; on my watchlist I see a bunch of new talk-pages with copies of RFV discussion, but when I go to WT:RFV, I see that the discussions are still there.
BTW, what exactly is your archiving logic? As in, how do you recognize that a discussion is passed or failed, and how long do you wait before archiving it? That should probably be documented somewhere. (Unless the "semi-automatic" is to suggest that you make the decisions manually, and the automatic just handles the actual archiving?)
At the moment it just goes through all the struck out headings, I'm currently making all the decisions, but I rarely have to do anything beyond what it's programmed to do as default (i.e. deleting struck out sections and copy to talk page, checking for 'rfv failed' or 'deleted' to give {{rfv-failed}} - though sometimes this brings false positives if the verdict changed half-way through). For recent-ness, I was going to stay out of December, though I let it play with some that were 1st of Decemeber. It does not yet remove {{rfv}} from entries. I also only edit WT:RFV itself in batches, it's far too slow to change it every single time (i.e. >60seconds for one edit to it, vs. <1 second for edits to any other page). This also reduces the chance of an edit conflict. I did wonder what to do in those cases, I will use -archived henceforth. Conrad.Irwin12:49, 4 December 2009 (UTC)Reply
For information, if the mainly automated mode finds variations on RFVfailed or Deleted it will ues {{rfv-failed}}, RFVpassed or Cited and it will use {{rfv-passed}}, if there is a date in the last three days, it will skip the section. If it would be both failed and passed, or neither, it asks me what to do. So, if people always close the discussions with those formulaic words, it cuts a few seconds off the time. Conrad.Irwin16:21, 4 December 2009 (UTC)Reply
Oh, good. I hate to look a gift horse in the mouth, but Connel's old archiver-bot had some serious problems. Your approach seems to address all of them perfectly. (I mean, I'm sure there'll be some mistakes, but then, manual archivers make mistakes, too.) Thanks again! :-D —RuakhTALK02:56, 5 December 2009 (UTC)Reply
Thanks for devoting some effort and skill to this important task. I often insert "Cited IMHO" to draw attention to my own efforts to cite a sense, not intending to close the matter unilaterally. I hope to get someone else to agree that the citations are adequate, as they are 80-90% of the time (not 100%). Should I use a different formula for my purpose? DCDuringTALK03:11, 5 December 2009 (UTC)Reply
It needs three things, Cited, <s> and the date to be old, so providing that you only provide one or two of these, it should be ok. Conrad.Irwin03:38, 5 December 2009 (UTC)Reply
Cool, thanks. This seems like a very practical procedure. I hope you can figure out how to make some of it completely automatic and basically bullet-proof. Reducing the size of the RfD, RfV, and similar pages without consigning the material solely to an archive is desirable.
I know that some TR discussions are moved to the associated entry talk pages, but they all "should" be. How possible would it be to retroactively put "all" RfD, RfV, and TR discussions about entries on associated talk pages? I know that we have had different archive practices at different times. Are the archives nearly uniform enough? Would it be possible to work the revisions from the XML dumps? I'd be interested in thoughts about the utility of this. It's good in principle, but is it worth the technical effort and resources? Or is there some other way to enhance the accessibility of our discussions, including non-entry "topics". DCDuringTALK11:40, 5 December 2009 (UTC)Reply
Well, it's quite possible, but I think it would be very time-consuming to do well. I think the history approach would be the sanest, but that still leaves the "is it a pass/fail/archive" decision which needs a human in the loop (unless we just cheat and use archive for everything old). There is an issue where multiple words are discussed in the same section, though my thought is just to archive it at all words (not nice when there are more than two or three). What to do with inspecific topics is harder, I think the best we can do is put each topic at a sub-page of somewhere, and list the topics that have been discussed before. A thematic index would be ideal, so that discussions can be re-opened as opposed to re-started, but I don't know any easy way to implement that. Conrad.Irwin12:42, 5 December 2009 (UTC)Reply
Latest comment: 14 years ago3 comments2 people in discussion
I've been thinking about {{Estonian index}} (hardly used) and {{finnish index}} (used hardily). I think we should be adding links to the Indexes via JS instead of through the wiki-text. Is there a good way to do this? Would we want to allow someone to turn on/off individual languages, or would someone be fine with links for all languages for which we have indexes? Additionally, it should interact well with HT's next/previous links (I'm assuming he's watching). Thoughts? --Bequw → ¢ • τ04:29, 6 December 2009 (UTC)Reply
I like the idea, but doing it nicely is tricky. I had a proposal for Hippietrail's forward/back links which gave a link to the Index in the middle (see below), the alternative is to just hijack the title itself, or to add a selection of links on the title line. Any of these are doable with javascript. Conrad.Irwin11:04, 6 December 2009 (UTC)Reply
They look great. I like #1 if it could be done both with and without HT's extension (sometimes Toolserver is slow). There's also the "Show an interwiki link under the language heading when one exists in the sidebar" option to consider. Ideally they could all be on the same line (that would necessitate modify HT's extension), but if not, maybe the index & iwiki could be laid out like #3. --Bequw → ¢ • τ23:51, 6 December 2009 (UTC)Reply
Sources
Latest comment: 14 years ago4 comments1 person in discussion
No, that's quite alright. I went back and looked at the past half-dozen or so articles that I've added quotes to and found I had used the quote-templates except in one case, where I added the citation manually so that it fit the style of a pre-existing citation. There's certainly something off about the quote-journal and quote-book templates as they add both a # and a * in front of the quotation itself (gene product has both). This looks wrong to you as well? --Ceyockey01:39, 10 December 2009 (UTC)Reply
I agree that placing the quotations under the definition rather than under a quotations header is cleaner and clearer. I would support deprecation of the header in general. --Ceyockey01:46, 10 December 2009 (UTC)Reply
Anagram layout
Latest comment: 14 years ago3 comments2 people in discussion
For the next time you run the anagram bot, have you thought about laying out the list horizontally instead of vertically? --Bequw → ¢ • τ17:50, 11 December 2009 (UTC)Reply
The list is two dimensional, see the#Anagrams, words with the same letters in the same order go on the same line. It would be possible to join all the lines together, but I'm not sure it would be an improvement (I suppose I could get it to only put them on seperate lines when there were multiple with the same letters). Conrad.Irwin17:52, 11 December 2009 (UTC)Reply
Neat, I didn't know they were 2D. If the lines were joined someone who cared about anagrams with the same ordering could still see them right? My main concern was space, especially on pages with multiple language entries. On tesla for instance, the FL entries could fit on my first screen if the anagrams were on one line (alphagram on a separate line). --Bequw → ¢ • τ01:26, 17 December 2009 (UTC)Reply
re References
Latest comment: 14 years ago3 comments1 person in discussion
Thanks very much for this pointer, I will keep it in mind for the future. But a quick question, could you give me some good examples of when References are encouraged to be used? Cirt (talk) 18:21, 13 December 2009 (UTC)Reply
Ah, understandable, and yet ... so it is okay to have pages exist with no sources listed whatsoever to back up what is on the page - and this could conceivably be considered an optimum state for such an entry? Cirt (talk) 18:57, 13 December 2009 (UTC)Reply
Latest comment: 14 years ago3 comments2 people in discussion
I have been using context a great deal. I noted a comment of your at a template talk page that mentioned the impact of {{context}} on the servers. Is it a serious problem? Is it the mere use of "context" or the number of parameters that is the problem? Does the same apply to all multi-parameter templates, or is it some kind of recursiveness peculiar to just a few? Should I adjust my use of "context"? DCDuringTALK19:22, 16 December 2009 (UTC)Reply
No, it's not a problem at all, I was merely making the point that the optimisation made at that template, I can't remember which it was anymore, was completely pointless. Conrad.Irwin20:17, 16 December 2009 (UTC)Reply
I've been thinking about what you wrote off and on and haven't edited in Wiktionary while I did so. You raise an interesting point about whether Sources are meant to support a definition or be directly related to a definition. This is likely a perennial debate. From the "Entry layout explained" page "References" section you get the passage "...references which can be used to verify the content." I've been looking for some discussion on the distinction between copying defs from suitably licensed content, paraphrasing, reformulating (less change than paraphrasing, more a reformat for Wiktionary style), and synthesizing a definition from a couple of sources as definition creation methods ... but without much success. My gut feeling is that the current definition is a synthesis, without direct lineage relationship with any one specific source; it is also too expansive as the "in vertebrates..." sentence should probably be reformulated as a set of hyponyms (but that is a style matter).
So ... there are several choices.
I could rewrite the definition so that it is derived from a particular source.
this could take the form of copying directly from a suitably licensed source and including the citation.
I could do nothing (harking back to the concept that sources "verify the content").
Some thinking out loud: Well, by looking at some quotes, it seems that whether the brain is included or not is disputed, or maybe it's just that "brain and nervous system" is a synonym? It's also possible to find lots of derived terms, "central nervous system" "peripheral nervous system" where do they fit into the picture? Looking at the picture on the entry, it seems to illustrate "the set of nerves in an animal, sometimes including the brain" - however looking closer at the books reults, some authors clearly include "receptors" - are these part of the nerves or the nervous sytem, some authors also seem to count the spinal chord seperately from nerves? Maybe an alternative approach is needed, given that the actual constituents are not agreed upon. This is harder, because there are few available quotes using the term in a non-technical sense. Intuitively for me, the nervous system is "the organ in an animal responsible for communicating messages"; searching for plants nervous system is perhaps useful here, clearly a "nervous system" in plants cannot have a brain, spinal chord or indeed nerves; the same goes for robots, though there always seems to be a qualifier "synthetic nervous system" or "robot nervous system"; looking further into google books it's also possible to find broken nervous system - seems to be the nervous system you have during a nervous breakdown. My final definition would be something like "that part of an organism that it uses to control or monitor itself"; and maybe also "The organs in an animal that make up its nervous system", perhaps having subsenses "## {{context|vertebrates}} The nerves and spinal chord sometimes including the brian and receptor cells" and so on. I hope this is helpful, though it's only my approach to this problem - there are plenty of cites on google books for both definitions, though beware of including those that define the word in the quotation - see Use-mention distinction. Conrad.Irwin19:00, 18 December 2009 (UTC)Reply
Importing definitions is frowned upon, the only large scale import of definitions was a complete fiasco, and many of the definitions imported from Webster 1913 have still not been cleaned up or updated. There are also large warnings about copying translations from dictionaries - particularly online translations dictionaries just look for thousands of lists of words without doing any verification at all; dord is another example of a famous dictionary just getting it plain wrong. Conrad.Irwin19:00, 18 December 2009 (UTC)Reply
Thanks
Latest comment: 14 years ago1 comment1 person in discussion
Conrad.Bot has a bug with entries with "etymology 2"?
Latest comment: 14 years ago4 comments2 people in discussion
Hi, I could be wrong but I think Conrad.Bot might have a bug. It has chopped about 2/3 of the article from "slough" twice now. The one thing that is uncommon about "slough" is that is has an "etymology 2" section which has another noun section which shows other meanings (e.g. "marsh", "bog" for slough). Could you have a look at the history and maybe check the code? Thanks! Facts70708:50, 20 December 2009 (UTC)Reply
This might help - it looks like the bot was working on "silver" and then went to "slough". It worked fine on "silver" but then seems to have got mixed up and put some of the "silver" items into "slough":
Hi, thanks for pointing this out, the real cause of the problem was bad error handling (since fixed) which meant that when the bot failed on "get next page" it assumed it had failed on "save page" and so kindly re-saved the page (though with the new title). Sorry for the mess. Conrad.Irwin09:53, 20 December 2009 (UTC)Reply
Latest comment: 14 years ago3 comments2 people in discussion
Hi there Conrad. Would it be possible for you to add the abillity for your bot to do esperanto anagrams, by chance? If you can, great! If not, that is fine as well :). Cheers, Razorflame22:12, 20 December 2009 (UTC)Reply
Actually, now that I think about it more in depth, I don't think that this would be a worthwhile endeavour because I don't think that there will be very many anagrams. Sorry for the disruption, Razorflame22:17, 20 December 2009 (UTC)Reply
A problem with diacritics and exra letters
Latest comment: 14 years ago3 comments2 people in discussion
Latest comment: 14 years ago6 comments3 people in discussion
A problem I've noticed: Serbian words in Latin script (e.g. opšti) are being included because of the way they're listed with "*: Latin:" for the Latin script form. --EncycloPetey23:27, 21 December 2009 (UTC)Reply
Thanks, but no rush. I'm not making much use of the Latin Index (yet). The Galician index, however, has been exceedingly useful, as I've been working through it to add terms and translations based on what we already have, then adding Galician terms the Portuguese Wiktionary has (but we don't), and thenadding further terms found neither here nor there. Once that's done, I may try to fill in additional Galician words based on the list the Galician Wiktionary has, but their list is one massive category not sorted by POS :P , so I want as big a head start as possible first. --EncycloPetey23:52, 22 December 2009 (UTC)Reply
See also the recent history of ]. (Incidentally, by "bad bot" (see edit summary there) I did not mean the bot is bad, but, rather, was, in jest, admonishing it as one might a child. Your bot is grand.)—msh210℠19:00, 28 December 2009 (UTC)Reply
Conrad.bot errors
Latest comment: 14 years ago3 comments2 people in discussion
No, not really. It could be modified to accept a name of a page to archive to, but that would complicate it considerably (and would anyone actually notice that it was autofilled wrong?). Conrad.Irwin22:27, 27 December 2009 (UTC)Reply
Latest comment: 14 years ago2 comments1 person in discussion
Hi Conrad,
If you have a chance, could you take a look at this edit for me? When I test it in my userspace, it works exactly as I expect, but when I try it for real, weird garbage (literal {{#if:'s and such) ends up in entries (and in the template page, for that matter). At first I was thinking that maybe it was due to cases of explicitly blank lang= (as opposed to the parameter just being missing), but no: that doesn't seem to have been the case, and further, testing in my userspace doesn't suggest that that would cause this sort of problem. I'm hoping a second pair of eyes will help.
Latest comment: 14 years ago6 comments2 people in discussion
OK, ready? I'll see whether I can de-administratize you, and you then try to perform an admin action (e.g. deletion of created nonsense) once I indicate I've attempted the change. Then, you let me know what happened. Repeat your attempt to perfrom an admin action again after a few minutes to see whether there was a delay of some kind. If de-sysopping happens, then I'll reactivate you. If not, well then nothing happened. I'll make the attempt only after you've indicated readiness. --EncycloPetey22:08, 28 December 2009 (UTC)Reply
Latest comment: 14 years ago3 comments2 people in discussion
Hello there Conrad. I would like to nominate you for bureaucrat because I believe that you have all the right qualities to be a great bureaucrat. I honestly don't think that four bureaucrats are enough for a Wiki this size, and I think that one more should add just enough coverage to be sufficient enough that we wouldn't need any more. Please let me know your decision on my talk page or here. Thanks, Razorflame16:03, 29 December 2009 (UTC)Reply
I was the one offering to nominate you, though :(. Anyways, yeah, feel free to talk it over with the bureaucrats :). I definitely think that we can do with one more. Cheers, Razorflame16:36, 29 December 2009 (UTC)Reply
Very good; I'll remember this and change the page. I did avoid copying the definition verbatim, as that would be a blatant copyvio. Evidently there is a different attitude here about references than on Wikipedia -- though that is not really surprising considering the different format and purpose. (I'm still learning the ropes.) Thanks again! The Fiddly Leprechaun01:47, 31 December 2009 (UTC)Reply
categorytree
Latest comment: 14 years ago1 comment1 person in discussion