Wiktionnaire:Actualités is a monthly periodical about French Wiktionary, dictionaries and words, published online since April 2015. Everyone is welcome to contribute to it. You can sign in to be noticed of future issues, read old issues and participate to the draft of the next edition. You can also have a look at Regards sur l’actualité de la Wikimedia. If you have any comments, critics or suggestions, our talk page is open!
On November 17 and 18 two study days on dictionary creation will take place in Lyon. The first day will allow discussion about methods and practices with a session dedicated to the Wiktionary project. The second day will be participative with group training in dictionary writing for the Wiktionary project, focused on the ten words of the Francophonie. And it’s co-organized by Lyokoï and Noé, two editors of Actualités!
Statistics provided by Wikiscan:
A vote to split existing thesauri with ambiguous titles led to the creation of: cirque (naturel) and cirque (spectacle); langue (anatomie) and langue (linguistique); paresseux (animal) and paresseux (personne); assimilation culturelle and assimilation (biologie); racine (végétale), racine (odontologie), racine (linguistique), racine (informatique), racine (géologie) and racine (figuré et sociologique).
By the way, Assassas77 is still working on thesauri in Tagalog, adding six more!
As of October 31 2017, the French Wiktionary contains 317 thesauri in French and a total of 452 thesauri in 54 languages!
23 new thesauri this month, five of which in French: punition , peine de mort , prison (first thesaurus creation by Classiccardinal!), armure and tissage .
The Questions on words page (WT:QM) records 189 questions in October, compared to 197 in September, 141 in August and 124 in July.
In automatic language processing, several operations can be used to produce tools for a language. Richard Khoury and Francesca Spasford have tried to create a tool for Latin stemming from the English Wiktionary, which they report in their article “ Latin word stemming using Wiktionary ” (in Digital Scholarship in the Humanities, volume 31, number 2, June 2016, pages 368–373). Their approach uses the database and links between pages that are specified in very precise declination models in order to link the roots to endings for verbs and suffixes for nouns. From a database dump of May 2015, they proceeded with three cleaning steps and then obtained 655,434 word forms for 32,860 roots.
The best tool before their experiments, the Schinke Stemmer, works on a different principle and is based on a set of rules that automatically stem by creating hypothetical roots, not always producing valid words but nevertheless reducing their overall number in a text, making it easier to search it with a search engine.
By comparing both tools, they observe that the one based on Wiktionary misses words that it does not know, but nevertheless reduces the vocabulary of a text much more effectively. Additionally it allows you to access a dictionary of definitions directly afterwards, something not possible with the previous tool. They even plan to improve their use of the Wiktionary database to integrate the part of speech categories of entries in order to produce an additional tool for morphosyntactic labelling of a corpus.
These uses show that the Wiktionary projects contain data that are not only usable as a dictionary, but also allow, through their regular structures, reuse by machines to create new tools — a review by Noé.
Some remarks about the role of patrollers:
Patrollers are editors who spend some of their time to read contributions made on Wiktionary.
They have a tool which tells them about changes still in need of patrolling. Only anonymous contributions or edits made by users lacking the "auto-patrol" flag have to be checked.
After proofreading they can mark a contribution as patrolled.
Being patrolled means free of vandalism in a broad sense, which implies:
These are the basic actions of the patroller. They may, in this context, if they are not administrators, be required to request that contributions containing defamation, personal information and copyright violations be concealed by them.
Then, the patroller can, wether he wishes, go further by operating on the presentation of various possible additional actions such as:
Last but not least, and by far the most interesting, it can investigate the substance, ensuring the accuracy of a contribution, or even providing additional information or corrections.
It must be said, this part is by far the longest and also the least easy.
Thus, it is possible:
Concerning this last point, it is necessary to have a certain level of linguistic skills, very rich material on a large number of languages and knowledge of the grammar of several languages — which is not the case for everyone.
Translation errors are indeed numerous, although made in good faith, often because of the metonymy processes are not the same for all languages. This means that it is sometimes fatal to copy a translation found elsewhere (dictionary, Wikipedia, etc.).
For example, many languages distinguish by different names the action from its result, the content from its container, the building from the institution, etc., where the French language does not necessarily do so. Thus, in Finnish: loading (action): kuormaus / loading (what is loaded): kuormitus; the town hall (the building): kaupungintalo / the town hall (administration): pormestarin
And of course, we find the same problem in the opposite direction Finnish/French.
It is, however, quite rare to encounter real mistranslations. I remember one, several years ago, on the English Wiktionary who had amused me: intrigued by the fact that I found several pages on the net giving the word anaullaut in Inuktitut, and knowing that this word meant stick I found, after some research, that the origin was that a contributor had found in an Inuktitut/English dictionary: anaullaut : bat and has created this entry on the English Wiktionary by specifying Category:Animals; this has been reused and translated into French by other websites.
Yet, alas for him, it was the English word bat but in his meaning of batte — for example in baseball — and not bats (animal) ...
If you have also noticed some crazy or funny contributions, do not hesitate to report them here for a future issue. — a chronicle by Unsui
What happens when Wiktionary becomes a reference against its own will? When discussing the sources of our project it becomes clear that they aren't at all structured like on Wikipedia. We don't share the same attitude towards original research and could even count as a source ourselves. Well, actually we're doing this already. And I can prove it with the little dictionary of the month. A pocket reference which gives an overview of “French vocabulary borrowed from Gaulish, Breton and the Celtic languages”. Yann Lukas shows us some familiar words and some with an unexpected Celtic origin. He suggests Celtic roots for some slang words where standard dictionaries are lost: à dache, loufer, morfal and many more.
But on page 62 we find a funny turn of phrase: Tamis: although disputed, the Gaulish etymology of tamis is tempting. In his Dictionnaire des étymologies obscures (Payot, 1982), Pierre Guiraud opts for a Latin origin stamen, also the root of étamine. Wiktionary prefers the Low-Franconian tamisa (source of Old Dutch teems). So we are cited in a recent etymological analysis. And our hypothesis for tamis isn't very solid. Actually it was added by an IP without giving sources, and other users have added more on top. Still, it shouldn't be discarded entirely since an etymologist has attested a certain value.
Apart from this small appearance which might bring us fame (or not), or at least acknowledgment, this short dictionary of Celtic words is filled with anecdotes about Celtic languages that allow us to get a better understanding of them in our world today. We also get to wonder about tortured Breton which got words that don't suit it: menhir (the Bretons say peulvan), dolmen (they say lichaven), kermesse (from the Flamish kerkmisse) or even triskèle (from Greek and written as triskell to make it look more Celtic). — a chronicle by Lyokoï
This section gives you a monthly selection of videos related to linguistics or the French language, don't hesitate to add more videos you find!
Initiated by the Tremendous Wiktionary User Group, the LexiSessions suggest monthly themes to simultaneously engage all Wiktionaries.
The themes are suggested in advance on Meta and announced every month on Wikidémie, the main community portal.
The October LexiSession was about punishment and led to the creation of three thesauri!
For the month of November we invite you to take an interest in toilets!Three days of encounters provided plenty of time to to meet the around 100 people who came to talk about their projects, ranging from personal contributions to dynamic collectives that spring up everywhere in the world.
The team of "Actualités du Wiktionnaire" was present in Strasbourg to cover the event and to collect enough material to fill the next few numbers, and of course to promote Wiktionnaire in every conversation!
Counting two talks and one project meeting, Wiktionnaire was well represented among a number of high quality presentations.
Just to mention a few topics brought up by participating editors of Wiktionnaire: inclusion of African languages, support for new participants, audio recordings with Lingua Libre, organisation of Edit-a-thons and the vitality of collective initiatives.
At this occasion we also met two researchers from the Logoscope project who are interested in cooperating with Wiktionnaire. Expect some updates on this very soon!
Among the six big functions of language defined by Roman Jakobson, the phatic function ensures that the communication channel works well.
These are words and expressions like "you see" or "do you follow?" but also words used at the begin of a telephone conversation such as allô?.
Marina Yaguello extends the analysis to all discourses which only have the goal to maintain the conversation without sharing anything at all.
By concentrating on the level of sentences and words it is difficult for a dictionary to describe these usages. One one hand there exist a great deal of variation in the used terms, but finding written attestations isn't always straightforward.
On the other hand it is because of the difficulty to explain the function of these terms well.
Sometimes they are entire sentences, including a verb, emptied of its meaning, to fullfil an entire communicational purpose.
— a chronicle by Noé