The goal of the Bulgarian Lemma Improvement Project (BLIP) is to raise the overall standard of Bulgarian lemma entries on English Wiktionary. To that end, the project proposes specific actions editors can take, grouped into tiers. The tier system is meant to help match project expectations to individual editors' available time and energy, as well as language fluency. In particular, Tier-1 is designed to be within reach for every currently (2023) known active Bulgarian editor.
The "north star" of this project is that, in the fullness of time, every Bulgarian lemma entry should have, at a minimum:
We envision that this will be an ongoing rather than a fixed-duration project, since it naturally competes with other worthy goals such as increasing Bulgarian coverage. The hope is that, over time, the baseline quality of Bulgarian lemma entries increases, thereby also encouraging a higher standard for new entries.
Participation in this project can take many forms, depending on editors' availability and personal interest. We're happy with on-and-off, regular, time-limited or even one-off participation. We're also happy with editors choosing their focus - e.g. whether they want to work on entries starting with a particular letter, or on a particular subject, or of a particular grammatical category, etc.
We encourage, but do not require, participants to put "BLIP: " or "" (or their lowercase equivalents) in their edit messages to indicate that an edit is related to this project. There is also a section at the end of this document where editors can optionally let others know what they're working on.
There are currently four tiers, each consisting of a set of recommended edits to an entry, if the entry needs them. The tiers are designed to be cumulative - e.g. Tier 2 implicitly subsumes everything under Tier 1. However, it's OK for editors to make later-tier changes before earlier-tier ones, if that's consistent with their interests and inclination. It's also OK to split the overall work into multiple edits - e.g. you can bring an entry to the Tier-2 bar by making a Tier-1 edit first, and later an incremental Tier-2 edit.
All that said, it is the strong hope of this project that all lemma entries meet at least the Tier-1 quality bar. Tier 1 should be accessible to any editor who can make use of one of our standard Bulgarian online dictionaries (e.g. see {{R:bg:RBE}}
).
This is the "baseline" tier that expresses our vision of a minimally viable Bulgarian entry. While a lot of entries already meet that bar, many don't. For BLIP to be considered a success, the proportion of Bulgarian entries that meet this bar needs to be very high.
Every entry should have a pronunciation section. Based on the currently available Bulgarian templates, a pronunciation section should have, at a minimum:
{{bg-IPA}}
, indicating the correct stress for polysyllabic words.
{{bg-IPA}}
line.|endschwa=
parameter to be set to 1
.{{bg-hyph}}
.Тhe part of speech should be one of the allowable parts of speech listed in WT:EL (see link above).
For parts of speech that have a custom Bulgarian headword-line template - e.g. {{bg-noun}}
- that template should be used. In all other cases, the {{head}}
template should be used - e.g. {{head|bg|prefix}}
. See the category above for currently available Bulgarian headword templates.
Some basic expectations:
p
(for plural).{{bg-verb}}
for how to do this.{{bg-adj}}
.{{bg-adv}}
.For Tier 1, what we're looking for is quality translations from Bulgarian to English. Usage examples and quotations will be covered by subsequent tiers.
There is already a wealth of English-Bulgarian and Bulgarian-English dictionaries in existence, so there isn't much new ground to break here. PONS is a pretty decent bi-directional online dictionary, and useful if you're unsure about a translation. If you're unsure about how to translate something, ask another editor, or post on Wiktionary:Tea room.
Be sure that you have familiarized yourself with the distinction between a translation, a gloss definition (see {{gloss}}
) and a non-gloss definition (see {{ng}}
). In a nutshell - a translation makes sense when there's a direct English equivalent of a Bulgarian word; otherwise, you need a gloss or non-gloss definition. Both gloss and non-gloss definitions explain the meaning of a word, but gloss definitions can be substituted for the word in a sentence. E.g. if the verb "to flaffle" was given the definition "to make a gurgling sound", then you could substitute that definition in the sentence "He flaffled" → "He made a gurgling sound." That's a gloss definition. If, instead, you had defined "to flaffle" as "an onomatopoetic verb that describes making a gurgling sound", then you couldn't replace "flaffle" with that in the example sentence - it's a non-gloss definition.
Recommendations:
{{gloss}}
to list out the meanings that apply to the Bulgarian lemma. See for example балон (balon) - it has 5 of the 14 meanings of English balloon, each listed out separately.For Tier 1, don't worry about labels ({{lb}}
and {{tlb}}
) unless you feel comfortable adding them. Tier 2 goes more in depth on label use.
Verbs should have a "Conjugation" subsection which uses {{bg-conj}}
. Nouns and adjectives should have a "Declension" subsection, utilizing {{bg-ndecl}}
and {{bg-adecl}}
, respectively.
Expectations:
/-voc
in the options to {{bg-adecl}}
. See the template's documentation on how to do that.{{bg-ndecl}}
(they are suppressed by default). Use of the vocative in modern standard Bulgarian is limited outside of given names, and it takes a certain amount of background to know where it makes sense to be included. Such background is not assumed for Tier 1.Every entry should ideally have at least one reference in its "References" section.
The most common type of reference for Bulgarian entries is a dictionary reference. We have templates for several popular dictionaries which are available online, and the most commonly used ones are {{R:bg:RBE}}
and {{R:bg:RBE2}}
. It's often a good idea to just add those two, and double-check in Preview Mode before publishing that the generated links actually work. If you click on them and they take you to a URL where you can find definitions for the word, then they work. When a word is missing from {{R:bg:RBE}}
, you should check if it's in {{R:bg:Infolex}}
.
A user-edited online slang dictionary a la Urban Dictionary is available via {{R:bg:BGJ}}
. A dictionary of neologisms is available via {{R:bg:Neolex}}
. The Bulgarian Etymological Dictionary is available via {{R:bg:BER}}
; however, etymology is considered out of scope for Tier 1. This is primarily to allow non-native editors who are concurrent language learners to contribute to Tier-1 improvements. BER is not an easy read, and incorporating it in an entry is often not as simple as just translating the Bulgarian text into English.
Some words may be too new to have made it into Bulgarian dictionaries yet. Don't worry about such words for Tier 1.
Tier 2 builds on the bar set by Tier 1, by providing users with better clues on how Bulgarian words are used in practice, as well as by indicating relevant stylistic and grammatical considerations. The latter include, among other things, whether a word is dialectal, obsolete, uncountable or derogatory.
The second goal of Tier 2 is to improve the discoverability of Bulgarian entries, by:
In Tier 2, we want to take advantage of the ability to specify related word forms that some Bulgarian headword templates provide.
Where applicable, consider specifying:
{{bg-noun}}
: relational adjectives, feminine/masculine equivalents, diminutives and augmentatives (undocumented but available via |aug=
).{{bg-verb}}
: as already expected in Tier 1, the imperfective or perfective counterpart of the verb{{bg-adj}}
: abstract nouns, adverbs and diminutive formsNot every noun, verb or adjective will have all (or any) of these additional forms. Some may have multiple applicable forms, e.g. multiple noun diminutives (as in вода (voda)).
We use labels to add grammatical, stylistic and topic categorization information to entries. Grammatical information includes things like whether a noun is uncountable, or a verb is transitive. Stylistic information captures whether a word is colloquial, derogatory, dialectal, archaic, etc. Topic categorization lets users know whether a word is e.g. a physics or an art history term. Topic labels often automatically add an entry to a particular topic category.
The two main label templates are the term label: {{tlb}}
, and the context label: {{lb}}
. A term label applies to all listed senses of a word, and is placed directly after the headword template. A context label applies to a specific word sense, and is placed in front of it. In other words, if you find yourself applying the same label to all senses of a word, it should most likely be a term label. For a sense label example, see реотан (reotan). For a term label example, see лих (lih) (which also uses several sense labels). For information on what labels are available, consult the documentation of {{lb}}
.
reflexive-se
and reflexive-si
.{{lb|uncountable|dialectal|agriculture}}
.Labels often automatically put an entry into a category, so when you add or modify labels, always check for red-link (not yet created) categories at the bottom of the entry, and create them using {{auto cat}}
.
To make Bulgarian entries more discoverable, we should ensure they're listed in the "Translations" sections of the corresponding English entries. That way, Wiktionary can function as a bidirectional English-Bulgarian/Bulgarian-English dictionary. While many English entries list their Bulgarian translations that are also on Wiktionary, quite a few don't.
Well-crafted English entries have separate translation tables per word sense, making it possible to add Bulgarian translations that match the correct sense(s). That's not true for all English entries, and some English entries don't even have a Translation section. Translations are added using a Wiki gadget called TranslationAdder (see WT:TADDER), which is enabled for everybody by default.
Note that not all Bulgarian entries will have corresponding English entries on Wiktionary. That could be because there's no adequate English translation in general, or because the English translation hasn't yet been added to Wiktionary, or e.g. because the English translation would be considered a "sum-of-parts" (WT:SOP), and thus ineligible for addition to Wiktionary. The rest of this section assumes that you're working with a Bulgarian entry which has English translations, and those translations have entries on Wiktionary.
A lot of this is already covered by the links provided in "See also" at the top of this section.
For each listed sense of a Bulgarian lemma:
{{trans-top}}
and {{trans-bottom}}
. See the template docs, and any English entry for a common word, to get an idea of how these two are used.{{trans-top}}
and {{trans-bottom}}
, giving it an appropriate gloss to serve as a heading.Save
button that will show in your browser when you're done. When adding Bulgarian translations:
bg
language code and hit TAB to load applicable checkboxes.Click on the "What links here" link under "Tools" in the Wiktionary main menu. This should list all the English pages where you added the Bulgarian word as a translation. You may see additional English pages, which means other editors before you added the Bulgarian word as a translation of those additional English words. Double-check those translations, removing incorrect ones and/or adding word stress as needed.
In Tier 2, we start complementing the quality English translations provided in Tier 1 with example sentences and quotes. This gives learners, translators and anyone else interested in Bulgarian an even better idea of a lemma's actual usage in the language.
General guidelines for adding example sentences can be found at Wiktionary:Example sentences. The main template for formatting example sentences is {{ux}}
, and its variant {{uxi}}
for short examples. Guidelines for adding quotations can be found at Wiktionary:Quotations. There are several templates - such as {{quote-book}}
and {{quote-journal}}
- depending on the type of durable media quoted.
Another kind of example is a collocation - a combination of two or more words that commonly go together in a language, such as "tight budget". A Bulgarian example is потомък (potomǎk, “descendant”), which includes the example collocation пряк потомък (prjak potomǎk, “direct descendant”). Collocations are not complete sentences, but they show users common word combinations that they might encounter in practice. For general guidelines on adding collocations, see Wiktionary:Collocations. Collocations go before example sentences.
For the scope of Tier 2, the main expectation is that editors add well-chosen collocations and/or example sentences. Quotations are a stretch goal, except in the case where a word has no applicable dictionary references. In those cases, at least one quotation is required. Well-chosen examples don't just mention a word, but rather use it in a way that helps a reader form a mental picture of the word. For instance, if you had to give an example for "chair", a not-so-good example might be: "On top of the pile of trash there was a chair." In this example, the word "chair" is mentioned, but not in a way that represents what a chair is or does. A better example might be: "The old man sat in his favorite chair in front of the TV." It is, indeed, common to sit in a chair in order to watch television, so this example captures more of the real-life use of the word. Very common words like "chair" don't necessarily need an example sentence, but the principle applies in general.
Stress should be indicated at a minimum on the word for which examples, collocations or quotations are being given. We recommend that all stressed words in collocations and example sentences indicate their stress. For quotations, use your judgment - older and dialectal texts might contain words stressed differently from the contemporary standard language, so add stress when you're sure you're right. Remember that Bulgarian prepositions, conjunctions, pronouns and a few other word types are usually pronounced together with the word that follows, so they wouldn't get their own stress (unless done for emphasis).
Topic categories help organize entries by subject matter - e.g. astronomy, music, fabrics, biological taxa, etc. They provide another way for users to discover entries in a language, taking advantage of users' interests, hobbies or professional needs.
As previously discussed, certain context labels will automatically add entries to topic categories. Additional topic categories are added to an entry using the {{C}}
template, e.g. {{C|bg|Physics}}
. Using this template is preferred over raw category markup, e.g. ]
- in fact, there is a maintenance category listing Bulgarian entries with raw category markup. If you want to save yourself some typing, you could also add topic categories via the HotCat gadget.
A good starting point for considering what topic categories to add to a Bulgarian entry is to look at the topic categories of the corresponding English entry (if one exists). If there is no corresponding English entry, use information provided in Bulgarian dictionaries, Wikipedia, or your personal understanding of the subject matter. Topic categories have subcategories, and it's often best to assign an entry to the most specific (sub-)categories that apply to it. It is, however, not a requirement or expectation that every entry should be assigned to topic categories.
The English category tree can be found at Category:en:All topics. It is a superset of the Bulgarian category tree Category:bg:All topics, because we simply haven't created all the same categories for Bulgarian yet. Note that you can't just make up any category name and have it work properly with Wiktionary - see Module:category_tree/topic_cat/data for the names of recognized topic categories and subcategories. This is where looking at the topic categories applied to an equivalent English entry can save you some time.
It will sometimes be the case that a valid category you specify via {{C}}
won't exist for Bulgarian yet - in that case, click on the red link and create the category by setting its text contents to {{auto cat}}
. That template works with the category tree module, and will automatically do the right thing if you're using a recognized category name.
Tiers 1 and 2 ensure that Wiktionary is a good standard Bulgarian-English and, to an extent, English-Bulgarian dictionary. That's a great milestone to reach, but short of the unique strengths that Wiktionary brings to the table compared to standard bilingual dictionaries. One of those strengths is that Wiktionary entries can, and often do, include the etymologies of words and expressions, deepening users' understanding, and revealing the unique and often surprising histories of those words and expressions. Connections with other languages - both within the same language family and outside of it - are elucidated, as are connections between words in the same language.
Bulgaria has sat at the crossroads of civilizations throughout its history, and that's reflected in its lexical makeup - alongside a solid Slavic core formed by inheritance, derivation and borrowing, there are influences from Greek (through prolonged contact), Latin, Ottoman Turkish and through it - Classical Persian and Arabic, French, German, Italian, and a host of other languages, including English. Tier 3 is about providing etymologies for Bulgarian entries, as well as the related and derived terms revealed by etymological information.
The main reference source for etymologies of Bulgarian words is the Bulgarian Etymological Dictionary (BED). As of this writing (11/2023), eight volumes have been published, of which the first seven - covering words up to терясвам (terjasvam) - are available online for free. We use {{R:bg:BER}}
to cite the dictionary. Dictionary entries contain the headword's origin, cognates and derived terms (among other more detailed information).
BED will often provide the Old Church Slavic etymon for modern Bulgarian words, if it is attested in the OCS canon. A lot of Bulgarian entries today are missing this link, and instead show inheritance directly from Proto-Slavic. Make sure entries with etymology sections (including those you write yourself) reflect inheritance/derivation from OCS whenever possible. For words that aren't available in BED, check out {{R:sla:EDSIL}}
or {{R:sla:ESSJa}}
.
There are a few things to watch out for when citing BED:
#turkic
channel on Discord, or you can request that the term be added by specifying {{l|ota||tr=<Latin equivalent>}}
(if you don't know the Arabic-script form).#balto-slavic
channel on Discord.{{inh+}}
, {{der+}}
and {{bor+}}
and when to use them.
{{inh+}}
for the nearest ancestor language, and {{inh}}
for its ancestors.{{bor+}}
with the immediate donor language, and {{der}}
for its donor language, etc.{{dercat}}
is available to indicate that a word has additional ancestors without including them in the etymology section text explicitly.
{{dercat}}
. However, if there is a direct, reconstructed Proto-Indo-European (PIE) ancestor, it's OK to go up to PIE. English entries often do that.{{cog}}
and {{ncog}}
.{{desc}}
.
BED often includes a rich list of derived terms, some of which may be dialectal, archaic or obsolete. Use your judgment on how many of them to include in the "Derived terms" section of an entry. It can also be helpful to look up those derived terms in our standard dictionaries.
To keep things neatly organized, separate derived terms by part of speech by using e.g. {{col-auto|bg}}
with |title=adjectives
. {{col-auto}}
will figure out the number of columns for you in case there are multiple derived terms, as well as whether or not to collapse the list. It is also used by the specific template we have for derived verbs - {{bg-derived verbs}}
. For an example tying all of these together, see черпя (čerpja) or кадър (kadǎr). If a word only has a few derived terms, it's OK to forgo these templates and list the derived terms directly.
Derived terms should indicate word stress. Derived verbs should further indicate aspect, which you get (almost) for free by using {{bg-derived verbs}}
.
Words in the "Related terms" section generally have the same morphological root as the main lemma, but aren't derived from it using affixes. For example, if you're working on the entry for изправям (izpravjam), related terms include прав (prav) and правило (pravilo), all three of which ultimately come from *pravъ.
There are no hard and fast rules about how many related terms to include in an entry, so use your judgment. It might be a good idea to try and reduce duplication across entries - for instance, rather than copying all the derived terms of черпя (čerpja) into each derived term's "Related terms" section, you could simply put черпя (čerpja) in that section. A user clicking on черпя (čerpja) would then see all of its remaining derived terms. For general guidance, see Wiktionary:Entry layout#Related terms. By that guidance, arguably a better related term for изправям (izpravjam) might be изправност (izpravnost), since it looks like it might be derived from the verb's past passive participle, but it's actually borrowed from Russian исправность (ispravnostʹ).
Bulgarian words and expressions get borrowed by other languages too! Romanian is the most common such language, followed primarily by other languages spoken on the Balkans. English sometimes has unadapted Bulgarian borrowings related to traditional culture, such as gadulka and sharena sol. BED usually indicates if a Bulgarian word is borrowed by other languages.
Generally, if you list a foreign-language word as a descendant of a Bulgarian word, you should also make sure that the foreign-language word's etymology section indicates that it was borrowed or derived from Bulgarian. In practice, since you likely don't know all the languages that borrow from Bulgarian, you should exercise caution and be open to collaboration. In particular:
Tiers 1 to 3 spell out a roadmap for making Wiktionary a capable explanatory, etymological and synonym Bulgarian dictionary, all in one! If we've gotten thus far consistently for most Bulgarian entries, we're in really good shape and we should be proud of our work. Tier 4 is about going the extra mile and taking full advantage of Wiktionary's unique format and strengths.
All activities listed here are optional and largely independent, so editors can pick and choose the ones they're passionate about - it all qualifies as Tier 4 work. Of these, adding audio recordings is a particularly helpful activity, especially for current or potential Bulgarian learners.
Wiktionary gives us the opportunity to add native-speaker recordings to entries. User:Kiril kovachev has done an incredible amount of good work towards increasing Bulgarian audio coverage, so if he's still around when you're reading this, you might want to collaborate with him on that. For guidance on uploading audio files, check out Wiktionary:Pronunciation#Audio files. Once uploaded to Commons, you can add an audio file to an entry using {{audio}}
.
In Tier 2, we got started on adding quotations for Bulgarian words. Per WT:ATTEST, a word has to either be in "clearly widespread use" (e.g. котка (kotka)), or we need to provide "at least three independent instances spanning at least a year". Since Bulgarian is a "well-documented language" (see WT:WDL), it is subject to that attestation requirement. Tier 4 is a good time to add quotations for each applicable word sense towards meeting the minimum count.
For lemmas that would benefit from a picture (e.g. пъстърва (pǎstǎrva), more so than краставица (krastavica)), you can add it using ]
. You put this right after the L2 language header and before any of the L3 headers.
There is also the ability to add a picture dictionary to an entry, usually for entries that represent "umbrella" terms with many individual examples. A Bulgarian entry with a picture dictionary is гризач (grizač). See {{picdic}}
and {{picdicimg}}
for more information, and take a look at the English pages where those templates are used to get an idea of applicability.
Colors, months, zodiac signs, chemical elements and days of the week are examples of closed sets of related terms. Wiktionary allows us to organize them in lists and tables, so that users can get an at-a-glance view, and quickly navigate to any of the terms in the set. Check out Category:Bulgarian auto-table templates and Category:Bulgarian list templates for what we have today, and how these templates are used. It's also good to check out the English ones, for an idea of where else these might be applicable.
The systematic taxonomic names of living things are sometimes available as Translingual entries on Wiktionary, but not always. Our sister project Wikispecies is devoted specifically to the cataloguing of species and their "vernacular names", i.e. non-scientific names in different languages. See the template {{taxlink}}
for including a link to Wikispecies.
These are just some of the ways to enrich Wiktionary entries - look around, see what you like, and consider bringing it over to Bulgarian entries!
If you'd like to let your fellow Bulgarian editors know what you're working on as part of this project, feel free to update the list below. This is completely optional, and it's only there for visibility and to help prevent duplicate effort.
Current participants: