(based on Wiktionary:Entry layout)
Note 1: This guide is intended to provide guidelines both for creating Sanskrit entries on English Wiktionary as well as for adding Sanskrit translations to English words. The main guidelines for creating any entry on English Wiktionary is set forth in Wiktionary:Entry layout; this page is an addition to that page, not a replacement.
Note 2: If a change occurs in the basic wiktionary template (currently at Wiktionary:Entry layout) that affects Sanskrit entries, then that change should be reflected here.
"Sanskrit" on Wiktionary refers to not only Vedic and Classical Sanskrit, but also the broad dialect continuum of Old Indo-Aryan languages that gave rise to the Middle and Modern Indo-Aryan languages. Some literature use a narrower definition, but for lexicographic simplicity it was agreed to use a broad one per this discussion.
Practically, this means:
{{lb|sa|Vedic}}
to both categorize the term as "Vedic Sanskrit" and display a "Vedic" label.The Sanskrit language has no single script associated with it. The system predominant in India historically in the written literature as well as today is Devanagari. Entries in Wiktionary may be in any of the scripts if there is usage. However, all words should at least have a Devanagari entry.
The same word in other Indic scripts may be referenced under the Alternative scripts header, see WT:ELE. In scripts other than Devanagari, it often suffices to define a term as <Other Script Name> form of <Devanagari Equivalent> with {{sa-sc}}
.
In the Translation section of English terms, Sanskrit entries should be presented in Devanagari, e.g. at horse:
* Sanskrit:
{{t|sa|अश्व|m}}
The headword/inflection line should show the Devanagari or other Indic script, with the IAST transliteration in parenthesis, with optional Vedic accent marks on vowels where present; example at अश्व (áśva):
{{sa-noun|g=m|tr=áśva}}
As pitch accent was lost in Classical Sanskrit, in many cases, the position of the original accent is unknown. In such cases, accent can be ignored.
For Sanskrit nominals, the lemma form is the stem (in the case of adjectives, the masculine stem). See here for an overview of Sanskrit declension. For instance, for a-stems, the lemma ends in -a (not the nominative -aḥ).
Active participles are sometimes given in dictionaries as ending in either -at or -ant. Our policy is to use -at for the lemma, as in भवत् (bhavat, “present”); the -ant form can be optionally listed here as an "Alternative form". Similarly, perfect participles should end in -vāṃs, like विद्वांस् (vidvāṃs, “understanding”).
Following the style of Proto-Indo-European and related languages, the Sanskrit "root" is a basic unit of meaning from which verbal and nominal forms are derived.
The root is often in the zero-grade, like भृ (bhṛ), but can sometimes be in other grades, like धा (dhā). In some printed dictionaries where compactness is required, the √ symbol signifies that a term is a Sanskrit root, like √dā. Because there are no such space constraint in Wiktionary, the √ symbol should always be avoided. Instead, refer to roots like "from the root Sanskrit दा (dā)", etc.
Wiktionary makes some distinctions between "roots" and "verbs" that some dictionaries, like Monier-Williams' English-Sanskrit dictionary, do not. Some roots, like दा (dā), are "true" non-prefixed roots, have a number of verbal/nominal derived forms, and have an entry in dictionaries like Whitney's dictionary of roots. The current practice is to include prefixed roots, like संधा (saṃdhā), as valid "root" entries.
Some other given "roots" in dictionaries like Monier-Williams essentially only correspond to the present-tense verbal forms. In such cases, there is no Wiktionary root, and the verbal form is given without a root, like खणखणायते (khaṇakhaṇāyate). This distinction can be blurry. In general, it is best to include just a verbal lemma and only add a root lemma if one knows what they are doing.
Sanskrit verbs are lemmatised in the third-person singular present active indicative. As discussed here, the following are valid "verbal" lemmas:
Notably, this means that the imperfect is non-lemma (the present tense is the lemma) and the conditional is non-lemma (the future is the lemma), among others.
In the non-present forms, usually a definition like "perfect of जन् (jan)" produced by {{inflection of|sa|जन्}}
, along with the conjugation given by {{sa-conj}}
suffices for a definition.
If a root form exists for some lemma, it should be linked to in the headword template (e.g. {{sa-noun}}
, {{sa-adj}}
, {{sa-verb}}
, etc). This will categorize the word appropriately. {{root|sa|inc-pro}}
, {{root|sa|iir-pro}}
, and {{root|sa|ine-pro}}
should also be used in the "Etymology" section where appropriate.
The template {{inflection of}}
identifies the lemma form and particular inflected form of the entry.
All infinitives, gerundives, and past passive participles are considered non-lemma forms of the root in Sanskrit. Active and medio-passive participles are considered non-lemma forms of their respective verbal lemmas.
Frequently, active/medio-passive/passive participles are also considered "adjectives" or "nouns" in their own right. In such cases, like कृत (kṛta), there should be a participle section (non-lemma) defined with {{inflection of}}
and an adjective/noun section (lemma) with the other relevant definitions and inflections.
Sanskrit literature chronologically encompasses more than 3 millenia of written and oral record. As such, owing especially to the particular detachment from spoken language after the codification of Classical Sanskrit by Pāṇini ~ C5 BCE, Sanskrit words came to develop plethora of often widely divergent meanings. Some of these are confined to a particular chronological period, to a particular literary style, or a particular author, work or a tradition. All of these meanings merit inclusion per criteria for inclusion for extinct languages. Monier-Williams' English-Sanskrit dictionary employs several hundreds of abbreviations listed after a particular semantic group (that itself corresponds to a single Wiktionary definition line) for this purpose. Wiktionary shall employ the same set of abbreviations, by means of a quote provided by the {{Q}}
template which accepts the abbreviation without the final dot and automatically fills in the metadata from Module:Quotations/sa/data (which can be expanded as necessary).
Such abbreviations should come bulleted following every definition line. For example, the second definition line of दृष्टि (dṛṣṭi) is in the Monier-Williams dictionary given as:
sight, the faculty of seeing, ŚBr.; Mn.; Suśr. &c;
which translates into Wiktionary syntax as:
# ], the faculty of seeing #* {{Q|sa||ŚBr}} #* {{Q|sa||Mn}} #* {{Q|sa||Suśr}}
See the #References section below for specific details on good Sanskrit references.
This section always appears at level 3 as ===References===. It should conclude the language section, and should never be placed within any subheader. It will include all references for the Sanskrit section as a group. If there are multiple etymologies corresponding to different terms that are homonyms, do not include a separate level 4 ====References==== section for all the different words; instead, use the "<ref>" tag to reference specific citations throughout the subsections and use "<references/>" under the level 3 ===References=== section.
Standard transliteration system for Sanskrit on Wiktionary is exclusively IAST - all the others of dozen or so commonly used transliteration schemes such as Harvard-Kyoto or ISO 15919 are forbidden. Transliterations shall appear in the inflection line with tr= parameter, and everywhere else when they are commonly used, such as mentioned in prose with {{m}}
. Transliterations are not mandatory for listings of Sanskrit lexemes, such as inside ====Related terms====
or appendices.
Entries written in IAST transliterations shall not appear in the main namespace. Commonly used English terms originating from Sanskrit that approximately correspond to transliterated Devanagari are subject to WT:CFI for English lexemes, and as such shall be formatted under ==English==
rather than ==Sanskrit==
L2 headers.
For reference purposes the following templates are available for dictionaries that are out of copyright and freely available on various places on the Web:
{{R:MW}}
– the popular Monier-Williams' Sanskrit-English dictionary. This template accepts single unnamed parameter: the page number in 4-number format. So, for example, for referencing the page 1, this template would be called as {{R:MW|0001}}
, for page 234 as {{R:MW|0234}}
, for page 1234 as {{R:MW|1234}}
and so on.{{R:Cappeller Sanskrit-English}}
or {{R:CAP}}
– Cappeller's dictionary. See the template page for instructions.{{R:MCD}}
– Macdonell's dictionary (1929 reprint). This template accepts a single unnamed parameter: the page number in 3-number format.{{R:WIL}}
– Wilson's dictionary. This template accepts a single unnamed parameter: the page number in 3-number format.For example, the entry on अंश (áṃśa) has the following ===References=== section:
===References=== {{R:MW|0001}} {{R:CAP|001}} {{R:MCD|001}} {{R:WIL|001}}
"Sanskrit" in Wiktionary actually refers to a dialect continuum of Old Indo-Aryan_languages (see #Scope). {{R:CDIAL}}
is one of the best resources for reconstructing based on New and Middle Indo-Aryan, but is not always clear about whether the reconstructed term is early Middle Indo-Aryan (which Wiktionary calls Ashokan Prakrit) or Old Indo-Aryan (which Wiktionary calls Sanskrit). Between Old Indo-Aryan and Middle Indo-Aryan, there are a few key changes:
For example, for Hindi अधूरा (adhūrā, “incomplete”), CDIAL gives a reconstructed ancestor *ardhapūraka, which is Old Indo-Aryan as it contains the complex consonant cluster rdh. We therefore class it as reconstructed Sanskrit Sanskrit *अर्धपूरक (ardhapūraka). For Hindi बोलना (bolnā), CDIAL gives a reconstructed ancestral root as bōll (with the equivalent present tense lemma form bollati), which is phonetically-valid for Middle Indo-Aryan. Hence, it would be more correct to label the ancestor of the Hindi word as Ashokan Prakrit *𑀩𑁄𑀮𑁆𑀮𑀢𑀺 (*bollati) from the root *𑀩𑁄𑀮𑁆𑀮𑁆 (*boll), rather than Sanskrit *बोल्लति (*bollati) in the absence of an attested form. The Old Indo-Aryan ancestor of बोलना (bolnā) would simply be considered unclear in such a case.
For many other words in descendant languages like Hindi, Bengali, and Marathi, there is a a clear, attested ancestor in Sanskrit. In such cases, most of the above advice can be disregarded and the Sanskrit term is given as the ancestor to the modern language.
This situation is made much more complex by the concept of Sanskritization of Prakrit forms and hyper-Sanskritization (i.e. hypercorrection of Middle-Indo Aryan Prakrit forms attempting to "reconstruct" the Sanskrit form). As a general rule, words should not inherited/derive from hyper-Sanskritization. In a few cases, the lines of what inherits from what is not entirely clear. It may be helpful to use {{rfe|sa}}
or start a discussion in the Talk page of the verb.
As with other Wiktionary languages, apply the principle of "translate lemmas with lemmas":
Sometimes, we know there is a problem, but don't know what to do to correct the problem. If you should find a Sanskrit entry with a problem that you do not know how to correct, there are several ways to approach the situation.
{{attention|sa}}
. This template will add the entry to Category:Requests for attention concerning Sanskrit, where another user can then find and correct the problem. It helps if you include comments on the entry's talk page explaining what the problem is or why you think the page needs attention.{{rfc}}
. this is a more general cleanup tag, and it allows the user to include reasons or concerns as an argument in the template. Be sure to also add an entry to WT:RFC concerning the word so that other editors will be made aware of the problem.