This is a Wiktionary policy, guideline or common practices page. This is a draft proposal. It is unofficial, and it is unknown whether it is widely accepted by Wiktionary editors. | |
Policies – Entries: CFI - EL - NORM - NPOV - QUOTE - REDIR - DELETE. Languages: LT - AXX. Others: BLOCK - BOTS - VOTES. |
Proto-Koreanic here refers to historical stages of Korean which are reconstructed solely on the strength of internal Middle Korean evidence, modern dialectal forms, or comparative evidence from Korean borrowings into other languages. For more on how this is done, see Appendix:Koreanic reconstructions. However, this is an extremely uncertain field. Proto-Koreanic should not normally be lemmatized, due to the paucity of well-understood sound laws and Korean's lack of dialectal diversity. The only exceptions are:
In the traditional linguistic periodization of Korean established in the 1970s, Old Korean is the stage of the Korean language up to the tenth century, with the fall of the Silla kingdom and the unification of the country by the Goryeo dynasty marking the transition to Middle Korean. But because very few primary sources on pre-fifteenth century Korean were known in the 1970s, this traditional periodization was based almost entirely on conjecture; a dynastic change probably meant linguistic change.
The discovery in the 1990s of a sizable corpus of "interpretive gugyeol" (석독구결 / 釋讀口訣) texts, Korean-language glosses to the Buddhist canon written between the tenth to thirteenth centuries, greatly expanded the primary source material that linguists of Korean had at their disposal. As the interpretive gugyeol data was subject to more detailed analysis, it was discovered that the Korean language of the interpretive gugyeol glosses was orthographically and grammatically much more similar to what survives of first-millennium Old Korean rather than to fifteenth-century Middle Korean. Accordingly, the growing consensus in South Korean academia is to classify all interpretive gugyeol texts as "Late Old Korean". Some Western scholars have also recently adapted the new schema, such as Alexander Vovin. John Whitman, who as of 2015 continued to follow the traditional periodization, still noted that "much of our knowledge of OK comes from materials dating from the Koryŏ (918–1392) period".
Wiktionary follows the novel South Korean consensus and classifies all pre-fourteenth century Korean-language texts as sources of Old Korean. This has a number of advantages:
Only entries attested from texts written in Old Korean by Koreans are considered uncontroversially valid for mainspace entries. These include:
When citing hyangga, take care to note that Cheoyong-ga (처용가 / 處容歌), Seodong-yo (서동요 / 薯童謠), and Pung-yo (풍요 / 風謠) are all believed to be from the twelfth or thirteenth century, not their claimed date of composition. This must be noted in the quotation. The claimed date of composition may be taken as largely factual for the other twenty-one hyangga.
There are a few wordlists for Old Korean written by Chinese visitors, the most significant of which is the twelfth-century Jilin leishi. However, these Chinese transcribers were simply transcribing what they heard, unaware of the rules of Old Korean orthography, and as a result produced orthographically invalid forms. For instance, the Jilin leishi writes the Old Korean words for "one" as 河屯 and "two" as 途孛, but we know that the actual way Koreans wrote these words was 一等 and 二尸.
It is not clear whether the words given in these lists can be used as the basis of their own entries. For now, the following guidelines stand:
Many Old Korean morphemes are reconstructed from proper nouns given in the traditional histories. For example, the twelfth-century history Samguk sagi gives a large number of placenames, personal names, and titles in two forms. One form generally appears to be a translation into Classical Chinese of the meaning of the name, and the other seems to be a transcription of the pronunciation using Chinese characters. Linguists have reconstructed non-Chinese morphemes by comparing the translation form to the transliteration form.
However, this is not considered sufficient attestation for an independent Wiktionary entry. As the Wikipedia article on the placenames of the Samguk sagi discusses, many of the morphemes reconstructed in such a way may not have been Korean at all, but reflect a Japonic or other substratum.
As with Chinese wordlists, references to such reconstructions are strongly recommended in the Phonology sections of otherwise attested Old Korean entries, and in the Etymology sections of likely Middle and Modern Korean reflexes. See 거칠다 (geochilda) for an example.
There are a few terms attributable to the languages of the ancient Korean kingdoms of Baekje and Goguryeo. These languages have their own ISO 639-3 language codes and are not suitable as Old Korean entries.
Reconstructions should not normally be made for Old Korean.
Due to the opaque nature of hyangga orthography and the lack of a canonical translation in the primary sources, the language of the hyangga poems is difficult to parse. In Wiktionary, only some lemmas attested solely in hyangga works are considered suitable for inclusion:
{{unk|okm}}
. Examples:For the purposes of scholarly opinion, only interpretations that postdate the late 1970s must be considered, as the principles of Old Korean orthography were not correctly understood before then. In particular, the readings of Shinpei Ogura and Yang Chu-dong are not considered valid.
Old Korean forms are given in the Chinese characters of the original attestations, not their reconstructed phonetic value.
Old Korean forms transcribed only by Chinese logograms, without any phonographic element, should not be included. An example is 我 (“I; me”) in the gugyeol glosses. There is simply nothing one can say about these forms other than their semantic meaning, which is in any case identical to the meaning of the Chinese characters with which they are written.
Gugyeol glosses are usually drastically abbreviated, e.g. 隱 is written as 𠃍. However, the source Chinese characters are the forms used for entry titles because:
When quoting primary sources, the actual gugyeol abbreviations are preferred. The source characters may be used instead if this is not feasible, but the fact that the abbreviations have been replaced by their sources should be explicitly noted.
Reconstructed romanizations are conventionally given in the Yale Romanization of Korean, and preceded with an asterisk. Per scholarly convention, romanizations for elements of an Old Korean phrase which are orthographically represented by a logogram are given in capital letters. Example:
Only the second syllable *li is phonetically represented (as 理), so the unrepresented first syllable that we fill in with the Middle Korean reflex is given in capitals.
Given our poor understanding of Old Korean phonology, IPA pronunciations must not be added.
Middle Korean after the invention of Hangul is very well-attested and well-understood. Per scholarly consensus, Korean-language texts produced up to 1600 are considered Middle Korean, and subsequent texts are considered Early Modern Korean. Many late sixteenth-century texts, especially informal ones such as personal letters, show Early Modern Korean features. But for the sake of consistency, the year 1600 is used as a definitive boundary date on Wiktionary. Take care to note that many texts published in the seventeenth and eighteenth centuries are attributed to the Middle Korean period, but are linguistically clearly Early Modern. For example, almost all known sijo poems are linguistically Early Modern works, even though many are ascribed to poets who would have written in Middle Korean.
The entry titles for Middle Korean terms should be written in the Hangul script as invented by Sejong, without tone marks. However, the tone should ideally be marked within the entry itself.
A phonemic IPA pronunciation may be added for Middle Korean based on the scholarly consensus on fifteenth-century Korean phonology (see Middle Korean#Script and phonology). It is not clear whether a phonetic orthography is appropriate, given ongoing dispute over the exact vowel qualities of Middle Korean, although it is useful to allow readers not familiar with Korean to understand, for example, that intervocalic /l/ is actually .
In addition to Hangul, Middle Korean was also written in Sinographic systems such as Idu and "consecutive gugyeol". Middle Korean terms in these scripts should be marked with the template {{spelling of}}
that link back to the Hangul form. For an example, see 遣 (Yale: -kwo).
Forms attested only in 칠대만법 / 七大萬法 must be marked with {{lb|okm|Gyeongsang}}
.
Per this discussion in October 2020, nouns are lemmatized at connective forms, and verbs, adjectives, and verbal suffixes at allomorphic forms with 다 (Yale: -ta). It is strongly recommended that soft redirects be made for the alternative forms. Examples:
This is because nouns are being theoretically lemmatized at a stem, whereas verbs are (by tradition) lemmatized at an actual inflected form.
The following templates give detailed conjugations. However, several dozen parameters must currently be manually inputted. It is hoped that it can be eventually modularized, which should remove this problem.
{{okm-conj/L}}
{{okm-conj/L!}}
{{okm-conj/H}}
{{okm-conj/H!}}
{{okm-conj/R}}
{{okm-conj/R!}}
{{okm-conj/HH}}
{{okm-conj/됴타}}
Per the discussion at Wiktionary:Language treatment requests/Archives/2020-24 § RFM discussion: January–February 2022, Early Modern Korean is now listed separately from Modern Korean under the code "ko-ear". For terms that have remained in continuous use from Early Modern times to now, there's no explicit need to add a separate Early Modern listing; however, this is up to the entry creator.
Another consideration that should be made is that many Middle Korean texts were reprinted in the Early Modern era. Usually, some of the language was modernized while other parts of the language were left in its Middle Korean state. One example is 횩다 (Yale: hyokta), which was already considered archaic by the seventeenth century and is nowadays given as a citation example of a distinctively Middle Korean form, but nonetheless continued to exist in print into the nineteenth century. Ideally, only terms that appear in an original composition of the Early Modern era should be considered Early Modern and be used with the "ko-ear" language code.
Idu and gugyeol texts continued to be produced into the Early Modern era. But as their highly formulaic phrases did not undergo any real shift during the transition from Middle Korean to Early Modern Korean, all post-thirteenth century idu and consecutive gugyeol forms are grouped as orthographic variants of Middle Korean instead of Early Modern Korean.