-- tests at ]
--[=[
-- Explanations, at bottom of this page.
-- See everything veryyyy big, zoom your browser 200% or see at ]
-- check codes at ] & ]
INSTRUCTIONS
Load this module using require(), not using mw.loadData().
USE e.g.: local m_data = require("Module:XXX/data")
IF: local module_path = 'Module:Yyyy'
USE: local m_data = require(module_path .."/data")
DO NOT USE: local m_data = mw.loadData("XXX")
HOW to call it:
m_data.xxxxxx e.g. m_data.unaccented_to_accented
CONTENTS
a) simple sequences
b) conversions
accented_to_unaccented // unaccented_to_oxia // perispomeni_to_oxia // oxia_to_perispomeni
c) diphthongs and digraphs (2-vowel-sequences)
digraphs // digraphs_accent_back // digraphs_accented_to_unaccented
PROBLEMS SOLVED
UNORTHODOX characters: They exist only in some old editions, where whole words are in capitals, retaining their diacritics
= Principle: when you do not use unicode at one part, then do not use unicode to the other part either
= write those at a .txt, show it at .htm and copypaste
* CAPITAL+diaeresis+tonos.
= copy the capital.with.diaersis (as one character) & copy next to it the invisible tonos
= the tonos unicodes are invisible
Example: IOTA.with.diaeresis+tonos as Ϊ + the invisible oxeia - ◌́ (U+0301)
* CAPITAL+prosdiegrammeno iota+tonos
Example Άͅ ALPHA.prosdiegrammene+tonos as ᾼ + the invisible oxia (U+030 & # x 0 3 0 1 ;) write at .txt, show it at .htm and copy
PROBLEMS cf ??
* FORBID all family-fonts that present the accent tonos or oxia as a small vertical line. E.g. Verdana.
in case a reader has a personal css with such a font. Can this be controlled?
* Do it with that U? unicode? Show how it is written.
* For dichronon_oxia I do not know how to write all prosodies.
]=]--
local export = {}
-- NEED: FORBID all family-fonts that present the accent tonos or oxia as a small vertical line.
--------------------------------------------------------------------------
-- a) SIMPLE SEQUENCES --
--------------------------------------------------------------------------
--?? DO i need all UNORTHODOX in here? bahh
-- vowel+perispomeni (circumflex)
-- These are always macra (macron), no need for prosody marks
-- see big ]
export.vowel_perispomeni = ''
-- brachy(short)+oxia (oxia ], or baria ] all these accents called tonos
-- There are no prosody marks.
export.brachy_oxia = ''
-- macron(long)+oxia (oxia ], or baria ] all these accents called tonos
-- There are no prosody marks.
export.macron_oxia = ''
-- diphthong (2 vowels together) + any tonos (okseia, bareia, perispomene)
-- NOT dialytics ΐῒῗΰῢῧ
-- These are always macra (macron), no need for prosody marks
export.diphthong_tonos = ''
-- ΝΟΤ ALL of them ') --
-- ?? Do I NEED to write IN the function the ones with prosodies?
--[=[
-- The 3 ambiguous dichrona (dichronon = with 2 possible prosodies) are α ι υ
-- Here, we also need the characters with BOTH PROSODIES
short alpha+tonos ᾰ̓́ - Ᾰ̓́ - ᾰ̔́ - Ᾰ̔́ iota upsilon copypaste from a .txt
long alpha+tonos .. iota upsilon copypaste from a .txt
]=]--
-- dichronon+oxia (oxia ], or baria ] all these accents called tonos
export.dichr_oxia = ''
-- all vowels+oxia or baria, or perispomeni (any kind of tonos accent)
export.tonos = ''
--------------------------------------------------------------------------
-- b) CONVERSIONS (change the characters) --
--------------------------------------------------------------------------
-- to see them, zoom in 170% or 200%
--------------------------------------------------------------------------
-- ? please write notes for unicodes or whatever code too
-- remove accent from accented
export.accented_to_unaccented = {
-- alpha ambiguous dichrononon -- do I need +prosodies here?
-- α no spirits
= 'α',
= 'Α',
= 'ᾳ',
= 'ᾼ', -- UNORTHODOX write ALPHA.with.iota + invisible unicode tonos at .txt, show it at .htm and copypaste
= 'α',
-- ?? ALPHA + persipomeni -- UNORTHODOX
= 'ᾳ',
-- ?? ALPHA.with.i + perisopomeni -- UNORTHODOX
-- with psile
= 'ἀ', = 'Ἀ', = 'ᾀ', = 'ᾈ',
= 'ἀ', = 'Ἀ', = 'ᾀ', = 'ᾈ',
-- with dasia
= 'ἁ', = 'Ἁ', = 'ᾁ', = 'ᾉ',
= 'ἁ', = 'Ἁ', = 'ᾁ', = 'ᾉ',
-- ε epsilon (always brachy = short = never persipomene circumflex)
= 'ε', = 'Ε', = 'ἐ', = 'Ἐ', = 'ἑ', = 'Ἑ',
-- η eta (always marcon = long)
= 'η', = 'Η',
= 'ῃ',
-- ?? ETA.with.i + oxia -- UNORTHODOX
= 'η',
-- ?? ETA + persipomeni -- UNORTHODOX
= 'ῃ',
-- ?? ETA.with.i + perisopomeni -- UNORTHODOX
-- with psile
= 'ἠ', = 'Ἠ', = 'ᾐ', = 'ᾘ',
= 'ἠ', = 'Ἠ', = 'ᾐ', = 'ᾘ',
-- with dasia
= 'ἡ', = 'Ἡ', = 'ᾑ', = 'Ἡ',
= 'ἡ', = 'Ἡ', = 'ᾑ', = 'Ἡ',
-- iota ambiguous dichrononon -- do I need +prosodies here?
-- ι no spirits -- possible diaeresis (dialytics)
= 'ι', = 'Ι',
= 'ϊ',
-- IOTA+dialytics+tonos -- UNORTHODOX
-- https://www.compart.com/en/unicode/U+0390 decomposed as Ι (U+0399) - ◌̈ (U+0308) - ◌́ (U+0301)
-- 1.FAILED write this at .txt, show at .htm and copy: Ϊ́
-- 2.FAILED write this at .txt, show at .thm and copy: Ϊ́ which is= Ϊ (IOTA.diaeresis) + (U+0308) - ◌́ (U+0301)
-- 3.YES copypaste IOTAwithdialytics+ copypaste invisible tonos Ϊ́ that is Ϊ +
-- = when you do not use unicode at one part, then do not use unicode to the other part either
= 'Ϊ', -- this is 3.
= 'ι',
-- ?? IOTA + perispomeni -- UNORTHODOX
= 'ϊ',
-- ?? IOTA.with.dialytics + perispomeni -- UNORTHODOX
-- with psile
-- ?? psile okseia, psile perisp does not convert to IOTA WITH PSILI (U+1F38) in accent shifts
= 'ἰ',
= 'Ἰ', -- = 'Ἰ'
= 'ἰ',
= 'Ἰ', -- = 'Ἰ',
--with dasia
= 'ἱ',
= 'Ἱ',
= 'ἱ', = 'Ἱ',
-- dialytics ???
-- omicron (always brachy = short = never persipomene circumflex)
= 'ο', = 'Ο', = 'ὀ', = 'Ὀ', = 'ὁ', = 'Ὁ',
-- upsilon ambiguous dichrononon -- do I need +prosodies here?
-- υ no spirits -- possible diaeresis (dialytics)
= 'υ', = 'Υ',
= 'ϋ',
-- ?? UPSILON.with.diaeresis + oxia -- UNORTHODOX
= 'υ',
-- ?? UPSILON + perispomeni -- UNORTHODOX
-- ?? UPSILON.with.diaeresis + perispomeni -- UNORTHODOX
-- with psile
= 'ὐ', = 'ὐ',
-- with daseia
= 'ὑ', = 'Ὑ',
= 'ὑ', = 'Ὑ',
-- ω omega (always marcon = long)
= 'ω', = 'Ω',
= 'ῳ',
-- ?? OMEGA.with.i + oxeia -- UNORTHODOX
= 'ω',
= 'ῳ',
-- with psile
= 'ὠ', = 'Ὠ', = 'ᾠ', = 'ᾨ',
= 'ὠ', = 'Ὠ', = 'ᾠ', = 'ᾨ',
-- with daseia
= 'ὡ', = 'Ὡ', = 'ᾡ', = 'ᾩ',
= 'ὡ', = 'Ὡ', = 'ᾡ', = 'ᾩ',
}
--------------------------------------------------------------------------
-- place accent (okseia) on unaccented
-- for unaccented-to-perispomeni circumflex (for polytonic): see oxia_to_perispomene
-- ?? NEED: get more pairs & all UNORTHODOX
export.unaccented_to_oxia = {
-- alpha
= 'ά',
= 'Ά',
= 'ᾴ',
= 'ἄ',
--
= 'ἅ',
--
-- epsilon
= 'έ',
= 'Έ',
= 'ἔ',
--
= 'ἕ',
--
-- eta
= 'ή',
= 'Ή',
= 'ῄ',
--
= 'ἤ',
--
= 'ἥ',
--
-- iota
= 'ί',
= 'Ί',
= 'ΐ',
--
= 'ἴ',
--
= 'ἵ',
--
-- omicron
= 'ό',
= 'Ό',
= 'ὄ',
--
= 'ὅ',
--
-- upsilon
= 'ύ',
= 'Ὺ',
= 'ΰ',
--
-- with psile
= 'ὔ',
--
-- with daseia
= 'ὕ',
--
-- omega
= 'ώ',
= 'Ώ',
= 'ῴ',
--
-- with psile
= 'ὤ',
--
= 'ᾤ', -- ]
--
-- with daseia
= 'ὥ',
--
--
--
}
--------------------------------------------------------------------------
-- replace perispomeni (circuflex) with okseia (acute)
-- this is for polytonic
export.perispomeni_to_oxia = {
-- alpha
= 'ά',
--
= 'ᾴ',
--
-- with psile
= 'ἄ',
= 'Ἄ',
= 'ᾄ',
--
-- with daseia
= 'ἅ',
= 'Ἅ',
= 'ᾅ',
--
-- eta
= 'ή',
--
= 'ῄ',
--
-- with psile
= 'ἤ',
= 'Ἤ',
= 'ᾔ',
--
-- with daseia
= 'ἥ',
= 'Ἥ',
= 'ᾕ',
--
-- iota
= 'ί',
--
-- with psile
= 'ἴ',
= 'Ἴ', -- psile perispomeni (1F3F)
-- and dialytics?
-- with daseia
= 'ἵ',
= 'Ἵ',
-- and dialytics?
-- upsilon
= 'ύ',
--
-- and dialytics?
-- with psile
= 'ὔ',
--
-- with daseia
= 'ὕ',
= 'Ὕ',
-- omega
= 'ώ',
--
= 'ῴ',
--
-- with psile
= 'ὤ',
= 'Ὤ',
= 'ᾤ',
--
-- with daseia
= 'ὥ',
= 'Ὥ',
= 'ᾥ',
}
--------------------------------------------------------------------------
-- ?? add all missing capitals, add unorthodox?
-- replace oxeia (acute) with perispomene (circuflex)
export.oxia_to_perispomeni = {
= 'ᾶ',
= 'ᾷ',
= 'ἆ',
= 'ᾆ',
= 'ἇ',
= 'ᾇ',
= 'ῆ',
= 'ῇ',
= 'ἦ',
= 'ᾖ',
= 'ἧ',
= 'ᾗ',
= 'ῖ',
= 'ἶ',
= 'ἷ',
= 'ῗ',
= 'ῦ',
= 'ὖ',
= 'ὗ',
= 'ῶ',
= 'ῷ',
= 'ὦ',
= 'ᾦ',
= 'ὧ',
= 'ᾧ',
}
--------------------------------------------------------------------------
-- c) diphthongs and digraphs (2-vowel-sequences) --
--------------------------------------------------------------------------
--------------------------------------------------------------------------
-- these are ]s = 2 vowels together as one
export.digraphs = { 'αι', 'ει', 'οι', 'αυ', 'ευ', 'ηυ', 'ου' }
-- υι ?? is a diphthong, only in polytonic
-- modern synizeses: εια, ειο, υα (]),
--------------------------------------------------------------------------
-- Move accent backwords. This is called ] accent.
--[=[
-- ?? Do i NEED? In polytonic we may have
αΐ to άι
OR αΐ to άϊ (with redundant, needless dialytics at second letter).
BOTH exist.
-- at the moment do as in monotonic
]=]--
export.digraphs_accent_back = {
= 'άι',
= 'έι',
= 'όι',
= 'άυ',
= 'έυ',
= 'ούι'
}
-- ?? oυϊ with accent only in polytonic?
--------------------------------------------------------------------------
-- Convert modern greek diphthongs (pronounced as one syllable) to two separate vowels:
export.digraphs_accented_to_unaccented = {
= 'αϊ',
= 'εϊ',
= 'οϊ',
= 'αϋ',
= 'εϋ',
= 'οϋ'
}
-- ήυ ??
-- = 'υϊ', not in nouns / δεν υπάρχει σε ουσιαστικά, μόνο στο επίθετο δρύινος.
-- Αντίθετα, θα βάλει διαλυτικά στο βούισμα, βουΐσματα. Πολυτονικό?
return export
--[=[
EXPLANATIONS
Conversions of greek characters unaccented <--> accented vowels or digraphs
i) for ] script: only one accent: oxia ] ⟨ ΄ ⟩
ii) for ] script: The diacritics:
Accents:
] tonos (] oxia, acute) ⟨ ´ ) is now accepted as identical to the modern accent TONOS and the latin acute accent: ⟨ ´ ⟩.
So, polytonic includes the functions of monotonic.
CAREFUL: here, ALL tonos = oxia must NEVER be a VERTICAL line
FORBID all font-families that present tonos with a little vertical line (like Verdana)
] perispomeni ( ῀ ) similar but not identical to the latin circumflex ( ˆ )
(The ], grave accent ( ˋ ) is used only in texts, not isolated words)
Breathings ]:
], psile, soft breathing ( ᾿ )
], daseia, rough breathing ῾ )
] ] or ] dialytics: splits digraph-vowels
] subscript ] iota
For more, see https://en.wiktionary.orghttps://dictious.com/en/Module:grc-utilities
Prosody is used visibly only for Ancient Greek (and Hellenistic Koine)
* μακρόν (macron) or βραχύ (breve)
Ref
* https://en.wiktionary.orghttps://dictious.com/en/Module:grc-utilities/data
* https://www.fileformat.info/info/unicode/block/greek_extended/list.htm
* https://en.wikipedia.orghttps://dictious.com/en/Greek_script_in_Unicode
* https://en.wikipedia.orghttps://dictious.com/en/Greek_alphabet#Greek_in_Unicode
]=]--