EgyVoc : an Automatic Vocalizer for Ancient Egyptian

Version: 0.0.11

Release date: 08.01.2020

Authors

Marwan Kilani - Swiss National Science Foundation (Mobility Grant) - Freie Universität Berlin (2019-2020)

How to cite

Kilani Marwan, 2019, EgyVoc 0.0.11 : an Automatic Vocalizer for Ancient Egyptian, https://github.com/MKilani/EgyVoc

Introduction

EgyVoc uses Coptic forms and Group Writing spelling to automatically reconstructs the Early New Kingdom vocalization of Ancient Egyptian words.

EgyVoc can reconstruct the vocalization on the basis of various combinations of data, namely:

Sahidic form
Sahidic form and Egyptian consonantal root (for a more accurate reconstruction of the consonants)
Group Writing spellings (with attestations from at least one period)
Sahidic form and Group Writing Spellings (and Egyptian consonantal root)

As a general rule, the more data are provided, the more accurate the reconstruction will be. However, in some cases a complete reconstruction can be obtained just on the basis of the Sahidic form, or of just one Group Writing spellings, while in others it will not be possible to obtain a full, precise reconstruction even when all the data are available. Each word is different, and each case needs to be considered individually.

The format of the input and of the output is discussed here below.

The algorithm is able to recognize irregularities in the corresponds between the forms - in that case, an error message is output instead of the reconstruction.

The possibility to use also data from other Coptic dialects and from W-orthography spellings (see Kilani 2017) will also be included in future releases.

The vocalization is reconstructed according to mainstream models (see E.g. Loprieno 1995), and the Group Writing spellings are distributed into 4 periods and they are read and analysed according to Kilani 2019.

The underlying phonological rules are summarised in the following tables:

Long stressed vowels:

Short stressed vowels

Notes:

Unstressed vowels are reduced to /ə/ during Period 2, or even earlier.

Getting Started

Prerequisites

You need Python 3.

Info how to download it here:

https://realpython.com/installing-python/

Installing

The python package can be installed through pip:

pip3 install EgyVoc

Two commands need to be called. First, one needs to initialize the FAAL aligning algorithm (for FAAL, see Kilani 2020 and my GitHub repository at FAAL). Then EgyVoc can be called. Here a minimal working example:

from EgyVoc import EgyVoc
from EgyVoc import initializeFAAL

initializeFAAL()

results = EgyVoc(SahidicWord = "ϩⲓⲃⲱⲓ", verbose = True )

print("Results as a Python Dictionary:")
print (results)

Input

The algorithm can take a series of arguments as input, namely:

SahidicForm
EgyptianRoot
GW_Period_1
GW_Period_2
GW_Period_3
GW_Period_4
verbose

All the arguments are optional. The argument "verbose" is a Boolean that indicates if the algorithm should output data while performing the calculations or not. It can be omitted, in which case it is set by default to True.

Input - Sahidic Form

The Sahidic Form needs to be input as a Unicode string, e.g.:

SahidicForm = "ⲥⲱⲧⲙ"

Input - Egyptian Root

The Egyptian Root needs to be input as a Unicode string. The Unicode characters ꜣ, ꜥ, ṯ, ḏ, ḥ, ḫ, ẖ, š, q (not ḳ) must be used. 𓇋 is transliterated as j, 𓇋𓇋 as y. The character 𓏭 should not be transcribed, as it is usually a diacritic, not a consonant (see Kilani 2019). In the case of late Egyptian spellings with extra final w (i.e. w-orthography spellings), the w should be ignored and not transcribed, as it does not indicate a consonant (see Kilani 2019). Final consonants in weak verbs should be transcribed as j and w.

EgyptianRoot = "sḏm"

Input - Group Writing

Group Writing spellings should be organized chronologically according to their periods of attestations (see Kilani 2019), namely:

18th dyn. - Ramses II (included) = Period 1 After Ramses II - 20th dyn. (included) = Period 2 21st dyn. - 22nd dyn. (included) = Period 3 After 22nd dyn. = Period 4

They must be input as Unicode strings and transliterated according to the system suggested in Kilani 2019. In particular:

Cw groups should be transliterated as CU (where C = any consonant, also below)
Cꜣ groups should be transliterated as CA
the sign 𓏭 should be transliterated as ʸ
the group 𓂓𓏤 should be transliterated as kU
the group 𓂧𓏭 should be transliterated as dU
the group 𓅓𓂝 should be transliterated as mA
the group 𓏭𓂋𓏤 should be transliterated as ʸr
the group 𓈖𓏥 should be transliterated as n-

GW_Period_1 = "yA.mA"
GW_Period_2 = "yA.mA"
GW_Period_3 = "yU.mA"
GW_Period_4 = "yU.mA"

Output

The Output is a Python dictionary with the following structure. If any of the main entries ('ProtoCoptic', 'GroupWriting_Voc', 'Reconstructed_Voc_Matrix', 'Reconstructed_Vocalization') is not present, it is set to None .

{
	'ProtoCoptic': {
		'CopticForm': 'string', 
		'Phonemes': 'string', 
		'PhonemeClasses': 'string', 
		'Stress': 'string', 
		'VowelLength': 'string'
		'EgyptianRoot': 'string', 
		'PhonemesIPA': 'string'
	}, 
	'GroupWriting_Voc': {
		'Regular': boolean, 
		'Earlier_Cons': [list_of_strings], 
		'Reconstr_Vow': [list_of_strings], 
		'Aligned_Forms': {
			1: {
				'ID': int, 
				'Irregularities': 'string (False or True)', 
				'Form': [list_of_strings], 
				'Consonants': [list_of_strings], 
				'IPA_Cons': [list_of_strings], 
				'VocClass': [list_of_strings], 
				'VocClassEdit': [list_of_strings], 
				'VocRec': [list_of_strings], 
				'VocCat': ['#', ints , '$']
			}, 
			2: {
				'ID': int, 
				'Irregularities': 'string (False or True)', 
				'Form': [list_of_strings], 
				'Consonants': [list_of_strings], 
				'IPA_Cons': [list_of_strings], 
				'VocClass': [list_of_strings], 
				'VocClassEdit': [list_of_strings], 
				'VocRec': [list_of_strings], 
				'VocCat': ['#', ints , '$']
			}, 
			3: {
				'ID': int, 
				'Irregularities': 'string (False or True)', 
				'Form': [list_of_strings], 
				'Consonants': [list_of_strings], 
				'IPA_Cons': [list_of_strings], 
				'VocClass': [list_of_strings], 
				'VocClassEdit': [list_of_strings], 
				'VocRec': [list_of_strings], 
				'VocCat': ['#', ints , '$']
			}, 
			4: {
				'ID': int, 
				'Irregularities': 'string (False or True)', 
				'Form': [list_of_strings], 
				'Consonants': [list_of_strings], 
				'IPA_Cons': [list_of_strings], 
				'VocClass': [list_of_strings], 
				'VocClassEdit': [list_of_strings], 
				'VocRec': [list_of_strings], 
				'VocCat': ['#', ints , '$']
			}
		}, 
		'Reconstructed_Form': [list_of_strings], 
		'Phonemes': 'string', 
		'PhonemesIPA': 'string',
		'PhonemeClasses': 'string', 
		'Stress': 'string', 
		'VowelLength': 'string', 
		'Earliest_Form': 'string', 
		'PeriodEarliestForm': int
	}, 
	'Reconstructed_Voc_Matrix': {
		'Phonemes': 'string',
		'PhonemesIPA': 'string',
		'PhonemeClasses': 'string', 
		'Stress': 'string', 
		'VowelLength': 'string'
	}, 
	'Reconstructed_VocalizationIPA': 'string',
	'Reconstructed_Vocalization': 'string'
}

Group Writing Vocalization

The following conventions are employed in the fields of the output relating to Group Writing spellings.

ɥ = used as IPA transcription of 𓇋𓇋 = , to distinguish it from 𓇋 = j
ā2 = vowel /ā/ next to k, may be spelled with U already in Period 1 (see Kilani 2019)
a2 = vowel /a/ next to k, may be spelled with U already in Period 1 and 2 (see Kilani 2019)
a3 = vowel /a/ next to ʕ and etymological pharyngeals, it does not undergo the shift /a/ > /o/
U = back vowel, spelled with w (see Kilani 2019)
A = non-back vowel or no vowel, spelled with ꜣ or with nothing (see Kilani 2019)
Ɔ = A after /k/, possibly realized as a back vowel (see Kilani 2019)
[ = the vowel must be read after the associated consonant.
] = the vowel must be read before the associated consonant.
> ... < = the vowel must be read either before (marked with > ) or after (marked with < ) the associated consonants, but not in both places.
# = beginning of the word
$ = end of the word
| = phoneme separator
⤫ = no vowel after the associated consonant
0 = empty slot
VocCat: each number indicates a distinct vowel in a given position, the first slot being after the first consonant. The presence of the same number in more than one slot indicates that the corresponding vowel may be read and either of the slots (but not in both).

Running the tests

The file EgyVoc_tester.py provides an implementation of 4 tests generating vocalizations on the basis of different sets of data.

The file EgyVoc_tester.py can be downloaded from the GitHub repository https://github.com/MKilani/EgyVoc/blob/master/EgyVoc_tester.py

The file EgyVoc_tester.py can be run from command line with:

python3 EgyVoc_tester.py

The results should look like this:

FAAL jar is running...

- - - - - -
-- Test 1 --

* Transcription of the Coptic Form:

Coptic form: ϩⲓⲃⲱⲓ
Phonemes: h.j.b.ō.j
Nr of vowels: 1
Nr of consonants: 4
Phoneme classes: C.C.C.V.C
Stress: 0.0.0.S.0
Vowel Length: 0.0.0.L.0

----

* ProtoCoptic Vocalization:

CopticForm: ϩⲓⲃⲱⲓ
EgyptianRoot: None

Reconstruction - PhonemesIPA: h.ə.j.b.ā.j.ə
Reconstruction - Phonemes: h.ə.j.b.ā.j.ə
Reconstruction - Phoneme classes: C.V.C.C.V.C.V
Reconstruction - Stress: 0.U.0.0.S.0.U
Reconstruction - Vowel Length: 0.u.0.0.L.0.u

----

Results as a Python Dictionary:
{'ProtoCoptic': {'CopticForm': 'ϩⲓⲃⲱⲓ', 'Phonemes': 'h.ə.j.b.ā.j.ə', 'PhonemeClasses': 'C.V.C.C.V.C.V', 'Stress': '0.U.0.0.S.0.U', 'VowelLength': '0.u.0.0.L.0.u', 'EgyptianRoot': 'None', 'PhonemesIPA': 'h.ə.j.b.ā.j.ə'}, 'GroupWriting_Voc': None, 'Reconstructed_Voc_Matrix': None, 'Reconstructed_Vocalization': None}

- - - - - -
-- Test 2 --

* Transcription of the Coptic Form:

Coptic form: ⲥⲱⲧⲙ
Phonemes: s.ō.t.m
Nr of vowels: 1
Nr of consonants: 3
Phoneme classes: C.V.C.C
Stress: 0.S.0.0
Vowel Length: 0.L.0.0

----

* ProtoCoptic Vocalization:

CopticForm: ⲥⲱⲧⲙ
EgyptianRoot: sḏm

Reconstruction - PhonemesIPA: s.ā.ɟ.ə.m
Reconstruction - Phonemes: s.ā.ḏ.ə.m
Reconstruction - Phoneme classes: C.V.C.V.C
Reconstruction - Stress: 0.S.0.U.0
Reconstruction - Vowel Length: 0.L.0.u.0

----

Results as a Python Dictionary:
{'ProtoCoptic': {'CopticForm': 'ⲥⲱⲧⲙ', 'EgyptianRoot': 'sḏm', 'PhonemesIPA': 's.ā.ɟ.ə.m', 'PhonemeClasses': 'C.V.C.V.C', 'Stress': '0.S.0.U.0', 'VowelLength': '0.L.0.u.0', 'Phonemes': 's.ā.ḏ.ə.m'}, 'GroupWriting_Voc': None, 'Reconstructed_Voc_Matrix': None, 'Reconstructed_Vocalization': None}

- - - - - -
-- Test 3 --

* GroupWriting vocalization:

Regular: True
First attestation: yA.mA
First attestation: Period 1

Reconstruction - PhonemesIPA: ɥ.a.m
Reconstruction - Phonemes: y.a.m
Reconstruction - Phon. Classes: C.V.C
Reconstruction - Stress: 0.S.0
Reconstruction - Vowel Length: 0.S.0

----

Results as a Python Dictionary:
{'ProtoCoptic': None, 'GroupWriting_Voc': {'Regular': True, 'Earlier_Cons': ['#', 'ɥ', 'm', '$'], 'Reconstr_Vow': ['#', '|a', '|⤫'], 'Aligned_Forms': {1: {'ID': 1, 'Irregularities': 'False', 'Form': ['#', 'yA', 'mA', '$'], 'Consonants': ['#', 'y', 'm', '$'], 'IPA_Cons': ['#', 'ɥ', 'm', '$'], 'VocClass': ['#', 'A', 'A', '$'], 'VocClassEdit': ['#', '[A', 'A', '$'], 'VocRec': ['#', '|[|a|ā|i|ī|0', '|a|ā|i|ī|0', '$'], 'VocCat': ['#', 0, 1, '$']}, 2: {'ID': 2, 'Irregularities': 'False', 'Form': ['#', 'yA', 'mA', '$'], 'Consonants': ['#', 'y', 'm', '$'], 'IPA_Cons': ['#', 'ɥ', 'm', '$'], 'VocClass': ['#', 'A', 'A', '$'], 'VocClassEdit': ['#', '[A', 'A', '$'], 'VocRec': ['#', '|[|a|i|ī|0', '|a|i|ī|0', '$'], 'VocCat': ['#', 0, 1, '$']}, 3: {'ID': 3, 'Irregularities': 'False', 'Form': ['#', 'yU', 'mA', '$'], 'Consonants': ['#', 'y', 'm', '$'], 'IPA_Cons': ['#', 'ɥ', 'm', '$'], 'VocClass': ['#', 'U', 'A', '$'], 'VocClassEdit': ['#', '[U', 'A', '$'], 'VocRec': ['#', '|[|ū|ō|o', '|a|e|ī|0', '$'], 'VocCat': ['#', 0, 1, '$']}, 4: {'ID': 4, 'Irregularities': 'False', 'Form': ['#', 'yU', 'mA', '$'], 'Consonants': ['#', 'y', 'm', '$'], 'IPA_Cons': ['#', 'ɥ', 'm', '$'], 'VocClass': ['#', 'U', 'A', '$'], 'VocClassEdit': ['#', '[U', 'A', '$'], 'VocRec': ['#', '|[|ō|o', '|a|e|ī|ē|0', '$'], 'VocCat': ['#', 0, 1, '$']}}, 'Reconstructed_Form': ['#', 'ɥ', '|a', 'm', '|⤫', '$'], 'Phonemes': 'y.a.m', 'PhonemesIPA': 'ɥ.a.m', 'PhonemeClasses': 'C.V.C', 'Stress': '0.S.0', 'VowelLength': '0.S.0', 'Earliest_Form': 'yA.mA', 'PeriodEarliestForm': 1}, 'Reconstructed_Voc_Matrix': None, 'Reconstructed_Vocalization': None}

- - - - - -
-- Test 4 --

* Transcription of the Coptic Form:

Coptic form: ϫⲁϫ
Phonemes: c.a.c
Nr of vowels: 1
Nr of consonants: 2
Phoneme classes: C.V.C
Stress: 0.S.0
Vowel Length: 0.S.0

----

* ProtoCoptic Vocalization:

CopticForm: ϫⲁϫ
EgyptianRoot: None

Reconstruction - PhonemesIPA: c.i|u.c
Reconstruction - Phonemes: c.i|u.c
Reconstruction - Phoneme classes: C.V.C
Reconstruction - Stress: 0.S.0
Reconstruction - Vowel Length: 0.S.0

----

* GroupWriting vocalization:

Regular: True
First attestation: ṯU.ṯU
First attestation: Period 2

Reconstruction - PhonemesIPA: c.u.c
Reconstruction - Phonemes: ṯ.u.ṯ
Reconstruction - Phon. Classes: C.V.C
Reconstruction - Stress: 0.S.0
Reconstruction - Vowel Length: 0.S.0

----

* Joined reconstructed vocalization:

Reconstructed vocalizationIPA: cuc
Reconstructed vocalization: ṯuṯ

Reconstruction - PhonemesIPA: c.u.c
Reconstruction - Phonemes: ṯ.u.ṯ
Reconstruction - Phon. Classes: C.u.C
Reconstruction - Stress: 0.u.0
Reconstruction - Vowel Length: 0.u.0

----

Results as a Python Dictionary:
{'ProtoCoptic': {'CopticForm': 'ϫⲁϫ', 'Phonemes': 'c.i|u.c', 'PhonemeClasses': 'C.V.C', 'Stress': '0.S.0', 'VowelLength': '0.S.0', 'EgyptianRoot': 'None', 'PhonemesIPA': 'c.i|u.c'}, 'GroupWriting_Voc': {'Regular': True, 'Earlier_Cons': ['#', 'c', 'c', '$'], 'Reconstr_Vow': ['#', '|u', '|⤫'], 'Aligned_Forms': {1: {'ID': 1, 'Irregularities': 'True', 'Form': ['#', '∅', '∅', '$'], 'Consonants': ['#', '∅', '∅', '$'], 'IPA_Cons': ['#', '∅', '∅', '$'], 'VocClass': ['#', '∅', '∅', '$'], 'VocClassEdit': ['#', '∅', '∅', '$'], 'VocRec': ['#', '∅', '∅', '$'], 'VocCat': ['#', '∅', '∅', '$']}, 2: {'ID': 2, 'Irregularities': 'False', 'Form': ['#', 'ṯU', 'ṯU', '$'], 'Consonants': ['#', 'ṯ', 'ṯ', '$'], 'IPA_Cons': ['#', 'c', 'c', '$'], 'VocClass': ['#', 'U', 'U', '$'], 'VocClassEdit': ['#', 'U]', ']U', '$'], 'VocRec': ['#', '|[|u|ū|ō', '|]|u|ū|ō', '$'], 'VocCat': ['#', 0, 1, '$']}, 3: {'ID': 3, 'Irregularities': 'False', 'Form': ['#', 'ṯA', 'ṯA', '$'], 'Consonants': ['#', 'ṯ', 'ṯ', '$'], 'IPA_Cons': ['#', 'c', 'c', '$'], 'VocClass': ['#', 'A', 'A', '$'], 'VocClassEdit': ['#', '[A', 'A', '$'], 'VocRec': ['#', '|[|a|e|ī|0', '|a|e|ī|0', '$'], 'VocCat': ['#', 0, 1, '$']}, 4: {'ID': 4, 'Irregularities': 'True', 'Form': ['#', '∅', '∅', '$'], 'Consonants': ['#', '∅', '∅', '$'], 'IPA_Cons': ['#', '∅', '∅', '$'], 'VocClass': ['#', '∅', '∅', '$'], 'VocClassEdit': ['#', '∅', '∅', '$'], 'VocRec': ['#', '∅', '∅', '$'], 'VocCat': ['#', '∅', '∅', '$']}}, 'Reconstructed_Form': ['#', 'c', '|u', 'c', '|⤫', '$'], 'Phonemes': 'ṯ.u.ṯ', 'PhonemesIPA': 'c.u.c', 'PhonemeClasses': 'C.V.C', 'Stress': '0.S.0', 'VowelLength': '0.S.0', 'Earliest_Form': 'ṯU.ṯU', 'PeriodEarliestForm': 2}, 'Reconstructed_Voc_Matrix': {'Phonemes': 'ṯ.u.ṯ', 'PhonemesIPA': 'c.u.c', 'PhonemeClasses': 'C.u.C', 'Stress': '0.u.0', 'VowelLength': '0.u.0'}, 'Reconstructed_VocalizationIPA': 'cuc', 'Reconstructed_Vocalization': 'ṯuṯ'}

EgyVoc
Release 0.0.11

Release 0.0.11

0.0.11

0.0.10

0.0.9

0.0.8

0.0.7

0.0.6

0.0.5

0.0.4

0.0.3

Documentation

EgyVoc : an Automatic Vocalizer for Ancient Egyptian

Authors

How to cite

Introduction

Getting Started

Prerequisites

Installing

Input

Input - Sahidic Form

Input - Egyptian Root

Input - Group Writing

Output

Group Writing Vocalization

Running the tests

Stats

Development practices

Releases

Contributors

EgyVoc Release 0.0.11

Release 0.0.11 Toggle Dropdown 0.0.11 0.0.10 0.0.9 0.0.8 0.0.7 0.0.6 0.0.5 0.0.4 0.0.3

Documentation

EgyVoc : an Automatic Vocalizer for Ancient Egyptian

Authors

How to cite

Introduction

Getting Started

Prerequisites

Installing

Input

Input - Sahidic Form

Input - Egyptian Root

Input - Group Writing

Output

Group Writing Vocalization

Running the tests

Stats

Development practices

Releases

Contributors

EgyVoc
Release 0.0.11

Release 0.0.11

0.0.11

0.0.10

0.0.9

0.0.8

0.0.7

0.0.6

0.0.5

0.0.4

0.0.3