Skip to content

Instantly share code, notes, and snippets.

@andjc
Last active March 23, 2025 01:11
Show Gist options
  • Save andjc/fed05bceb2e58ee5d86a03eb4d6376bf to your computer and use it in GitHub Desktop.
Save andjc/fed05bceb2e58ee5d86a03eb4d6376bf to your computer and use it in GitHub Desktop.
Greek case mapping in Python

Greek case mapping is asymmetric:

image

Python case mapping uses language/locale insensitive full case mapping. Greek requires langauge sensitive mappings defined in CLDR transformations. These build on the data in UnicodeData.txt and SpecialCasing.txt.

The PyICU module supports Greek case mapping:

import icu
locale = icu.Locale('el')
test = 'Νεράιδα'
print(test.upper())
# Output: ΝΕΡΆΙΔΑ
print(icu.UnicodeString(test).toUpper(locale))
# Output: ΝΕΡΑΪΔΑ
print(icu.CaseMap.toUpper(locale, test))
# Output: ΝΕΡΑΪΔΑ

test2 = 'ήσουν ή εγώ ή εσύ'
print(test2.upper())
# Output: ΉΣΟΥΝ Ή ΕΓΏ Ή ΕΣΎ
print(icu.UnicodeString(test2).toUpper(locale))
# Output: ΗΣΟΥΝ Ή ΕΓΩ Ή ΕΣΥ
print(icu.CaseMap.toUpper(locale, test2))
# Output: ΗΣΟΥΝ Ή ΕΓΩ Ή ΕΣΥ
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment