This report identifies Unicode superscript Cyrillic characters in manuscript description files, which
as of Unicode 9.0 are in Cyrillic Extended-A U+2DE0–U+2DFF and Cyrillic Extended-B U+A674–U+A67B. It is
sorted by filename and, within the individual files, it lists only the items that include supscript
characters, and only the textual snippets for each of those that contain those characters. The superscript
characters are in red and their Unicode codepoint values are reported in parentheses after the sample text,
along with a count (in square brackets) of how many times each superscript character occurs in that sample.
Because Unicode does not provide combining superscript versions of all Cyrillic letters, even were we to use
the ones that are available, we would have to fall back on an alternative for others, which would introduce
inconsistencies into the representation of superscription in the corpus. For that reason, our policy is to
represent all instances of superscription by wrapping markup around regular Cyrillic letters, so that
for example, е<seg rend="sup">г<seg>
would be rendered as ег
.
If titlo, porkytie, or an accentual or breathing diacritic appears over the superscript letter, it should be
included with the letter inside the same <seg>
element.