Recent additions and changes to encoding guidelines and policies

Anonymous and unknown authors

In the case of anonymous authors, the value of the msItemStruct/author element should be anonymous (lower case, no punctuation). In the case of unknown authors of unknown antigraphs, we leave out the author element entirely; we don’t use the value unknown.


The <availability> element should have a @status attribute (typically with the value free), but it should not have a @default attribute.

Bibliographic references

Bibliographic references under <sourceDesc> should include only a <ref type="bibl"> element that points to an item in the separate bibliography file. It should not contain a <bibl> element. For example, instead of:

<bibl default="false" status="draft"><ref type="bibl" target="bib:Милтенова1986c">Милтенова 1986: 114-125</ref></bibl>

the markup should read:

<ref type="bibl" target="bib:Милтенова1986c">Милтенова 1986: 114-125</ref>

Note that we also remove the @default and @status attributes.


Damaged or otherwise unclear text

Damaged or otherwise unclear text in incipita, etc. in the early Repertorium files was sometimes represented incorrectly with pseudo-markup, by surrounding the text in square brackets, e.g., п[салти]рь (where the text can be read, although with difficulty) or п[.....]рь (where the encoder can discern or guess the number of letters, but cannot identify what they are intended to represent). The only correct way of encoding unclear or damaged text is with the <gap>, <unclear>, and <supplied> elements, as described below As with parentheses and slashes, square brackets should never be used in incipita and other transcribed text, whether to represent unclear or damaged text or for other purposes. The only raw text that should appear in these transcription elements is text that occurs literally in the manuscript, and all editorial annotation should be represented with markup.

Text that is physically missing or entirely illegible should be encoded as an empty <gap> element, e.g.:

<explicit defective="false">и сь риданїемь вѣлико<seg rend="sup">м</seg> ѕело 
г҃лаше <gap/> ѣда умру видехь бо Іѡсифа че<seg rend="sup">д</seg>ѡ мое сладкое 
за Х҃а б҃а прославлень сь о҃це<seg rend="sup">м</seg> и с҃нѡ<seg rend="sup">м</seg> 
и стымъ д҃хѡ<seg rend="sup">м</seg> н҃нѣ и пр<seg rend="sup">с</seg>но и 
вь вѣки вѣкѡ<seg rend="sup">м</seg> аминъ.</explicit>

If the extent of the gap can be identified, it may be represented with the @quantity and @unit attributes. e.g.:

<gap quantity="6" unit="character"/>

The <gap> element should also be used where text has been omitted deliberately during transcription, and in that case the <gap> element must be accompanied by a @resp attribute that identifies the editor responsible for the deletion (<gap> elements without a @resp attribute are assumed to represent physical lacunae in the manuscript). The @resp attribute must be a pointer to an @xml:id attribute in our participant list, e.g.:

<explicit>и рѣче ѿ искони прѣбивае<seg rend="sup">т</seg> вь веки аминь. ꙗко 
 том<seg rend="sup">у</seg> по<seg rend="sup">д</seg>бае<seg rend="sup">т</seg>
<gap hand="#AA"/> амн҃</explicit>

Text that is partially legible should be represented by an <unclear> element, e.g.:

Where the editor supplies text to restore a missing or completely illegible reading, the restored text should be tagged as <supplied>, e.g.:

<head>О планитох</head>
<incipit>ззьвѣзди ѹбо</incipit>
<explicit>кь <supplied>з</supplied>апад</explicit>

@default attribute

The @default attribute should not be used (e.g., on the <availability>, <bibl>, <langUsage>, <sourceDesc>, and <textClass> elements).

Defective information

Textual excerpts such as incipita and explicita should be tagged as <incipit defective="true"> only when the value is true. The @defective attribute should never appear with the value false; in those situations it should be omitted entirely from the markup.

Folio count

The folio count must be given inside supportDesc/extent/measure, and the <measure> element itself must contain nothing but digits (Arabic and Roman) and plus signs, e.g., 123, iii+25+ii. No plain text should go inside the <measure> element, and the number of folios or bifolios must be wrapped in <measure> and must not appear as plain text inside the <extent> element. The <measure> element must also contain a @unit attribute, the value of which typically is folia (at present bifolio also occurs once in the corpus). See also locus.

<keywords> element

The <keywords> element should include a @scheme attribute with the value Repertorium, e.g.:

<keywords scheme="Repertorium">.

<langUsage> element

The <langUsage> element should not have a @default attribute.

<list> element

A simple list should be encoded just as <list>, with no @type attribute. That is, instead of:

<list type="simple">

the markup should read just:


<locus> element

Locations within a manuscript should be tagged with the <locus> element without attributes. Because TEI P5 does not support a @unit attribute on the <locus> element, the unit (f., ff., p., or pp., followed by a period and a space) must be included inside the element content, e.g.,

<locus>ff. 1r–24r</locus>

Note that in the case of folios, the side (lower-case r or v) or column (lower-case a, b, c, or d) must be given with the number.

Lost or destroyed manuscripts

Manuscripts that have been lost or destroyed should contain the same full <msIdentifier> elements as extant manuscripts, including <country>, <settlement>, <repository>, and <idno>. The information that a manuscript has been lost may be encoded outside the <msIdentifier> element in two places that are part of the standard TEI manuscript description module, as follows:

<msName> element

The <msName> element records a name by which a manuscript is known or the genre to which it belongs. It must include an @xml:lang attribute plus a @type attribute with one of the following three values:

The only words that should be capitalized in names are those that are always capitalized in their respective languages, e.g., proper nouns in all three languages and proper adjectives in English. Do not capitalize the first word of a name unless it must always be capitalized in the language of the element.

Example of the use of <msName> (from AA36NBB):

<msName type="general" xml:lang="en">miscellany</msName>
<msName type="specific" xml:lang="en">apocryphal miscellany</msName>      
<msName type="individual" xml:lang="bg">Призренски апокрифен сборник</msName>
<msName type="individual" xml:lang="en">Prizren apocryphal miscellany</msName>
<msName type="individual" xml:lang="ru">Призренский апокрифический сборник</msName>

In rendered lists of manuscript names and in codicological descriptions on the site, the manuscript will be represented by its individual names, if any exist. If not, a specific name will be used. If that doesn’t exist, a generic name will be used. At that time appropriate capitalization will be introduced automatically.

<name> element

The <name> should never have a @full attribute. That is, instead of:

<respStmt><name full="yes">

the markup should read:


<note> element

The <note> element should not have a @place attribute. That is, instead of:

<note place="inline">

the markup shoudl read:


<revisionDesc> element

The <revisionDesc> element should not have a @status attribute. That is, instead of:

<revisionDesc status="draft">

the markup should read just:



Romanization (the transliteration of Cyrillic into Latin characters) follows the international scientific system, documented at (mirrored from

<scribeDesc> element

<scribeLang> element

<scriptDesc> element


The <sourceDesc> element should not have a @default attribute.

@status attribute

The @status attribute should be used only when the status is something other than draft. That is

<availability status="free">

is okay, but no element should never have a @status attribute with the value draft.


Superscription in the early Repertorium files was encoded in at least four different ways, two using markup (the <c> and <seg> elements) and two using pseudo-markup (surrounding the superscript character with slashes or parentheses, e.g., е/с/ or е(с) for ес. The only correct way of encoding superscription is with <seg rend="sup">, e.g., as е<seg rend="sup">с</seg>. That is, use <seg>, and not <c>, to mark up a superscript character. Parentheses and slashes should never be used in incipita and other transcribed text, whether to represent superscription or for other purposes. The only raw text that should appear in these transcription elements should be text that occurs literally in the manuscript, and all editorial annotation should be represented with markup.

<teiHeader> @type attribute

The <teiHeader> should not have a @type attribute. (In earlier files had a @type atttribute with values like text and sbornik, and those have now been removed.)

<textClass> element

The <textClass> element should not have a @default attribute.

Types of books

Use the terms in bold below to represent the types of books described beside them. We follow the usage of the Oxford dictionary of Byzantium where possible, which, among other things, means that we favor Greek spellings (e.g., Praxapostolos) over Latinized ones (Praxapostolus).


Information about watermarks should be written inside the element <watermark> with the following structure:

  1. If there are no watermarks in the manuscript, the contents of the <watermark> element should be the bare text None.
  2. If there are watermarks in the manuscript, there must be exactly <motif> element as the first child of <watermark> (except when it is preceded by <locus>; see below). We do not distinguish among basic, supplemental, and additional parts of the <motif>. The <motif> element is the only required child of <watermark>.
  3. If there is a <countermark> element, it must be the first following sibling of <motif>. A <countermark> element must have a single <motif> child element; <countermark> should not contain plain text.
  4. The third part of the watermark information (but see the discussion of similar to below) is a pointer to a watermark album. The reference itself is encoded as a <ref> element with a @target attribute, the value of which is a bibliographic pointer in the form FamilynameYear. The <ref> element has two obligatory children: <num> (the number of an tracing in the album) and <date>, containing the year or range of years recorded in the album for that tracing. In the case of multiple examples of the same motif from the same album, there may be more than one <num> and <date> pair, but each <num> must have its own <date>, even when the date is the same as the date of the preceding item.

Names of motifs should be given in English according to the usage preferred in the Memory of paper project.

For example:

  <motif>Anchor in circle with star</motif>
  <ref target="bib:Moshin1973">

This will be rendered on line as:

Anchor in circle with star and AB countermark. Moshin 1973: 1393 (1560/75).

If there is more than one watermark in the manuscript and the description distinguishes them by location, the element <locus> is used as the first child of <watermark>, before <motif>. For example:

  <locus>ff. 132-140</locus>
  <motif>Two circles with a cross</motif>
  <ref target="bib:Moshin1973">

If the tracing is similar to, but not the same equal as, the watermark being described, a <term> element with the textual content similar to to appear before the <ref> element. For example:

  <locus>ff. 50, 256</locus>
  <term>similar to</term> 
  <ref target="bib:Piekosinski1893">

When should a holy person be labeled as a saint?

Holy persons are not labeled as saints when their names appear as authors of texts. For example, the authoritative title for Clement of Ohrid’s Sermon before receiving holy communion is Поучение преди причастие от Климент Охридски (not … от св. Климент …).

Holy persons are labeled as saints when they appear otherwise than as authors. For example, the authoritative title of the Akathist hymn in honor of St. John the Baptist is Акатист за св. Йоан Кръстител (not Акатист за Йоан …, without the св.). The only exception is that the word Богородица is never preceded by св..

The label for a saint is never repeated in texts dedicated to multiple saints. For example, Служба за св. Пров, Тарах и Андроник и Козма Маюмски uses св. just once, before the first saint mentioned, and does not repeat it.

Where possible, we follow the Oxford Dictionary or Byzantium for names of persons, so Gregory of Nazianzos rather than Gregory the Theologian. The Apostles are referred to as St. Paul the Apostle. Pseudo names put the pseudo last in parentheses, so Basil the Great (Pseudo).