Repertorium of Old Bulgarian Literature and Letters

Maintained by: David J. Birnbaum ( [Creative Commons BY-NC-SA 4.0 International License] [Browse] [Search]    [Bulgarian] [Englist] [Russian] Last modified: 2018-11-06T08:34:15+0000

Recent additions and changes to encoding guidelines and policies

Anonymous and unknown authors

In the case of anonymous authors, the value of the msItemStruct/author element must be anonymous (lower case, no punctuation). In the case of unknown authors of unknown antigraphs, we leave out the <author> element entirely; we don’t use the value unknown.


The <availability> element must have a @status attribute (typically with the value free), and it must not have a @default attribute.

Bibliographic references

The TEI does not permit bibliographic references under <sourceDesc> to be expressed as a <ref> child of <sourceDesc>, and the same is true of bibliographic references inside the Repertorium <scribe> element. Instead, the <ref> must be wrapped in a <bibl> element. For example, instead of:

<ref type="bibl" target="bib:Милтенова1986c">Милтенова 1986: 114-125</ref>

the markup must read:

<bibl><ref target="bib:Милтенова1986c">Милтенова 1986: 114-125</ref></bibl>

Note that when the <ref> is a child of <bibl>, we remove the type="bibl" attribute specification.

<binding> element

The @contemporary attribute on the <binding> element takes one of the following three values: true (binding is contemporary with the manuscript), false, and unknown (typically when the date of either the manuscript or the binding cannot be discerned).


<colophon> element

The TEI <colophon> element is phrase-like. Ours is div-like, which means that it contains one or more instances of <p>, optionally preceded by a single instance of <head>.

Damaged or otherwise unclear text

Damaged or otherwise unclear text in incipita, etc. in the early Repertorium files was sometimes represented incorrectly with pseudo-markup, by surrounding the text in square brackets, e.g., п[салти]рь (where the text can be read, although with difficulty) or п[.....]рь (where the encoder can discern or guess the number of letters, but cannot identify what they are intended to represent). The only correct way of encoding unclear or damaged text is with the <gap>, <unclear>, and <supplied> elements, as described below. As with parentheses and slashes, square brackets are never to be used in incipita and other transcribed text, whether to represent unclear or damaged text or for other purposes. The only raw text that may appear in these transcription elements is text that occurs literally in the manuscript, and all editorial annotation must be represented with markup.

Text that is physically missing or entirely illegible must be encoded as an empty <gap> element, e.g.:

<explicit defective="false">и сь риданїемь вѣлико<seg rend="sup">м</seg> ѕело 
г҃лаше <gap/> ѣда умру видехь бо Іѡсифа че<seg rend="sup">д</seg>ѡ мое сладкое 
за Х҃а б҃а прославлень сь о҃це<seg rend="sup">м</seg> и с҃нѡ<seg rend="sup">м</seg> 
и стымъ д҃хѡ<seg rend="sup">м</seg> н҃нѣ и пр<seg rend="sup">с</seg>но и 
вь вѣки вѣкѡ<seg rend="sup">м</seg> аминъ.</explicit>

If the extent of the gap can be identified, it may be represented with the @quantity and @unit attributes. e.g.:

<gap quantity="6" unit="character"/>

The <gap> element is also to be used where text has been omitted deliberately during transcription, and in that case the <gap> element must be accompanied by a @resp attribute that identifies the editor responsible for the deletion (<gap> elements without a @resp attribute are assumed to represent physical lacunae in the manuscript). The @resp attribute must be a pointer to an @xml:id attribute in our participant list, e.g.:

<explicit>и рѣче ѿ искони прѣбивае<seg rend="sup">т</seg> вь веки аминь. ꙗко 
 том<seg rend="sup">у</seg> по<seg rend="sup">д</seg>бае<seg rend="sup">т</seg>
<gap resp="#AA"/> амн҃</explicit>

Text that is partially legible must be represented by an <unclear> element, e.g.:

<incipit>Ег<seg rend="sup">д</seg>а посла г҃ь ах҃рла Михаила кь Авраамоу г҃лю. прииди вь домь
Авраамобь. сь радостию и <sic>лубовною</sic> приими д҃хь вьзлюбленнаго госта моего Авраама.
арх҃гль сьниде вь по<seg rend="sup">д</seg>горие. и сѣ<seg rend="sup">д</seg> ꙗко еднь
поутни<seg rend="sup">к</seg> иде<seg rend="sup">ж</seg> бѣше Авраамь оу
<unclear>в</unclear>ра<seg rend="sup">т</seg>и свои<seg rend="sup">х</seg>.</incipit>

Where the editor supplies text to restore a missing or completely illegible reading, the restored text must be tagged as <supplied>, e.g.:

<head>О планитох</head>
<incipit>ззьвѣзди ѹбо</incipit>
<explicit>кь <supplied>з</supplied>апад</explicit>


Dates in the attributes @when, @notBefore, and @notAfter must be in ISO format. This means that dates that consist of just a year must be expressed with exactly four digits (using a leading zero for years before 1000). Year plus month must be formatted as YYYY-MM. Month plus date (for example, in the church calendar) require two leading hyphens, i.e., --MM-DD. For more information see

References to dates in the church calendar should be encoded as <date type="churchCal">, and the <date> element should be a child of <msItemStruct> (that is, not wrapped in a <note> element). The following example is from MP408G.xml:

<msItemStruct xml:id="ACD3" type="translation">
    <locus n="3">39r-58r</locus>
    <title xml:lang="bg">Житие на св. Текла</title>
    <date type="churchCal" when="--09-24">24. September</date>
    <filiation type="protograph">Bulgarian</filiation>
    <filiation type="antigraph">Middle Bulgarian</filiation>
    <re:sampleText xml:lang="cu">
        <explicit>И погребоше тѣло ѥе вь, бь нѥмже паметь творимь м҃сца Септембрѣ кд҃ д҃нь</explicit>

@default attribute

The @default attribute is not to be used (e.g., on the <availability>, <bibl>, <langUsage>, <sourceDesc>, and <textClass> elements).

Defective information

Textual excerpts such as incipita and explicita are to be tagged as <incipit defective="true"> only when the value is true. The @defective attribute is never to appear with the value false; in those situations it must be omitted entirely from the markup.

<dimensions> element

The folios to which a <dimensions> specification applies should be specified with the @extent attribute. Using the @scope attribute is an error.

Folio count

The folio count must be given inside supportDesc/extent/measure, and the <measure> element itself must contain nothing but digits (Arabic and Roman) and plus signs, e.g., 123, iii+25+ii. No plain text is ever to go inside the <measure> element, and the number of folios or bifolios must be wrapped in <measure> and must not appear as plain text inside the <extent> element. The <measure> element must also contain a @unit attribute, the value of which typically is folia (at present bifolio also occurs once in the corpus). For example:

<measure unit="folia">II+243+III</measure>

See also <locus>.

<keywords> element

The <keywords> element must include a @scheme attribute with the value Repertorium, e.g.:

<keywords scheme="Repertorium">

<langUsage> element

The <langUsage> element must not have a @default attribute.

<list> element

A simple list must be encoded just as <list>, with no @type attribute. That is, instead of:

<list type="simple">

the markup must read just:


<locus> element

Locations within a manuscript must be tagged with the <locus> element without attributes. Because TEI P5 does not support a @unit attribute on the <locus> element, the unit (f., ff., p., or pp., followed by a period and a space) must be included inside the element content, e.g.,

<locus>ff. 1r–24r</locus>

Note that in the case of folios, the side (lower-case r or v) or column (lower-case a, b, c, or d) must be given with the number. Where there is a range, the separator must be an en-dash (–), and not a hyphen (-).

Lost or destroyed manuscripts

Manuscripts that have been lost or destroyed must normally contain the same full <msIdentifier> elements as extant manuscripts, including <country>, <settlement>, <repository>, and <idno>. The information that a manuscript has been lost may be encoded outside the <msIdentifier> element in two places that are part of the standard TEI manuscript description module, as follows:

<msName> element

The <msName> element records a name by which a manuscript is known or the genre to which it belongs. It must include an @xml:lang attribute plus a @type attribute with one of the following three values:

The only words that are to be capitalized in names are those that are always capitalized in their respective languages, e.g., proper nouns in all three languages and proper adjectives in English. Do not capitalize the first word of a name unless it must always be capitalized in the language of the element.

Example of the use of <msName> (from AA36NBB):

<msName type="general" xml:lang="en">miscellany</msName>
<msName type="specific" xml:lang="en">apocryphal miscellany</msName>      
<msName type="individual" xml:lang="bg">Призренски апокрифен сборник</msName>
<msName type="individual" xml:lang="en">Prizren apocryphal miscellany</msName>
<msName type="individual" xml:lang="ru">Призренский апокрифический сборник</msName>

In rendered lists of manuscript names and in codicological descriptions on the site, a manuscript will be represented by its individual names, if any exist. If not, a specific name will be used. If that doesn’t exist, a generic name will be used. At that time appropriate capitalization will be introduced automatically.

<name> element

The <name> must never have a @full attribute. That is, instead of:

<respStmt><name full="yes">

the markup should read:


<note> element

The <note> element must not have a @place attribute. That is, instead of:

<note place="inline">

the markup should read:


<orthography> element

In some earlier version of the Repertorium files, the <orthography> element contained plain text or mixed content. This type of content requires a <p> wrapper. As an alternative to paragraphs or paragraph-like elements, the <orthography> element may instead contain an optional <summary> followed by one or more <orthNote> elements.

<quire> element

The <quire> element has an optional @status attribute, the legal values of which are original, added, and missing.

<revisionDesc> element

The <revisionDesc> element must not have a @status attribute. That is, instead of:

<revisionDesc status="draft">

the markup must read just:



Romanization (the transliteration of Cyrillic into Latin characters) follows the international scientific system, documented at (mirrored from

<scribeDesc> element

<scribeLang> element

<scriptDesc> element

<sourceDesc> element

The <sourceDesc> element must not have a @default attribute.

@status attribute

The @status attribute is to be used only when the status is something other than draft. That is

<availability status="free">

is okay, but no element can never have a @status attribute with the value draft.


Superscription in the early Repertorium files was encoded in at least four different ways, two using markup (the <c> and <seg> elements) and two using pseudo-markup (surrounding the superscript character with slashes or parentheses, e.g., е/с/ or е(с) for ес. The only correct way of encoding superscription is with <seg rend="sup">, e.g., as е<seg rend="sup">с</seg>. That is, use <seg>, and not <c>, to mark up a superscript character. Parentheses and slashes are never to be used in incipita and other transcribed text, whether to represent superscription or for other purposes. The only raw text that can appear in these transcription elements is text that occurs literally in the manuscript, and all editorial annotation must be represented with markup.

<supportDesc> element

The value of the @material attribute on the <supportDesc> element cannot contain white space. Manuscripts written on a combination of parchment and paper should specify this value as mixed (not as parchment and paper).

<teiHeader> @type attribute

The <teiHeader> must not have a @type attribute. (In earlier files had a @type atttribute with values like text and sbornik, and those have now been removed.)

<textClass> element

The <textClass> element must not have a @default attribute.

Types of books

Use the terms in bold below to represent the types of books described beside them. We follow the usage of the Oxford dictionary of Byzantium where possible, which, among other things, means that we favor Greek spellings (e.g., Praxapostolos) over Latinized ones (Praxapostolus).


Information about watermarks must be written inside the element <watermark> with the following structure:

  1. If there are no watermarks in the manuscript, the contents of the <watermark> element must be the bare text None.
  2. If there are watermarks in the manuscript, there must be exactly <motif> element as the first child of <watermark> (except when it is preceded by <locus>; see below). We do not distinguish among basic, supplemental, and additional parts of the <motif>. The <motif> element is the only required child of <watermark>.
  3. If there is a <countermark> element, it must be the first following sibling of <motif>. A <countermark> element must have a single <motif> child element; <countermark> is not allowed to contain plain text.
  4. The third part of the watermark information (but see the discussion of similar to below) is a pointer to a watermark album. The reference itself is encoded as a <ref> element with a @target attribute, the value of which is a bibliographic pointer in the form FamilynameYear. The <ref> element has two obligatory children: <num> (the number of an tracing in the album) and <date>, containing the year or range of years recorded in the album for that tracing. In the case of multiple examples of the same motif from the same album, there may be more than one <num> and <date> pair, but each <num> must have its own <date>, even when the date is the same as the date of the preceding item.

Names of motifs are to be given in English according to the usage preferred in the Memory of paper project.

For example:

  <motif>Anchor in circle with star</motif>
  <ref target="bib:Moshin1973">

This will be rendered on line as:

Anchor in circle with star and AB countermark. Moshin 1973: 1393 (1560/75).

If there is more than one watermark in the manuscript and the description distinguishes them by location, the element <locus> is used as the first child of <watermark>, before <motif>. For example:

  <locus>ff. 132-140</locus>
  <motif>Two circles with a cross</motif>
  <ref target="bib:Moshin1973">

If the tracing is similar to, but not the same equal as, the watermark being described, a <term> element with the textual content similar to to appear before the <ref> element. For example:

  <locus>ff. 50, 256</locus>
  <term>similar to</term> 
  <ref target="bib:Piekosinski1893">

When should a holy person be labeled as a saint?

Holy persons are not labeled as saints when their names appear as authors of texts. For example, the authoritative title for Clement of Ohrid’s Sermon before receiving holy communion is Поучение преди причастие от Климент Охридски (not … от св. Климент …).

Holy persons are labeled as saints when they appear otherwise than as authors. For example, the authoritative title of the Akathist hymn in honor of St. John the Baptist is Акатист за св. Йоан Кръстител (not Акатист за Йоан …, without the св.). The only exception is that the word Богородица is never preceded by св..

The label for a saint is never repeated in texts dedicated to multiple saints. For example, Служба за св. Пров, Тарах и Андроник и Козма Маюмски uses св. just once, before the first saint mentioned, and does not repeat it.

Where possible, we follow the Oxford Dictionary of Byzantium for names of persons, so Gregory of Nazianzos rather than Gregory the Theologian. The Apostles are referred to as St. Paul the Apostle. Pseudo names put the pseudo last in parentheses, so Basil the Great (Pseudo).

@writtenLines attribute

The @writtenLines attribute on the <layout> element takes a value of one integer or two white-space-separated integers. A single integer means that all pages have the same number of written lines; two integers defines the low and high values of the range of line counts. Where there are two values, they must be separated by white space; separating them with a hyphen or en dash is invalid.