Recent additions and changes to encoding guidelines and policies

Bibliographic references

The TEI does not permit bibliographic references under <sourceDesc> to be expressed as a <ref> child of <sourceDesc>, and the same is true of bibliographic references inside the Repertorium <scribe> element. Instead, the <ref> must be wrapped in a <bibl> element. For example, instead of:

<ref type="bibl" target="bib:Милтенова1986c">Милтенова 1986: 114-125</ref>

the markup must read:

<bibl><ref target="bib:Милтенова1986c">Милтенова 1986: 114-125</ref></bibl>

Note that when the <ref> is a child of <bibl>, we remove the type="bibl" attribute specification.

Certainty

If we are uncertain about a proposed scientific title for an <msItemStruct>, we include an empty <certainty> element after the title, just before the </title> end tag, with a @locus equal to value (that is, we are uncertain about the content of the element) and a @degree attribute of low (that is, our degree of certainty is low). No other values may be used for these two attributes, and the <certainty> element must be an empty element that follows the title (i.e., it must not contain the title). For example:
```
<title>Разказ за пророк Самуил<certainty locus="value" degree="low"/></title>
```
More generally (that is, not only with respect to scientific titles), question marks are never to be used to represent low certainty. The only correct way to represent low certainty is by including <certainty locus="value" degree="low"> inside the uncertain element, e.g.:
```
<name>Nikifor from Rila Monastery<certainty locus="value" degree="low"></certainty></name>
```

Damaged or otherwise unclear text

Damaged or otherwise unclear text in incipita, etc. in the early Repertorium files was sometimes represented incorrectly with pseudo-markup, by surrounding the text in square brackets, e.g., п[салти]рь (where the text can be read, although with difficulty) or п[.....]рь (where the encoder can discern or guess the number of letters, but cannot identify what they are intended to represent). The only correct way of encoding unclear or damaged text is with the <gap>, <unclear>, and <supplied> elements, as described below. As with parentheses and slashes, square brackets are never to be used in incipita and other transcribed text, whether to represent unclear or damaged text or for other purposes. The only raw text that may appear in these transcription elements is text that occurs literally in the manuscript, and all editorial annotation must be represented with markup.

Text that is physically missing or entirely illegible must be encoded as an empty <gap> element, e.g.:

<explicit defective="false">и сь риданїемь вѣлико<seg rend="sup">м</seg> ѕело 
г҃лаше <gap/> ѣда умру видехь бо Іѡсифа че<seg rend="sup">д</seg>ѡ мое сладкое 
за Х҃а б҃а прославлень сь о҃це<seg rend="sup">м</seg> и с҃нѡ<seg rend="sup">м</seg> 
и стымъ д҃хѡ<seg rend="sup">м</seg> н҃нѣ и пр<seg rend="sup">с</seg>но и 
вь вѣки вѣкѡ<seg rend="sup">м</seg> аминъ.</explicit>

If the extent of the gap can be identified, it may be represented with the @quantity and @unit attributes. e.g.:

<gap quantity="6" unit="character"/>

The <gap> element is also to be used where text has been omitted deliberately during transcription, and in that case the <gap> element must be accompanied by a @resp attribute that identifies the editor responsible for the deletion (<gap> elements without a @resp attribute are assumed to represent physical lacunae in the manuscript). The @resp attribute must be a pointer to an @xml:id attribute in our participant list, e.g.:

<explicit>и рѣче ѿ искони прѣбивае<seg rend="sup">т</seg> вь веки аминь. ꙗко 
 том<seg rend="sup">у</seg> по<seg rend="sup">д</seg>бае<seg rend="sup">т</seg>
<gap resp="#AA"/> амн҃</explicit>

Text that is partially legible must be represented by an <unclear> element, e.g.:

<incipit>Ег<seg rend="sup">д</seg>а посла г҃ь ах҃рла Михаила кь Авраамоу г҃лю. прииди вь домь
Авраамобь. сь радостию и <sic>лубовною</sic> приими д҃хь вьзлюбленнаго госта моего Авраама.
арх҃гль сьниде вь по<seg rend="sup">д</seg>горие. и сѣ<seg rend="sup">д</seg> ꙗко еднь
поутни<seg rend="sup">к</seg> иде<seg rend="sup">ж</seg> бѣше Авраамь оу
<unclear>в</unclear>ра<seg rend="sup">т</seg>и свои<seg rend="sup">х</seg>.</incipit>

Where the editor supplies text to restore a missing or completely illegible reading, the restored text must be tagged as <supplied>, e.g.:

<head>О планитох</head>
<incipit>ззьвѣзди ѹбо</incipit>
<explicit>кь <supplied>з</supplied>апад</explicit>

Dates

Dates in the attributes @when, @notBefore, and @notAfter must be in ISO format. This means that dates that consist of just a year must be expressed with exactly four digits (using a leading zero for years before 1000). Year plus month must be formatted as YYYY-MM. Month plus date (for example, in the church calendar) require two leading hyphens, i.e., --MM-DD. For more information see https://en.wikipedia.org/wiki/ISO_8601.

References to dates in the church calendar should be encoded as <date type="churchCal">, and the <date> element should be a child of <msItemStruct> (that is, not wrapped in a <note> element). The following example is from MP408G.xml:

<msItemStruct xml:id="ACD3" type="translation">
    <locus n="3">39r-58r</locus>
    <title xml:lang="bg">Житие на св. Текла</title>
    <date type="churchCal" when="--09-24">24. September</date>
    <filiation type="protograph">Bulgarian</filiation>
    <filiation type="antigraph">Middle Bulgarian</filiation>
    <re:sampleText xml:lang="cu">
        <explicit>И погребоше тѣло ѥе вь, бь нѥмже паметь творимь м҃сца Септембрѣ кд҃ д҃нь</explicit>
    </re:sampleText>
</msItemStruct>

Folio count

The folio count must be given inside supportDesc/extent/measure, and the <measure> element itself must contain nothing but digits (Arabic and Roman) and plus signs, e.g., 123, iii+25+ii. No plain text is ever to go inside the <measure> element, and the number of folios or bifolios must be wrapped in <measure> and must not appear as plain text inside the <extent> element. The <measure> element must also contain a @unit attribute, the value of which typically is folia (at present bifolio also occurs once in the corpus). For example:

<measure unit="folia">II+243+III</measure>

Lost or destroyed manuscripts

Manuscripts that have been lost or destroyed must normally contain the same full <msIdentifier> elements as extant manuscripts, including <country>, <settlement>, <repository>, and <idno>. The information that a manuscript has been lost may be encoded outside the <msIdentifier> element in two places that are part of the standard TEI manuscript description module, as follows:

msDesc/additional/adminInfo/availability/p This element is required for lost or destroyed manuscripts. When reports are generated that include information drawn from the <msIdentifier> section of the description, the contents of this <availability> element will be rendered in parentheses next to that information. The encoding must therefore contain the text exactly as it should appear in the final-form output. For lost manuscripts, we recommend the string lost (lower-case, no punctuation, no spaces), but encoders may use any string they consider appropriate. For example, in the case of the Apocryphal Miscellany NB Belgrade, 305, which was destroyed in the 1941 German bombing of Belgrade, this entry would read:
```
<availability>
 lost
</availability>
```
When generating a list of manuscripts in the corpus, this entry will be rendered as Serbia, Belgrade, National Library, 305 (lost).
msDesc/history/summary This element is optional, and it contains a longer prose description of the circumstances surrounding the loss or destruction of the manuscript. The contents of this element will be rendered as a paragraph when a full manuscript description is generated for output, and encoders must therefore enter the text as they would like it to appear. In the case of the Apocryphal Miscellany NB Belgrade, 305, mentioned above, this entry would read:
```
<history>
 <summary>Lost in the destruction of the National Library of Serbia during the German bombing in April 1941.</summary>
</history>
```
When generating a reading view of this manuscript description, the contents of the history element will be rendered as Lost in the destruction of the National Library of Serbia during the German bombing in April 1941.

`<msName>` element

The <msName> element records a name by which a manuscript is known or the genre to which it belongs. It must include an @xml:lang attribute plus a @type attribute with one of the following three values:

<msName type="general"> identifies the genre of the manuscript in a general way, e.g., miscellany. This is the only <msName> that is required for all manuscript description files, and it must be given only in English. The Bulgarian and Russian names will be retrieved later from a list, which means that if the name isn’t in the list, it will need to be added (notify David). General names do not have subtypes.
<msName type="specific">identifies the genre of the manuscript in a specific way, e.g., apocryphal miscellany. This name is optional, but if it is present, it must be given only in English, and is subject to the same look-up treatment as general names, described above. Specific names do not have subtypes.
<msName type="individual"> identifies the individual name(s) by which the manuscript is known, e.g., Loveč miscellany. This name is optional and repeatable, but if it is given, it must be given in all three languages. To encode a former individual manuscript name, add the @subtype attribute with the value old; old is the only legal value of @subtype here, and in the absence of a @subtype attribute the name is assumed to be current.

The only words that are to be capitalized in names are those that are always capitalized in their respective languages, e.g., proper nouns in all three languages and proper adjectives in English. Do not capitalize the first word of a name unless it must always be capitalized in the language of the element.

Example of the use of `<msName>` (from AA36NBB):

<msName type="general" xml:lang="en">miscellany</msName>
<msName type="specific" xml:lang="en">apocryphal miscellany</msName>      
<msName type="individual" xml:lang="bg">Призренски апокрифен сборник</msName>
<msName type="individual" xml:lang="en">Prizren apocryphal miscellany</msName>
<msName type="individual" xml:lang="ru">Призренский апокрифический сборник</msName>

In rendered lists of manuscript names and in codicological descriptions on the site, a manuscript will be represented by its individual names, if any exist. If not, a specific name will be used. If that doesn’t exist, a generic name will be used. At that time appropriate capitalization will be introduced automatically.

`<scribeDesc>` element

If there is more than one scribe, the <scribe> element must have an @n attribute that assigns a sequential number to the scribe, so that, for example, the first scribe would be <scribe n="1">. Scribes must be numbered with consecutive Arabic numerals (not letters), and if there is only one scribe, the @n attribute must be omitted.
When one and the same scribe wrote in several places in the manuscript, use a single <scribe> element that contains a <locusGrp> with several <locus> child elements.
If the name of the scribe is unknown, write <name>anonymous</name> (lower case). Do not write unknown or anything else other than anonymous as the value of the <name> element.

`<scribeLang>` element

Prose observations about orthography must be enclosed in an <orthography> child of the <scribeLang> element, e.g.:
```
<scribeLang>
 <orthography>Here goes some information</orthography>
</scribeLang>
```
Nonsystematic prose observations must be enclosed in a  element, e.g.:
```
<scribeLang>
 Here goes some information
</scribeLang>
```
Systematic information must be represented by <orthNote> child elements of the <orthography> element. See MDManoil for an example.

`<scriptDesc>` element

The <scriptDesc> element has an obligatory @script attribute with two possible values, cyrs for Cyrillic and glag for Glagolitic.
An optional @type attribute can specify the subtype of the script, e.g., semiuncial.
If addition information is available about the writing, it must be specified in a  child element of <scriptDesc>, e.g.:
```
<scriptDesc script="cyrs" type="semiuncial">
 Semiuncial with with cursives elements.
</scriptDesc>
```
Do not use a  child element for simple descriptions that do not contain additional information. For example, do not write:
```
<scriptDesc script="cyrs" type="semiuncial">
 Semiuncial>
</scriptDesc>
```
The <respStmt> element and its children are to be used inside the <scribeDesc> element only in situations where the responsibility belongs to the encoder (compiler) of the electronic description. For example:
```
<respStmt>
 <name ref="#AM">A. Miltenova</name>
 <resp>de visu</resp>
</respStmt>
```
(Note that the name is not inverted.) In situations where the identification or description comes from a publication, though, instead of <respStmt> we must use a note and bibliographic pointer, e.g., instead of:
```
<respStmt>
 <name full="yes">Райков, Б., Хр. Кодов, Б. Христова</name>
 <resp>Славянски ръкописи в Рилския манастир</resp>
</respStmt>
```
we should write:
```
<note>
 <ref type="bibl" target="bib:Райков1986">Райков, Кодов, Христова 1986: 69–71, № 32</ref>
 </note>
```
Usage examples within <scriptDesc> must be tagged as <foreign xml:lang="cu">. The elements <w>, <c>, <seg> are not to be used here.

Superscription

Superscription in the early Repertorium files was encoded in at least four different ways, two using markup (the <c> and <seg> elements) and two using pseudo-markup (surrounding the superscript character with slashes or parentheses, e.g., е/с/ or е(с) for е^с. The only correct way of encoding superscription is with <seg rend="sup">, e.g., as е<seg rend="sup">с</seg>. That is, use <seg>, and not <c>, to mark up a superscript character. Parentheses and slashes are never to be used in incipita and other transcribed text, whether to represent superscription or for other purposes. The only raw text that can appear in these transcription elements is text that occurs literally in the manuscript, and all editorial annotation must be represented with markup.

Watermarks

Information about watermarks must be written inside the element <watermark> with the following structure:

If there are no watermarks in the manuscript, the contents of the <watermark> element must be the bare text None.
If there are watermarks in the manuscript, there must be exactly <motif> element as the first child of <watermark> (except when it is preceded by <locus>; see below). We do not distinguish among basic, supplemental, and additional parts of the <motif>. The <motif> element is the only required child of <watermark>.
If there is a <countermark> element, it must be the first following sibling of <motif>. A <countermark> element must have a single <motif> child element; <countermark> is not allowed to contain plain text.
The third part of the watermark information (but see the discussion of similar to below) is a pointer to a watermark album. The reference itself is encoded as a <ref> element with a @target attribute, the value of which is a bibliographic pointer in the form FamilynameYear. The <ref> element has two obligatory children: <num> (the number of an tracing in the album) and <date>, containing the year or range of years recorded in the album for that tracing. In the case of multiple examples of the same motif from the same album, there may be more than one <num> and <date> pair, but each <num> must have its own <date>, even when the date is the same as the date of the preceding item.

Names of motifs are to be given in English according to the usage preferred in the Memory of paper project.

For example:

<watermark>
  <motif>Anchor in circle with star</motif>
  <countermark>
    <motif>AB</motif>
  </countermark>
  <ref target="bib:Moshin1973">
    <num>1393</num>
    <date>1560/75</date>
  </ref>
</watermark>

This will be rendered on line as:

Anchor in circle with star and AB countermark. Moshin 1973: 1393 (1560/75).

If there is more than one watermark in the manuscript and the description distinguishes them by location, the element <locus> is used as the first child of <watermark>, before <motif>. For example:

<watermark>
  <locus>ff. 132-140</locus>
  <motif>Two circles with a cross</motif>
  <ref target="bib:Moshin1973">
    <num>2025</num>
    <date>1336</date>
  </ref>
</watermark>

If the tracing is similar to, but not the same equal as, the watermark being described, a <term> element with the textual content similar to to appear before the <ref> element. For example:

<watermark>
  <locus>ff. 50, 256</locus>
  <motif>axe</motif>
  <term>similar to</term> 
  <ref target="bib:Piekosinski1893">
    <num>413</num>
    <date>1395</date>
  </ref>
</watermark>

When should a holy person be labeled as a saint?

Holy persons are not labeled as saints when their names appear as authors of texts. For example, the authoritative title for Clement of Ohrid’s Sermon before receiving holy communion is Поучение преди причастие от Климент Охридски (not … от св. Климент …).

Holy persons are labeled as saints when they appear otherwise than as authors. For example, the authoritative title of the Akathist hymn in honor of St. John the Baptist is Акатист за св. Йоан Кръстител (not Акатист за Йоан …, without the св.). The only exception is that the word Богородица is never preceded by св..

The label for a saint is never repeated in texts dedicated to multiple saints. For example, Служба за св. Пров, Тарах и Андроник и Козма Маюмски uses св. just once, before the first saint mentioned, and does not repeat it.

Where possible, we follow the Oxford Dictionary of Byzantium for names of persons, so Gregory of Nazianzos rather than Gregory the Theologian. The Apostles are referred to as St. Paul the Apostle. Pseudo names put the pseudo last in parentheses, so Basil the Great (Pseudo).

<oo>→<rep> Repertorium of Old Bulgarian Literature and Letters