Intern How To

From EBA_Documentation
Revision as of 20:14, 27 February 2017 by Sarah Ketchley (Talk | contribs)

Jump to: navigation, search

Welcome to the Emma B. Andrews Diary Project internship program! EBA for short, we're proud to be a founder member of Newbook Digital Texts. While the goals of all the projects working under the Newbook umbrella may be similar, some of our working methodologies are different. Read on for the following documentation:

EBA Internship Job Descriptions

How to Transcribe

Encoding Documentation

These guidelines will help us with coding structural and contextual information in our TEI digital representations of texts. Each of our TEI documents will begin with a standardized TEI header, which we’ll make available and periodically update to customize for your document.  Our ultimate resource on coding is the Text Encoding Initiative.


I. Basic TEI Structure for All Files

II. Structural Markup

A. The TEI Header: Consistent Elements Across All Files         
B. Poems
         C. Plays 
        D. Prose Texts 
         E. Letters (and working with manuscripts / images)                  F. Editorial Headnotes (with links out and bibliographic citations)

III. Contextual/Relational Markup

A. Our Site Index (si.xml) and Our Schema (MRMValidate.sch)
         B. People, Places, Books, Events, Flora, Quantities (Real and Fictional)

        C. Editorial Notes: When and How to Code Them                 * How to Research Editorial Notes and Site Index Entries
                 * Helpful Resources for Researching Notes and Entries                  D. Variant Texts: Critical Apparatus Markup
         E. Coding Quotes of Various Kinds

        F. Coming: Our Village: Special Markup

I. Basic TEI Structure for All Files

Basic TEI XML structure for our project looks like this:

<?xml version="1.0" encoding="UTF-8"?> <?xml-model href="" type="application/xml" schematypens=""?> <?xml-model href="" type="application/xml"         schematypens=""?>

<TEI xmlns="">

  <teiHeader>     Information that we customize about the origins of our files, our editors, etc. To be updated regularly as new interns join and contribute </teiHeader>     <text>
       <front>         This includes a title and all prefacing material of our main text. It can include an epigraph, introduction, etc.        </front>             <body>          This is where we place the code and transcription of our main text.       </body>              <back>          This is where we develop lists containing detailed info on people, places, contexts,                 etc. mentioned in the main text.       </back>        </text> </TEI> 

II. Structural Markup:  Begin coding at the structural level, noting the organization of the text you’re working with. This is a matter of form, and the TEI provides standards for structured markup distinct to plays, poems, prose texts, and letters, among others. 
To start structural markup of literary texts, first look for a “clean" base text to work with in a good edition. By “clean," we mean that we want to avoid working with texts subjected to “dirty OCR" or poor-quality Optical Character Recognition from scanned images of pages. This produces texts flecked with unpredictable errors, weird special characters, a mess to clean up by hand to be avoided if possible. If the text is available in an html document or plain text, based on “keyed in” input text rather than OCR generation this is best of all.

In looking for a good edition, consider: we don’t want to work with an excerpted edition from, say, the 1890s when there’s a full edition available close to the year of first publication. We need to survey our available options for representing each text, and demonstrate awareness of multiple editions of each text in the process of editing. However the very best text for our scholarly edition may simply not be available as a “clean" electronic text file, so we’ll need to make do with the best available clean text to start. This doesn’t mean we’re stuck with coding a marginally desirable edition, but it just means we can start with as much of our text in place as we can find. We can compare this text with a more preferred edition and we can key in the differences--but at least we don’t have to key in the entire text. (In our workshops and hangouts and in consultation with each other, we’ll discuss good ways to track down variant editions.)

The following sections describe the structural markup for various kinds of texts we’re working on in the Digital Mitford project. For all of the literary texts regardless of form, title pages and front matter will vary considerably, and you may want to consult the coding for Title Pages in the TEI for anything I’ve not anticipated here.

II. A. The TEI Header: Consistent Elements Across All Files

What follows is a general TEI header with elements that we'll be using in all of our files in our project. For working with manuscript letters, please see the more specialized header in the Letters encoding section.

The top lines here, beginning with <? ...>, aren't part of the header, but belong at the top of our files to validate against the TEI P5 coding rules, and against the specific rules we've developed for our project. We've adapted this sample header from a play file in the Mitford project.

<?xml version="1.0" encoding="UTF-8"?> <?xml-model href="" type="application/xml" schematypens=""?> <?xml-model  href="" type="application/xml"         schematypens=""?> <?xml-model href="" type="application/xml" schematypens=""?>

<TEI xmlns="">    <teiHeader>       <fileDesc>          <titleStmt>             <title>Title</title>             <author>Mary Russell Mitford</author>

            <editor ref="#rnes">Rebecca Nesvet</editor>

            <sponsor><orgName>Mary Russell Mitford Society: Digital Mitford                Project</orgName></sponsor>             <sponsor>University of Pittsburgh at Greensburg</sponsor>           <sponsor>Pittsburgh Supercomputing Center</sponsor>             <principal>Elisa Beshero-Bondar</principal>

            <respStmt>                <resp>Transcription, recording of variants, and TEI coding by</resp>                <persName ref="#rnes">Rebecca Nesvet</persName>

            </respStmt>             <respStmt>                <resp>Proofing and corrections by</resp>                <persName>Elisa Beshero-Bondar</persName>                            </respStmt>          </titleStmt>          <editionStmt>             <edition>First digital edition in TEI, date: 5 June 2013. P5.</edition>                      </editionStmt>

         <publicationStmt>             <authority>Digital Mitford: The Mary Russell Mitford Archive</authority>             <pubPlace>Greensburg, PA, USA</pubPlace>             <date>2013</date>             <availability>                <licence>Distributed under a Creative Commons                   Attribution-ShareAlike 3.0 Unported License</licence>             </availability>


         <seriesStmt>             <title>Digital Mitford: The Mary Russell Mitford Archive</title>          </seriesStmt>

                     <profileDesc>          <handNotes>             <handNote>In the manuscript of 1825, Mitford's hand numbers the pages of the play in                sequence from 1 to 85. A second hand in pencil has renumbered the folio pages for                inclusion in this volume of Plays from the Lord Chamberlain's Office, from 415 to                499.</handNote>          </handNotes>       </profileDesc>       <encodingDesc>          <editorialDecl>


Mitford’s spelling and punctuation are retained, except where a word is split at the end of a line and the beginning of the next in the manuscript. Where Mitford’s spelling and hyphenation of words deviates from the standard, in order to facilitate searching we are using the TEI elements “choice," “sic," and “reg" to encode both Mitford’s spelling and the regular international standard of Oxford English spelling, following the first listed spelling in the Oxford English Dictionary. The long s and ligatured forms are not encoded.

         </editorialDecl>       </encodingDesc>


            <msDesc>                <msIdentifier>                   <repository>British Library</repository>                   <idno>Add MS 42873, folio pages 402-404 and 415-499</idno>                </msIdentifier>

               <physDesc>                   <objectDesc>                      <supportDesc>


<material>Paper</material>, quarto-sized sheets.

                        </support>                         <condition>Written on the front sides of the sheets.</condition>                      </supportDesc>                   </objectDesc>

               </physDesc>             </msDesc>          </sourceDesc>       </fileDesc>

      <profileDesc>          <handNotes>             <handNote>In the manuscript of 1825, Mitford's hand numbers the pages of the play in                sequence from 1 to 85. A second hand in pencil has renumbered the folio pages for                inclusion in this volume of Plays from the Lord Chamberlain's Office, from 415 to                499.</handNote>          </handNotes>       </profileDesc>       <encodingDesc>          <editorialDecl>


Mitford’s spelling and punctuation are retained, except where a word is split at the end of a line and the beginning of the next in the manuscript. Where Mitford’s spelling and hyphenation of words deviates from the standard, in order to facilitate searching we are using the TEI elements “choice," “sic," and “reg" to encode both Mitford’s spelling and the regular international standard of Oxford English spelling, following the first listed spelling in the Oxford English Dictionary. The long s and ligatured forms are not encoded.

         </editorialDecl>       </encodingDesc>


II. B. Poems

See Ch. 6 in the TEI P5 Guidelines on coding verse. Here’s a sample structure for a long poem divided into cantos (likeBlanch or Christina), including a title page, an epigraph, a dedication, and an introduction at the front. (I’m snagging some of this from the TEI’s sample code, and I’m just showing the <text> portion of the TEI document.)

<text>    <front> <titlePage>  <docTitle>   <titlePart type="main"> Histoire du Roi de Bohême</titlePart>   <titlePart type="sub"> et de ses sept châteaux </titlePart>  </docTitle>  <titlePart>Pastiche.</titlePart>  <byline>Par <docAuthor>Charles Nodier</docAuthor>  </byline> <docEdition>Third edition.</docEdition>  <docImprint>   <pubPlace>PARIS</pubPlace>, <publisher>Delangle Frères Éditeurs-libraires</publisher>,    <placeName>Place de la Bourse</placeName>    <docDate>MDCCCXXX</docDate>  </docImprint> </titlePage>  <epigraph>   <cit>    <quote>     <l>Since I can do no good because a woman</l>     <l>Reach constantly at something that is near it.</l>    </quote>    <bibl>     <title>The Maid's Tragedy</title>     <author>Beaumont and Fletcher</author>    </bibl>   </cit>  </epigraph>

[prose text...]

or text with line breaks that isn’t necessarily poetry: <lb/>text <lb/>text <lb/>text  

<head>Introduction</head> <lg>      <l n="1"></l>      <l></l>         . . .      <l></l> </lg>

   <pb n="2"/> (a sample page break element)       </front> <body>


    <head>Canto I</head>


            <head>Section 1.</head>             <lg>   [line group: a cluster of lines.]                      <l></l>                      <l></l>                    . . .                      <l></l>                </lg>               <lg>                      <l></l>                      <l></l>                    . . .                      <l></l>                </lg>


          <head>Section 2.</head>

              . . .


       <head>Canto II</head>


            <head>Section 1.</head>                 . . .        


</body> <back> . . . </back> </text>

II. C. Plays

See Chapter 7 on Performance Texts in the TEI Guidelines. Note: The Front elements don’t have to appear in this order--this is just a sampling. But the Front always includes the castList and the general set for the whole play.

<text>    <front>  <titlePage>                 <docTitle>                         <titlePart type="main">The Melfi.</titlePart>                         <titlePart type="subtitle">A Tragedy <lb/> Five Acts<lb/></titlePart>                    <titlePart type="place">Theatre Royal Covent Garden </titlePart>                 <titlePart type="date">5th March 1823.--</titlePart>                                 </docTitle> </titlePage>

(various opening sections: dedication, chamberlainletter, etc. See frontmatter divs in my file of Julian)

       <castList>               <castItem>                      <role xml:id="Doge_F">Doge Foscari</role> The xml:id gives a unique identifier for the character--to be referred to in the body of the play (see below), and to be added to our site index (si.xml) file. Once we have added it to the site index, we can change the tagging here to <role corresp="#Doge_F">.                       <roleDesc>(if given) </roleDesc>                       <actor>(if given) </actor>                  </castItem>                 <castItem>                         . . .                 </castItem>         </castList>

 Describes the overall setting of the play. (Note: there’s some variation on this in the guidelines: I noticed that for my code to be valid on Foscari, this needed to be a div, rather than the <set> element.)




                <head>Act I.</head>


                <head>Scene I.</head>

                <stage type=""> Stage directions--can appear outside or inside speeches, and may include:  <stage type="setting">  (for descriptions of setting) <stage type="business"> (for entrances, exits, physical  movements) <stage type="delivery"> (for comments on spoken voice, like asides)

<sp who="#Doge_F"> the @who gives the xml:id value after a hashtag (#) to refer up to the xml:id established for each character in the castList.           <speaker>Name</speaker>         <l> </l>

        <l> </l> when the speeches are in verse. (Otherwise use

for prose paragraphs.)


<sp who="#fred #camilla"> If two or more people are speaking in unison, you can refer to them in a single @who attribute separated by white space. Very handy!           <speaker>Name</speaker>         <l> </l>

        <l> </l> when the speeches are in verse. (Otherwise use

for prose paragraphs.)

</sp> <pb n="2"/> (a sample page break element)                             . . .

<head>Scene II.</head>
. . .

. . .             </body>

   <back> . . . </back> </text>

II. D. Prose Texts

See examples of front matter in preceding sections. Prose fiction is divided into
elements for books and chapters, and

for prose paragraphs. Where a prose text contains embedded verse, code the lines of verse outside the paragraph structure like this: 

<lg>    <l></l>    <l></l> </lg>

<text> <front> . . . </front> <body>


        <head>Book I</head>


                <head>Chapter I</head>


. . .

. . .

<pb n="2"/> (a sample page break element)    

. . .


</body> <back>. . .</back> </text>

II. E. Letters (and working with manuscripts / images)

Standard Filenames: Please save your letter’s XML file using this standardized form for a filename: yyyy-mm-dd-RecipientNoSpace.xml
Example: 1821-10-31-BRHaydon.xml
(This prioritizes the year for us in sorting and cataloging.) 
Please back up your file regularly in our shared Box space in Mitford Digital Archives, in the appropriate folder (which is named as above) together with the image files for your letter. (When you need help or you’re ready for another editor to review it, signal us from Box and send e-mail!)

The TEI Header has special elements for manuscripts, including places to mark when there is more than one "hand" (or writer) recording marks or writing in the document.  I’m providing a detailed explanation of header elements in our letters here, both to standardize them and to help you figure out how to customize them for each letter.

Transcribing from Manuscripts can’t be entirely literal, when we are producing a digital surrogate in unicode text. See our policy on this in the editorialDecl statement of our TEI header for letters below--quoted here for reference:   
<encodingDesc>          <editorialDecl>


Mitford’s spelling and punctuation are retained, except where a word is split at the end of a line and the beginning of the next in the manuscript. Where Mitford’s spelling and hyphenation of words deviates from the standard, in order to facilitate searching we are using the TEI elements “choice," “sic," and “reg" to encode both Mitford’s spelling and the regular international standard of Oxford English spelling, following the first listed spelling in the Oxford English Dictionary. The long s and ligatured forms are not encoded.

         </editorialDecl> </encodingDesc> 

This means a couple of things for us: 
1) With so many letters we’re trying to make available for the sake of information and searching, there is no point for us to sweat the reproduction of hyphens that look like equal signs (as many of Mitford’s tend to do). Indeed, if Mitford is just hyphenating a word at the end of a line, we’re just going to skip over that hyphen silently anyway (which is common practice in print editions as well as digital ones). However, if she hyphenates a word in the MIDDLE of a line, yes, we’ll reproduce that, and if it’s a nonstandard usage, we’ll use <choice><sic>literal weird usage</sic><reg>normal usage</reg></choice> tags to indicate both Mitford’s unusual usage and the regular/standard form (as modelled below in our letter template).

  • But what happens when Mitford splits a word at the end of a *page* in her manuscript? In this case, we will embed the <pb/> self-closing "milestone" element inside the word that's broken at the point of the split, like this:

This is a line of Mitford's text at the bottom of a page, with a word split<pb n="2"/>ting onto the next page. When we need to indicate page breaks, the <pb/>element will permit us to do so. 
2) One issue we’re going to have is how to represent Mitford’s dashes, as distinct from short hyphens. Here let’s think about the punctuationconceptually and functionally, first of all. Don’t worry about how long a dash is. If it’s a line that’s clearly being used AS a dash, let’s consistently indicate that in a simple way: As you’re transcribing we’ll make dashes with two hyphens, like this--with NO spaces on either side. Later we’ll convert these all to standard unicode em-dashes, which require a special character code: —
(It’s not essential that we use this immediately--unlike the need to use a unicode ampersand--and it’s something I can standardize as we transform our files. For our purposes, let’s keep this simple and standard for ease of transcription.) 

 <TEI xmlns="">    <teiHeader>       <fileDesc>          <titleStmt>             <title xml:id="MRMidentifier">Letter to <persName ref="#Talfourd_Thos">Thomas Noon Talfourd</persName>, 14 September 1820. </title>             <author ref="#MRM">Mary Russell Mitford</author>             <editor ref="#lmw">Lisa M. Wilson</editor>

            <sponsor>                <orgName> Mary Russell Mitford Society: Digital Mitford Project </orgName>             </sponsor>             <sponsor>University of Pittsburgh at Greensburg</sponsor>   <sponsor>Pittsburgh Supercomputing Center</sponsor>             <principal>Elisa Beshero-Bondar</principal>             <respStmt>                <resp>Transcription and coding by</resp>                <persName ref=”#lmw”>Lisa M. Wilson</persName>             </respStmt>             <respStmt>                <resp>Date last checked: <date when="2014-07-04">2014-07-04</date>. Proofing and corrections by</resp>                <persName ref=”#ebb”>Elisa Beshero-Bondar</persName>             </respStmt>          </titleStmt>          <editionStmt> <edition> First digital edition in TEI, date: <date when="2014-06-06">6 June 2014. P5. </date></edition> <respStmt><resp>Edition made with help from photos taken by</resp><orgName>Digital Mitford editors</orgName></respStmt> <respStmt><orgName>The Digital Mitford</orgName><resp> editors' photos from this archive are not permitted for public distribution. Photo files: <idno>DSCF6129.jpg, DSCF6130.jpg, DSCF6131.jpg, DSCF6132.jpg, DSCF6133.jpg, DSCF6134.jpg</idno></resp></respStmt>          </editionStmt>

         <publicationStmt>             <authority>Digital Mitford: The Mary Russell Mitford Archive</authority>             <pubPlace>Greensburg, PA, USA</pubPlace>             <date>2013</date>             <availability>


                  Reproduced by courtesy of the <orgName ref="#ReadingCL">Reading Central Library</orgName>.                   Courtesy of <orgName ref="#Rylands">The University of Manchester</orgName>.                

               <licence>Distributed under a Creative Commons Attribution-ShareAlike 3.0 Unported License</licence>             </availability>          </publicationStmt>          <seriesStmt>             <title>Digital Mitford Letters: The Mary Russell Mitford Archive</title>          </seriesStmt>          <notesStmt>             <note>Any special notes on this text? (optional)</note>             <note>You can have multiple notes here.</note>          </notesStmt>          <sourceDesc>            <msDesc>                    <msIdentifier>                 <respository ref="#Rylands">The John Rylands University Library</repository>                     <collection> Mitford-Talfourd Correspondence: Letters from Mary Russell Mitford to Thomas Noon Talfourd: vol. 665</collection>                       <idno></head> </head>             <physDesc>                 <objectDesc>                            <supportDesc>                         <support>


<material>Paper</material> with watermark in form of <watermark>anchor</watermark> visible on second sheet.

How many distinct sheets of paper do you count in this letter (as opposed to the number of pages--usually 4--marked on the fronts and backs and over folds)? Indicate number of sheets, and size of sheets, whether this is a fragment, and if so what appears to be missing. Two sheets of quarto-post folded in thirds twice. OR octavo-post OR 16-mo post, folded once or twice, etc. (If you can’t tell what size paper this is b/c you’re working with a photo or just don’t know, you can simply say, one large sheet, with a half-size smaller sheet inside, etc.)  Describe any watermarks here, and on what sheets they appear. Envelope present, or only an address leaf? [EXAMPLE]:  Folded in thirds twice: one sheet of quarto-post containing pages 1, 2, 5 and 6, and one sheet of half quarto post containing pages 3 and 4. Watermark in form of <watermark>unicorn</watermark> visible on page three

Describe stamps or postal marks here. It'll help to look at our slides on how to identify the different stamps:  Use the <lb/> element to indicate what’s above and below if needed. Envelope bearing large stamped date inside circle reading <date when="1818-04-20"><stamp>April 20<lb/>1818</stamp></date> above right of address. Address leaf bearing sepia-inked stamp reading<stamp><lb/><date>29 * JU</date> <lb/><date>1820</date> N. <unclear><gap quantity="1" unit="chars" reason="illegible"/></unclear></stamp> at top left-hand edge upside down in relation to the writing of the address.



Comment on the condition: is the letter damaged, signs of mold? Example: A portion of page 3 has been torn away under the seal.


</supportDesc>                                      </objectDesc>                     <sealDesc>


Is there a seal on this letter? Describe it. Red oval-shaped wax seal with diagonal line, and at the bottom lettering anders. Or indicate here if the seal is missing.



          </msDesc>    </sourceDesc>  </fileDesc>        <profileDesc>          <correspDesc> <correspAction></correspAction> <correspContext></correspContext>
<note></note>                         </correspDesc>          <handNotes>             <handNote xml:id="rc" medium="red_crayon"> Red crayon or thick red pencil. Probably a different hand from Mitford's, that marks many of her letters, sometimes drawing diagonal lines across pages, and sometimes writing words overtop and perpendicularly across Mitford's writing. </handNote>             <handNote xml:id="black_ink" medium="black_ink"> Someone, apparently other than Mitford, perhaps cataloging letters and describing them. </handNote>             <handNote corresp="#pencil" medium="pencil"> [Use @corresp="#id" when the hand is already identified as an xml:id on our site index (si.xml), as in this case.] Someone, apparently other than Mitford, perhaps cataloging letters and describing them, who left grey pencil marks and numbered her letters now in the Reading Central Library's collection.             </handNote>         </handNotes> </profileDesc>   <encodingDesc>          <editorialDecl>


Mitford’s spelling and punctuation are retained, except where a word is split at the end of a line and the beginning of the next in the manuscript. Where Mitford’s spelling and hyphenation of words deviates from the standard, in order to facilitate searching we are using the TEI elements “choice," “sic," and “reg" to encode both Mitford’s spelling and the regular international standard of Oxford English spelling, following the first listed spelling in the Oxford English Dictionary. The long s and ligatured forms are not encoded.

         </editorialDecl> </encodingDesc>   </teiHeader>

 <text>       <body>


            <opener>                <add><handShift resp="#pencil"/>35 B. R. Haydon Esq</add>                <add><handShift resp="#rc"/></add><addSpan spanTo="#endpoint"/>                               <dateline>                   <date when="1821-10-31">October 31<hi rend="superscript">st</hi> 1821.</date>                   <name type="place">Three Mile Cross</name>                </dateline>                <salute>My dear Sir</salute>             </opener>


Text text text text...



Text to the end of page one.<anchor xml:id="endpoint"/> Here’s how to deal with <emph rend=”underline”>Mitford’s underlining</emph> in her manuscripts.

            <pb n="2"/> (self-closing page-break element, which might or might not be contained in a paragraph. You don’t need one for page 1. Put <pb n="2"/> at the start of the second page.)

Here’s how to code idiosyncratic and standard spellings.       


Text text text text text text text text text <choice><sic>Wierd</sic><reg resp="#lmw">weird</reg></choice> text text text <choice><sic>every body</sic><reg resp="#lmw">everybody</reg></choice>  For the word you place in <reg>..</reg> tags, use Oxford international English spelling.)

Here's how to code missing or obliterated text due to DAMAGE to the manuscript:

For tears in the paper:

Text text text Text text text Text text text Text text text <gap reason="torn" unit="word" quantity="2"/>  Indicate the size of the gap (one letter, one word, # of words):   < gap quantity="[#]" unit=”word”/> < gap quantity=”[#]” unit=”chars”/> [for missing "characters"] < gap quantity=”[#]” unit=”sentence”/> For smudges: <damage agent="smudge" unit="word" quantity="2"/><unclear><supplied resp="#ghb">your best guess here</supplied></unclear>
 Here's how to code missing or obliterated or added text due to Mitford herself altering the text:  deleted text here   Indicate how the word or words are deleted: Common values of @rend are:strikethrough, slashes, crossout, squiggles, etc. Samples:  Use when MRM has scribbled something out, but you can’t read what was underneath: <del rend="squiggles"><gap quantity="1" unit="word"/></del> Use if MRM has lined something out and you can read what was beneath, as well as the word that replaced it: You will find that I have conformed to <del rend="strikethrough">your</del><add>the</add> representation of the Venetian government as we find it in the great Dramatists If the infamous Red Crayon Monster of Reading Central Library has lined out words (something MORE than the usual diagonal slash across the page), code that like this.

  • In the TEI Header, in the <profileDesc>, <handNote xml:id="rc"> identifies the Red Crayon.
  • At the point of the deletion, code like this (for a readable deletion of 9 words): 
<add> For words added by MRM above or below the line of text, using a caret mark:   ˄   or   ˅ <add place="above"> <metamark place="below" function="insertion" rend="caret"/>me</add> For Mitford’s Jerk paragraph separator: <add><metamark rend="waves"/></add> For words added by MRM above or below the line of text, not using a caret mark:   <add place="above">me</add> Finally, How to code the end material in a letter:
                 <closer>                <lb/>Ever most sincerely your's<lb/>                <signed>MM Mitford.</signed><lb/>                             <address> <addrLine><persName ref="#Haydon">B. R. Haydon Esqre</persName></addrLine> <addrLine><placeName>St. John's Place</placeName> <placeName>Lisson Grove North</placeName> </addrLine>     <addrLine><placeName>Regent's Park</placeName></addrLine>     <addrLine> <placeName>London</placeName></addrLine>                </address>            </closer> <postscript><p>How is poor Miss Lamb?


[Notice that the postscript element is OUTSIDE the closer element in TEI.]





                  <listPerson>                  <person xml:id="Byron">                  <persName>George Gordon, Lord Byron</persName>                  <birth>                     <placeName> where born; date on when= of <birth>                                                                    </placeName>                  </birth>                  <death>                     <placeName> where died; date on when= of <death>  </placeName>                  </death>               </person> . . .         </listPerson>                  <listPlace>
                <place>                  <district xml:id="Lisson_Grove">Lisson Grove, within City of       Westminster, London</district>               </place>         </listPlace>


      </back>    </text> </TEI>

For more detailed information on TEI Encoding for manuscripts, see TEI Chapter 10 on Manuscript Description, and review our TEI Header file for an ms letter in the Box MRMS Project Support folder. See also TEI Chapter 11 on Representing Primary Sources.

More on our TEI Header for Letters: 1. Archive-specific Permissions to record in the TEI Header: Specific archives have given us distinct permissions statements to record in our digitized representations of their holdings. Here’s what we’re recording for each, within the <fileDesc> <publicationStmt> portion of the header (See the precise position of these elements in the coding template letter above): a) Reading Central Library:            <availability>


Reproduced by courtesy of the <placeName>Reading Central Library</placeName>.

               <licence>Distributed under a Creative Commons Attribution-ShareAlike 3.0 Unported License</licence>             </availability>

b) The John Rylands Library:         <availability>

Reproduced by courtesy of the University Librarian and Director, <placeName>The John Rylands Library</placeName>, The University of Manchester.

<licence>Distributed under a Creative Commons Attribution-ShareAlike 3.0 Unported License</licence> </availability>

2. Further along in the <fileDesc> portion of the TEI header, within the <msDesc><msIdentifier> elements, is the place where we designate the volume number and shelfmark information of a letter as stored in a particular archive.  To locate information on the letter you’re working on, please consult our working Excel Spreadsheet of Mitford Letters, posted on Box, download the file so you can easily search through it, and click on the Letters tab to view individual letters by date, and find your letter. (If it’s missing, please say something--we need to know that!) In the spreadsheet, you’ll see volume names to record in the <collection> element  shelfmark and volume information to record in the <idno> element. Letters in the John Rylands Library are stored in one of three volumes:  English MSS 665, English MSS 666, or English MSS 667. (Notice that we’ve recorded specific shelfmark, volume, and number information for letters at Rylands in Box in a comment on the file folder containing your letter. For Rylands letters, include all this identifier info, including Rylands’ numbering and Coles’ numbering.) Those in Reading Central are Shelfmark qB/TU/MIT and from either vol. 4 and 5 (and we’re about to verify that and collect more at Reading Central in the next few months as of summer 2014).  

3. Postmarks: You may be wanting some help with reading half-legible ink stamps. We've posted a useful resource on this in Box, Alcock and Holland's canonical text, Postmarks of Great Britain and Ireland:

(This is posted in Mitford Digital Archives--›Mitford Letters Collated (outermost level). See our example in the letter above of where and how to describe this information.

II. F. Editorial Headnotes Like the majority of text-based documents in the Digital Mitford Archive, our editorial headnotes are written and stored in TEI XML. In these headnotes, we document what we have researched of a text’s drafting and publication history, and provide overviews of significant contexts to help introduce Mitford’s writings. Our headnotes, like all TEI files in the Archive, should be coded to connect to canonical names and @xml:ids defined in our site index (si.xml). Unlike most other kinds of files in our archive, though, these headnotes are “born digital” and can be coded to reference resources outside our archive, such as another database containing a pertinent file, like a page in Lord Byron and His Times, the Southey Letters, or the Shelley-Godwin archive. I’ve posted a standard template for all headnote files in the Box (under MRMS Project Support→Coding Templates, Guidelines, and Workshop Notes). Here’s a copy, with info on how to code a “link” to an external web resource: <ref target=”http://…”>referenced text</ref>.  (This isn’t exactly the same as a web hyperlink, but does permit us to convert your tagging to a web hyperlink when we transform the file to html.)

<TEI xmlns="">    <teiHeader>       <fileDesc>          <titleStmt>             <title>Headnote to <title ref="#idref">Text by Mitford</title></title>             <author ref="#youridref">[Name of Editor working on this]</author>                         <sponsor><orgName>Mary Russell Mitford Society: Digital Mitford Project</orgName></sponsor>             <sponsor>University of Pittsburgh at Greensburg</sponsor>             <principal>Elisa Beshero-Bondar</principal>                         <respStmt>                <resp>Corrections and proofing by</resp>                <persName ref="#editoridref">[Name of Editor working on this]</persName>             </respStmt>          </titleStmt>          <editionStmt><edition>First digital edition in TEI, date: [what date was this prepared?]. P5.</edition><respStmt><resp>We can include a respStmt here.</resp><persName>Who?</persName></respStmt>                      </editionStmt>                      <publicationStmt>             <authority>Digital Mitford: The Mary Russell Mitford Archive</authority>             <pubPlace>Greensburg, PA, USA</pubPlace>             <date>2013</date>             <availability>                <licence>Distributed under a Creative <orgName>Commons</orgName> Attribution-ShareAlike 3.0                   Unported License</licence>             </availability>                      </publicationStmt>                              <seriesStmt>             <title>Digital Mitford: The Mary Russell Mitford Archive</title>          </seriesStmt>                    <notesStmt>             <note>[Anything we need to say about the preparation of this headnote beyond what we're covering in the rest of this header?]</note>                      </notesStmt>                    <sourceDesc>


born digital

         </sourceDesc>       </fileDesc>          </teiHeader>   <text>       <body>


[Write your headnote! Code with reference to our site index, indicating <persName ref="#Doge_F">Doge Foscari</persName> just as you would in other files mentioning him, for example. You probably won't need to add annotations to your own introductory headnote, which is, after all, a big introductory annotation to a Mitford text.


To add a link to a pertinent and authoritative web resource, there's a special TEI tag set: A link to <ref target="">the biographies page in the Southey letters archive</ref> would look like this.

To make a bibliographic citation, to a source that we're including in our site index, use the <cit> and <bibl> tag set, like this:  <cit><bibl ref="#whateverID"><title>Poetic Castles in Spain</title><author>Diego Saglia</author></bibl></cit>

      </body>   </text> </TEI>

III. Contextual/Relational Markup First, here’s a basic overview of how most of our contextual markup works. Our context markup usually involves two things: 1) A tag in the body of the text that indicates a person, or a place, or an event. This tag usually has an @ref, an @who, or an @corresp attribute: 
        <persName ref="#Mitford_Geo" > for Mitford's father, when referred to by name, or <rs type="person" ref ="#Mitford_Geo"> when he’s not named, but you want to mark a reference to him. <sp who="#Doge_F"> when we’re defining who is speaking a part in a play and referring to the role definition up in the cast list. <castItem><role corresp="#Doge_F"> when it’s a character role defined in a cast list, pointing elsewhere to a standard xml:id (see below). 2) Elsewhere in the file--or in another file (our site index, a place to identify the hashtags, the point where an @xml:id is used. --We're putting lists of persons and places and events down in the "back" section of each file: so we add a section after <body> called <back> to 
develop and store our lists as we code. Each entry  here stores a unique xml:id. The prosopography lists and entries you're preparing have a formal structure in TEI, so they're set up to contain canonical names, information, birth and death 
dates, etc.   --I'm asking you to place these in the same file, and I'll be extracting 
the lists as we go into a separate centralized file. Eventually all our 
texts will be pointing to this central file, and our readers will be able to 
move back and forth from that file to the texts. That file should also 
allow us to show which texts refer to a particular person or place--Using @xml:id attributes lets us do all this. Other places to store @xml:ids:

  • Within plays in particular, the castList is another place  that will hold the canonical names of speakers abbreviated in the play--
so that's a point  where we'll also use xml:id attributes.
  • Sometimes up in the TEI header we'll have occasion to store detailed info referred to in the body of the text. 
(Remember, we saw that with our <handShift> element when coding letters--we described the different hands scrawling on a 
manuscript up in the header, gave those hands distinct @xml:id’s, and coded 
within the body of the text to refer up to the header.)
  • As I update our site index I’ll extract and possibly modify the xml:ids as defined inside your files. As we standardize the xml:ids in our central si.xml file, you or I will change <handNote> and <castList><role> attributes from @xml:id=”id” to @corresp=”#id”  .

Workflow Suggestion: As you begin editing, and especially if you’re transcribing from manuscript,  it may be best to just enter tag elements without the attributes at first: Just enter the simple tags for  
<persName> <placeName> , etc. as you read the texts and save the looking up of detailed info for the backlists for a next phase of work. You'll be able to search for all of your markup later, retrieve the contents of 
specific tags, and modify it quickly using the Find & Replace window as well 
as the XPath window. Don't  bother trying to do it one by one as you tag each 
individual. **I’ve shown you some simple XPath expressions in our coding hangouts and at the MRM Workshops: XPath helps us to locate the contents of particular elements and the values of particular attributes, following the structure of our XML documents. If you’d like to learn more and experiment, here’s the intro guide I prepared for my students: “Follow the XPath!” 
 III. A. Our Site Index (si.xml) and Our Schema (MRMValidate.sch) As we are all editing Mitford’s texts, we are actively compiling entries for our central site index file, or si.xml. This file iscurrently stored in Box (Mitford Digital Archives→SI:Standard Lists of Named Entities and XML:IDs )and posted online here: . The file contains our centralized list of standard names and xml:id values (specific identity codes) for persons, fictional characters, places, texts, events (the tagging of which we discuss in later sections here in Part III of this codebook). The site index serves multiple purposes:  We can extract information from it to publish in html on the Digital Mitford website (as in a list of the real places referred to in some way in Our Village, for example, and--separately--a full index of all people referenced in the entire archive). We also use this file internally among our project team, to help us keep track of every named entity we’ve defined, and to reference our xml:ids in our coding. As I add your updates and new entries to the si.xml file, I formalize and standardize the xml:ids to make sure they aren’t duplicated anywhere in the file and to make them both brief and (I hope) human-readable, so we can locate them as available identifiers to use while coding. I have prepared a Schematron file, MRMValidate.sch, to help us to live checking (or "validation") of the files you are working on. (This replaces our old system of schema checking.) When the file is associated it works essentially like a “spell-checker” to help guide you in entering values for @ref, @who, and other similar attributes on our context coding for <persName>, <placeName>, and such elements we have in play for named entities, places, events, texts etc. with canonical xml:ids that we’re storing in SI.xml. You’ll only use (or “associate”) this Schematron file to check the values we’ve already identified in the site index--and you’ll disconnect (or “dissociate”) the schema when you’re done with this and ready to identify named entities who are not already listed in our site index. Here’s the process:

Workflow with our Site Index: 1. Entries are added to the si.xml file. (Working with your files, we extract your lists of new persons, places, other named entities, edit them, and add them to the site index. More on this below. 2. As you prepare to work on context coding a Mitford text, you need to associate the current Digital Mitford Schematron filewith your XML file. There are two ways to do this: . Preferred: Simply paste the purple schema line below into your file in <oXygen/>, directly beneath the purple lines identifying the TEI schema rules, and just above the root <TEI> element at the head of your document:

<?xml-model href="" type="application/xml" schematypens=""?>

As long as you're working with a stable internet connection, this is the best way to be sure you are always using the most up-to-date version of our project schema rules.   

b. If you're not going to be working with a stable internet connection, you'll want to save the current Schematron file locally to work with it: Download the current schematron file: MRMValidate.sch from Box, and save a local copy in the same folder in which you’re editing your XML file. You need to *temporarily* associate this schema file with your XML document to help you identify and use the right values for @resp, @ref, @who, and @corresp attributes in your file. To associate the schema, you must be currently editing the xml file in <oXygen/>. On the top menu bar in <oXygen/>, go to Document→Schema→Associate Schema… 
(see screen capture below)   This brings up a new window, permitting you to browse to associate a new schema file. You need to do two things: 
        a) browse and select the appropriate .sch file to associate, and         b) keep the other schema files originally associated with your file in place. These are the schema rules oXygen applies for tei_all. (If you forget this part you can locate the appropriate lines by starting a new file using oXygen’s template for TEI all, and snagging the lines that appear before the <TEI> root element. Copy and paste those back into your file.)

See screen capture below for how to associate the schema:  Once you’ve associated the schema rules with your file, its rules will “fire” whenever you apply values to the attributes @ref, @who, @corresp, etc. This will generate a helpful validation error if you mistype a value, and identify an old value that’s no longer in use on the site index. 

* Remember: The @xml:id holds a "canonical" reference term, used just once in a project. @ref and other such attributes "point to" our xml:ids, and have a hashtag in front (#) to signal that these are connectors. 
Example:<title ref="#Blackwoods">Blackwood's</title> 
Don't forget your hashtags on these referencing attributes! The schematron file will pick these up as errors, and look for missing hashtags as the most common/likely problem as you're correcting your work. 5. Eventually you will need to REMOVE the Schematron file, when you’ve completed your context coding of every named entity we’ve already listed in the current site index. You want to remove the schema so it doesn’t generate errors when you propose new entries and @xml:ids for the site index and when you work references to those new @xml:ids into your text. To remove the schema, go to the top of your file and either delete or “comment out” the schema line that holds “MRMValidate.sch.” If you saved the Schematron file locally, it looks like this: <?xml-model href="MRMValidate.sch" type="application/xml" schematypens=""?> (If you pasted in the schema posted on my GitHub site, it llooks very like the above line with a GitHub address in front of it.) To “comment out” the line, simply wrap the whole thing in a comment tag, . In Windows, highlighting the line with your mouse and right-clicking brings up a menu in <oXygen/> with “toggle comment”--and you can use that wrap a line in a comment: “Commenting out” the line effectively disables the schema just as well as deleting it, and makes it possible to quickly associate it again later if you want. Here’s what it looks like “commented out”: Whether you delete or comment-out the MRMValidate.sch line, leave the other schema rules in place (the lines pointing to tei_all.rng). Or if you accidentally removed them, open a new document in TEI_all and copy and paste in the current appropriate schema lines, which sit under the very top line, <?xml version "1.0" encoding="UTF-8"?>. (Here’s a copy of all the top lines current as of 10/04/2014): <?xml version="1.0" encoding="UTF-8"?> <?xml-model href="" type="application/xml" schematypens=""?> <?xml-model href="" type="application/xml"

6. In the <back> section of your document (following the closing </body> tag inside the <text> element), code in your proposed lists of new named entities, setting up the appropriate list for specific groups (historical people, mythic entities, fictional characters, places, books, events) as defined and discussed in the following sections. 
**Shortcut: To begin producing these <back>lists, download and run a Code Report XSL Transformation on your XML file in <oXygen/>. (This link is only available to Mitford editors.) In the results window you will generate a new XML file (which you may wish to save with the name of your original file + CodeRpt (as in 1819-10-22-Elford-CodeRpt.xml). After the lists of elements and attributes in the file, you will find a series of lists holding all the named entites (persons, places, titles, etc) that you've marked: You can copy and paste those lists into the <back> of your file. Fill in these lists with new information, or delete if the names are already in the site index and you have no new information to add to those site index entries. (If you have never run a Code Report XSL Transform, or have forgotten how, you will need to check in with Elisa or the other editors, and one of us will walk you through it.)

If you want to expand on or otherwise propose changes to an entry in the current site index, do so here as well, with a comment tag to indicate this:

 [Please do propose changes and add new info! Our site index is rapidly growing and needs quite a lot of refining.]

Here’s a structural outline to show where these new / revised entries go in your TEI file: 
<TEI> <teiHeader>...</teiHeader> <text>      <body>....</body>      <back>


<listOrg type=”hist”>...</listOrg> <listPerson type=”hist”>...</listPerson> <listPerson type=”arch”>...</listPerson> <listPerson type=”fict”>...</listPerson> <listPlace>...</listPlace> <listEvent>...</listEvent>          <listBibl>...</listBibl>


    </back> </text> </TEI>

Define new @xml:ids here (and up in your castList elements in plays or in your TEI header for reading witnesses or handNotes as appropriate. Work in references to these new entities throughout your text using hashtags (“#”) with @ref, @resp, @corresp, or @who as appropriate for your tagging. <persName ref=”#newID”>.

Good Resources for Researching Entries: The ODNB (Oxford Dictionary Of National Biography) database Lord Byron and His Times (extensive prosopography resources on Mitford's contemporaries) WorldCAT  Virtual International Authority File (VIAF): (Ex: Mitford entry) NCBEL and CBEL See "Helpful Resources for Researching Your Notes and Entries" below 
Link to publicly available resources when we can from our entries. Make links in TEI XML like this:

Version 1: Milestone-style self-closing pointer: <ptr target=""/>

Version 2: Alternate version to surround a block of text (like an html link): self-closing pointer:

<ref target=""> Relevant text</ref>

7. Save and post your new file to the Digital Mitford Archive folder in which it belongs, and alert me with a comment (entering @e to bring up my e-mail address) to message me that you’ve got a new file posted for us to review, with new entries for me to include in the site index. I will then enter your new entries, and the cycle continues.

III. B. People, Places, Books, Events. Flora, Quantities (Real and Fictional)  The TEI’s combination of elements and attributes make it a very powerful cataloguing system, and we can adapt and streamline its many options readily for a very large project like ours that combines historical contexts as well as fictional ones. 1. People, Fictional Characters, Archetypal Entities, and Named Creatures

1a. Thinking about our Prosopography

1b. Prosopography and Networks: Defining Relationships 2. Organizations, any group or collective
 3. Places
 4. Book References in a text
 5. Dates, Events, Times
 6. Botany (Flora)
 7. Quantities (Numbers, Weights, Measures, Currency)

1. People, Fictional Characters, Archetypal Entities, and Named Creatures: We can actually use the same tag for all of these when referred to by some proper name: <persName>. Here’s our scheme for distinguishing kinds of entity names:

<listPerson type="hist">   an historic person from the “real" world

<listPerson type="hist" subtype="animal">   an animal with a name from the “real" world
 <listPerson type="arch" ref="#a2">  an archetype or mythical entity, not precisely located in any one text or necessarily to any specific culture (although it might derive from specific texts). The classification of type="arch" should be determined by the usage in Mitford's texts. 
 <listPerson type="fict" subtype="int" ref="#a3"> a fictional character, within the world of this specific text that you are now coding

<persName type="fict" subtype="ext" ref="#a4"> a fictional character, from another text (whether by Mitford or another writer)

<rs type="person" ref="#a4"> Use to tag a reference when the individual is not actually named but referred to.

<rs type="fict" subtype="int" ref="#Bradshaw #Centinel"> If you need to refer to multiple characters or people at once, you can set up the @ref attribute to list each one separated by a white space. 

At the <back> of your file, develop lists of these to store unique @xml:id values for each, and to provide detailed information, using the <listPerson> element and its friends. Keep your lists separate according to the @type attributesyou’ve used.

1a. Thinking about our Prosopography: Generating lists of persons and places allows us to store canonical information about each distinct entity we tag. These lists may start small but will get increasingly complex, and we’ll be devoting considerable research to tracking down information and thinking about how to represent significant personages and characters. Do we generate paragraphs of information, or a long list of detailed tags, or a hybrid of both? In our project, we've been making decisions about the level of complexity we need, and the following passages will help to orient you about our decisions. Let’s look at a couple of examples.

<listPerson> example 1: Detailed List Mode: [NOT our current method, but for illustration only!]
Here is an example I’ve cobbled together of a very complex entry using lots of tags in list format: (Note: This is MUCH more detailed than the practice we’re currently following for the site index on the Digital Mitford.)

<listPerson type="hist">       <person xml:id="MarLou" sex="2">           <persName><forename>Marie-Louise</forename> <surname>von Habsburg-Lothringen</surname></persName> <birth when="1791-12-12">born 12 December 1791, in Vienna, Austria</birth>
<death when="1847-12-17">died 17 December 1847, diagnosis: pleurisy.</death> <roleName type="nobility" from="1810" to="1814">Empress</roleName> <roleName type="nobility" from="1814" to="1842">Duchess of Parma</roleName> <event type="marriage" when="1810-4-2">           <state from="1810-4-2" notafter="1815">wedded to <persName ref="#Napoleon">Napoleon</persName> (April 2, 1810) after Josephine’s separation, nominally the regent of France during his absence and exile, sent by Napoleon back to Austria in during his exile, and remained there afterwards. </state> </person>   <person xml:id="Napoleon" sex="1">   <persName><forename>Napoleon</forename> <surname>Bonaparte</surname></persName>      . . . .(stuff on Napoleon) . . .

. . .. more <person>info. . . </listPerson>

This is pretty complicated, and took me some serious time to put together, not only to look up information about Empress Marie-Louise (and when she was Empress and what she was before and after that), but also to look up the appropriate TEI elements to use. As we’re editing texts and extracting people, places, books, etc, it may just be easiest to generate an xml:id and quickly write up a paragraph suggesting more work to be done in a later phase as we focus on formalizing a personography. So, here’s a second example that may be a little easier to deal with: 

<listPerson> example 2: Hybrid List and Paragraph

We need to understand that listPerson and other canonical "list" elements in TEI are rigidly structured like the teiHeader elements, and the detailed TEI tags tailored for <listPerson> are only available in list format. You can’t just write paragraphs of information and scatter them throughout, because tags like <birth> and <death> and others aren’t permitted inside

elements. We *are* permitted to write paragraphs inside a <listPerson>, though--it’s just that the paragraphs aren’t tagged in such detail. We can position these paragraphs inside a <note> element, and this makes sense for us to generate brief biographical or contextual commentary after the formal listing info. The use of such notes will also help us reduce our reliance on a strict listing system--so we don't need to use so many of the available list elements. This is the approach we've chosen for the Digital Mitford's site index: a hybrid of list elements with an annotation, like this: <listPerson>  <person xml:id="Bullock_Wm">                   <persName>                      <surname>Bullock</surname>                      <forename>William</forename>                   </persName>                   <birth when="1773">                      <placeName>Plymouth, Devon, England</placeName>                   </birth>                   <death when="1849-03-07">                      <placeName>Chelsea, England</placeName>                   </death>                   <occupation>naturalist</occupation>                   <occupation>antiquarian</occupation>                   <occupation>museum</occupation>                   <note type="bio" resp="#ebb">Collector and systematic organizer of museums, including the Liverpool Museum at <placeName ref="#EgyptianHall">Egyptian Hall</placeName> in Piccadilly, <placeName ref="#London_city">London</placeName>, which housed artifacts from <persName ref="#Cook_CaptJ">Captain Cook</persName>'s voyages that Bullock had acquired from other collections. An early British traveller to <placeName ref="#Mexico">Mexico</placeName> in <date when="1822">1822</date>, after <rs type="event" ref="#MexIndependence">Mexican independence in 1821</rs>, Bullock returned in 1823 with Mexican artifacts that he exhibited at Egyptian Hall, and published catalogs as well as <bibl><title>Six Months' Residence and Travels in Mexico</title> in <date when="1824">1824</date></bibl>. Between 1825 and 1825 he travelled again in Mexico and the <placeName ref="#USA">United States</placeName>, where he purchased an estate called The Elms or Elmwood near <placeName ref="#Cincinnati">Cincinnati</placeName> on the <placeName ref="#Kentucky">Kentucky</placeName> border, and laid out an unsuccessful but admired town plan called "Hygeia" that would become Ludlow, Kentucky. (ODNB) </note>                </person> <person>. . . </person> . . . </listPerson> So, this way we can include paragraphs within note elements. With this model we could simply start with a paragraph inside a note, and apply tagging within the note to point to other entries or site files, or to point to stable and authoritative web resources outside our project. (Note: We are not including links to the ODNB since this is a proprietary resource, available only through paid subscription, but a simple mention will do.) 
1b. Prosopography and Networks: Defining Relationships There are a couple of ways we can define networks of relations. One is formal and a little challenging--using the <listRelation> element within a listPerson or other list group in our site index. The other is more organic and built into each of our entries.

  • OLD formal way that we've discarded, for practical purposes: The formal method is possible for us, but not really necessary. Here's how it would work:
As the last entry in a <listPerson/> or other list group: define the Relations among these people, or among entries in this list and those in other lists. The listRelation element is optional, but if we use it, it has to sit in the LAST position inside a listPerson element.               In other words, it needs to sit last in a list, after the person elements.         <listperson> <person> . . . </person> <person> . . . </person>                        <listRelation>                  <relation name="marriage" mutual="#MarLou #Napoleon"/>                 <relation name="parent" active="#MarLou #Napoleon" passive="#kid1 #kid2"/>                 <relation name="friendship". . .>                                  <relation name="????" > (Let’s think about the various values we could use here...)                   <relation name="emulation" active="#Napoleon" passive="#Ossian">       </listperson> Elsewhere, in <listPerson type="fict"> we identify:
                <persName>Ossian</persName                 and connect him with the book by Macpherson thus:                             <bibl ref="#Ossian/> a <listBibl>         <bibl xml:id="Ossian">
        <author xml:id="Macphers">Macpherson, James</author> (unless you refer to him elsewhere and develop him in a <listPerson>, in which case, give this an @ref="#Macphers" that points to an xml:id defined elsewhere)...         <title>Ossian</title>         . . .         </bibl> These examples help to show how we can make relationships within lists and between different lists in our site index.
  • Better Way for Us: In the Digital Mitford project, we've evolved a more efficient way of making these connectionssimply by coding our <note> elements within individual entries. Thus, a biographical note we write can indicate that an individual was a parent, spouse, sibling, friend, editor, etc of another person on the index, and someone who wrote or was influenced by a particular publication we've referenced in the index. As our project is evolving and our site index developing, we seem not to need the TEI's elaborate listRelation markup. For example, here is our entry on Henry Fothergill Chorley:
 <person xml:id="Chorley_HF">                   <persName>                      <surname>Chorley</surname>                      <forename>Fothergill</forename>                      <forename>Henry</forename>                   </persName>                   <birth when="1808-12-15">                      <placeName>Blackley Hurst, Lancashire</placeName>                   </birth>                   <death when="1872-02-16">                      <placeName ref="#London_city">London</placeName>                   </death>                   <occupation>literary</occupation>                   <occupation>journalist</occupation>                   <occupation>music critic</occupation>                   <note resp="#ebb">Of Quaker parentage, Chorley worked unhappily in clerical positions and cultivated the arts as a music and literary critic publishing reviews of around 2500 books, weekly reviews of musical performances, and "columns of musical 'gossip'" for <title>The Athenaeum</title> beginning in <date from="1830" to="1868">1830 through 1868</date>, "the most prolific of all its reviewers," according to the ODNB. Reviewed <persName ref="#Hawthorne_N">Nathaniel Hawthorne</persName> and <persName>Charles Dickens</persName>, and promoted the compositions and operas of <persName>Rossini</persName>, <persName>Mendelssohn</persName>,<persName>Meyerbeer</persName>, and <persName>Gounod</persName>, though he disliked <persName>Verdi</persName>. <persName ref="#Hemans_Felicia">Felicia Hemans</persName> and <persName>E. T. A. Hoffman</persName> made lasting impressions on him. Wrote <bibl><title>Memorials of Mrs. Hemans</title>, in two volumes, published in <date>1836</date></bibl>. Served as editor of <title>The Ladies' Companion</title> in 1850 (after <persName>Jane Loudon</persName>), and wrote plays, novels, and short stories, though these did not receive much recognition. Correspondent of <persName ref="#MRM">MRM</persName>, as well as <persName ref="#Barrett_E">Elizabeth Barrett</persName>, Charles Dickens, and Arthur Sullivan. Edited the <bibl><date>1872</date> edition of Mitford's correspondence, <title>Letters of Mary Russell Mitford, Second Series</title></bibl>.</note>                </person> This long-ish note for the site index helps demonstrate how an individual entry's <note> element serves pretty efficiently to define relationships of many kinds. Notice how the note is itself tagged and points to other entries across the site index. Not every tag in these annotations leads to another entry, and our Working Rule of Thumb is that we add new entries when they are shown to relevant to Mitford's world. Thus, we don't yet have an entry for Charles Dickens, and we don't yet have an entry for E. T. A. Hoffman either--but we will certainly develop entries for these when they turn up in Mitford's writings or in our headnotes and commentary--and when we do, we can easily locate their mentions in the site index and add @ref pointers. Take these examples as a preliminary model, but as you face complexities, research the tagging in the TEI P5 Guidelines. Here’s a link to a handy list of every example of the <listPerson> element in use in the guidelines (which also contains examples of its use with <listRelation>). The examples link back to Chapter 13 on “Names, Dates, People, and Places." And, again, here’s a link to the TEI Wiki on Prosopography, which contains helpful examples.   2. Organizations, any group or collective: <orgName>name of the group</orgName> <rs type="org> when the group is being referenced, but not by name</rs> <listOrg>         (works like <listPerson> and works with <listRelation>) Note: Any <listRelation> can associate people with organizations with books, etc.  (See the TEI P5 Guidelines examples of listOrg for details.) 3. Places: Though the TEI offers many, many possibilities for coding names of places, from geographical features to districts, we have chosen to simplify our approach to place by not worrying about differentiations of place types in our markup. Thus, every name of a place that can be plotted on a map is simply to be coded as a <placeName>. We will use @type and@ref attributes to point to distinctive information about each place, and the use of our site index makes this a simple and effective strategy to streamline our work and minimize the potential for conflicting judgment calls on categorizing the place names we see as we are coding. Thus London, the River Thames, the Temple Bar, and Mount Aetna are all simple tagged with <placeName> in our project. 
 For our Site Index entries (and <back> list entries to be added to the Site Index, we are developing lists of historical and fictional (or imaginary) places: <listPlace type="hist"> and <listPlace type="fict">. Here, too, we can streamline our markup, without worrying over the many possible elements we can use in a listPlace. All entries on the need to feature at least one canonical <placeName>. In addition, we'd like our entries in <listPlace type="hist"> to hold latitude and longitude coordinates to facilitate our mapping them, though we've only just begun entering this information. Wherever possible, include detailed information about the place's relevance to Mitford and her world in a <note> in the entry. Here's a sample entry for the Site Index:   <listPlace>   <place xml:id="Waterloo_Belgium">                   <placeName>Waterloo battlefield</placeName>                   <location> <geo>50.683333 4.4</geo>                          </location>                   <note resp="#ebb">Location of the Battle of Waterloo, near <placeName>the municipality of Waterloo, Belgium</placeName> and 15 kilometers south of <placeName>Brussels</placeName>.</note>           </place> 4. Book References in a text: a. Quotations and epigraphs that Mitford includes in her texts: See examples for tagging with <epigraph><cit><quote> and <bibl> together. b. Use a title element alone when you’re seeing a title of ANY kind: I was reading in the <title>London Magazine</title> . . . When you’re ready, add an @ref attribute pointing to the distinct @xml:id of the Monthly Magazine that is (or will go) into our site index:
<title ref=”#LondonMag_per”>London Magazine</title> Where you see a mention of an author together with a title, use a <bibl> element as a loosely structured sort of bibliographic citation, and wrap it around the <author> and <title> information, to bundle this all together: <bibl><author>Miss Edgeworth</author>'s <title>Popular tales</title></bibl> Where you see a mention of a book but ONLY through the author’s name, use a bibl and an author element together, so you’re indicating there’s a book being referred to, but you’re missing the title--like this: I was reading <bibl><author>Burke</author></bibl> the other day. When you see a passing reference to a text title or some part of a titled publication within the text you’re coding, tag it with either a <title> or a <bibl> element: <title ref="#Gull">Gulliver’s Travels</title>  or <bibl corresp="#Gull">that book, Gulliver</bibl> <listBibl>
        <bibl xml:id="#Gull”><title>. . . </title>                                 <author ref="#Swift>. . .</author>                          . . . (more detail: See the examples of listBibl in the TEI Guidelines.) 5. Dates, Events, Times: <name type="event" ref="#distinctid"> name of an event  </name> <rs type="event" ref="#distinctid"> reference to an event </name> and <listEvent>   See examples of event and listEvent in the Guidelines. <listEvent>               <event type="battle" xml:id="Waterloo" when="1815-06-18">                  <label>Battle of Waterloo</label>                  <desc>What happened</desc>                  <note resp="#ebb"><p>Want to add something in paragraphs?


              </event>                 <event type="riot" xml:id="riot1795" when="1795">
                <label>Food Riots in 1795</label> 
              <desc>A poor harvest led to rioting. . .</desc>                 </event>                     </listEvent>

When dates and times are listed, use these tags:
 <date when="1772-03-06">March 6th, 1772</date>

. (Somehow I think we won’t see many readings of seconds!)

Spans of dates, and spans of hours: <date from="1770" to="1771"> or <date from="1770-3-06" to="1771-2-25">

6. Botany (Flora):   Here's how to code the named plants important in Mitford's worlds, and document them for the site index:

<rs type="plant" ref="#DAISY>

<list type="plants">                                <item xml:id="China_Aster">                   <name>China Aster</name>                   <note resp="#ebb">One of <persName ref="#MRM">Mitford</persName>'s favorite flowers, blooms in autumn in <placeName ref="#Berkshire">Berkshire</placeName></note>                </item>                             </list>

(Note: This is a simplification of the more elaborate tagset with "nym" that I originally wrote here. listNym doesn't permit notes, and our notes in the site index are valuable and important--so let's code them this way.)

7. Quantities: Numbers, Weights, Measures, Currency

For anything presented in measurements and quantities, there's a convenient TEI element called <measure>. This takes several attributes, but in most cases what we really need is the @type attribute to mark the kind: For money, the standard TEI tag is: <measure type="currency">one hundred pounds</measure>. 
We can add @unit, @quantity, and @commodity to this, but the @type attribute should be sufficient for our purposes. If we want to study how Mitford represents quantities later, we can search for these tags and elaborate on them. See detailed examples of the usage of measure in the TEI Guidelines.

III. C. Editorial Notes: When and How to Code Them  Notes are coded directly in place at the point in the text where they’re inserted. So imagine in a print text where you’d see the mark signalling a footnote, like this.(1)  In a printed publication that note would lead your eye down to the bottom of the page to a “footnote" or to the back of the text to an “endnote." No such physical placement is necessary or called for or even desirable in an XML document. Notes are inserted directly inline like this:

<lg> . . . <l>Though scarce the lamp <note resp="#ebb">This lamp bears a mythic association with the one held by <persName ref="#Psyche">Psyche</persName> over the sleeping <persName ref="#Cupid">Cupid</persName>.</note> can pierce the gloom,</l> <l>That shrouds a high and stately room,</l> <l>Its light a bending fair one shows;</l> <l>A man, who snatches short repose;</l> <l>And while <placeName ref="#StCloud">St. Cloud's proud walls</placeName> scarce catch the beam,</l> <l><persName ref="#MarLou">Louisa</persName> Louisa wondering, marks Napoleon's dream.</l> </lg>

    • Remember, this isn’t presentation markup, and it’s actually best to keep notes embedded in the structure of the XML together with what they comment on. Notice how we continue to apply contextual markup inside the body of the note.**

Two Kinds of Notes:
1) Author notes: written by Mitford and placed in the copy-text. We use attributes to indicate a) that the note is by the author of the text, and b) whether the note is anchored, or *signalled* (1) in this spot (even if it appears elsewhere. (Even if the note appears elsewhere, we code the note at the insertion point, in the body of the text, just as in the example above. Here’s an author note, with some kind of anchor (doesn’t matter what kind) in the text:

<note type="author" anchored="true">. . . . </note>

(If Mitford’s notes are numbered, we can number them later by collecting all these and running a program to count them.) We can also preserve the number in an @id attribute during autotagging. We can program many different ways to visualize such a note, whether in a floating box or along a margin, or the viewer’s choice.

2) Editorial notes: written by you, the editor, as you’re researching and coding. Our notation for this is simple, and we’ll use hashtags to identify each of you editors by your initials or some other notation that you can’t use for anyone else. We’ll define your distinct @xml:id value up in the TEI header. An editorial note by me looks like this:                <note resp="#ebb”> . . . . </note> See the TEI’s examples of <note> markup.

When to Annotate:
Since we are coding identifications of people, places, events, and other “things" or “nyms," we can minimize our use of notes, and limit these to situations where we need to explain something more than we can hold in our tagging system. We may want a note to comment on something really distinct to a particular text--say an odd use of a word or phrase that won’t fit in our tagging system. Remember, inside the note, we can still write markup to point to persons, places, events, etc. We can also point to or cite outside resources, like this: A link to <ref target="">the biographies page in the Southey letters archive</ref> would look like this. To make a bibliographic citation, to a source that we're including in our site index, use the <cit> and <bibl> tag set, like this: <cit><bibl ref="#whateverID"><title>Poetic Castles in Spain</title><author>Diego Saglia</author></bibl></cit>

What Kinds of Research should go into Editorial Notes, (or into new entries for the Site Index)? 
Try to track down Mitford's unspecified references to people, texts, places, publications, works of art. Where the information you find is specific to the context of the particular Mitford text you're editing, this is where an editorial note is called for (<note resp="#ebb">....<cit><bibl>....</bibl></cit>). Otherwise the new information should go into new entries for the site index (and indeed you may have occasion to write both a specialized note for your text and a generalized entry for the site index.)

For example, in her letter of 4 October 1820, Mitford mentions an unspecified painting that Haydon is taking to exhibit in Edinburgh, and that this exhibition will serve as a reply to Blackwood's Magazine. As an editor of this letter, of course you'll probably start by tagging <placeName ref="#Edinburgh">Edinburgh</placeName> and <title ref="#Blackwoods">Blackwood's</title>. But you should also do some research to attempt to identify the painting and find more information on the exhibition, and if there was a piece in Blackwood's on Haydon's work at some recent point before Mitford's letter was written.

How do you look for such information? There are many ways to do this, and you might begin with a web search on proper names and dates, to turn up sources that comment on Haydon going to Edinburgh in 1820, for example, and which painting he was bringing with him. There will be a number of questionable biographic resources, so be wary of sketchy source attributions and try to track down primary sources whenever you can. Citing a detailed and intensive biographical resource such as Benjamin Robert Haydon: Correspondence and Table-Talk which presents his letters alongside discussion of his life, will not only tell you what painting he was taking with him to Edinburgh, but also about his reception there and something of how a mean-spirited anonymous Blackwood's correspondent had recently lumped Haydon with a "Cockney School" of Keats and Hunt. Don't stop there. See if you can locate the article in Blackwood's that made that association--Blackwood's is available in Google Books, and a Google search on a quoted passage from the article quickly turns up the article itself in its particular issue.

Ultimately, when we finished our research for these references in Mitford's letter, we produced a specialized editorial note for *this particular letter*, as well as more general entries for the Site Index. Here's what we tagged and annotated in the letter. New entries for the Site Index are marked in bold purple: 

And <rs type="art" ref="#ChrstEJrslm_Haydon">your picture</rs> is really going to <placeName ref="#Edinburgh">Edinburgh</placeName>! What an answer to <title ref="#Blackwoods">Blackwood's Magazine</title>!<note resp="#ebb"><persName ref="#MRM">Mitford</persName> refers to Haydon's painting, <title ref="#ChrstEJrslm_Haydon">"Christ's Entry Into Jerusalem"</title> which he exhibited in <placeName ref="#Edinburgh">Edinburgh</placeName> in <date when="1820-12">December 1820</date>. See Haydon's letter to<persName ref="#Beaumont_Sir_Geo">George Beaumont</persName> of <date when="1820-12-26">26 December 1820</date> in <bibl corresp="#Haydon_Corresp"><title>Benjamin Robert Haydon: Correspondence and Table-Talk</title><biblScope unit="vol">1 of 2</biblScope><biblScope unit="page">350</biblScope></bibl>. <bibl>The mention of <title ref="#Blackwoods">Blackwood's</title> likely recalls the <date when="1818-08">August 1818</date> issue in which <persName ref="#Haydon">Haydon</persName> was attacked by<author>the anonymous "Z."</author> who had been lambasting the <persName ref="#Hunt">Hunt</persName> and <persName ref="#Keats">Keats</persName> circle as the "<orgName ref="#CockneyS">Cockney School</orgName>."</bibl> The <date when="1818-08">August 1818</date> article had described <persName ref="#Haydon">Haydon</persName> as <quote>"that clever, but most affected artist, who as little resembles <persName ref="#Raphael">Raphael</persName> in genius as he does in person, not-withstanding the foppery of having his hair curled over his shoulders in the old Italian fashion."</quote></note>. . .

In the editorial note above, I've identified the specific contexts, and though I located much of the information in secondary sources, I was able to "drill down" to primary sources for precise references to Haydon's own correspondence and the famous Blackwood's article of Oct. 1817 (which had been misdated in the secondary source I first discovered). It took a little time to work out all this, complete with tagging and @ref / @corresp hashtag attribute values, because we also needed to add several entries to the Site Index. Here are some of the Site Index entries as I added them, so you can see something of the move from specific to more general contextual information for our prosopography writing:

Sample Site Index Entries (Added in the Process of Developing the Editorial Note Above): 

<listOrg type="hist">                <org xml:id="CockneyS">                   <orgName>the Cockney School</orgName>                   <note resp="#ebb">Satirical term coined by <bibl>an anonymous <title ref="#Blackwoods">Blackwood's</title> article of <date when="1817-10">October 1817</date></bibl> targeting a circle of intellectuals, writers, and artists specifically including <persName ref="#Keats">John Keats</persName>, <persName ref="#Hazlitt_Wm">William Hazlitt</persName>, <persName ref="#Hunt">Leigh Hunt</persName>, and <persName ref="#Haydon">Benjamin Robert Haydon</persName>.</note>                </org> </listOrg>

<listPerson type="hist">   <person xml:id="Hunt">                   <persName>                      <forename>James</forename>                      <forename>Henry</forename>                      <forename>Leigh</forename>                      <surname>Hunt</surname>                   </persName>  <persName>Leigh Hunt</persName>                   <birth when="1784-10-19">19 October 1784 <placeName>Southgate, Middlesex</placeName></birth>                   <death when="1859-08-28">28 August 1859 <placeName>Charles Reynell's home on Putney High Street, London</placeName></death>                  <note resp="#ebb">Founding editor (<date from="1808" to="1821">from 1808 to 1821</date>) of the radical weekly journal, <title ref="#Examiner">The Examiner</title>,  which advocated for parliamentary and military reform and Catholic emancipation. Hunt was prosecuted and imprisoned for libel from <date from="1813" to="1815">1813 to 1815</date> for his negative depiction of <persName>the Prince Regent</persName> in <bibl corresp="#Examiner">the issue of <date when="1812-03-22">22 March 1812</date></bibl>. Hunt published <persName ref="#Shelley_PB">Shelley</persName>'s and <persName ref="#Keats">Keats</persName>'s poems in <title ref="#Examiner">The Examiner</title>, and came to be associated after an article in <bibl corresp="#Blackwoods">the <date when="1821-10">October 1817</date> issue of <title ref="#Blackwoods">Blackwood's Magazine</title></bibl> with the "<orgName ref="#CockneyS">Cockney School</orgName>" of poetry.</note>                </person> </listPerson>

Helpful Resources for Researching Your Notes and Entries:

Lord Byron and His Times : Try their Search Persons feature.

Southey Letters (and other resources on Romantic Circles)

the Oxford Dictionary of National Biography (ODNB) database (proprietary resource, available by university subscription).
 WorldCAT to track down specific editions, titles, translations of texts available to Mitford.

NCBEL and CBEL (Cambridge Bibliography of English Literature): Useful print resource for tracking down published texts in periodicals, usually available in library reference sections. 
 Our Resources on Contexts Folder in Box: We've uploaded biographies and memoirs, resources on 19th-c. theater, resources on Mitford's Reading's parliamentary elections and Berkshire maps, guidance on how to read postmarks, and more in this useful stash of files and links.   
 Published editions of letters of Mitford and her circle (our Box Stash). Look for the indexes in the backs (of most) of these texts. These are really helpful for tracking down a proper name or event. In particular, check out the Haydon Correspondence and Table-Talk volumes in Box as a helpful resource because it contains many letters from Haydon to Mitford in vol. 2, and Haydon's letters in both volumes will help us greatly as a contextual resource.

any other print or digital collections of letters or literary biographies on the people and contexts in Mitford's world. (Please add useful resources to this list as we work!)

III. D. Variant Texts: Critical Apparatus Markup TEI markup helps us document some very interesting differences between publications of a text, or differences between ms and publication (as I've done with a couple of  versions of Julian). For this we use apparatus markup, using the elements<app> and <rdg> with @wit  . You’ll see this modelled in our xml for Mitford’s Julian (posted on our site). We don’t handle this with notes because the app and rdg markup gives us an efficient way to represent versions.

<app>         <rdg wit="#ABC">         <rdg wit="#EFG"> </app>

In our letters, when we set ourselves against 19th-c. editors who have drastically cut or altered (bowdlerized) Mitford’s voice, we can identify ourselves as the “lemma" or more authoritative witness:

<app>         <lem wit="#EBB">
        <rdg wit="#EFT"> </app> 
Witnesses are identified with @xml:id in the TEI Header, within the <sourceDesc> element.

<listWit>         <witness xml:id="ABC">Amos B. Cottle</witness>         <witness xml:id="EFG">Edward F. Granger</witness> </listWit> Eventually we move these @xml:ids into the site index (si.xml) and define them using @corresp="#ABC" .

E. Coding Quotes of Various Kinds What to do when you encounter quotation marks or quoted material? 

* Always keep in view the purpose of the quotation marks, rather than the specific rendering of them. Use the appropriate TEI element to indicate the kind of quotation this is. For example, in a letter, Mitford might quote extensively over multiple lines of text, and at the start of each line give a set of quotation marks. We do NOT need to reproduce these extra quotation marks any more than we need to reproduce the line breaks in the letter. Instead, we surround quoted material with the appropriate element indicating the kind of quotation this is. Use Mitford's opening and closing set of quotation marks, but don't worry about rendering those at the start of each line in the manuscript.

When to render actual quotation marks (". . . ") in your text: In general, render Mitford's quotation marks if she uses them (removing the extras at the head of each line in a manuscript letter). When she doesn't use quotation marks, don't you use them either!

  • We might encounter quoted or spoken material that is not signaled with quotation marks. Again, this needs to be tagged as quoted or spoken material of some kind, regardless of the punctuation in the text. Following are the elements and attributes we use for quoted or spoken passages of various kinds: 

1. <quote>  and <quote> with <epigraph>, <cit> and <bibl>: This is the element to indicate quoted material from another text, if, say, Mitford is quoting from a book, or someone's letter to her. When Mitford doesn't give a source for her quote but you can identify it, use the @corresp to point to a source with an xml:id. (@corresp="#id"). Or use an editorial note if you're speculating on a possible sources.

        Example, where editor supplies the source: 
 <quote corresp="#whateverId">Et tu, Brute?</quote>

Where Mitford does give a source, expand the tagging to include <cit> (for citation). <cit> with <bibl> is also how we cite bibliographic resources in editorial notes and headnotes.  

Example, where our text supplies the source. This particular example is an epigraph, or a quotation that prefaces a chapter or book title: 
       <epigraph>  <cit>    <quote>     <l>Since I can do no good because a woman</l>     <l>Reach constantly at something that is near it.</l>    </quote>    <bibl ref="#whateverID">     <title>The Maid's Tragedy</title>     <author>Beaumont and Fletcher</author>    </bibl>  </cit>       </epigraph>

2.  Use the element for common sayings, or for "scare quotes" that don't have a specific source.  "And they lived happily ever after," like all the old stories end.

3. <said>, with @who / @aloud="false" / @direct="false" In all kinds of texts except drama (which has its own tag set for character speeches), use <said> for passages indicating spoken utterances of particular individuals  fictional characters or historical people.  Use @who to point to the id of the person in our site index (@who="#whateverID"). The other attributes only need to be used when their values are false: 
 <said @who="#whateverID" aloud="false"> means that this speech act is NOT spoken aloud--something said to oneself </said>.

<said @who =#whateverID" direct="false"> means this speech act is indirect, not signalled explicitly with quotation marks:

She claimed that <said @who="#whateverID" direct="false">Queen Caroline had behaved in a most execrable manner</said>.

F. Coming: Our Village: Special Markup

About the Newbook Autotagger


It’s easy to miss out the backslash, an angle bracket or even forget to close the tag completely when typing tags like


repeatedly through a large document. A single error can invalidate output. In order to minimize inevitable human error and ensure tagging standardization, the Newbook Tech Team have developed what we have called an AUTOTAGGER. We indicate to the Autotagger by using simple keystrokes or patterns of text, to replace these with valid XML markup. For example the [TAB] key tells the Autotagger that whenever it encounters a [TAB] to replace it with paragraph tags.

Preparing the Andrews Diaries for the Autotagger We’re finally ready to start tagging the Diaries for structure, i.e. how the diaries are laid out on the page. This process will define page and section breaks, headers, paragraphs and line breaks. This is distinct from tagging content which will include textual features such as deletions, additions, misspellings, graphics, people’s names, hotel, ship and geographic names. We will tag the content by hand!

EBA Structural Tags (go to the end of this document for a crib sheet) To get started, you will need open up the plain text copy (.txt) of the diary volume you are working on as well as the PDF of the original Diary. You’ll use the latter to figure out where you’ll put line breaks, and for counting lines to figure out the correct line # for Headers.

Here’s what you’ll need to type in to the .txt document to define its structural features:

1. Header


Line #: header  Line #: header

(please note that we tab in before typing Text etc)

Put the actual line number for the header. This will mean counting lines down the original PDF diary page.

EXAMPLE: Text: Line 8: Shepheards Hotel Cairo - Egypt. Dec. 12. 1889.

2. Page Number Page #:

Note the capital P, then number, then colon. No space between number and colon. Number pages consecutively.

EXAMPLE: windows we look upon a spacious court, with palms and trop-

Page 3:

ical trees, among which the great black and grey crows are flying and cawing all the time.

3. Paragraph [tab] at the beginning of each paragraph

EXAMPLE: As a member of his family, Mrs. Andrews accompanied Mr. Davis on his annual visits to Egypt for a period of more than twenty years, and the charming description which she gives of their river-life on the “Bedawin”, familiar to many of us who enjoyed their hospitality, is certainly worthy of a wider public and more permanent form in print, -- though she could not be prevailed upon to consider this.

4. Line Break


To mirror the line breaks on the original file, please hit [RETURN] in the relevant place as you are reading through the transcribed .txt file. The Autotagger will replace this with <lb n=1/>, <lb n=2/>, <lb n=3/> etc.

Testing your File with the Autotagger

1. Go here:

2. Select ‘Choose File’ and upload the .txt document you’ve been working on.

3. Select ‘Upload transcription file’ and the autotagger will work its magic. If all goes well you will see this screen: 

4. If there’s a problem you will get an error screen, helpfully indicating which line the problem(s) are to be found in your .txt document. Go back to your file, correct and re-upload.

5. Once your output is error-free, please upload the file into the Google Drive folder named ‘AUTOTAGGER OUTPUT’. Format the file in this way YOURNAME-DIARYVOLUME-TODAYSDATE eg. ketchley-volume2-12feb2014

6. I will look over all the XML files to check they’re valid and also take a look at them as HTML output to make sure everything looks the way it should. Once complete, these files will move through the heavily guarded gateway onto the Staff Pages ready for phase #2: hand tagging for content.

Autotagger Crib Sheet



Line #: header  Line #: header

Page Number Page #:

Paragraph [tab] at the beginning of each paragraph

Line Break [return]

How to Work with EBA Databases

===How to Work with Omeka===