The Emma B. Andrews Diary Project
Facebook  Twitter  Tumblr

search icon

Home > Technology

Technology

Overview

The tools and technology used and developed by the Emma B. Andrews Diary Project, partner project of Newbook Digital Texts (NDT), make up a “clear path to output” process for texts prepared in the Text Encoding Initiative (TEI) format. The goal of this process is to produce well-formed, valid, structured data from literary, historical, pedagogical, and other sources in multiple languages and scripts not readily available in print.

The Emma B. Andrews Project and NDT use a set of freely available open-source tools, tools developed by NDT, or commercial software freely available for non-commercial projects which can be used in a three phase process:

  1. Producing generic auto-tagged texts to create TEI conforming XML output. These texts are broadly tagged for structural elements.
  2. Customizing the TEI encoded text for individual projects.
  3. Processing TEI tagged texts to create standardized output for Web and Print

Tools and Samples: The Newbook Process

  • Generic auto-tagging tools to simplify the structural tagging procedure for creating Text Encoding Initiative [TEI] conforming XML input (e.g. by student interns).
  • The resulting generic TEI texts can then be hand coded, using specific, additional TEI tags, to meet the requirements unique to individual projects. The Emma Andrews Project has hand-coded texts for names of people, places, boats, Egyptological finds and European artwork.
  • Using XSLT processors to produce TEI tagged texts for standardized output for Web and Print: XHTML/HTML5, PDF, and e-pub format

Further details are available on GitHub in the Newbook Repository and the Emma B. Andrews Repository.

Open Source Tools

The software tools listed below are readily available from sources on the Internet. Scripts and document samples developed by NDT can be downloaded from this site.

  • UTF-8 Editors: Notepad++ (WinX), TextWrangler (OSX), vi, Emacs
  • xmllint (Unix/Linux): DETECT errors in XML output
  • NDT Autotagger (PERL-based): CONVERT plain text transcripts to TEI-XML
  • XSLT scripts: CONVERT valid TEI-XML to XHTML/HTML5, LaTeX, tag-set lists
  • TeX Live/MikTeX: CONVERT LaTeX sources to PDF
  • validator.w3.org: markup validation service

We also use the oXygen XML editor, academic licensing.

Oxygen XML editor

 

Other Project & DH Tools

Content Management Systems & Databases

Timelines/Storymaps

Data Visualization

Text Analysis

Mapping

Transcription

Project Management & Documentation