Skip to main content

LEAF Commons: Using Linked Tools for Digital Editorial Production

This half-day in-person workshop introduces textual scholars and practitioners to the LEAF Commons tool suite, a set of web-based, easy-to-use tools that support text encoding, named entity recognition, web annotation, text analysis and publication without users having to learn complex coding languages, and support the easy movement from one interoperable tool to the other depending on users’ needs. This freely available suite of tools supports digital scholarly workflows for the collaborative production and publication of scholarly and documentary texts, editions, and collections on the web, without the need for software installation, promoting best practices for text encoding, annotation, and metadata standards. The LEAF Commons suite enables the use of individual tools for specific purposes, as well as supporting an end-to-end workflow beginning with outputs of optical character or handwriting recognition systems, transcriptions, or born-digital texts and ends with publication on the web, allowing it to serve a wide range of research and pedagogical uses.

LEAF stands for the Linked Editing Academic Framework, a collaborative software suite that provides a set of modular tools for text editing and publication. The LEAF Commons tools constitute an accessible, low-barrier, no-cost infrastructure for the production of online texts, editions, or collections, whether for teaching or for undertaking research and collaboration on a sustainable basis. The Commons makes LEAF tools freely available in the browser, enabling collaboration and publication through Github, in addition to permitting local storage. Promoting the reuse of data in keeping with the FAIR (Findable, Accessible, Interoperable, and Reusable) data principles, LEAF uses open-source software, open-access platforms, and open international standards for best practices in text-encoding (TEI-XML) and web annotation (RDF).

LEAF promotes best practices, reuse, and sustainability in the production of digital scholarship in the humanities broadly by designing tools to bridge the gap between scholars who have coding and encoding experience, and those who do not. LEAF Commons offers communities of researchers, teachers, and students the opportunity to take part in digital knowledge production and open collaboration. The workshop will end with an open discussion about pursuing such forms of open knowledge production and collaboration.

LEAF-Writer: an open-source, open-access Extensible Markup Language (XML) editor that runs in a web browser and offers scholars and their students a rich textual editing experience without the need to download, install, and configure proprietary software, pay ongoing subscription fees, or learn complex coding languages. This user-friendly editing environment incorporates Text Encoding Initiatives (TEI) and Resource Description Framework (RDF) standards, meaning that texts edited in LEAF-Writer are interoperable with other texts produced by the scholarly editing community and with other materials produced for the Semantic Web. It also incorporates Named Entity Recognition and reconciliation with, or linking to, linked open data identifiers through the incorporation of the NERVE tool. LEAF-Writer is particularly valuable for pedagogical purposes, allowing instructors to teach students best practices for encoding texts without also having to teach students how to code. LEAF-Writer is designed to help bridge the gap by providing access to all who want to engage in new and important forms of textual production, analysis, and discovery.

NERVE: (the Named Entity Relationship and Vetting Environment) is an application that performs Named Entity Recognition (NER) on machine-readable texts, allowing users to identify candidate entities in a document, review, and correct the results . NERVE suggests relevant Uniform Resource Identifiers (URIs) for entities, so users can reconcile data to an authority such as Wikidata or the Virtual International Authority File to provide the basis for Linked Open Data (LOD) Web Annotations. Users can export their reconciled data in TEI-XML or HTML formats to an online repository or to their desktop. NERVE can be used from within LEAF-Writer, or as a stand-alone tool.

DToC: (the Dynamic Table of Contexts) provides an online interactive reading and publication environment for digital scholarly texts where the two conventional overviews provided in print editions - the table of contents and the index - have been dynamically merged to provide an interactive online e-reading experience that leverages the power of XML markup. Users can build a DToC edition from one or more TEI-XML files, then curate and label the underlying elements and attributes in order to understand where named entities, topics, and concepts can be traced within the edition. Editions can be stored using URLs and shared with readers as published or teaching texts.

LEAF-TE: (The LEAF Turning Engine) is a web interface that enables users to easily and automatically transform documents between formats. It converts HTR/OCR output (from various sources including Trankribus) to TEI-XML for importing into LEAF-Writer or other editors.) It converts TEI-XML to HTML, Markdown, and plain text for exporting to web publishing and text analysis environments, including the Dynamic Table of Contexts.

Outline

The workshop will be managed as follows:

  • Introductions (15 minutes)
  • Set up (15 minutes)
    • Access to LEAF Commons tools via GitHub
  • Part 1 (45 minutes)
    • Encoding and annotating text objects using TEI and RDF with LEAF-Writer
  • Part 2 (30 minutes)
    • Using NERVE to recognize, process, and associate named entities with external linked data authorities
  • Part 3 (30 minutes)
    • Publishing texts, editions, anthologies or collections with DToC
  • Part 4 (15 minutes)
    • Exporting from LEAF-TE for web publication
  • Conclusion (15 minutes)
    • Adapting LEAF Commons tools for custom workflows

Requirements

Participants will need a web-enabled laptop computer (not a tablet or a mobile device). No software installation is required. Sample texts will be provided although participants are welcome to come with their own documents in TEI-conformant XML.

Statement Of Commitment To Diversity

Attention to diversity will be given the the selection of sample texts and images to ensure that a range of genders, races, and nationalities are represented. Participants are also welcome to bring their own content. LEAF tools are designed to meet accessibility standards as well as to make tools more readily usable by a broader and more diverse user base than is typical in the digital humanities.

Facilitators

Diane Jakacki (Bucknell University) Digital Scholarship Coordinator, associated faculty in Comparative & Digital Humanities. At Bucknell and through the Mellon Foundation-funded Liberal Arts Based Digital (LAB) Editions Publishing Cooperative project, she works with faculty, students, and their collaborators within and beyond Bucknell to develop and implement an array of text-centric multicultural and multilingual research projects. She is lead investigator of the LAB Cooperative, REED London, and currently serves as the chair of the TEI’s Board of Directors.

Susan Brown (University of Guelph) Professor, Canada Research Chair in Collaborative Digital Scholarship. Her work explores intersectional feminism, literary history, semantic technologies, and online knowledge infrastructures. She directs the Orlando Project, the Canadian Writing Research Collaboratory, and the Linked Infrastructure for Networked Cultural Scholarship.

James Cummings (Newcastle University) is the Reader in Digital Textual Studies and Late-Medieval Literature at the School of English Literature, Language, and Linguistics at Newcastle University. His interests include the use of digital technology for scholarly editing and also late-medieval performance. He has a long history of involvement with the TEI Consortium and is currently an elected member to the TEI’s Board of Directors. He has recently been leading projects on both HTR to TEI workflows and digital pedagogy, as well as a sub-project at Newcastle using LEAF.

Mihaela Ilovan (Canadian Writing Research Collaboratory) Assistant Director. A librarian and digital humanist by formation, she is the technical project manager for the development of the LEAF Virtual Research Environment and the coordinator for LEAF data ingestion and mapping.

Rachel Milio (University of Crete) PhD Candidate in Semantic Annotation and Attic Oratory at the TALOS Center for Artificial Intelligence for Humanities and Social Studies. She is also a Research Associate with LEAF.