eMOP Github Repo

Aletheia Web Layout (AWL) Editor A tool for identifying and transcribing paratext on a page image in TypeWright.	Cobre A robust image comparison environment, presenting versions of texts in filmstrip view along side each other and collating these images of different texts while allowing users to adjust the collation.
eMOP Controller The code that implements the entire eMOP workflow.	eMOP Dashboard The online dashboard that powers the eMOP workflow.
Franken+ A tool created for eMOP that allows users to create training for Tesseract with their own typeface samples.	hOCR deNoising A tool created for eMOP post-processing that removes noise from Tesseract's hOCR output.
Juxta-cl A command line version of Juxta that compares OCR output to groundtruth files.	Page Corrector A tool created for eMOP that uses dictionary files and a google 3-gram DB to correct Tesseract output.
Page Evaluator A tool created for eMOP that evaluates OCR output to determine how correctable it is.	Publisher Imprint Database Printer, Seller, and location information culled from the imprint lines of the entire eMOP dataset. These XML files (EEBO and ECCO separately) contain only those entries for which we have an ESTC number.
RETAS A tool created for eMOP that compares OCR output to groundtruth files.	Tesseract Training A collection of training created for Tesseract by eMOP using Franken+.

Aletheia Web Layout (AWL) Editor