The METAe engine
The METAe engine is designed as a comprehensive software package for digitising books and journals with a minimum of effort and a maximum of automation and effectiveness.
METAe makes digitisation easier since it detects the structural elements of printed material automatically without any training.
- page numbers and their correct order - illustration 1
- titlepages - illustration 2
- table of contents pages
- prefaces, appendices, indexes
- chapters and their hierarchical order - illustration 3
- issues within journals
- contributions and their authors
- running titles
- illustrations, tables, formulas, advertisements - illustration 4
- caption lines
- footnotes - illustration 5
- characters (OCR) - illustration 6
- automated double page splitting and croping - illustration 7
- and many more...
METAe makes digitisation safer:
- Metadata are captured automatically during the digitisation process.
- Metadata can be exported as XML file but also as PDF file or in any other format suitable to a digital library application. - illustration 8
- The METAe Engine preferable assembles a METS information object (OAIS) which fulfils the state-of-the-art requirements for digital preservation.
- There are a number of correction tools which allow quality control on all levels.
METAe makes digitisation more valuable:
- The XML output of the METAe engine is in most cases even richer than from a born digital document. It includes all elements and the hierarchical structure which is needed for effective electronic publishing.
- Access for browsing through a document can be provided on different hierarchical levels of a document: E.g title page, chapters, illustrations,... - See ALO demo.
- Searches can be performed exactly on the body text (with no disturbing noise coming e.g. from column titles, indexes, appendices) or separately on footnotes, headlines or caption lines.
- Illustrations, pictures, tables, formulas are extracted and described separately as a Dublin Core record. Your book collection becomes a picture collection as well.
The METAe engine has been developed in close co-operation with several partners of the project. The partner responsible for the technical development, the German software house CCS-GmbH, distributes the tool as a commercial product under the name docWORKS/METAe Edition.
Contact: Claus Gravenhorst.