Skip to content

Releases: slub/mets-mods2tei

v0.1.4

12 Dec 21:58
6602f9a
Compare
Choose a tag to compare

Changed

  • mm-update: adapt to OCR-D API changes

v0.1.3

11 Feb 19:36
Compare
Choose a tag to compare

Added

  • mm2tei CLI param controlling page and line refs via @corresp
  • mm-update CLI

v0.1.2

10 Jan 12:45
69665d9
Compare
Choose a tag to compare

Added

  • tests for TEI API
  • tests for insertion index identification
  • more logging
  • CLI param for output file
  • CLI param for image fileGrp

Changed

  • Add front, body and back per default
  • Log to stderr instead of stdout
  • Differentiate between (physical) image nr and (logical) page nr

Fixed

  • Evaluate texts from all struct types but binding and colour_checker, #43
  • Handle errors during language code expansion, and fallback to Unbekannt, #47
  • Add ALTO HYP text content if available, #52
  • Allow empty logical structMap and structLink, fallback to physical, or empty, #57
  • Allow partial dmdSec (MODS) or amdSec, fallback to empty, #46, #51
  • Pass all mods:identifiers to msIdentifier/idno (not just VD and URN)
  • Parse full titleInfo (main/sub/part/volume), and re-use in biblFull
  • Prefer titleInfo/title over div/@LABEL if available
  • Map top logical div/@TYPE into allowed biblFull/title/@level only
  • Map top logical div/@TYPE into appropriate bibl/@type if possible

v0.1.1

10 Jan 12:38
21e8bd0
Compare
Choose a tag to compare

Added

  • Treat nested AMD-type (non-logical) divs in logical struct map (i.e. newspaper case)
  • Make full text file group selectable by user
  • Allow for file entries (in addition to URLs) in METS
  • Add special treatment for URNs and VD IDs
  • Add poor man's namespace versioning handling

Changed

  • Make extraction of subtitles conditional on their presence
  • Use "licence" for all types of licences (even unknown ones)

Fixed

Working TEI/text serialization

04 Dec 17:32
ecde7db
Compare
Choose a tag to compare

With this version, the <text> part of the TEI file gets spurred. The div structure from the METS file is carried over to the TEI and, optionally, attached OCR in ALTO format is added to single divs as defined by METS' logical struct map.

Initial Release

31 Jul 11:57
4585111
Compare
Choose a tag to compare

A complete TEI header is created from METS/MODS files. Tested with multiple examples but not yet systematically.