Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integrate validation into data workflow #1

Open
dasch124 opened this issue Jul 8, 2022 · 3 comments
Open

integrate validation into data workflow #1

dasch124 opened this issue Jul 8, 2022 · 3 comments
Assignees
Labels

Comments

@dasch124
Copy link
Member

dasch124 commented Jul 8, 2022

incl.
RNG Schema + Schematron
Audio location

@dasch124
Copy link
Member Author

dasch124 commented Dec 16, 2022

on tei:w (i.e. in text documents):

  • reference to <fs> in annotation document via @ana
  • ensure that text() = $featureStructure/tei:f[@name='trans']

NB there's some boilerplate code already at https://github.com/acdh-oeaw/vicav-content/blob/master/tools/802_tei_odd/vicav_dicts.odd#L193

on tei:fs in annotation document (assuming that we have implemented #5):

  • reference to <entry> in SHAWI dictionary via tei:f[@name='dict']/substring-after(@fVal,'dict:')
  • ensure existence of doc($path-to-fLib.xml)//tei:fs[@xml:id = $featureStructure/tei:f[@name = 'pos']/substring-after(@fVal,'fLib:') [i.e. repeated as $fLibEntry below]
  • ensure that $fLibEntry = $entry/tei:gram[@type='pos'] (CHECKME pos info in dict might be using a different syntax than flib - e.g. "pos.n" in flib vos "n" in dictionary)
  • ensure that $featureStructure/tei:fs[@name = 'root']/tei:string = $entry/tei:gram[@type='root']

[TODO add other things like TEI header validation or audio reference integrity]

@dasch124
Copy link
Member Author

Probably we need several validation reports depending:

  • Annotation document errors
  • Annotation errors
  • Dictionary errors

@dasch124
Copy link
Member Author

dasch124 commented Jul 4, 2024

@charlymo has recently added a functionality to use BaseX for validation. We should re-use this method to validate the texts during annotation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants