Skip to content

Commit

Permalink
integrations-tests: Amends README with instructive information
Browse files Browse the repository at this point in the history
  • Loading branch information
funkyfuture committed May 11, 2024
1 parent d443fa4 commit 649600e
Showing 1 changed file with 12 additions and 3 deletions.
15 changes: 12 additions & 3 deletions integration-tests/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
# Integration tests against corpora

This folder serves as playground for tests of basic functionality against many
XML documents, mostly TEI-encodings. They are supposed to be executed with
XML documents, mostly TEI encodings. They are supposed to be executed with
major code change proposals and before releases.

## Test corpus

Place document collections into the `corpora` folder. The `fetch-corpora.py`
script helps to get going with the minimal requirement (~3GB) of tests.
script helps to get going with the minimal requirement (~6GB) of data.
Set any non-empty string as environment variable `SKIP_EXISTING` to skip
downloading a corpus whose target folder already exists.

Due to the `lb` tag [issue](https://github.com/deutschestextarchiv/dtabf/issues/33)
with the DTABf the DTA corpus isn't considered. It could be an experiment to
use *delb* for transformations with regards to the conclusions of that issue.

The `normalize-corpora.py` script addresses issues that were found in the text
encodings and must be run before the tests.
encodings and must be run after fetching test data.
One of the corpus folder names can be passed as argument to the script in order
to process only this one's contents.

## Tests

Expand All @@ -25,3 +29,8 @@ reserves a `report.txt` for messages redirected from *stdout*.

When problems occur, carefully investigate that it's not due to the source, and
if not extract simple enough cases for the unit tests.

## TODO

After adding the third kind of test, wrap all scripts here into a
[textual](https://textual.textualize.io) app.

0 comments on commit 649600e

Please sign in to comment.