Skip to content

Commit

Permalink
Merge branch 'master' into fix-polygon-for-parent
Browse files Browse the repository at this point in the history
  • Loading branch information
kba authored Jun 5, 2020
2 parents 94719f8 + bfd29aa commit 0bbd5b5
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 2 deletions.
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,16 @@ Fixed:

* segment-region: ensure polygons are within page/Border

## [0.8.4] - 2020-06-05

Changed:

* segment-region: in `sparse_text` mode, also add text lines

Fixed:

* Always set path to `TESSDATA_PREFIX` for `tesserocr.get_languages`, #129

## [0.8.3] - 2020-05-12

Fixed:
Expand Down Expand Up @@ -182,6 +188,9 @@ Changed:
* Recognition with proper support for textequiv_level, drop `page` level

<!-- link-labels -->
[0.8.4]: v0.8.4...v0.8.3
[0.8.3]: v0.8.3...v0.8.2
[0.8.2]: v0.8.2...v0.8.1
[0.8.1]: v0.8.1...v0.8.0
[0.8.0]: v0.8.0...v0.7.0
[0.7.0]: v0.7.0...v0.6.0
Expand Down
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ ENV PYTHONIOENCODING utf8

WORKDIR /build-ocrd
COPY setup.py .
COPY ocrd_tesserocr/ocrd-tool.json .
COPY README.md .
COPY requirements.txt .
COPY requirements_test.txt .
Expand Down
2 changes: 1 addition & 1 deletion ocrd_tesserocr/ocrd-tool.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"version": "0.8.3",
"version": "0.8.4",
"git_url": "https://github.com/OCR-D/ocrd_tesserocr",
"dockerhub": "ocrd/tesserocr",
"tools": {
Expand Down
8 changes: 7 additions & 1 deletion ocrd_tesserocr/recognize.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

from tesserocr import (
RIL, PSM,
PyTessBaseAPI, get_languages)
PyTessBaseAPI, get_languages as get_languages_)

from ocrd_utils import (
getLogger,
Expand Down Expand Up @@ -46,6 +46,12 @@
CHOICE_THRESHOLD_NUM = 6 # maximum number of choices to query and annotate
CHOICE_THRESHOLD_CONF = 0.2 # maximum score drop from best choice to query and annotate

def get_languages(*args, **kwargs):
"""
Wraps tesserocr.get_languages() with a fixed path parameter.
"""
return get_languages_(*args, path=TESSDATA_PREFIX, **kwargs)

class TesserocrRecognize(Processor):

def __init__(self, *args, **kwargs):
Expand Down

0 comments on commit 0bbd5b5

Please sign in to comment.