Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Reconstitution] Improve reconstitution #1750

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

felixdittrich92
Copy link
Contributor

@felixdittrich92 felixdittrich92 commented Oct 10, 2024

This PR:

  • Improve synthesize quality
  • Allow also to render polygons but warn because to large rotations can't be rendered correctly yet
  • Extend corresponding tests

Still not perfect but looks much better as before :)

Any feedback is welcome 🤗

Old: (Lots of overlapping / clipping)
Screenshot from 2024-10-10 10-40-13

New:
Screenshot from 2024-10-10 11-21-24

New without line:
Screenshot from 2024-10-10 10-50-56

@felixdittrich92 felixdittrich92 added type: enhancement Improvement module: utils Related to doctr.utils ext: tests Related to tests folder labels Oct 10, 2024
@felixdittrich92 felixdittrich92 added this to the 0.10.0 milestone Oct 10, 2024
@felixdittrich92 felixdittrich92 self-assigned this Oct 10, 2024
Copy link

codecov bot commented Oct 10, 2024

Codecov Report

Attention: Patch coverage is 92.98246% with 4 lines in your changes missing coverage. Please review.

Project coverage is 96.54%. Comparing base (59f1c30) to head (6867d98).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
doctr/utils/reconstitution.py 92.85% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1750      +/-   ##
==========================================
+ Coverage   96.46%   96.54%   +0.07%     
==========================================
  Files         164      164              
  Lines        7869     7895      +26     
==========================================
+ Hits         7591     7622      +31     
+ Misses        278      273       -5     
Flag Coverage Δ
unittests 96.54% <92.98%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@felixdittrich92 felixdittrich92 linked an issue Oct 10, 2024 that may be closed by this pull request
@felixdittrich92 felixdittrich92 changed the title [Reconstruction] Improve reconstruction [Reconstitution] Improve reconstitution Oct 10, 2024
doctr/utils/reconstitution.py Outdated Show resolved Hide resolved
doctr/utils/reconstitution.py Outdated Show resolved Hide resolved
@felixdittrich92
Copy link
Contributor Author

Closes: #1692

Comment on lines 26 to 31
# test with a smiley which can't be rendered by unidecode
pages_one_line["blocks"][0]["lines"][0]["words"][0]["text"] = "🤯"
render_one_line = reconstitution.synthesize_page(pages_one_line, draw_proba=True)
assert isinstance(render_one_line, np.ndarray)
assert render_one_line.shape == (*pages[0].dimensions, 3)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, it didn't work ?

Copy link
Contributor Author

@felixdittrich92 felixdittrich92 Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It worked but i was not able to reach the line where it breaks d.draw because d.draw was also able to draw it (not correctly - only as question mark - because the default font supports it but it would with a supporting font) xDDD

Copy link
Contributor Author

@felixdittrich92 felixdittrich92 Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the idea was to cover the unicode error also where it falls back to anyascii xD I tried with chinese symbols, smileys but everything worked also without anyascii

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but i would still keep it as fallback only for the case 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: tests Related to tests folder module: utils Related to doctr.utils type: enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[reconstitution] Improve synthesize output quality
2 participants