Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OCR post processing model on map text outputs #483

Open
rwood-97 opened this issue Aug 8, 2024 · 2 comments
Open

Use OCR post processing model on map text outputs #483

rwood-97 opened this issue Aug 8, 2024 · 2 comments
Labels
enhancement New feature or request new feature text on maps Integrating text detection and recognition task(s)

Comments

@rwood-97
Copy link
Collaborator

rwood-97 commented Aug 8, 2024

Is your feature request related to a problem? Please describe.
Test out this model on the text outupts
https://huggingface.co/PleIAs/OCRonos-Vintage

Use the outputs for evaluating outputs.
Does this improve outputs? Where does this improve outputs?
Where does it make things worse?

@rwood-97 rwood-97 added enhancement New feature or request new feature labels Aug 8, 2024
@rwood-97
Copy link
Collaborator Author

rwood-97 commented Aug 8, 2024

Some questions:

  1. How are we evaluating this?
  2. How do we deal with acronyms?

@rwood-97
Copy link
Collaborator Author

rwood-97 commented Sep 11, 2024

Some example outputs for MapTextRunner:

FIELD1 image_id patch_id text score post_processed
0 74953138.1.tif patch-0-0-1000-1000-#74953138.1.tif#.png 342 0.99
1 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png Renfrew 0.96 Renfrew
2 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png 826 0.98
3 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png Co. 0.92 Co.
4 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png RENFREW 0.90 RENFREW
5 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png 558 0.97
6 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png WARD 0.53 WARD
7 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 554 0.99
8 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png Bain's 0.99 Bain's
9 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png PI. 0.94 PI.
10 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 505 0.99
11 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 575 0.98
12 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 18 0.91
13 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 577 0.98
14 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 571 0.97
15 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png 20 0.97
16 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png 26 0.97
17 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png 19 0.97
18 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png 72 0.98
19 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png 152 0.99
20 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png L##ETH 0.73 L##ETH
21 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png WARD 0.44 WARD
22 74953138.1.tif patch-0-3200-1000-4200-#74953138.1.tif#.png 45 0.97
23 74953138.1.tif patch-0-3200-1000-4200-#74953138.1.tif#.png 28 0.98
24 74953138.1.tif patch-0-3200-1000-4200-#74953138.1.tif#.png St. 0.87 St.
25 74953138.1.tif patch-0-3200-1000-4200-#74953138.1.tif#.png TWELET 0.81 TWELET TWELET TWELET TWE
26 74953138.1.tif patch-0-4000-1000-5000-#74953138.1.tif#.png House 0.97 House
27 74953138.1.tif patch-0-4000-1000-5000-#74953138.1.tif#.png 45 0.98
28 74953138.1.tif patch-0-4000-1000-5000-#74953138.1.tif#.png 29 0.98
29 74953138.1.tif patch-0-4000-1000-5000-#74953138.1.tif#.png 179 0.98
30 74953138.1.tif patch-0-4000-1000-5000-#74953138.1.tif#.png Western 0.96 Western
31 74953138.1.tif patch-0-4800-1000-5800-#74953138.1.tif#.png 49 0.97
32 74953138.1.tif patch-0-4800-1000-5800-#74953138.1.tif#.png 46 0.94
33 74953138.1.tif patch-0-4800-1000-5800-#74953138.1.tif#.png 180 0.97
34 74953138.1.tif patch-0-4800-1000-5800-#74953138.1.tif#.png 34.6 0.58
35 74953138.1.tif patch-0-4800-1000-5800-#74953138.1.tif#.png B. 0.54 B.
36 74953138.1.tif patch-0-4800-1000-5800-#74953138.1.tif#.png M. 0.81 M.
37 74953138.1.tif patch-0-5600-1000-6600-#74953138.1.tif#.png EIGHTH 0.98 EIGHTH EIGHTH EIGHTH EIGHTH
38 74953138.1.tif patch-0-5600-1000-6600-#74953138.1.tif#.png Pr. 0.85 Pr.
39 74953138.1.tif patch-0-5600-1000-6600-#74953138.1.tif#.png Builds 0.97 Builds
40 74953138.1.tif patch-0-5600-1000-6600-#74953138.1.tif#.png Wales 0.95 Wales Wales Wales Wales Wales
41 74953138.1.tif patch-0-5600-1000-6600-#74953138.1.tif#.png of 0.90 of
42 74953138.1.tif patch-0-5600-1000-6600-#74953138.1.tif#.png W### 0.51 W.
43 74953138.1.tif patch-0-6400-1000-7400-#74953138.1.tif#.png 291 0.96
44 74953138.1.tif patch-0-6400-1000-7400-#74953138.1.tif#.png 55 0.98
45 74953138.1.tif patch-0-6400-1000-7400-#74953138.1.tif#.png 366 0.99
46 74953138.1.tif patch-0-6400-1000-7400-#74953138.1.tif#.png 81 0.80
47 74953138.1.tif patch-0-6400-1000-7400-#74953138.1.tif#.png LANE 0.73 LANE LANE LANE LANE LANE
48 74953138.1.tif patch-0-6400-1000-7400-#74953138.1.tif#.png R# 0.43 R#
49 74953138.1.tif patch-0-6400-1000-7400-#74953138.1.tif#.png 3STENOCHS 0.48 3STENOCHS 3STENOCHS 3STENO

And for DeepSoloRunner:

FIELD1 image_id patch_id text score post_processed
0 74953138.1.tif patch-0-0-1000-1000-#74953138.1.tif#.png 270 0.51
1 74953138.1.tif patch-0-0-1000-1000-#74953138.1.tif#.png BARO 0.52 BARO BARO BARO BARO
2 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png RENFREW 0.71 RENFREW
3 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png 554 0.57
4 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png 342 0.66
5 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png RENFREW 0.62 RENFREW
6 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png CO 0.51 CO CO CO CO CO CO CO
7 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png 558 0.59
8 74953138.1.tif patch-0-800-1000-1800-#74953138.1.tif#.png WARD 0.49 WARD
9 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png BAINS 0.87 BAINS BAINS BAINS BAINS BAINS
10 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 505 0.70
11 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 575 0.64
12 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 577 0.64
13 74953138.1.tif patch-0-1600-1000-2600-#74953138.1.tif#.png 571 0.64
14 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png 18 0.52
15 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png 20 0.40
16 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png 19 0.40
17 74953138.1.tif patch-0-2400-1000-3400-#74953138.1.tif#.png WARD 0.47 WARD
18 74953138.1.tif patch-0-3200-1000-4200-#74953138.1.tif#.png 28 0.40
19 74953138.1.tif patch-0-4000-1000-5000-#74953138.1.tif#.png 45 0.43
20 74953138.1.tif patch-0-4000-1000-5000-#74953138.1.tif#.png 34 0.45
21 74953138.1.tif patch-0-4000-1000-5000-#74953138.1.tif#.png WESTERN 0.51 WESTERN
22 74953138.1.tif patch-0-4800-1000-5800-#74953138.1.tif#.png 180 0.71
23 74953138.1.tif patch-0-5600-1000-6600-#74953138.1.tif#.png VI10 0.62 VI10
24 74953138.1.tif patch-0-5600-1000-6600-#74953138.1.tif#.png WALES 0.49 WALES WALES WALES WALES
25 74953138.1.tif patch-0-6400-1000-7400-#74953138.1.tif#.png EIGHTH 0.77 EIGHTH EIGHTH EIGHTH EIGHTH
26 74953138.1.tif patch-0-6400-1000-7400-#74953138.1.tif#.png LANE 0.45 LANE LANE LANE LANE LANE
27 74953138.1.tif patch-0-7200-1000-8200-#74953138.1.tif#.png FOR 0.56 FOR FOR FOR FOR FOR FOR FOR
28 74953138.1.tif patch-0-8000-1000-9000-#74953138.1.tif#.png GREAT 0.88 GREAT
29 74953138.1.tif patch-0-8000-1000-9000-#74953138.1.tif#.png 654 0.67
30 74953138.1.tif patch-0-8000-1000-9000-#74953138.1.tif#.png CLY 0.64 CLY
31 74953138.1.tif patch-0-8800-1000-9800-#74953138.1.tif#.png 1443 0.81
32 74953138.1.tif patch-0-8800-1000-9800-#74953138.1.tif#.png 1442 0.67
33 74953138.1.tif patch-0-9600-1000-10600-#74953138.1.tif#.png 1444 0.50
34 74953138.1.tif patch-0-9600-1000-10600-#74953138.1.tif#.png GOVAN 0.41 GOVAN GOVAN GOVAN GOVAN
35 74953138.1.tif patch-0-10400-1000-11400-#74953138.1.tif#.png 1882 0.65
36 74953138.1.tif patch-800-0-1800-1000-#74953138.1.tif#.png 342 0.60
37 74953138.1.tif patch-800-0-1800-1000-#74953138.1.tif#.png SPRINGBURN 0.43 SPRINGBURN
38 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png HOTEL 0.84 HOTEL HOTEL HOTEL HOTEL
39 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png 105 0.68
40 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png 100 0.69
41 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png BOX 0.59 BOX BOX BOX BOX BOX BOX BOX
42 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png 103 0.68
43 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png LETTER 0.80 LETTER LETTER LETTER LETTER LETTER
44 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png 826 0.52
45 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png 834 0.56
46 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png WM 0.62 WM
47 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png BM1034 0.44 BM1034
48 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png STAR 0.64 STAR
49 74953138.1.tif patch-800-800-1800-1800-#74953138.1.tif#.png WM 0.49 WM

@rwood-97 rwood-97 added the text on maps Integrating text detection and recognition task(s) label Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request new feature text on maps Integrating text detection and recognition task(s)
Projects
Status: Done (review)
Development

No branches or pull requests

1 participant