-
Notifications
You must be signed in to change notification settings - Fork 19
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
TLDR-635 retrain paragraph, diplomas classifiers
TLDR-635 refresh code for old datasets; rewrite Paragraph_feature_extractor TLDR-635 fix TableAnnotation TLDR-635 delete FirstWordFeatures TLDR-635 delete FeatureExtractors.fit(..) and fit_transform(..) TLDR-635 move LineEpsDataSet into line_lstm_classifier_trainer.py TLDR-635 rewrite LineWithMetaExtraction TLDR-635 Base line classifier retrain TLDR-635 fixed paragraph tests TLDR-635 refactor TocFeatureExtractor TLDR-635 retrain diploma classifier TLDR-635 rewrite paragraph added normalization indent_prev; indent_next; indent_prev_right; cancel local normalization by page_width TLDR-635 retrain paragraph: change calculate upperletter percent, is_capitalize TLDR-635 retrain paragraphs: added bold features TLDR-635 retrain paragraphs: added list features TLDR-635 retrain paragraph classifier TLDR-635 retrain paragraph classifier TLDR-635 retrain paragraph classifier TLDR-635 fixed labeling tests
- Loading branch information
Showing
55 changed files
with
1,049 additions
and
725 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
117 changes: 0 additions & 117 deletions
117
dedoc/readers/pdf_reader/pdf_image_reader/paragraph_extractor/paragraph_features.py
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.