-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] add ignore keys if classes differ - KIE training #1271
[Fix] add ignore keys if classes differ - KIE training #1271
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1271 +/- ##
=======================================
Coverage 95.75% 95.75%
=======================================
Files 154 154
Lines 6901 6903 +2
=======================================
+ Hits 6608 6610 +2
Misses 293 293
Flags with carried forward coverage won't be shown. Click here to find out more.
|
# The number of class_names is not the same as the number of classes in the pretrained model => | ||
# remove the layer weights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, just like with the recognition models, if we want to fine-tune a pre-trained model on another vocab we have to reset (resp. resize embeddings) the classifier head as well as the embeddings, otherwise it leads to a shape mismatch.
We have the same thing now when training on a KIE data set, since the number of classes varies, before that in normal detection training it is always one class (text) so it didn't matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will reset the Conv"head" which makes it also possible to use '--pretrained' for KIE detection training without an impact on the normal detection training and inference otherwise '--pretrained' raises a shape mismatch while init the model with the pretrained checkpoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as example for crnn:
if pretrained:
# The number of classes is not the same as the number of classes in the pretrained model =>
# remove the last layer weights
_ignore_keys = ignore_keys if _cfg["vocab"] != default_cfgs[arch]["vocab"] else None
load_pretrained_params(model, _cfg["url"], ignore_keys=_ignore_keys)
ignore_keys=["linear.weight", "linear.bias"],
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super clear, thanks
This PR:
--pretrained
if you use a pretrained det model and train on a KIE datasetAny feedback is welcome 🤗