-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It does'nt work for dectecting 1600 objects for the funtion "torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" in focol_loss #131
Comments
The clamp function probably improves stability in some cases but is very much unnecessary as you can switch to using the "with logits" version of the focal loss as is used in the tensorflow version of the official code (quote below from the official code's comments)
|
Good job, thank you very much , i will try it
Have a nice day to you
[email protected]
From: rmcavoy
Date: 2020-03-26 02:37
To: toandaominh1997/EfficientDet.Pytorch
CC: y78h11b09; Author
Subject: Re: [toandaominh1997/EfficientDet.Pytorch] It does'nt work for dectecting 1600 objects for the funtion "torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)" in focol_loss (#131)
The clamp function probably improves stability in some cases but is very much unnecessary as you can switch to using the "with logits" version of the focal loss as is used in the tensorflow version of the official code (quote below from the official code's comments)
# Below are comments/derivations for computing modulator.
# For brevity, let x = logits, z = targets, r = gamma, and p_t = sigmod(x)
# for positive samples and 1 - sigmoid(x) for negative examples.
#
# The modulator, defined as (1 - P_t)^r, is a critical part in focal loss
# computation. For r > 0, it puts more weights on hard examples, and less
# weights on easier ones. However if it is directly computed as (1 - P_t)^r,
# its back-propagation is not stable when r < 1. The implementation here
# resolves the issue.
#
# For positive samples (labels being 1),
# (1 - p_t)^r
# = (1 - sigmoid(x))^r
# = (1 - (1 / (1 + exp(-x))))^r
# = (exp(-x) / (1 + exp(-x)))^r
# = exp(log((exp(-x) / (1 + exp(-x)))^r))
# = exp(r * log(exp(-x)) - r * log(1 + exp(-x)))
# = exp(- r * x - r * log(1 + exp(-x)))
#
# For negative samples (labels being 0),
# (1 - p_t)^r
# = (sigmoid(x))^r
# = (1 / (1 + exp(-x)))^r
# = exp(log((1 / (1 + exp(-x)))^r))
# = exp(-r * log(1 + exp(-x)))
#
# Therefore one unified form for positive (z = 1) and negative (z = 0)
# samples is:
# (1 - p_t)^r = exp(-r * z * x - r * log(1 + exp(-x))).
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Recently, I found that efficientDet-do didn't work for 1600 objects because cls_loss still didn't downgrade in training.
So , i modified the funtion **"torch.clamp(classification, min=1e-4, max=1.0 - 1e-4)"
into "torch.clamp(classification, min=1e-8, max=1.0 - 1e-8)" in focol_loss,
and then efficientDet-d0 can be traind on for 1600 objects.
i
Who can tell me what advantage is the funtion of torch.clamp() in focal_loss? i think it should be removed completely!
The text was updated successfully, but these errors were encountered: