Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust at the end #72

Open
FauconFan opened this issue Feb 14, 2018 · 6 comments
Open

Adjust at the end #72

FauconFan opened this issue Feb 14, 2018 · 6 comments

Comments

@FauconFan
Copy link

Hi,

I don't know if it was already said, but it is still not fixed in the code.
When you are training your model, you calculate the weights_deltas between hidden layer and output layer.
But you are updating your weights too early, you are using the new weights to calculate deltas for previous weights (for back-propagation).
You have to keep deltas in memory then update at the end after the rest.

That's why your model takes a lot of time to be trained for the XOR problem. It shouldn't be that long. By memory you have took 50000 iterations. That is too big for a problem like this.

Keep going ^^.

P.S.: Sorry if my english is not perfect, it is not my main natural language.

@xxMrPHDxx
Copy link

I wonder if gradients should be updated in the same way?

Sorry, I don't know the algorithm well

@FauconFan
Copy link
Author

I don't think it matters, or i didn't understand your comment... sorry ^^

For the algorithm. order is important, because if your weights is updated too early the following steps for back propagation is disturbed. For the example of a 3 layers neural networks. It doesn't really matter, but for a deeper nn, this is very important...

:)

@xxMrPHDxx
Copy link

Based on the way you mention earlier, I assume that the calculation is done as follows

deltas = []
for layer of layers:
    delta = calculateDelta()
    deltas.append(delta)

    calculateGradient()
    bias.addGradient()

for weight in weight
    weight.add(delta)

Am I right?

@FauconFan
Copy link
Author

Yes, but bias must be treated in the same way that weights, for the same reason.

And you suppose that gradient and deltas are different but delta are calculated using gradient. (just in case that was not clear)

So now :

deltas_w = []
deltas_bias = []
for layer of layers:
    gradient = calculateGradien()
    delta = calculateDelta(gradient)
    deltas_w.append(delta)
    deltas_bias.append(gradient) // According to the model

// Calculated with learning rate and multiplied by -1 for the descent part
for weight in weight
    weight.add(delta)
for b in deltas_bias
    bias.add(b)

This algorithm may be wrong, and may only work for a convolutional neural network.

Hope it helps

@xxMrPHDxx
Copy link

but bias must be treated in the same way

That's what I meant earlier. Thanks for the info

@FauconFan
Copy link
Author

Your welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants