Skip to content

Conversation

@typhoonzero
Copy link
Contributor

Fix #3714

Copy link
Contributor

@lcy-seso lcy-seso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@lcy-seso
Copy link
Contributor

Just some questions,

  1. Dose this PR means if I want to get parameter gradients, I should handle the EndForwardBackward event ?

  2. Why I can still get gradients in CPU mode previously if just handle the EndIteration event, is there any difference in CPU and GPU training?

@typhoonzero
Copy link
Contributor Author

Dose this PR means if I want to get parameter gradients, I should handle the EndForwardBackward event ?

Yes, you must. If not all gradients will be zero.

Why I can still get gradients in CPU mode previously if just handle the EndIteration event, is there any difference in CPU and GPU training?

Please check out https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/trainer/ThreadParameterUpdater.cpp#L201, in v2 API training, default creates SgdThreadUpdater, and the default parameter update in GPU put all gradients to zero immediately after update

@lcy-seso
Copy link
Contributor

lcy-seso commented Sep 30, 2017

I see, thank you! I think I should also update demos in the model repo after this PR.

@typhoonzero typhoonzero merged commit 3f87414 into PaddlePaddle:develop Oct 9, 2017
@typhoonzero typhoonzero deleted the fix_gpu_gradient_debug branch December 22, 2017 05:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants