-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[onert] Separate forward and backward configuration of trainable ops #13033
[onert] Separate forward and backward configuration of trainable ops #13033
Conversation
This PR separates forward and backward configuration of trainable ops in order to make possible disabling backward propagation of particular ops. ONE-DCO-1.0-Signed-off-by: Mateusz Bencer <m.bencer@partner.samsung.com>
fn->configure(input_tensor, output_tensor, back_prop_input_tensor, back_prop_output_tensor, | ||
node.param().alpha, node.param().beta, | ||
convertElementwiseActivationType(node.param().op_type)); | ||
auto convertToInferActivationType = [](const ir::operation::ElementwiseActivation::Type &type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't alpha
also be considered?
switch (op_type)
{
case ElementwiseActivationType::kReLU:
...
if ((alpha == std::numeric_limits<float>::infinity() || alpha == 6.0f) && beta == 0.f)
{
cpu::ops::ElementwiseActivationLayer::configure(
input, output, alpha, beta, cpu::ops::ElementwiseActivationType::kReLU);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already checked inside ElementwiseActivationLayer::configureBackward
(
if ((alpha == std::numeric_limits<float>::infinity() || alpha == 6.0f) && beta == 0.f) |
Do you prefer to have it also here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the alpha is not an issue. Why do you define the convertToInferActivationType
function again? Can't use convertElementwiseActivationType
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we're converting ir::operation::ElementwiseActivation::Type
to cpu::ops::ElementwiseActivationType
while convertElementwiseActivationType
returns train::ops::ElementwiseActivationType
. In the previous version this conversion has been made impicitly (
ONE/runtime/onert/backend/train/ops/ElementwiseActivationLayer.cc
Lines 53 to 61 in 8dd9291
switch (op_type) | |
{ | |
case ElementwiseActivationType::kReLU: | |
if (input->data_type() == OperandType::FLOAT32) | |
{ | |
if ((alpha == std::numeric_limits<float>::infinity() || alpha == 6.0f) && beta == 0.f) | |
{ | |
cpu::ops::ElementwiseActivationLayer::configure( | |
input, output, alpha, beta, cpu::ops::ElementwiseActivationType::kReLU); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand. The train Kernel generator selectively calls the cpu and train configuration functions according to the op.isRequiredForBackward()
value. In order to support this, these changes should be implmenet in the KernelGenerator. So, the activation type has to be converted to the cpu and train types depending on the situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sorry for being late, I'm reading the code, right now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
This PR separates forward and backward configuration of trainable ops in order to make possible disabling backward propagation of particular ops.
ONE-DCO-1.0-Signed-off-by: Mateusz Bencer m.bencer@partner.samsung.com
Draft: #12951
Issue: #12386