Skip to content

Conversation

jiqing-feng
Copy link
Contributor

Hi @ArthurZucker

Relate to 25856. I added a new parameter in config to control the stride for the first bottleneck layer in stages. Would you please help me review it? Thx!

@jiqing-feng jiqing-feng changed the title control first downsample stride Control first downsample stride in ResNet Sep 25, 2023
@jiqing-feng jiqing-feng mentioned this pull request Sep 27, 2023
Copy link
Contributor

@rafaelpadilla rafaelpadilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new configuration parameter is being suggested in the ResNet config to attend the needs of a new model. It was done to preserve backward compatibility by default. :)

However, the name of the suggested config parameter should be more intuitive to leverage other models. Could you please, make this change?

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening the follow up PR 🤗

Comment on lines 203 to 220
if config.layer_type == "bottleneck":
self.layers = nn.Sequential(
# downsampling is done in the first layer with stride of 2
layer(
in_channels,
out_channels,
stride=stride,
activation=config.hidden_act,
reduce_first=config.reduce_first,
),
*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
)
else:
self.layers = nn.Sequential(
# downsampling is done in the first layer with stride of 2
layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),
*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be simplified 😉

Suggested change
if config.layer_type == "bottleneck":
self.layers = nn.Sequential(
# downsampling is done in the first layer with stride of 2
layer(
in_channels,
out_channels,
stride=stride,
activation=config.hidden_act,
reduce_first=config.reduce_first,
),
*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
)
else:
self.layers = nn.Sequential(
# downsampling is done in the first layer with stride of 2
layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),
*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
)
reduce_first = config.reduce_first if config.layer_type == "bottleneck" else False
self.layers = nn.Sequential(
# downsampling is done in the first layer with stride of 2
layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),
*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
)

or something like this

@jiqing-feng
Copy link
Contributor Author

Hi @ArthurZucker @rafaelpadilla .

Thanks for your advice. The resnet backbone which downsamples in bottleneck layer can be found in resnet.

Copy link
Contributor

@rafaelpadilla rafaelpadilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
But including docstring in the constructors of ResNetBottleNeckLayer and ResNetBasicLayer seems important to me, as the new parameter downsample_in_bottleneck is introduced.

out_channels: int,
stride: int = 1,
activation: str = "relu",
downsample_in_bottleneck: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the purpose of downsample_in_bottleneck here: it prevents one from distinguishing between ResNetBottleNeckLayer and ResNetBasicLayer. But it is not being use in this class.
So, I think it is valuable to include a docstring here, and clarify why downsample_in_bottleneck isn't used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually let's not pollute this class with this, the if else logic should be done to have the correct args.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree! It's cleaner.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM appart for the arg that is not used. Let's have explicit and useful args, doing a if else is okay to chose the layer and more understandable

out_channels: int,
stride: int = 1,
activation: str = "relu",
downsample_in_bottleneck: bool = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually let's not pollute this class with this, the if else logic should be done to have the correct args.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@jiqing-feng
Copy link
Contributor Author

LGTM appart for the arg that is not used. Let's have explicit and useful args, doing a if else is okay to chose the layer and more understandable

Sure!

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this should be the last iteration!

Comment on lines 206 to 223
if config.layer_type == "bottleneck":
self.layers = nn.Sequential(
# downsampling is done in the first layer with stride of 2
layer(
in_channels,
out_channels,
stride=stride,
activation=config.hidden_act,
downsample_in_bottleneck=config.downsample_in_bottleneck,
),
*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
)
else:
self.layers = nn.Sequential(
# downsampling is done in the first layer with stride of 2
layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),
*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No as I said before the only layer that is different here is the call. Something like

Suggested change
if config.layer_type == "bottleneck":
self.layers = nn.Sequential(
# downsampling is done in the first layer with stride of 2
layer(
in_channels,
out_channels,
stride=stride,
activation=config.hidden_act,
downsample_in_bottleneck=config.downsample_in_bottleneck,
),
*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
)
else:
self.layers = nn.Sequential(
# downsampling is done in the first layer with stride of 2
layer(in_channels, out_channels, stride=stride, activation=config.hidden_act),
*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)],
)
if config.layer_type == "bottleneck":
first_layer = layer(in_channels,out_channels,stride=stride,activation=config.hidden_act,downsample_in_bottleneck=config.downsample_in_bottleneck)
else:
first_layer = layer(in_channels, out_channels, stride=stride, activation=config.hidden_act)
self.layers = nn.Sequential(first_layer,*[layer(out_channels, out_channels, activation=config.hidden_act) for _ in range(depth - 1)])

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like this should be enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this round! merging 😉

@ArthurZucker ArthurZucker merged commit 592f2ea into huggingface:main Oct 10, 2023
Keracles added a commit to Keracles/transformers that referenced this pull request Oct 10, 2023
Control first downsample stride in ResNet (huggingface#26374)
@jiqing-feng jiqing-feng deleted the resnet branch November 22, 2023 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants