discrepancy between number of features

by Elsospi - opened Sep 13

Sep 13

I'm sorry if I open a new Discussion but I'm sure it may help.

'''
model = timm.create_model('timm/mobilenetv4_conv_small.e2400_r224_in1k', pretrained=True)
print(model.num_features)
'''

Once you enter this couple of instructions, the number of features printed will be 960, but the last Linear convolutional layer will have 'in_features=1280' (which is also the default value, as you might see from the MobileNetV3 Class implementation on Github).
However, I can't figure out why the printed num_features and the effective number of features before the classification head don't correspond.

Thank you all!

rwightman

PyTorch Image Models org Sep 13

•

edited Sep 13

@Elsospi mnv4 is like mnv3, it has a linear layer after gobal pool, that is considered part of the head so it's a bit different from other CNN

So num_features matches features of forward_features() which is a spatial feature map

head_hidden_size is the pooled features after the last (EDIT: last meaning the last one before the classifier, the penultimate) linear layer in the head

In code:
https://github.com/huggingface/pytorch-image-models/blob/ee5b1e8217134e9f016a0086b793c34abb721216/timm/models/mobilenetv3.py#L120-L137

rwightman

PyTorch Image Models org Sep 13

you need to use forward_head(pre_logits=True), or set num_classes=0 / reset_classifier to get those pre_logits features.

Elsospi

Sep 16

Thank you so much for the answers, both here and on the other question, it clearer now.
Just saying, once you cut the classification head by using "num_classes=0", and you wanna customise the head (specifically in case you want to use it as a backbone for the siamese neural network with a parametrised embedding_size), you have to take into account that last linear layer, so the number of features you will have to work with is 1280 rather than 960.

Example:

self.base_model = base_model # MNV4 with num_classes=0
self.flatten = nn.Flatten()
self.fc = nn.Linear(1280, embedding_size) # !!!
self.l2_norm = nn.functional.normalize

rwightman

PyTorch Image Models org Sep 16

@Elsospi yes, using the 'generic' model interface that will work for all models this is the case. But knowing the model structure you can modify to remove that model.conv_head = nn.Identity(), model.conv_norm = nn.Identity() (if this one exists).

You can also call forward_features(), get the unpooled output at 960 channels, and then pool to your liking in a custom head.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment