Trainable layers impossible to control in Vision Tower
It seems that no matter what setup I chose, whatever layer in the Vision Tower I chose to be trainable like here :
for name, param in self.llm.named_parameters():
param.requires_grad = (('.ln.' in name.lower()
or 'norm' in name.lower() or
'transformer.h.0' in name.lower() or
'vision_model.encoder.layers.0.' in name.lower() or
'vision_model.encoder.layers.1.' in name.lower()) and
('out_proj' not in name.lower()))
it changes nothing, the training behavior is the same, why is that ?
(I printed which layers are trainable and everythink is fine, weirdly pytorch understood well my changes in the Vision Tower because the number of trainable weights changed, and I know I am doing it right because my changes in the trainable layers in the LLM part have an actual impact on the training/val loss and weights value)
We will try to figure out this question. And are you trying to finetune imp in your custom datasets?
Yes I am trying to do that, I did successfully, but the vision tower is resisting me.
Ah okok sure, I was hoping I could just stick to my usual Hugginface model training, but I can try that too when I find some time
I know, the vision tower must be extracted from the model to force the trainable weights.