LayerNorm missing in Statedict of DatViT L

#43
by Matagi - opened

So I wanted to test just the Vision encoder on downstream tasks and downloaded the modelfile, than I notices I cant load the model in strict mode because "norms.weight", "norms.bias" are missing from the checkpoint. I am wondering if this is intentional, with 4048 parameters in total missing (beeing reset) I would guess this degrades the performance when using frozen weights for distilation, while they could be relearned when finetuning.

Screenshot 2024-07-09 164803.png

Microsoft org

hi @Matagi , vision_tower.norms is not used in inference. you can refer to https://huggingface.co/microsoft/Florence-2-large/blob/15aa04e200389df2ccb00e2eb94d551284e45df1/modeling_florence2.py#L2603 for more details.

Sign up or log in to comment