100b-int4 model size mismatch for transformer.encoder.layers.0.self_attention.query_key_value.weight

#2
by Chunfa - opened

RuntimeError: Error(s) in loading state_dict for ProteinGLMForMaskedLM:
size mismatch for transformer.encoder.layers.0.self_attention.query_key_value.weight: copying a param with shape torch.Size([30720, 5120]) from checkpoint, the shape in current model is torch.Size([30720, 10240]).

I suspect that there might be some differences between the settings described in the paper and the actual configuration used in the model.I was wondering if it would be possible for you to share the config.json file that was used in your original setup.

Already fixed with new config file.

Chunfa changed discussion status to closed

Sign up or log in to comment