Why is the Pixtral activation function "gelu" when the reference code uses "silu"?
#10
by
mgoin
- opened
In the vision_config the hidden_act is gelu https://huggingface.co/mistral-community/pixtral-12b/blob/54521e820bfe9c740aea0c91e1ee52cf55420ba2/config.json#L30
However in the reference implementation the activation function is silu https://github.com/vllm-project/vllm/blob/717a5f82cda6dd6a52be6504179adaa64bbdc67a/vllm/model_executor/models/pixtral.py#L390
The activation function should indeed by "silu" - would be nice if we could correct the implementation here
Resolved by https://huggingface.co/mistral-community/pixtral-12b/discussions/14, thanks!
mgoin
changed discussion status to
closed