Fix quantization_config to work with vLLM v0.5.3.post1
#11
by
davidthomas426
- opened
The modules_to_not_convert
need to be the linear layers to work with vLLM, or they are ignored. Setting them to parent modules does not work.
Also, updated the _name_or_path
field to the correct HF model id.
Thanks LGTM
ArthurZ
changed pull request status to
merged