feat: add eos_token_id to generation_config.json (needed by vllm infer)
#12
by
wxsm
- opened
No description provided.
could you please share the python script to serve the internvl2-8b using the vllm using openai chat completions?
could you please share the python script to serve the internvl2-8b using the vllm using openai chat completions?
we may replace the /model
with hf models
vllm serve /model --port 8000 \
--trust-remote-code \
--served-model-name internvl2-internlm2 \
--enable-chunked-prefill False # this is required by now, otherwise inference will failed
Thanks for your efforts and time!
czczup
changed pull request status to
merged