can't use inference
#2
by
llamameta
- opened
The model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated is too large to be loaded automatically (65GB > 10GB)
while i can use original qween model here https://huggingface.co/spaces/llamameta/Qwen2.5-Coder-32B-Instruct-Chat-Assistant
Please refer to Qwen2.5 Speed Benchmark