can't use inference

#2
by llamameta - opened

The model huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated is too large to be loaded automatically (65GB > 10GB)

while i can use original qween model here https://huggingface.co/spaces/llamameta/Qwen2.5-Coder-32B-Instruct-Chat-Assistant

Please refer to Qwen2.5 Speed Benchmark

Sign up or log in to comment