Haotong Qin

HaotongQin
ยท

AI & ML interests

Model Compression, Efficient AIGC

Organizations

Posts 1

view post
Post
1900
We release an empirical study to showcase "How Good Are Low-bit Quantized hashtag#LLaMA3 ๐Ÿฆ™ Models" with existing LLM quantization techniques!

In this study, the performance of the low-bit LLaMA3 models (especially LLaMA3-70B) is impressively notable. ๐Ÿš€ However, the results also exposed significant performance degradation issues faced by existing quantization techniques when dealing with LLaMA3, especially under ultra-low bit-width.

We hope this study can serve as a reference for the LLM quantization community and promote the emergence of stronger LLM quantization methods in the context of LLaMA3's release. More work is on the way...

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study (2404.14047)

https://huggingface.co/collections/LLMQ/llama3-quantization-66251258525135aeda16513c

models

None public yet

datasets

None public yet