Seeking information about smashing
#2
by
ewre324
- opened
Hello, to be frank I am quite impressed by your work. Can you please point me to resources to understand smashing in detail?
I am happy that you like what we do :) Smashing refers to the application of any (potentially combination of ) compression methods to a ML model in the context of Pruna. It could include quantization, pruning, compilation, or many other compression methods. In each of the model page, we provide a smash_config.json which details the parameters used for the compression of the model. We also constantly try to update our documentation here. E.g, for this model thee use llm-int8 quantization from the great bitsandbytes (bnb) to compress the model.
sharpenb
changed discussion status to
closed
This comment has been hidden
Sorry wrong thread