gemma2-mitra-it-int8
This is an 8int quantized version of gemma-2-mitra-it: https://huggingface.co/buddhist-nlp/gemma-2-mitra-it
The quantization was done with llm compressor: https://github.com/vllm-project/llm-compressor
The template for prompting the model is this:
Please translate into <target_language>: <input_sentence> 🔽 Translation::
Line breaks in this model should be replaced with the '🔽' character before running the generation. '#' is used as a stop token.
Model Details
For details on how to run this please see the gemma2-9b repository: https://huggingface.co/google/gemma-2-9b