GGUF
Russian
Inference Endpoints
conversational
aashish1904 commited on
Commit
76a1f51
1 Parent(s): fe46f42

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +116 -0
README.md ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ language:
5
+ - ru
6
+ datasets:
7
+ - IlyaGusev/saiga_scored
8
+ - IlyaGusev/saiga_preferences
9
+ license: gemma
10
+
11
+ ---
12
+
13
+ ![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)
14
+
15
+ # QuantFactory/saiga_gemma2_9b-GGUF
16
+ This is quantized version of [IlyaGusev/saiga_gemma2_9b](https://huggingface.co/IlyaGusev/saiga_gemma2_9b) created using llama.cpp
17
+
18
+ # Original Model Card
19
+
20
+
21
+
22
+ # Saiga/Gemma2 9B, Russian Gemma-2-based chatbot
23
+
24
+ Based on [Gemma-2 9B Instruct](https://huggingface.co/google/gemma-2-9b-it).
25
+
26
+ ## Prompt format
27
+
28
+ Gemma-2 prompt format:
29
+ ```
30
+ <start_of_turn>system
31
+ Ты — Сайга, русскоязычный автоматический ассистент. Ты разговариваешь с людьми и помогаешь им.<end_of_turn>
32
+ <start_of_turn>user
33
+ Как дела?<end_of_turn>
34
+ <start_of_turn>model
35
+ Отлично, а у тебя?<end_of_turn>
36
+ <start_of_turn>user
37
+ Шикарно. Как пройти в библиотеку?<end_of_turn>
38
+ <start_of_turn>model
39
+ ```
40
+
41
+
42
+ ## Code example
43
+ ```python
44
+ # Исключительно ознакомительный пример.
45
+ # НЕ НАДО ТАК ИНФЕРИТЬ МОДЕЛЬ В ПРОДЕ.
46
+ # См. https://github.com/vllm-project/vllm или https://github.com/huggingface/text-generation-inference
47
+
48
+ import torch
49
+ from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
50
+
51
+ MODEL_NAME = "IlyaGusev/saiga_gemma2_10b"
52
+
53
+ model = AutoModelForCausalLM.from_pretrained(
54
+ MODEL_NAME,
55
+ load_in_8bit=True,
56
+ torch_dtype=torch.bfloat16,
57
+ device_map="auto"
58
+ )
59
+ model.eval()
60
+
61
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
62
+ generation_config = GenerationConfig.from_pretrained(MODEL_NAME)
63
+ print(generation_config)
64
+
65
+ inputs = ["Почему трава зеленая?", "Сочини длинный рассказ, обязательно упоминая следующие объекты. Дано: Таня, мяч"]
66
+ for query in inputs:
67
+ prompt = tokenizer.apply_chat_template([{
68
+ "role": "user",
69
+ "content": query
70
+ }], tokenize=False, add_generation_prompt=True)
71
+ data = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
72
+ data = {k: v.to(model.device) for k, v in data.items()}
73
+ output_ids = model.generate(**data, generation_config=generation_config)[0]
74
+ output_ids = output_ids[len(data["input_ids"][0]):]
75
+ output = tokenizer.decode(output_ids, skip_special_tokens=True).strip()
76
+ print(query)
77
+ print(output)
78
+ print()
79
+ print("==============================")
80
+ print()
81
+ ```
82
+
83
+
84
+ ## Versions
85
+ v2:
86
+ - [258869abdf95aca1658b069bcff69ea6d2299e7f](https://huggingface.co/IlyaGusev/saiga_gemma2_9b/commit/258869abdf95aca1658b069bcff69ea6d2299e7f)
87
+ - Other name: saiga_gemma2_9b_abliterated_sft_m3_d9_abliterated_kto_m1_d13
88
+ - SFT dataset config: [sft_d9.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/sft_d9.json)
89
+ - SFT model config: [saiga_gemma2_9b_sft_m2.json](https://github.com/IlyaGusev/saiga/blob/main/configs/models/saiga_gemma2_9b_sft_m3.json)
90
+ - KTO dataset config: [pref_d11.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/pref_d13.json)
91
+ - KTO model config: [saiga_gemma2_9b_kto_m1.json](https://github.com/IlyaGusev/saiga/blob/main/configs/models/saiga_gemma2_9b_kto_m1.json)
92
+ - SFT wandb: [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/pjsuik1l)
93
+ - KTO wandb: [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/dsxwvyyx)
94
+
95
+ v1:
96
+ - [fa63cfe898ee6372419b8e38d35f4c41756d2c22](https://huggingface.co/IlyaGusev/saiga_gemma2_9b/commit/fa63cfe898ee6372419b8e38d35f4c41756d2c22)
97
+ - Other name: saiga_gemma2_9b_abliterated_sft_m2_d9_abliterated_kto_m1_d11
98
+ - SFT dataset config: [sft_d9.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/sft_d9.json)
99
+ - SFT model config: [saiga_gemma2_9b_sft_m2.json](https://github.com/IlyaGusev/saiga/blob/main/configs/models/saiga_gemma2_9b_sft_m2.json)
100
+ - KTO dataset config: [pref_d11.json](https://github.com/IlyaGusev/saiga/blob/main/configs/datasets/pref_d11.json)
101
+ - KTO model config: [saiga_gemma2_9b_kto_m1.json](https://github.com/IlyaGusev/saiga/blob/main/configs/models/saiga_gemma2_9b_kto_m1.json)
102
+ - SFT wandb: [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/af49qmbb)
103
+ - KTO wandb: [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/5bt7729x)
104
+
105
+ ## Evaluation
106
+
107
+ * Dataset: https://github.com/IlyaGusev/rulm/blob/master/self_instruct/data/tasks.jsonl
108
+ * Framework: https://github.com/tatsu-lab/alpaca_eval
109
+ * Evaluator: alpaca_eval_cot_gpt4_turbo_fn
110
+
111
+ Pivot: gemma_2_9b_it_abliterated
112
+ | model | length_controlled_winrate | win_rate | standard_error | avg_length |
113
+ |-----|-----|-----|-----|-----|
114
+ |gemma_2_9b_it_abliterated | 50.00 | 50.00 | 0.00 | 1126 |
115
+ |saiga_gemma2_9b, v1 | 48.66 | 45.54 | 2.45 | 1066 |
116
+ |saiga_gemms2_9b, v2 | 47.77 | 45.30 | 2.45 | 1074 |