OferB commited on
Commit
e3c738c
1 Parent(s): fcaf82e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +181 -0
README.md CHANGED
@@ -1,3 +1,184 @@
1
  ---
2
  license: llama2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
+ datasets:
4
+ - cerebras/SlimPajama-627B
5
+ language:
6
+ - en
7
+ tags:
8
+ - Deci AI
9
+ - DeciLM
10
+ - Instruction
11
+ model-index:
12
+ - name: DeciLM 6B
13
+ results:
14
+ - task:
15
+ type: text-generation
16
+ dataset:
17
+ type: ai2/arc
18
+ name: ai2_arc
19
+ metrics:
20
+ - name: ARC Challenge
21
+ type: ARC Challenge
22
+ value: 43.43
23
+ verified: false
24
+ - task:
25
+ type: text-generation
26
+ dataset:
27
+ type: ai2/arc
28
+ name: ai2_arc
29
+ metrics:
30
+ - name: ARC Easy
31
+ type: ARC Easy
32
+ value: 70.58
33
+ verified: false
34
+ - task:
35
+ type: text-generation
36
+ dataset:
37
+ type: boolq
38
+ name: boolq
39
+ metrics:
40
+ - name: BoolQ
41
+ type: BoolQ
42
+ value: 77.34
43
+ verified: false
44
+ - task:
45
+ type: text-generation
46
+ dataset:
47
+ type: hellaswag
48
+ name: hellaswag
49
+ metrics:
50
+ - name: HellaSwag
51
+ type: HellaSwag
52
+ value: 74.57
53
+ verified: false
54
+ - task:
55
+ type: text-generation
56
+ dataset:
57
+ type: LAMBDA
58
+ name: OpenAI LAMBDA
59
+ metrics:
60
+ - name: LAMBDA
61
+ type: LAMBDA
62
+ value: 70.1
63
+ verified: false
64
+ - task:
65
+ type: text-generation
66
+ dataset:
67
+ type: OpenBookQA
68
+ name: openbookqa
69
+ metrics:
70
+ - name: OpenBookQA
71
+ type: OpenBookQA
72
+ value: 33
73
+ verified: false
74
+ - task:
75
+ type: text-generation
76
+ dataset:
77
+ type: PIQA
78
+ name: piqa
79
+ metrics:
80
+ - name: PIQA
81
+ type: PIQA
82
+ value: 77.52
83
+ verified: false
84
+ - task:
85
+ type: text-generation
86
+ dataset:
87
+ type: truthful_qa
88
+ name: truthful_qa
89
+ metrics:
90
+ - name: TruthfulQA
91
+ type: TruthfulQA
92
+ value: 43.89
93
+ verified: false
94
+ - task:
95
+ type: text-generation
96
+ dataset:
97
+ type: winogrande
98
+ name: winogrande
99
+ metrics:
100
+ - name: Winogrande
101
+ type: Winogrande
102
+ value: 67.64
103
+ verified: false
104
  ---
105
+ # DeciLM 6B-Instruct
106
+
107
+ DeciLM 6B-Instruct is a model for short-form instruction following. It is built by LoRA fine-tuning [DeciLM 6B](https://huggingface.co/Deci/DeciLM-6b) on a subset of the OpenOrca dataset.
108
+
109
+
110
+ - **Developed by:** Deci
111
+ - **Model type:** DeciLM is an auto-regressive language model using an optimized transformer decoder architecture that includes variable Grouped-Query Attention.
112
+ - **Language(s) (NLP):** English
113
+ - **License:** [Llama 2 Community License Agreement](https://huggingface.co/Deci/DeciLM-6b-instruct/blob/main/LICENSE.md)
114
+
115
+ ### Model Sources
116
+
117
+ - **Paper:** [DeciLM 6B Technical Blog] (https://deci.ai/blog/decilm-15-times-faster-than-llama2-nas-generated-llm-with-variable-gqa/)
118
+ - **Demo:** [DeciLM 6B-Instruct Demo](https://huggingface.co/spaces/Deci/DeciLM-6b-instruct)
119
+ - **Notebook:** [DeciLM 6B Notbook](https://colab.research.google.com/drive/1LugJCifOv0L426ukRHjOblBRWwUImAit)
120
+
121
+ ## Uses
122
+
123
+ The model is intended for commercial and research use in English and can be fine-tuned for use in other languages.
124
+
125
+ ## How to Get Started with the Model
126
+
127
+ Use the code below to get started with the model.
128
+
129
+ ```bibtex
130
+ # pip install -q transformers
131
+
132
+ import torch
133
+ from transformers import AutoModelForCausalLM, AutoTokenizer
134
+
135
+ checkpoint = "Deci/DeciLM-6b-instruct"
136
+ device = "cuda" # for GPU usage or "cpu" for CPU usage
137
+
138
+ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
139
+ model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, trust_remote_code=True).to(device)
140
+
141
+ inputs = tokenizer.encode("How do I make french toast? Think through it step by step", return_tensors="pt").to(device)
142
+ outputs = model.generate(inputs, max_new_tokens=100, do_sample=True, top_p=0.95)
143
+ print(tokenizer.decode(outputs[0]))
144
+ ```
145
+
146
+ ## Training Details
147
+
148
+ DeciLM 6B underwent training utilizing the SlimPijamas dataset, leveraging advanced proprietary methodologies allowing for fast training. DeciLM 6B was further finetuned on a subset of the OpenOrca dataset, giving rise to DeciLM-6B-Instruct.
149
+
150
+ ## Evaluation
151
+
152
+ Below are DeciLM's 6B-instruct evaluation results.
153
+
154
+ | Average | ARC Challenge* | ARC Easy* | BoolQ | HellaSwag* | LAMBDA OpenAI | OpenBookQA | PIQA | TruthfulQA | Winogrande |
155
+ |:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|
156
+ | 62.01 | 44.43 | 70.58 | 77.34 | 74.57 | 70.1 | 33 | 77.52 |43.89 | 67.64 |
157
+ Accuracy-norm score*
158
+
159
+
160
+ ## Runtime Benchmarks
161
+
162
+ |Inference Tool/Hardware | A10 (tokens/sec) |
163
+ |:----------|:----------|
164
+ | HF | 652.49 |
165
+ | Infery LLM | 2,029.6 |
166
+
167
+ - Throughput (tokens/sec) - Measured with optimal batch - BS 64, Infery LLM BS 128
168
+
169
+ ## Disclaimer
170
+
171
+ DeciLM 6B-Instruct has not been aligned for safety or trained using RLHF.
172
+
173
+ ## How to Cite
174
+
175
+ Please cite this model using this format.
176
+
177
+ ```bibtex
178
+ @misc{DeciFoundationModels,
179
+ title = {DeciLM 6B Instruct},
180
+ author = {DeciAI Research Team},
181
+ year = {2023}
182
+ url={[https://huggingface.co/Deci/DeciLM-6b-instruct](https://huggingface.co/Deci/DeciLM-6b-instruct)},
183
+ }
184
+ ```