lmzheng commited on
Commit
f07ba7b
1 Parent(s): 0825072

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -64
README.md CHANGED
@@ -1,64 +1,13 @@
1
  ---
2
- license: other
3
- pipeline_tag: conversational
4
  ---
5
- <!-- header start -->
6
- <div style="width: 100%;">
7
- <img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;">
8
- </div>
9
- <div style="display: flex; justify-content: space-between; width: 100%;">
10
- <div style="display: flex; flex-direction: column; align-items: flex-start;">
11
- <p><a href="https://discord.gg/Jq4vkcDakD">Chat & support: my new Discord server</a></p>
12
- </div>
13
- <div style="display: flex; flex-direction: column; align-items: flex-end;">
14
- <p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p>
15
- </div>
16
- </div>
17
- <!-- header end -->
18
- # Vicuna 13B 1.1 HF
19
 
20
- This is an HF version of the [Vicuna 13B 1.1 model](https://huggingface.co/lmsys/vicuna-13b-delta-v1.1).
 
 
21
 
22
- It was created by merging the deltas provided in the above repo with the original Llama 13B model, [using the code provided on their Github page](https://github.com/lm-sys/FastChat#vicuna-weights).
23
 
24
- ## My Vicuna 1.1 model repositories
25
-
26
- I have the following Vicuna 1.1 repositories available:
27
-
28
- **13B models:**
29
- * [Unquantized 13B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-13B-1.1-HF)
30
- * [GPTQ quantized 4bit 13B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-13B-1.1-GPTQ-4bit-128g)
31
- * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU inference](https://huggingface.co/TheBloke/vicuna-13B-1.1-GGML)
32
-
33
- **7B models:**
34
- * [Unquantized 7B 1.1 model for GPU - HF format](https://huggingface.co/TheBloke/vicuna-7B-1.1-HF)
35
- * [GPTQ quantized 4bit 7B 1.1 for GPU - `safetensors` and `pt` formats](https://huggingface.co/TheBloke/vicuna-7B-1.1-GPTQ-4bit-128g)
36
- * [2, 3, 4, 5, 6 and 8-bit GGML models for CPU inference](https://huggingface.co/TheBloke/vicuna-7B-1.1-GGML)
37
-
38
- <!-- footer start -->
39
- ## Discord
40
-
41
- For further support, and discussions on these models and AI in general, join us at:
42
-
43
- [TheBloke AI's Discord server](https://discord.gg/Jq4vkcDakD)
44
-
45
- ## Thanks, and how to contribute.
46
-
47
- Thanks to the [chirper.ai](https://chirper.ai) team!
48
-
49
- I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training.
50
-
51
- If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects.
52
-
53
- Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits.
54
-
55
- * Patreon: https://patreon.com/TheBlokeAI
56
- * Ko-Fi: https://ko-fi.com/TheBlokeAI
57
-
58
- **Patreon special mentions**: Aemon Algiz, Dmitriy Samsonov, Nathan LeClaire, Trenton Dambrowitz, Mano Prime, David Flickinger, vamX, Nikolai Manek, senxiiz, Khalefa Al-Ahmad, Illia Dulskyi, Jonathan Leane, Talal Aujan, V. Lukas, Joseph William Delisle, Pyrater, Oscar Rangel, Lone Striker, Luke Pendergrass, Eugene Pentland, Sebastain Graf, Johann-Peter Hartman.
59
-
60
- Thank you to all my generous patrons and donaters!
61
- <!-- footer end -->
62
  # Vicuna Model Card
63
 
64
  ## Model details
@@ -74,14 +23,12 @@ Vicuna was trained between March 2023 and April 2023.
74
  The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego.
75
 
76
  **Paper or resources for more information:**
77
- https://vicuna.lmsys.org/
78
-
79
- **License:**
80
- Apache License 2.0
81
 
82
  **Where to send questions or comments about the model:**
83
  https://github.com/lm-sys/FastChat/issues
84
 
 
85
  ## Intended use
86
  **Primary intended uses:**
87
  The primary use of Vicuna is research on large language models and chatbots.
@@ -93,8 +40,8 @@ The primary intended users of the model are researchers and hobbyists in natural
93
  70K conversations collected from ShareGPT.com.
94
 
95
  ## Evaluation dataset
96
- A preliminary evaluation of the model quality is conducted by creating a set of 80 diverse questions and utilizing GPT-4 to judge the model outputs. See https://vicuna.lmsys.org/ for more details.
 
97
 
98
- ## Major updates of weights v1.1
99
- - Refactor the tokenization and separator. In Vicuna v1.1, the separator has been changed from `"###"` to the EOS token `"</s>"`. This change makes it easier to determine the generation stop criteria and enables better compatibility with other libraries.
100
- - Fix the supervised fine-tuning loss computation for better model quality.
 
1
  ---
2
+ inference: false
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
+ **NOTE: New version available**
6
+ Please check out a newer version of the weights [here](https://huggingface.co/lmsys/vicuna-13b-v1.3).
7
+ If you still want to use this old version, please see the compatibility and difference between different versions [here](https://github.com/lm-sys/FastChat/blob/main/docs/vicuna_weights_version.md).
8
 
9
+ <br>
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  # Vicuna Model Card
12
 
13
  ## Model details
 
23
  The Vicuna team with members from UC Berkeley, CMU, Stanford, and UC San Diego.
24
 
25
  **Paper or resources for more information:**
26
+ https://lmsys.org/blog/2023-03-30-vicuna/
 
 
 
27
 
28
  **Where to send questions or comments about the model:**
29
  https://github.com/lm-sys/FastChat/issues
30
 
31
+
32
  ## Intended use
33
  **Primary intended uses:**
34
  The primary use of Vicuna is research on large language models and chatbots.
 
40
  70K conversations collected from ShareGPT.com.
41
 
42
  ## Evaluation dataset
43
+ A preliminary evaluation of the model quality is conducted by creating a set of 80 diverse questions and utilizing GPT-4 to judge the model outputs.
44
+ See https://lmsys.org/blog/2023-03-30-vicuna/ for more details.
45
 
46
+ ## Acknowledgement
47
+ Special thanks to [@TheBloke](https://huggingface.co/TheBloke) for hosting this merged version of weights earlier.