jarradh
/

llama2_70b_chat_uncensored

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

jarradh commited on Aug 3, 2023

Commit

a643903

•

1 Parent(s): 626c051

Better attribution

Files changed (1) hide show

README.md +8 -8

README.md CHANGED Viewed

@@ -1,17 +1,21 @@
 ---
-license: other
 datasets:
 - ehartford/wizard_vicuna_70k_unfiltered
 ---
 # Overview
 Fine-tuned [Llama-2 70B](https://huggingface.co/TheBloke/Llama-2-70B-fp16) with an uncensored/unfiltered Wizard-Vicuna conversation dataset [ehartford/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered).
 [QLoRA](https://arxiv.org/abs/2305.14314) was used for fine-tuning. The model was trained for three epochs on a single NVIDIA A100 80GB GPU instance, taking ~1 week to train.
 The version here is the fp16 HuggingFace model.
-In 8 bit mode, the model fits into 84% of A100 80GB (~67.2GB) 68747MiB
-In 4 bit mode, the model fits into 51% of A100 80GB (~40.8GB) 41559MiB
 500gb of RAM/Swap was required to merge the model.
 ## GGML & GPTQ versions
@@ -89,8 +93,6 @@ The road to hell is paved with good intentions, the current approach to AI Safet
 # Training code
-Special thanks to [George Sung](https://huggingface.co/georgesung) for creating [llama2_7b_chat_uncensored](https://huggingface.co/georgesung/llama2_7b_chat_uncensored).
 Code used to train the model is available [here](https://github.com/georgesung/llm_qlora).
 To reproduce the results:
@@ -131,6 +133,4 @@ model_output_dir: models/  # model saved in {model_output_dir}/{model_name}
 ```
 # Fine-tuning guide
-https://georgesung.github.io/ai/qlora-ift/

 ---
+license: llama2
 datasets:
 - ehartford/wizard_vicuna_70k_unfiltered
+tags:
+- uncensored
 ---
 # Overview
 Fine-tuned [Llama-2 70B](https://huggingface.co/TheBloke/Llama-2-70B-fp16) with an uncensored/unfiltered Wizard-Vicuna conversation dataset [ehartford/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered).
 [QLoRA](https://arxiv.org/abs/2305.14314) was used for fine-tuning. The model was trained for three epochs on a single NVIDIA A100 80GB GPU instance, taking ~1 week to train.
+Special thanks to [George Sung](https://huggingface.co/georgesung) for creating [llama2_7b_chat_uncensored](https://huggingface.co/georgesung/llama2_7b_chat_uncensored), and to [Eric Hartford](https://huggingface.co/ehartford/) for creating [ehartford/wizard_vicuna_70k_unfiltered](https://huggingface.co/datasets/ehartford/wizard_vicuna_70k_unfiltered)
 The version here is the fp16 HuggingFace model.
+In 8 bit mode, the model fits into 84% of A100 80GB (67.2GB) 68747MiB
+In 4 bit mode, the model fits into 51% of A100 80GB (40.8GB) 41559MiB
 500gb of RAM/Swap was required to merge the model.
 ## GGML & GPTQ versions
 # Training code
 Code used to train the model is available [here](https://github.com/georgesung/llm_qlora).
 To reproduce the results:
 ```
 # Fine-tuning guide
+https://georgesung.github.io/ai/qlora-ift/