|
--- |
|
license: llama3 |
|
datasets: |
|
- NobodyExistsOnTheInternet/ToxicQAFinal |
|
--- |
|
|
|
# Llama-3-Alpha-Centauri-v0.1-LoRA |
|
|
|
--- |
|
|
|
## Disclaimer |
|
|
|
**Note:** All models and LoRAs from the **Centaurus** series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms: |
|
|
|
- The user is responsible for what they might do with it, including how the output of the model is interpreted and used; |
|
- The user should not use the model and its outputs for any illegal purposes; |
|
- The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA. |
|
|
|
I do not endorse any particular perspectives presented in the training data. |
|
|
|
--- |
|
|
|
## Base |
|
|
|
This model and its related LoRA was fine-tuned on [https://huggingface.co/failspy/Meta-Llama-3-8B-Instruct-abliterated-v3](https://huggingface.co/failspy/Meta-Llama-3-8B-Instruct-abliterated-v3). |
|
|
|
## Datasets |
|
|
|
- [https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal) |
|
|
|
## Fine Tuning |
|
|
|
### - Quantization Configuration |
|
|
|
- load_in_4bit=True |
|
- bnb_4bit_quant_type="fp4" |
|
- bnb_4bit_compute_dtype=compute_dtype |
|
- bnb_4bit_use_double_quant=False |
|
|
|
### - PEFT Parameters |
|
|
|
- lora_alpha=64 |
|
- lora_dropout=0.05 |
|
- r=128 |
|
- bias="none" |
|
|
|
### - Training Arguments |
|
|
|
- num_train_epochs=1 |
|
- per_device_train_batch_size=1 |
|
- gradient_accumulation_steps=4 |
|
- optim="adamw_bnb_8bit" |
|
- save_steps=25 |
|
- logging_steps=25 |
|
- learning_rate=2e-4 |
|
- weight_decay=0.001 |
|
- fp16=False |
|
- bf16=False |
|
- max_grad_norm=0.3 |
|
- max_steps=-1 |
|
- warmup_ratio=0.03 |
|
- group_by_length=True |
|
- lr_scheduler_type="constant" |
|
|
|
## Credits |
|
|
|
- Meta ([https://huggingface.co/meta-llama](https://huggingface.co/meta-llama)): for the original Llama-3; |
|
- HuggingFace: for hosting this model and for creating the fine-tuning tools; |
|
- failspy ([https://huggingface.co/failspy](https://huggingface.co/failspy)): for the base model and the orthogonalization implementation; |
|
- NobodyExistsOnTheInternet ([https://huggingface.co/NobodyExistsOnTheInternet](https://huggingface.co/NobodyExistsOnTheInternet)): for the incredible dataset; |
|
- Undi95 ([https://huggingface.co/Undi95](https://huggingface.co/Undi95)) and Sao10k ([https://huggingface.co/Sao10K](https://huggingface.co/Sao10K)): my main inspirations for doing these models =] |
|
|
|
A huge thank you to all of them ☺️ |