fearlessdots
/

Llama-3-Alpha-Centauri-v0.1-LoRA

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-3-Alpha-Centauri-v0.1-LoRA / README.md

fearlessdots's picture

Update README.md

80592df verified 6 months ago

|

2.45 kB

	---
	license: llama3
	datasets:
	- NobodyExistsOnTheInternet/ToxicQAFinal
	---

	# Llama-3-Alpha-Centauri-v0.1-LoRA

	---

	## Disclaimer

	Note: All models and LoRAs from the Centaurus series were created with the sole purpose of research. The usage of this model and/or its related LoRA implies agreement with the following terms:

	- The user is responsible for what they might do with it, including how the output of the model is interpreted and used;
	- The user should not use the model and its outputs for any illegal purposes;
	- The user is the only one resposible for any misuse or negative consequences from using this model and/or its related LoRA.

	I do not endorse any particular perspectives presented in the training data.

	---

	## Base

	This model and its related LoRA was fine-tuned on [https://huggingface.co/failspy/Meta-Llama-3-8B-Instruct-abliterated-v3](https://huggingface.co/failspy/Meta-Llama-3-8B-Instruct-abliterated-v3).

	## Datasets

	- [https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)

	## Fine Tuning

	### - Quantization Configuration

	- load_in_4bit=True
	- bnb_4bit_quant_type="fp4"
	- bnb_4bit_compute_dtype=compute_dtype
	- bnb_4bit_use_double_quant=False

	### - PEFT Parameters

	- lora_alpha=64
	- lora_dropout=0.05
	- r=128
	- bias="none"

	### - Training Arguments

	- num_train_epochs=1
	- per_device_train_batch_size=1
	- gradient_accumulation_steps=4
	- optim="adamw_bnb_8bit"
	- save_steps=25
	- logging_steps=25
	- learning_rate=2e-4
	- weight_decay=0.001
	- fp16=False
	- bf16=False
	- max_grad_norm=0.3
	- max_steps=-1
	- warmup_ratio=0.03
	- group_by_length=True
	- lr_scheduler_type="constant"

	## Credits

	- Meta ([https://huggingface.co/meta-llama](https://huggingface.co/meta-llama)): for the original Llama-3;
	- HuggingFace: for hosting this model and for creating the fine-tuning tools;
	- failspy ([https://huggingface.co/failspy](https://huggingface.co/failspy)): for the base model and the orthogonalization implementation;
	- NobodyExistsOnTheInternet ([https://huggingface.co/NobodyExistsOnTheInternet](https://huggingface.co/NobodyExistsOnTheInternet)): for the incredible dataset;
	- Undi95 ([https://huggingface.co/Undi95](https://huggingface.co/Undi95)) and Sao10k ([https://huggingface.co/Sao10K](https://huggingface.co/Sao10K)): my main inspirations for doing these models =]

	A huge thank you to all of them ☺️