Undi95
/

BagelMix-8x7B

Text Generation

Not-For-All-Audiences

nsfw

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

BagelMix-8x7B / README.md

Undi95's picture

Update README.md

13cc2f4 verified 10 months ago

|

history blame contribute delete

1.66 kB

	---
	base_model:
	- Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss
	- jondurbin/bagel-dpo-8x7b-v0.2
	- NeverSleep/Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss
	license: cc-by-nc-4.0
	tags:
	- not-for-all-audiences
	- nsfw
	- mergekit
	- merge

	---
	# BagelMix

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [jondurbin/bagel-dpo-8x7b-v0.2](https://huggingface.co/jondurbin/bagel-dpo-8x7b-v0.2) as a base.

	### Models Merged

	The following models were included in the merge:
	* [Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss](https://huggingface.co/Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss)
	* [NeverSleep/Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss](https://huggingface.co/NeverSleep/Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: jondurbin/bagel-dpo-8x7b-v0.2
	parameters:
	density: 1.0
	weight: 1.0
	- model: Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss
	parameters:
	density: 0.5
	weight: [0.33, 0.4, 0.33]
	- model: NeverSleep/Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss
	parameters:
	density: [0.33, 0.45, 0.66]
	weight: 0.66
	merge_method: dare_ties
	base_model: jondurbin/bagel-dpo-8x7b-v0.2
	parameters:
	normalize: true
	int8_mask: true
	dtype: bfloat16
	tokenizer_source : union
	```
	If you want to support me, you can [here](https://ko-fi.com/undiai).