grimjim
/

Llama-3-Luminurse-v0.2-OAS-8B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Llama-3-Luminurse-v0.2-OAS-8B / README.md

grimjim's picture

Update README.md

5d20ee8 verified 5 months ago

|

2.33 kB

	---
	base_model:
	- grimjim/llama-3-aaditya-OpenBioLLM-8B
	- NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
	- cgato/L3-TheSpice-8b-v0.8.3
	library_name: transformers
	tags:
	- mergekit
	- merge
	pipeline_tag: text-generation
	license: llama3
	license_link: LICENSE
	---
	# Llama-3-Luminurse-v0.2-OAS-8B

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	Luminurse is a merge based on Lumimaid, enhanced with a biomedical model (at higher strength than v0.1), with a dash of TheSpice thrown in to improve formatting of text generation.

	Boosting temperature has the interesting property of reducing repetitiveness and increasing verbosity of the model at the same time. Higher temperature also increases the odds of reasoning slippage (which can be manually mitigated by swiping for regeneration), so settings should be adjusted according to one's comfort levels. Lightly tested using Instruct prompts with temperature in the range of 1 to 1.6 (pick something in between to start, perhaps in the range of 1.2-1.45) and minP=0.01.

	- [static GGUFs c/o mradermacher](https://huggingface.co/mradermacher/Llama-3-Luminurse-v0.2-OAS-8B-GGUF)
	- [weighted/imatrix GGUFs c/o mradermacher](https://huggingface.co/mradermacher/Llama-3-Luminurse-v0.2-OAS-8B-i1-GGUF)

	Built with Meta Llama 3.

	## Merge Details
	### Merge Method

	This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS](https://huggingface.co/NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS) as a base.

	### Models Merged

	The following models were included in the merge:
	* [grimjim/llama-3-aaditya-OpenBioLLM-8B](https://huggingface.co/grimjim/llama-3-aaditya-OpenBioLLM-8B)
	* [cgato/L3-TheSpice-8b-v0.8.3](https://huggingface.co/cgato/L3-TheSpice-8b-v0.8.3)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	base_model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
	slices:
	- sources:
	- model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
	layer_range: [0,32]
	- model: grimjim/llama-3-aaditya-OpenBioLLM-8B
	layer_range: [0,32]
	parameters:
	weight: 0.2
	- model: cgato/L3-TheSpice-8b-v0.8.3
	layer_range: [0,32]
	parameters:
	weight: 0.04
	merge_method: task_arithmetic
	dtype: bfloat16

	```