Spaetzle
Collection
German-English models, mostly merged, some sft/dpo
β’
117 items
β’
Updated
This is a progressive (mostly dare-ties, but also slerp) merge with the intention of a suitable compromise for English and German local tasks.
There is also an unquantized version.
It achieves (running quantized) in
It should work sufficiently well with ChatML prompt template (for all merged models should have seen ChatML prompts at least in DPO stage).
Spaetzle-v69-7b is a merge of the following models using LazyMergekit:
The merge tree in total involves to following original models:
models:
- model: cstr/Spaetzle-v68-7b
# no parameters necessary for base model
- model: abideen/AlphaMonarch-dora
parameters:
density: 0.60
weight: 0.30
merge_method: dare_ties
base_model: cstr/Spaetzle-v68-7b
parameters:
int8_mask: true
dtype: bfloat16
random_seed: 0
tokenizer_source: base
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "cstr/Spaetzle-v69-7b"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
Base model
mlabonne/Monarch-7B