ChinaLM-9B / README.md
Chickaboo's picture
Update README.md
2275163 verified
metadata
license: mit
language:
  - en
base_model:
  - 01-ai/Yi-1.5-9B-Chat
  - Qwen/Qwen2-7B-Instruct
library_name: transformers
tags:
  - mergekit
  - merge
  - conversational
  - chicka
  - chinese
  - china

ChinaLM by Chickaboo AI

Welcome to ChinaLM, a Chinese LLM merge made Chickaboo AI. ChinaLM is designed to deliver a high-quality conversational experience in Chinese.

Table of Contents

  • Model Details
  • Benchmarks
  • Usage

Model Details

ChinaLM is a merge of the Qwen2-7B-Instruct model and Yi-1.5-9B-Chat made with Mergekit using this config file:

slices:
  - sources:
    - model: 01-ai/Yi-1.5-9B-Chat
      layer_range: [0, 20]
  - sources:
    - model: Qwen/Qwen2-7B-Instruct
      layer_range: [0, 20]
merge_method: passthrough
dtype: bfloat16

Open Chinese LLM Leaderboard

Coming soon

Benchmark ChinaLM-9B ChinaLM-13B (Unrealesed) Mistral-7B-Instruct-v0.2 Meta-Llama-3-8B Yi-1.5-9B-Chat Qwen2-7B-Instruct
Average -- -- -- -- -- --
ARC -- -- -- -- -- --
Hellaswag -- -- -- -- -- --
MMLU -- -- -- -- -- --
TruthfulQA -- -- -- -- -- --
Winogrande -- -- -- -- -- --
GSM8K -- -- -- -- -- --

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("Chickaboo/ChinaLM-9B")
tokenizer = AutoTokenizer.from_pretrained("Chickaboo/ChinaLM-9B")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")

model_inputs = encodeds.to(device)
model.to(device)

generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])