Dracarys-Llama-3.1-70B-Instruct
Built with Meta Llama 3
Introduction
We introduce the latest in the Smaug series, the Dracarys family of finetunes targeting coding performance improvements across a variety of base models.
This variant is a finetune of meta-llama/Meta-Llama-3.1-70B-Instruct
Compared to meta-llama/Meta-Llama-3.1-70B-Instruct, Dracarys has better LiveCodeBench scores (see evaluation results below).
Model Description
- Developed by: Abacus.AI
- License: https://llama.meta.com/llama3/license/
- Finetuned from model: meta-llama/Meta-Llama-3.1-70B-Instruct.
How to use
The prompt format is unchanged from Llama 3 70B Instruct (see evaluations for prompt details for LCB)
Use with transformers
See the snippet below for usage with Transformers:
import transformers
import torch
model_id = "abacusai/Dracarys-72B-Instruct"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "You are data science coding assistant that generates Python code using Pandas and Numpy."},
{"role": "user", "content": "Write code to select rows from the dataframe `df` having the maximum `temp` for each `city`"},
]
prompt = pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>"),
pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>"),
]
outputs = pipeline(
prompt,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])
Evaluation Results
LiveCodeBench
Model | Code Generation | Code Execution | Test Output Prediction |
---|---|---|---|
Dracarys-Llama-3.1-70B-Instruct | 33.34 | 48.329 | 49.90 |
Meta-Llama-3.1-70B-Instruct | 32.23 | 48.768 | 41.40 |
Breakdown of LiveCodeBench CodeGeneration
Model | Easy | Medium | Hard |
---|---|---|---|
Dracarys-Llama-3.1-70B-Instruct | 71.89 | 17.30 | 4.23 |
Meta-Llama-3.1-70B-Instruct | 68.4 | 17.99 | 3.57 |
Breakdown of LiveCodeBench TestOutputPrediction
Model | Easy | Medium | Hard |
---|---|---|---|
Dracarys-Llama-3.1-70B-Instruct | 60.88 | 44.53 | 39.30 |
Meta-Llama-3.1-70B-Instruct | 51.22 | 35.91 | 34.30 |
LiveBench(July update)
Model | Global Average | Coding Average | Reasoning Average | Mathematics Average | Data Analysis Average | Language Average | IF Average |
---|---|---|---|---|---|---|---|
Dracarys-Llama-3.1-70B-Instruct | 48.67 | 35.23 | 44.0 | 45.68 | 48 | 41.77 | 77.37 |
Meta-Llama-3.1-70B-Instruct | 48.44 | 32.67 | 40.67 | 45.58 | 50.29 | 42.36 | 79.08 |
- Downloads last month
- 367
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.