YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
CAMEL-13B-Combined-Data is a chat large language model obtained by finetuning LLaMA-13B model on a total of 229K conversations collected through our CAMEL framework, 100K English public conversations from ShareGPT that can be found here, and 52K instructions from Alpaca dataset that can be found here. We evaluate our model offline using EleutherAI's language model evaluation harness used by Huggingface's Open LLM Benchmark. CAMEL*-13B scores an average of 58.9.
Model | size | ARC-C (25 shots, acc_norm) | HellaSwag (10 shots, acc_norm) | MMLU (5 shots, acc_norm) | TruthfulQA (0 shot, mc2) | Average | Delta |
---|---|---|---|---|---|---|---|
LLaMA | 13B | 56.3 | 80.9 | 46.7 | 39.9 | 56.0 | - |
Vicuna | 13B | 52.8 | 80.1 | 50.5 | 51.8 | 58.8 | 2.8 |
CAMEL* | 13B | 56.1 | 79.9 | 50.5 | 49.0 | 58.9 | 2.9 |
license: cc-by-nc-4.0
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 46.07 |
ARC (25-shot) | 55.63 |
HellaSwag (10-shot) | 79.25 |
MMLU (5-shot) | 49.74 |
TruthfulQA (0-shot) | 47.42 |
Winogrande (5-shot) | 75.45 |
GSM8K (5-shot) | 7.13 |
DROP (3-shot) | 7.86 |
- Downloads last month
- 1,508
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.