--- license: apache-2.0 model-index: - name: Llama3.1_CoT results: - task: type: text-generation name: Text Generation dataset: name: IFEval (0-Shot) type: HuggingFaceH4/ifeval args: num_few_shot: 0 metrics: - type: inst_level_strict_acc and prompt_level_strict_acc value: 22.46 name: strict accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BBH (3-Shot) type: BBH args: num_few_shot: 3 metrics: - type: acc_norm value: 19.9 name: normalized accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MATH Lvl 5 (4-Shot) type: hendrycks/competition_math args: num_few_shot: 4 metrics: - type: exact_match value: 1.51 name: exact match source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GPQA (0-shot) type: Idavidrein/gpqa args: num_few_shot: 0 metrics: - type: acc_norm value: 5.15 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MuSR (0-shot) type: TAUR-Lab/MuSR args: num_few_shot: 0 metrics: - type: acc_norm value: 11.77 name: acc_norm source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU-PRO (5-shot) type: TIGER-Lab/MMLU-Pro config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 19.32 name: accuracy source: url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT name: Open LLM Leaderboard --- ### 1. Model Details Introducing xinchen9/llama3-b8-ft, an advanced language model comprising 8 billion parameters. It has been fine-trained based on [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B). The llama3-b8 model was fine-tuning on dataset [CoT_ollection](https://huggingface.co/datasets/kaist-ai/CoT-Collection). ### 2. How to Use Here give some examples of how to use our model. #### Text Completion ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig model_name = "xinchen9/Llama3.1_CoT" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto") model.generation_config = GenerationConfig.from_pretrained(model_name) model.generation_config.pad_token_id = model.generation_config.eos_token_id ``` ### 3 Disclaimer The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please cosult an attorney before using this model for commercial purposes. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_xinchen9__Llama3.1_CoT) | Metric |Value| |-------------------|----:| |Avg. |13.35| |IFEval (0-Shot) |22.46| |BBH (3-Shot) |19.90| |MATH Lvl 5 (4-Shot)| 1.51| |GPQA (0-shot) | 5.15| |MuSR (0-shot) |11.77| |MMLU-PRO (5-shot) |19.32|