Model is broken - Do not use
Example of bad output:
{
"task_id": "HumanEval/161",
"completion": " if not any(c.isalpha() for c in s)):\n return s[::-1]\n else:\n return ''.join([c.upper() if c.islower() else c.lower()][::-1] if c.isalpha() else [c.upper() if c.islower() else c.lower()][::-1] if c.isalpha() else [c.upper() if c.islower() else c.lower()][::-1] if c.isalpha() else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else else",
"result": "failed: unmatched ')' (<string>, line 13)",
"passed": false
}
Scores about a 0.065, about 10x worse than advertised.
Tried main and also suggested branch in the description, both are broken.
Loaded like this:
from transformers import AutoTokenizer, LlamaForCausalLM
from human_eval.data import write_jsonl, read_problems
from tqdm import tqdm
# initialize the model
model_path = "Phind-CodeLlama-34B-v1-GPTQ"
model = LlamaForCausalLM.from_pretrained(model_path, device_map="cuda:0")
tokenizer = AutoTokenizer.from_pretrained(model_path)
Can you test it with Transformers from latest Github. It has the new rope_theta parameter which affects longer context, and the output you showed is often a symptom of bad RoPE scaling params
I probably need to mention that Transformers 4.33.0.dev is required for this, but could you test it for me first?
This is the Transformers commit that adds CodeLlama support: https://github.com/huggingface/transformers/commit/015f8e110d270a0ad42de4ae5b98198d69eb1964
You can test with:
pip3 install git+https://github.com/huggingface/transformers.git
Let me know
Yup that fixed it:
{'pass@1': 0.6158536585365854}
This is closer to what they claimed
Main branch: {'pass@1': 0.5670731707317073}
Repro steps here: https://github.com/catid/phind