Update configuration_llama.py, required to get model to Load as `rope_scaling` needs to be None, or else a dictionary

I checked how NousResearch YaRN models did it, and they have `rope_scaling=None` defined in this Python file. Then that will be overriden by the values specified in your config.json:

Before this PR I couldn't load the model, but this PR enables me to load it - below confirms that correct `rope_scaling` is then applied:

```
In [1]: from transformers import AutoModelForCausalLM, AutoTokenizer

In [2]: model = AutoModelForCausalLM.from_pretrained("/workspace/process/ddh0_norocetacean-20b-10k/source" , low_cpu_mem_usage=True, trust_remote_code=True)
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:08<00:00, 1.77s/it]

In [4]: model.config.rope_scaling
Out[4]:
{'factor': 2.5,
'original_max_position_embeddings': 4096,
'type': 'yarn',
'finetuned': False}

In [5]:
```

Files changed (1) hide show

configuration_llama.py +1 -1

configuration_llama.py CHANGED Viewed

@@ -124,7 +124,7 @@ class LlamaConfig(PretrainedConfig):
         pretraining_tp=1,
         tie_word_embeddings=False,
         rope_theta=10000,
-        rope_scaling="yarn",
         attention_bias=False,
         **kwargs,
     ):

         pretraining_tp=1,
         tie_word_embeddings=False,
         rope_theta=10000,
+        rope_scaling=None,
         attention_bias=False,
         **kwargs,
     ):