|
--- |
|
license: other |
|
license_name: deepseek |
|
license_link: https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/LICENSE-MODEL |
|
--- |
|
DeepMagic-Coder-7b |
|
|
|
Alternate version: |
|
- https://huggingface.co/rombodawg/DeepMagic-Coder-7b-Alt |
|
|
|
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/LlbswwXZQoIQziTNEMSMk.jpeg) |
|
|
|
This is an extremely successful merge of the deepseek-coder-6.7b-instruct and Magicoder-S-DS-6.7B models, bringing an uplift in overall coding performance without any compromise to the models integrity (at least with limited testing). |
|
|
|
This is the first of my models to use the merge-kits *task_arithmetic* merging method. The method is detailed bellow, and its clearly very usefull for merging ai models that were fine-tuned from a common base: |
|
|
|
Task Arithmetic: |
|
``` |
|
Computes "task vectors" for each model by subtracting a base model. |
|
Merges the task vectors linearly and adds back the base. |
|
Works great for models that were fine tuned from a common ancestor. |
|
Also a super useful mental framework for several of the more involved |
|
merge methods. |
|
``` |
|
|
|
The original models used in this merge can be found here: |
|
|
|
- https://huggingface.co/ise-uiuc/Magicoder-S-DS-6.7B |
|
|
|
- https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct |
|
|
|
|
|
The Merge was created using Mergekit and the paremeters can be found bellow: |
|
```yaml |
|
models: |
|
- model: deepseek-ai_deepseek-coder-6.7b-instruct |
|
parameters: |
|
weight: 1 |
|
- model: ise-uiuc_Magicoder-S-DS-6.7B |
|
parameters: |
|
weight: 1 |
|
merge_method: task_arithmetic |
|
base_model: ise-uiuc_Magicoder-S-DS-6.7B |
|
parameters: |
|
normalize: true |
|
int8_mask: true |
|
dtype: float16 |
|
``` |