Train Mistral 7B 0.2

by mosama - opened Jan 10

Jan 10

Why don't you guys train mistral 7b 0.2 which has 32k context length on long context as well as short? Long Context datasets such as:

wckwan/M4LE
THUDM/LongBench
togethercomputer/Long-Data-Collections

or maybe your own long context curated ones.

mosama changed discussion title from Trainv Mistral 7B 0.2 to Train Mistral 7B 0.2 Jan 10

rombodawg

Jan 10

Yea i agree, I was considering using this model in a mixtral-merge becuase of the scores but it would difficult considering the context constraints of only 8k. Making any other mistral model in the merge be limited to 8k despite being able to produce 32k tokens of content.

carlos447

Jan 12

Citaman

Jan 15

I would say that the Mistral 7B V0.2 is not a pretrained model, but an instruction-tuned one, and therefore already has a bias towards the finetuning phase. For complete control over the model's performance, it is best to start from a pretrained model. That may be why.

riddlechen

Jan 17

SvCy

Jan 22

•

edited Jan 22

~~nvm~~
and i think they should definitely go beyond 7B parameters with openchat!

count-zero

Mar 27

•

edited Mar 27

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2 is fine-tuned on the base model mistral-7B-v0.2, which is now officially made available by Mistral AI:

mistral-7B-v0.2

https://models.mistralcdn.com/mistral-7b-v0-2/mistral-7B-v0.2.tar (PyTorch)
https://huggingface.co/alpindale/Mistral-7B-v0.2-hf (Safetensors)
https://huggingface.co/bartowski/Mistral-7B-v0.2-hf-GGUF (GGUF)

I would love to see an OpenChat fine-tune based on mistral-7B-v0.2 with a 32k context length.

Joseph717171

Apr 3

•

edited Apr 3

OpenChat team, I Depth Up-Scaled Mistral-7B-v0.2, following UpStage’s paper: SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling, if you want to train OpenChat on a slightly bigger model.

Joseph717171/Mistral-10.7B-v0.2

32K Context Window
🚫 Sliding Window Attention

rombodawg

Apr 3

@Joseph717171 Your too late bro, they dont care

https://huggingface.co/openchat/openchat-3.5-0106-gemma/discussions/4

Joseph717171

Apr 3

Oh, well it was worth a shot. 😁

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment