Model

by mrfakename - opened Jan 29

Jan 29

Hi,
Thank you for releasing this model! Would you mind sharing some details on how it was trained, and what the training data is?
Thanks!

nanowell

Jan 29

mrfakename

Jan 29

Is this a frankenmerge?

lastrosade

Jan 29

mrfakename, this model is most likely a leak of mistral medium.

mrfakename

Jan 29

mrfakename, this model is most likely a leak of mistral medium.

Interesting! According to nisten it's a frankenmerge, do you know if that's accurate?

lastrosade

Jan 29

I had not seen this, thanks for the info

HDiffusion

Jan 29

Interesting! According to nisten it's a frankenmerge, do you know if that's accurate?

He initially claimed it was a MoE, so I'd take this with a grain of salt. It outperforms mistral 7B by a mile from my testing though.

mrfakename

Jan 29

Interesting! According to nisten it's a frankenmerge, do you know if that's accurate?

He initially claimed it was a MoE, so I'd take this with a grain of salt. It outperforms mistral 7B by a mile from my testing though.

Frankenmerges can be MoEs, right?

dillfrescott

Jan 29

Frankenmerges can be MoEs, right?

Correct

dillfrescott

Jan 29

Does anybody know, at same quant level, whether this model or this model is better?

jackboot

Jan 30

Nisten made all kinds of claims, some rather insane ones in the beginning.. yet I tested the model and it's relatively good. If it's a merge then of what? Who else uses mistral's format that put a model out recently? I suggest people just try it if they have the memory.. at least at Q4.

It chats well and it's not dumb, that's all that matters. I downloaded tons of disappointment of the leader board going by benchmarks.

dillfrescott

Jan 30

I downloaded tons of disappointment of the leader board going by benchmarks.

Yes! So many models are disappointing when evaluated with real world usage.

aigeek0x0

Jan 30

this looks like an MoE 7x11 fine-tuned on mistral-medium synthetic data. it does mimic mistral's style very closely.

adamo1139

Jan 30

@aigeek0x0 do you see MoE router in the gguf? It's not MoE.

Nexesenex

Jan 30

More infos and benchs here : https://huggingface.co/Nexesenex/Miqu-1-70b-Requant-iMat.GGUF

dillfrescott

Jan 31

https://twitter.com/teortaxesTex/status/1752459593416847570

Interesting. Not sure if true but seems possible.

jeffwadsworth

Jan 31

Excellent model. Reminds me of Claude. It’s willing to consider alternative solutions. It takes advice and will mold its answers to new promoted insights. Tested it with the difficult Aunt Agatha riddle and it handled it well.

adamo1139

Jan 31

Apparently someone succeeded at dequantizing it to fp16, 70+ MMLU scores
https://huggingface.co/152334H/miqu-1-70b-sf

jeffwadsworth

Jan 31

Apparently someone succeeded at dequantizing it to fp16, 70+ MMLU scores
https://huggingface.co/152334H/miqu-1-70b-sf

Hmm. That makes no sense. How can you "add" precision to it? That would be like taking a blurry picture and making it clear again with all the detail.

aigeek0x0

Jan 31

Apparently someone succeeded at dequantizing it to fp16, 70+ MMLU scores
https://huggingface.co/152334H/miqu-1-70b-sf

the model appears to be legit, resembling "mistral-medium" as mentioned onhttps://twitter.com/teortaxesTex/status/1752459593416847570.

it (mistral-70b-instruct-alpha01) was likely trained on the Llama architecture, possibly for a quick presentation to investors.

this model is fine-tuned and adept at following instructions. based on my experiments, i can confirm that it is also aligned for safety.

jackboot

Jan 31

The 5bit EXL2 performs OK. It gets 11 perplexity on PTB_NEW. Have to check it vs the q4km I have. So the re-compression wasn't the end of the world.

adamo1139

Jan 31

That makes no sense. How can you "add" precision to it? That would be like taking a blurry picture and making it clear again with all the detail.

@jeffwadsworth

It doesn't add any precision, but fp16 pytorch file format is much more universal and it's easier to work with if you want to do finetuning. It's the same blurry image, but now you have it in digital form and can do stuff to it in Photoshop and you're not limited in what you can do to the physical photo using scissors, markers and other physical tools.

mayuruyam

Jan 31

It's a leak
News : https://bizmorphic.com/blogs/the-mysterious-leak-of-miqu--an-unauthorized-release-of-mistrals-proprietary-ai-model

jeffwadsworth

Jan 31

Wow. Crazy.

mrfakename

Feb 1

Well, I guess it is a leak.

mrfakename changed discussion status to closed Feb 1

dillfrescott

Feb 1

Its pretty obvious it was some sort of leak considering the lack of information about its creation process!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment