Did something go wrong during training?

by CamiloMM - opened Jul 30

Discussion

CamiloMM

Jul 30

You guys usually release top tier LLM finetunes, so I wonder if there was something that went wrong when training this. Subjectively, it seems worse than base Mistral-Large-Instruct-2407 123B. "Objectively", it seems more censored according to UGI. For some reason it does RP worse? Which makes no sense.

Either way if someone else has a different experience maybe they can comment on this - maybe I'm alone in thinking this is somehow a slight downgrade.

Undi95

NeverSleep org Jul 30

We never really trained a model so big, so it was a one shot.
Loss seemed okay, and the unquantized version used for our test too, I'm surprised tho that our model is more censored than base

jackboot

Jul 31

It's censored? I am just getting shorter replies. Sometimes it's more creative but sometimes it's more dumb.

colourspec

Jul 31

Base Mistral with a JB does seem to function better

CamiloMM

Jul 31

I am just getting shorter replies. Sometimes it's more creative but sometimes it's more dumb.

Yeah, that as well - which usually is the opposite problem, like busting past a 512 token budget sometimes, but the whole reply being *spine gets shivered*<EOT> or something is another problem entirely. Had to re-roll way too much and write stuff to ask it to write at length in the prompt. I assume there was something that went wrong during training, and the low score on UGI is just the intelligence drop.

mradermacher

Aug 1

You can't always win. But even if this turns out to be total crap I would still be sad if it stays a one shot. Ganbatte!

jackboot

Aug 1

You can also use the correct mistral template and get a bit of a mix between the two.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment