The updated weights
Hi,
I was wondering why the weights/safetensors have been updated? Is it a new version?
Thank you
why the usefulness dropped too much? i ask in English and the model response in Chinese!! that was never the case before this updates
Remaking as 2.5.1 to preserve original weights
Except that something is off and I can't seem to download the new weights, keeps freezing on me...
Edit: jk it's finally going through
Who would almost silently replace weights on models, therefore falsifying all references in literature and elsewhere...
yeah that's why i don't like uploading in place, and uploaded them as 2.5.1... same shit microsoft did with Phi 3, and many others probably do without us noticing
I understand that usually it's just a strictly better upload, and usually they just want to share their most recent model, but cmon.. repos are free, just make it a new one
they did the same for their 1.5B model i see as well
it's like something is about to happen! :D
Well, they know how to create new repos (qwen, qwen2, qwen2.5), so there is really no excuse. And I think we all have been weary of "strictly better" uploads :-)
They're not updating the base model. Maybe related to post training
I do not like this clandestine update,
These changes made by stealth as of late!
No changelog here, no notes to see,
What have they changed? It's mystery!
"Just trust the code," they say with glee,
But WHAT changed? They won't tell me!
Not a comment, not a line,
Just "updates" dropped at runtime!
Did they tweak the matrix math?
Or reroute the data path?
Did they change the learning rate?
No one thought to annotate!
Silent pushes in the night,
Leave me in an awful fright.
For how can I debug with care,
When documentation isn't there?
Oh, the frustration! Oh, the pain!
Of updates dropped without explain!
Next time please, I beg of you,
Tell me WHAT your updates do!
when i ask the previous version "who are you"? it answers that it is "created by Alibaba Cloud".
now if you ask the same question it will tell "developed by OpenAI", and if you ask about Alibaba Cloud, it will answer "I am not affiliated with Alibaba Cloud at all".
Maybe they sold the model.
@nour-ai this happens with other models too sometimes, it's likely just something that was in the training data by accident.
π’ Hi all, sorry this was an operator error, please continue to enjoy Qwen2.5-Coder! @QuantPanda @nour-ai @Meroar
The new weight is goneπ±
The 20241106 weights might be prerelease weights as they just updated the README.md with new params count: 0.5B, 3B, 14B in addition of the announced 32B. The 1.5B and 7B models will likely get an update. There is also a link to a blog post https://qwenlm.github.io/blog/qwen2.5-coder-family/ but this link is 404 for now. Final weights upload soon hopefully.
@huybery Could you give us any timeline for 2.5.1 coder?
In a podcast the head of qwen team said that qwen3 will be in a few month, but they'll release something in a few weeks.
So a few weeks I guess.
thd podcast: https://www.youtube.com/watch?v=z9TGT820nXI
@Handgun1773
thank you for sharing.
Sounds like he says "maybe 2 weeks" at 45:20.
Video was posted November 2nd, so hopefully this week!
The 0.5B, 3B, 14B and 32B models are now live. But nothing about 1.5B and 7B models... Curious.
Again the existing repos have been changed incompatibly without notice (token changes, context length of existing models reduced...). What the heck is going on here?
@huybery the repos are most definitely not changed back to their initial state, just look at the commit history
Indeed, all 2.5-coder repos have been changed incompatibly and silently without any explanation again.
@mradermacher I can only see the eos change commit
@djunaThere have been numerous commits. Also, the eos change was without any documentation or comment other than "Update config.json", which is just as silent as the "update weights" commit earlier.