The updated weights

#12

by QuantPanda - opened 7 days ago

Discussion

QuantPanda

7 days ago

•

edited 7 days ago

Hi,

I was wondering why the weights/safetensors have been updated? Is it a new version?

Thank you

QuantPanda

7 days ago

•

edited 7 days ago

Tagging the homies @bartowski @QuantFactory @0-hero @mradermacher @MaziyarPanahi

nour-ai

7 days ago

why the usefulness dropped too much? i ask in English and the model response in Chinese!! that was never the case before this updates

bartowski

Qwen org 7 days ago

Remaking as 2.5.1 to preserve original weights

bartowski

Qwen org 7 days ago

•

edited 7 days ago

Except that something is off and I can't seem to download the new weights, keeps freezing on me...

Edit: jk it's finally going through

mradermacher

7 days ago

Who would almost silently replace weights on models, therefore falsifying all references in literature and elsewhere...

bartowski

Qwen org 7 days ago

yeah that's why i don't like uploading in place, and uploaded them as 2.5.1... same shit microsoft did with Phi 3, and many others probably do without us noticing

I understand that usually it's just a strictly better upload, and usually they just want to share their most recent model, but cmon.. repos are free, just make it a new one

bartowski

Qwen org 7 days ago

they did the same for their 1.5B model i see as well

MaziyarPanahi

7 days ago

it's like something is about to happen! :D

mradermacher

7 days ago

Well, they know how to create new repos (qwen, qwen2, qwen2.5), so there is really no excuse. And I think we all have been weary of "strictly better" uploads :-)

djuna

7 days ago

They're not updating the base model. Maybe related to post training

Meroar

7 days ago

I do not like this clandestine update,
These changes made by stealth as of late!
No changelog here, no notes to see,
What have they changed? It's mystery!
"Just trust the code," they say with glee,
But WHAT changed? They won't tell me!
Not a comment, not a line,
Just "updates" dropped at runtime!
Did they tweak the matrix math?
Or reroute the data path?
Did they change the learning rate?
No one thought to annotate!
Silent pushes in the night,
Leave me in an awful fright.
For how can I debug with care,
When documentation isn't there?
Oh, the frustration! Oh, the pain!
Of updates dropped without explain!
Next time please, I beg of you,
Tell me WHAT your updates do!

nour-ai

6 days ago

when i ask the previous version "who are you"? it answers that it is "created by Alibaba Cloud".
now if you ask the same question it will tell "developed by OpenAI", and if you ask about Alibaba Cloud, it will answer "I am not affiliated with Alibaba Cloud at all".
Maybe they sold the model.

QuantPanda

6 days ago

•

edited 6 days ago

@nour-ai this happens with other models too sometimes, it's likely just something that was in the training data by accident.

huybery

Qwen org 5 days ago

😢 Hi all, sorry this was an operator error, please continue to enjoy Qwen2.5-Coder! @QuantPanda @nour-ai @Meroar

djuna

5 days ago

The new weight is gone😱

huybery

Qwen org 5 days ago

@djuna Yes, it's back to the initial state for now, so please ignore that buggy update.

theo77186

5 days ago

The 20241106 weights might be prerelease weights as they just updated the README.md with new params count: 0.5B, 3B, 14B in addition of the announced 32B. The 1.5B and 7B models will likely get an update. There is also a link to a blog post https://qwenlm.github.io/blog/qwen2.5-coder-family/ but this link is 404 for now. Final weights upload soon hopefully.

EwoutH

4 days ago

@huybery Could you give us any timeline for 2.5.1 coder?

Handgun1773

4 days ago

@huybery Could you give us any timeline for 2.5.1 coder?

In a podcast the head of qwen team said that qwen3 will be in a few month, but they'll release something in a few weeks.
So a few weeks I guess.
thd podcast: https://www.youtube.com/watch?v=z9TGT820nXI

QuantPanda

3 days ago

@Handgun1773 thank you for sharing.
Sounds like he says "maybe 2 weeks" at 45:20.
Video was posted November 2nd, so hopefully this week!

theo77186

2 days ago

The 0.5B, 3B, 14B and 32B models are now live. But nothing about 1.5B and 7B models... Curious.

mradermacher

2 days ago

Again the existing repos have been changed incompatibly without notice (token changes, context length of existing models reduced...). What the heck is going on here?

@huybery the repos are most definitely not changed back to their initial state, just look at the commit history

mradermacher

2 days ago

Indeed, all 2.5-coder repos have been changed incompatibly and silently without any explanation again.

djuna

2 days ago

@mradermacher I can only see the eos change commit

mradermacher

2 days ago

@djunaThere have been numerous commits. Also, the eos change was without any documentation or comment other than "Update config.json", which is just as silent as the "update weights" commit earlier.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment