A Bang Up Job

#4
by nightvision04 - opened

We needed another big win for the open source community. Thanks for taking a big risk for everyone.

You added several innovations here. Would you consider adding the 1.58 bit architecture in the future? I'm curious to know if it was considered.

I haven't seen ternary bits applied to an SSM yet, let alone a hybrid. Would be interesting to see if it's compatiable.

Imagine the efficiency with MoE + Mamba + 1.58 bit 😳

Maybe like make higher parameters version too, I imagine 1.58 bit version could be same memory footprint and speed is 50B version while being a lot more parameters if not double. Then I guess it would be how could we shrink that somehow even like quantization already let's us do with fp16 models

Sign up or log in to comment