小an you increase LLAMA3 8b simply by duplicating some layers?
#2
by
Regrin
- opened
Tell me, can you increase LLAMA3 8b simply by duplicating some layers?
Will this be of any use? I would like there to be a model, say, at 13b, so that, on the one hand, it would be easy to train, and on the other, it would be quite smart. I hope that such a transformation can preserve the performance of the model on the one hand, and increase the learning prospects on the other.
And if you do this, will the model lose performance at all? If not, that's great! Then it will be possible to train on the generated GPT4 datasets, the result will be much better than with the 8b model.
Am I right that the 8b models have reached the limit of their capabilities?
I don't think they did. There's still a lot of performance you can squeeze out of 8B models.