Lower R for smaller model size?

#2
by wsxiaoys - opened

Thanks for the amazing work!

I noticed most lora models you shared are of ~150M, assuming you're using a large R in lora training (128?).
While based on discussion here: https://github.com/cloneofsimo/lora/discussions/37 it seems lora generalize well with r = 4 / 8.

Have you done experiments that use a R like 4 / 8 for model size around 10M? It'll be much easier for runtime model switch and sharing.

Not really, I will wait for someone else to do those tests. Right now I'm busy enough with the stuff that I do.

YoungMasterFromSect changed discussion status to closed

Sign up or log in to comment