Training details?
#2
by
MicPie
- opened
Hi, great work with the strong RM!
I found your blog post recently (https://efficient-unicorn-451.notion.site/Reward-Modeling-for-RLHF-abe03f9afdac42b9a5bee746844518d0) and I wondered if you used the same/a very similar recipe for Llama-3 when compared to the ones outlined there for Gemma and Mistral-7B?
Thank you and keep up the great work! :-)
Yes. The training recipe is similar to the previous ones.