Training details?

by MicPie - opened Apr 25

MicPie

Apr 25

•

Hi, great work with the strong RM!

I found your blog post recently (https://efficient-unicorn-451.notion.site/Reward-Modeling-for-RLHF-abe03f9afdac42b9a5bee746844518d0) and I wondered if you used the same/a very similar recipe for Llama-3 when compared to the ones outlined there for Gemma and Mistral-7B?

Thank you and keep up the great work! :-)

FsfairX org Apr 27

Yes. The training recipe is similar to the previous ones.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment