Text Classification
Transformers
Safetensors
llama
text-generation-inference
Inference Endpoints

Training details?

#2
by MicPie - opened

Hi, great work with the strong RM!

I found your blog post recently (https://efficient-unicorn-451.notion.site/Reward-Modeling-for-RLHF-abe03f9afdac42b9a5bee746844518d0) and I wondered if you used the same/a very similar recipe for Llama-3 when compared to the ones outlined there for Gemma and Mistral-7B?

Thank you and keep up the great work! :-)

FsfairX org

Yes. The training recipe is similar to the previous ones.

Sign up or log in to comment