Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper
•
2306.05685
•
Published
•
29
Collection of articles and resources focusing on automatic evaluation for LLM's and their role as unbiased judges in assessing other LLMs' outputs