upstage/open-ko-llm-leaderboard · Announcement: Launch of Open Ko-LLM Leaderboard Season 2

upstage org Jul 30

•

Announcement: Launch of Open Ko-LLM Leaderboard Season 2

We are pleased to announce the imminent launch of Open Ko-LLM Leaderboard Season 2. The current Season 1 will conclude on Friday, August 2, and the new season will commence on August 12.

In this upcoming season, we will be retiring the following benchmarks: Ko-HellaSwag (provided by Upstage), Ko-MMLU (provided by Upstage), Ko-Arc (provided by Upstage), Ko-Truthful QA (provided by Upstage), and Ko-CommonGen V2 (provided by Korea University NLP&AI Lab).

Season 2 will introduce an updated set of benchmarks, including:

Ko-GPQA (provided by Flitto)
Ko-WinoGrande (provided by Flitto)
Ko-GSM8K (provided by Flitto)
Ko-EQ-Bench (provided by Flitto)
Ko-IFEval (provided by Flitto)
KorNAT-Knowledge (provided by SELECTSTAR and KAIST AI)
KorNAT-Social-Value (provided by SELECTSTAR and KAIST AI)
Ko-Harmlessness (provided by SELECTSTAR and KAIST AI)
Ko-Helpfulness (provided by SELECTSTAR and KAIST AI)

All models submitted during Season 1 will undergo a comprehensive re-evaluation process.

We are also delighted to announce that the Ko-CommonGen V2 dataset will be made publicly available. The associated research paper, KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models by Jaehyung Seo, Jaewook Lee, Chanjun Park, SeongTae Hong, Seungjun Lee, and Heuiseok Lim, is scheduled for presentation at ACL 2024-Findings, with the dataset being released in conjunction with the paper's publication. Additionally, a paper on the Open Ko-LLM Leaderboard, Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark (https://arxiv.org/abs/2405.20574) by Chanjun Park, Hyeonwoo Kim, Dahyun Kim, SeongHwan Cho, Sanghoon Kim, Sukyung Lee, Yungi Kim, and Hwalsuk Lee, will also be presented at ACL 2024, marking the conclusion of Season 1.

Since the launch of Season 1 on September 27, 2023, we have received over 1,700 model submissions. We extend our sincere gratitude for your active participation and invaluable contributions. The Open Ko-LLM Leaderboard is a non-commercial initiative, hosted by Upstage and NIA, with infrastructure support generously provided by KT Cloud and AICA. We also gratefully acknowledge data support from Korea University NLP&AI Lab, Flitto, SELECTSTAR, and KAIST AI.

Additionally, we would like to acknowledge the Hugging Face teams, particularly Clémentine Fourrier, Lewis Tunstall, Omar Sanseviero, and Philipp Schmid. Moreover, we would like to express our gratitude to Professor Harksoo Kim from Konkuk University, Professor Hwanjo Yu from Pohang University of Science
and Technology, Professor Sangkeun Jung from Chungnam National University, and Professor Alice Oh from KAIST for their valuable advice provided for the Open Ko-LLM Leaderboard. Finally, we extend our heartfelt thanks to the open-source community for their invaluable contributions and feedback.

We look forward to your continued engagement and support as we transition to Open Ko-LLM Leaderboard Season 2.

Thank you for your attention and participation.

Sincerely,

Chanjun Park
Upstage

Chanjun pinned discussion Jul 30

alielfilali01

Jul 31

Looking forward to it 💪🏻