Santi Diana
Automatic Evaluation Available. Read the README filegit add add_new_model/README.md add_new_model/add_new_model.py app.py add_new_model/execute_evaluation.py
d38c074

A newer version of the Gradio SDK is available: 5.5.0

Upgrade

How to add a new model to the Leaderboard

Here we are going to explain how to add a new model to the Leaderboard. The next steps must be followed:

  1. Git clone this repository and cd add_new_model.
  2. If you want to evaluate the model and add it to the Leaderboard, you just need to execute the following command: python3 add_new_model.py --model_id <HF_model> --execute_eval True This command will execute the evaluation in your runtime. That will create 2 elements: a results folder in which will store all the information regarding the evaluation process and a file called mteb_metadata.yaml that contains the processed metadata regarding your evaluation process. That metadata is read and re-ordered and added to the three CSV files that you can find in the data folder of this repository. The app.py file will read those CSV files, so you must be very careful with them.
  3. If you do not want to evaluate the model because you already have done it, you just need to execute the following command: python3 add_new_model.py --output_folder <results_folder>. That folder is where you have your JSON files containing the previous evaluation you did using MTEB library. If you select any different subfolder, then the code will raise an error (i.e if you select a parent folder of the actual results folder). This methodology will run very fast.
  4. After you added your model to the Leaderboard, remember you must execute two more commands: rm -r <results_folder> rm mteb_metadata.yaml We do not need that, so we won't accept any PR that contains those files.
  5. Once the previous steps are done, you can execute python3 app.py from the parent folder of this repository and you will be able to see the Leaderboard before pushing it to the hub.
  6. Add, commit and git push the changes. Remember not to push your results.