Capybara Preferences

distilabel-internal-testing 's Collections

Capybara and SystemChat-1.1 Preferences with SOTA LLMs

updated Apr 17

This collection contains the results of the effort on extending `LDJnr/Capybara` to convert it into a preference dataset, with 7B LLMs

Upvote

LDJnr/Capybara

Viewer • Updated Jun 7 • 16k • 380 • 225

Note The original Capybara dataset for SFT (ending with an assistant response)
argilla/distilabel-capybara-dpo-7k-binarized

Viewer • Updated Jul 16 • 7.56k • 856 • 174

Note The first iteration on Argilla's end to generate responses with `argilla/notus-7b-v1`, `teknium/OpenHermes-2.5-Mistral-7B`, and `mlabonne/NeuralBeagle14-7B`; then using GPT-4 as a judge via UltraFeedback using `distilabel`
distilabel-internal-testing/Capybara-Deduped

Viewer • Updated Apr 15 • 16k • 39

Note A subset of `LDJnr/Capybara` dropping the duplicates, as apparently there are some duplicate entries within the Dove subset
distilabel-internal-testing/Capybara-Deduped-Remaining

Viewer • Updated Apr 15 • 8.38k • 39

Note A subset of `distilabel-internal-testing/Capybara-Deduped` removing the rows that have already been generated and judged in `argilla/distilabel-capybara-dpo-7k-binarized`
distilabel-internal-testing/Capybara-Preferences-Remaining

Viewer • Updated Apr 17 • 7.84k • 52

Note This subset contains the generations and preferences of the samples in `distilabel-internal-testing/Capybara-Deduped-Remaining`, and should be merged into `argilla/distilabel-capybara-dpo-7k-binarized`
argilla/Capybara-Preferences

Viewer • Updated May 9 • 15.4k • 296 • 38

Note The final dataset as an iteration on top of `LDJnr/Capybara` generating alternative completions with `argilla/notus-7b-v1`, `teknium/OpenHermes-2.5-Mistral-7B`, and `mlabonne/NeuralBeagle14-7B`; then using GPT-4 as a judge via UltraFeedback using `distilabel`

Upvote