Finetune on "uncensored" dataset?

#32

by sivarajan - opened Jun 1, 2023

Jun 1, 2023

The datasets used for fine-tuning the model introduce significant bias in responses, and marked reduction in capability, famously with the verbal tic "I'm sorry, but as a large language model … ". Have you considered finetuning Falcon on datasets with such responses removed?
See evol_instruct_unfiltered and ShareGPT_unfiltered.

joorei

Jun 1, 2023

That would be amazing!

The censored models are not only biased, but as a result less useful.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment