Komorebi
Collection
Multi-phase KTO RP series
•
1 item
•
Updated
•
1
This is a model based on a multi-phase process using KTO fine tuning using the jondurbin gutenberg approach, that results in 3 separate LoRAs which are merged in sequence.
The resulting model is exhibiting a significant decrease in Llama 3.1 slop outputs.
Experimental. Please give feedback. Begone if you demand perfection.
I did most of my testing with temp 1.4, min-p 0.15, DRY 0.8. I also did play with enabling XTC with threshold 0.1, prob 0.50.
As context grows, you may want to bump temp and min-p and maybe even DRY.