vicgalle commited on
Commit
2851088
1 Parent(s): 6b48a30

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -10,12 +10,12 @@ datasets:
10
  - Undi95/Weyaxi-humanish-dpo-project-noemoji
11
  ---
12
 
13
- # Humanish-RP-Llama-3.1-8B
14
 
15
  ![image/webp](https://cdn-uploads.huggingface.co/production/uploads/5fad8602b8423e1d80b8a965/VPwtjS3BtjEEEq7ck4kAQ.webp)
16
 
17
  A DPO-tuned Llama-3.1 to behave more "humanish", i.e., avoiding all the AI assistant slop. It also works for role-play (RP). To achieve this, the model was fine-tuned over a series of datasets:
18
- * General conversations from Claude Opus
19
  * `Undi95/Weyaxi-humanish-dpo-project-noemoji`, to make the model react as a human, rejecting assistant-like or too neutral responses.
20
  * `ResplendentAI/NSFW_RP_Format_DPO`, to steer the model towards using the \*action\* format in RP settings. Works best if in the first message you also use this format naturally (see example)
21
 
 
10
  - Undi95/Weyaxi-humanish-dpo-project-noemoji
11
  ---
12
 
13
+ # Humanish-Roleplay-Llama-3.1-8B
14
 
15
  ![image/webp](https://cdn-uploads.huggingface.co/production/uploads/5fad8602b8423e1d80b8a965/VPwtjS3BtjEEEq7ck4kAQ.webp)
16
 
17
  A DPO-tuned Llama-3.1 to behave more "humanish", i.e., avoiding all the AI assistant slop. It also works for role-play (RP). To achieve this, the model was fine-tuned over a series of datasets:
18
+ * General conversations from Claude Opus, from `Undi95/Meta-Llama-3.1-8B-Claude`
19
  * `Undi95/Weyaxi-humanish-dpo-project-noemoji`, to make the model react as a human, rejecting assistant-like or too neutral responses.
20
  * `ResplendentAI/NSFW_RP_Format_DPO`, to steer the model towards using the \*action\* format in RP settings. Works best if in the first message you also use this format naturally (see example)
21