个人感觉这是当前对中文角色扮演聊天支持最好的模型了
acsr-y34b-4bpw-hb6-exl2
- base model: Yi-34B-Chat
- LoRA: Yi-34b-alpaca-cot-lora 支持Alpaca格式对话,效果很差不推荐使用Alpaca指令格式
- LoRA: Yi-34B-Spicyboros-3.1-LoRA 非官方对话数据集
- LoRA: limarpv3-yi-llama-34b-lora 扮演类长回复
- Instruction template: ChatML
- Origin max token size: 4096
- 在使用text-generation-webui作后台,SillyTavern做前端时,webui上MaxToken调到8K时SillyTavern的MaxToken要调整到18K,WebUI才不会提前截断
- 上下文长度8K Alpha调到2.5左右时,聊天到6K 以上Token还是不可避免的出现重复的回复和回复质量下降
- 在SillyTavern上要得到Roleplay样式的长回复需做如下调整
- 使用ChatML作为指令格式
- 勾选
用换行符换行序列
- 最后输出序列调整为
### Response (2 paragraphs, engaging, natural, authentic, descriptive, creative):
- 自定义停止符调整为
["<|im_end|>用户", "\n### Input"]
- 系统提示调整为
Avoid repetition, don't loop. Develop the plot slowly, always stay in character. Describe all actions in full, elaborate, explicit, graphic, and vivid detail. Mention all relevant sensory perceptions.
description
- This is test for exllamav2 model.
- 4bpw
python convert.py -i acsr-v2-y34b -c exl2/0000.parquet -o acsr-v2-y34b-4bpw-hb6-exl2 -hb 6 -l 4096 -b 4.15
- convert doc
- calibration dataset: WikiText-2-v1
- oobabooga/text-generation-webui must add
--trust-remote-code
into CMD_FLAGS.txt and use ExLlamav2 to load model
- Downloads last month
- 18
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.