What is the context size of this model?
Howdy! This one looks to be a very interesting model (big fan of role-playing and story-writing LLMs), could you please let me know what is the native context size of it? Thank you in advance for the answer!
It's a Yi-200k finetune so 200k.
Yeah, as @Light4Bear says it might go up to 200K, but during training it was only up to ~31K.
You are already doing a good job. Most finetunes would train at 4k ctx anyway.
Hey, thanks for the replies and kudos for training it on bigger data! So I tested the model using SillyTavern as a frontend with your recommended settings, but sadly, I was unable to make it work properly for my roleplay. I assume it's simply strictly tailored for the use on your website. Or I was doing something wrong, and will gladly accept the feedback if so! Perhaps someone got it to work better.
I tried the recommended prompt and instructions (with names) for role-play from the official guide first, but it produced only mark-down style responses (while I'm using the models to write novel-style, long paragraphed replies), also continuing writing for user too. This should be easily fixable with adding a custom Stopping String, though. Here's a screenshot of what I got.
I managed to get it to write more novel-style once using my own system prompt and the classic ChatML (user/assistant) format, as seen below, however; it still played for me, kind of like continuing writing a novel. I assume this may because this model is especially crafted with creative writing in mind.
Sadly, no matter what I did, it either produced weirdly formatted responses or divulged into novel-writing again... The screenshot below was tested with an Alpaca Instruct format (tested it just in case).
I also ran one test on my ERP testing chat with around 16k context and got... a very strange suggestion of how the scene should continue? I kind of laughed a bit with this one, hah. Oh, and it also gets the names incorrect sometimes.
My character cards are written descriptive-style, similarly to what is recommended on your guide website, so I haven't touched them. Below are my settings that I used (classic ChatML style).
System String: https://files.catbox.moe/eim1xw.json
Instruct: https://files.catbox.moe/kdlwip.json
Settings: https://files.catbox.moe/81y4sf.json
If I had to judge just the writing alone, I'm also not super impressed. The style seems kind of... bland? Basically, it feels like it was written by an AI, which is super strange given the fact that you trained it with human data only. Apologies for putting it so bluntly. But, once again - this may be completely on my end due to my wrong prompting/formatting, so please take my opinion with a grain of salt. So far, this model is a pass for me, unless someone actually got it to work on their frontend and will be kind enough to share their setup, so I can re-test this LLM properly and do a proper review. I'm always on the lookout for new long-context models capable of creative writing/roleplaying, and I'm always happy to write new recommendations.
Perhaps I should also simply test the model on your website, to see how it fares in its "natural habitat"? Could write a review based on that.
Hey there, thank you for the super extensive testing!
In your instruct setting, you are using <|im_start|>assistant
, which should <|im_start|>text
. In training, the stories / role-play used text
role. The assistant
role will not be good at writing, I think.
You can see the code to generate the correct prompt here: https://huggingface.co/dreamgen/opus-v1.2-7b/blob/main/configs/opus-v1.py
Ideally it should look something like this, once assembled:
<|im_start|>system
You are an intelligent, skilled, versatile writer.
Your task is to write a story based on the information below.
Write the story as if it's a book.
## Plot description:
This is a fanfiction from the Harry Potter universe. In this alternate reality, Harry Potter is evil and secretly siding with Slytherin.
Up until now, Harry was pretending to be friends with Hermione and Ron, that changes when he invites Hermione to his chambers where he tricks her to drink Amorentia, the most powerful love potion.
## Characters:
### Harry Potter
Harry Potter in this fanfiction is secretly a member of Slytherin and is using his powers for evil rather than for good. Up until now, he was pretending to be friends with Hermione and Ron.
### Hermione Granger
Hermione appears just like in the original books.<|im_end|>
<|im_start|>user
The story starts with Harry welcoming Hermione into his chambers, who he invited there earlier that day. He offers her water to drink, but it contains a love potion.<|im_end|>
<|im_start|>text
“Come in,” said Harry, waving at the doorway behind Hermione’s back.
“Hello?” she said, stepping inside, “what did you want me to come up here for?”
“Well, I thought we could get away from all the noise down there, have a chat about what we plan to do for Christmas…” Harry said, fumbling for words. He had never really been any good with girls. “But anyway, please, take a seat and let me get us some water!” he said, darting over to the sideboard.
He returned quickly with two glasses of water. Hermione took hers and thanked him, taking in a big gulp. As soon as she swallowed, Harry saw her eyes widen as her heart began beating wildly in her chest.
It worked! Harry thought, grinning to himself. Amorentia truly was the world’s best love potion, its effects lasting twice as long and being five times stronger.
Here's a preliminary ST settings from one user from the ST Discord: https://discord.com/channels/1100685673633153084/1132396046480314388/1210526004146077707
I will be playing around with it and tweaking it further.
For role-play, you may also want to add ## Style description:
into the system prompt, something like this:
## Style description:
This story is written as a second-person narrative from the perspective of {{user}}.
You can also add that romantic scenes are described vividly, with great detail
etc. More examples in the guide: https://dreamgen.com/docs/models/opus/v1
Oh my, thank you so much for the guide! You are the best! I'll rewrite my prompt then using your recommendations and will get back to you after more testing! Super excited to see how it fares now, thank you!
EDIT: I snorted when I checked that the example ST setting you linked on Discord is actually the old system prompt I designed, ahaha. Glad to see people are using it, though!
I made some updates to match Opus V1 template as closely as possible: https://huggingface.co/dreamgen/opus-v1.2-7b/tree/main/configs/silly_tavern