DavidAU
/

MN-Dark-Planet-Kaboom-21B-GGUF

Model card Files Files and versions Community

Garbage output? (Q6_k)

by ThijsL202 - opened Sep 15

Sep 15

So I'm using the Q6_k quant and my output is utter garbage, I've tried a lot of settings, presets, story, and instruct strings in silly tavern but the model doesn't produce any good message. (I mainly use theia 21b v2) and compared to that model... It's night and day. Is this the model itself? (Running using koboldcpp v1.74) and context shift + context size 8192

DavidAU

Owner Sep 16

•

edited Sep 16

Is it not following instructions? or is it text output issues? (garbled? broken?).
Can you give me an idea of temp / rep pen settings used?

Although this model works with 3 (or more templates), Alpaca seems most stable. (unclear why).
If you can give me some output from it ; I will see what I can do.

Note: This model is composed of fine tuned models, it is not a fine tune itself. Theia is a "ground up" fine tune.

You may want to try:
https://huggingface.co/DavidAU/L3-Dark-Planet-8B
or
https://huggingface.co/DavidAU/L3-Dark-Planet-8B-V2-Eight-Orbs-Of-Power-GGUF

As these models are composed differently, and retain all function of their "parents" so to speak.
They do not have the same level of "character" as "Kaboom" (and Dark Horror series), but they may be better for RP ; and have 8k context windows.

ThijsL202

Sep 17

Will gather some examples.

DavidAU

Owner Sep 17

I re-read your comment; try turning "content shift" off.
This model has a 131k context window.

ThijsL202

Sep 17

Will try!

ThijsL202

Sep 17

•

edited Sep 17

Since I'm horrible at explaining things myself, here's what Chatgpt made for me;

Below are some examples:

Character Used:

Parameters:

Preset: universal_light + 120 Response (tokens) 8k context
Story string template: ChatML + enabled Trim Incomplete Sentences
Instruct template: turned off

Model Loading Info:

Koboldcpp Version: 1.74
Context size: 8192
Context shift: turned off
Flash attention: enabled
GPU layers: 200
Use QuantMatMul: enabled
BLAS batch size: 1024

Example Output:

Example Output:

Excerpt 1:
"Our blades locked together with a resounding clang, sparks showering forth like falling stars. Our eyes locked, inches apart, nos鼻e to nose."

Excerpt 2:
"You speak of greater good yet you've only brought suffering and despair!" I growled back, my arm shaking with the strain of holding back her might. "The price you speak of, is one I'll never accept! I'll see you burn, witch!"

Issues:
1. Japanese kanji inserted into English words: For example, "nos鼻e" instead of "nose." This also happens with words like "lick舐."
2. Missing possessive pronouns: The model doesn't handle abbreviations or possessives correctly. For example, "breath hot on face" instead of "her breath hot on his face."

These are the issues I had first, but it's waaaay more rare to happen right now.
I have no idea why the few times all output was really, really bad, but it's much better now. It can't replicate as easily, the Japanese characters sometimes (very rare) show up but a swipe will take care of it.

Thanks for the merge!

DavidAU

Owner Sep 17

•

edited Sep 17

Thank you ; this helps a lot.
The token issue can happen with these types of merges.

However, try turning off flash attention ; sometimes it works great other times it causes issues.
Pronouns issue is odd. I will test and see what I can replicate here.

Thanks again ;

ThijsL202

Sep 17

•

edited Sep 17

I fixed the possessive pronouns issue by disabling Exclude Top Choices (XTC)
So the only issue I really have is the model being... Less coherent, it tends to talk with lots of "slang" and is a bit to the angry/bad side when mimicking a personality.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment