brucethemoose
commited on
Commit
•
953d2ad
1
Parent(s):
a62d438
Update README.md
Browse files
README.md
CHANGED
@@ -29,7 +29,7 @@ Disappointed with some quirks of my previous kitchen sink merges (like token/ins
|
|
29 |
|
30 |
- [DrNicefellow/migtissera/Tess-M-Creative-v1.0](https://huggingface.co/migtissera/Tess-M-Creative-v1.0) and [NousResearch/Nous-Capybara-34B](https://huggingface.co/NousResearch/Nous-Capybara-34B) are both "undertrained" Yi models. I find they excel at raw completion performance (like long novel continuations) while still retaining some Vicuna instruct ability. This may be why some still prefer the original Tess 1.0/Capybara merge.
|
31 |
|
32 |
-
I consider this a more "focused" merge
|
33 |
|
34 |
|
35 |
## Prompt template: Orca-Vicuna
|
@@ -60,6 +60,8 @@ To load/train this in full-context backends like transformers, you *must* change
|
|
60 |
|
61 |
See: https://huggingface.co/brucethemoose/Yi-34B-200K-DARE-megamerge-v8#testing-notes
|
62 |
|
|
|
|
|
63 |
I have tested this merge with with novel-style continuation (but not much chat-style roleplay), and some assistant-style responses and long context analysis. I haven't seen any refusals so far.
|
64 |
|
65 |
## Merge Details
|
|
|
29 |
|
30 |
- [DrNicefellow/migtissera/Tess-M-Creative-v1.0](https://huggingface.co/migtissera/Tess-M-Creative-v1.0) and [NousResearch/Nous-Capybara-34B](https://huggingface.co/NousResearch/Nous-Capybara-34B) are both "undertrained" Yi models. I find they excel at raw completion performance (like long novel continuations) while still retaining some Vicuna instruct ability. This may be why some still prefer the original Tess 1.0/Capybara merge.
|
31 |
|
32 |
+
I consider this a more "focused" merge that previous ones. I will investigate other models (perhaps chatML models?) for a more "factual assistant" focused merge, as well as a coding-focused merge if I can't find one to suit my needs.
|
33 |
|
34 |
|
35 |
## Prompt template: Orca-Vicuna
|
|
|
60 |
|
61 |
See: https://huggingface.co/brucethemoose/Yi-34B-200K-DARE-megamerge-v8#testing-notes
|
62 |
|
63 |
+
Thiss is a possible base for a storytelling finetune/LASER in the future, once I bite the bullet and rent some A100s or a MI300.
|
64 |
+
|
65 |
I have tested this merge with with novel-style continuation (but not much chat-style roleplay), and some assistant-style responses and long context analysis. I haven't seen any refusals so far.
|
66 |
|
67 |
## Merge Details
|