Update README.md
#1
by
jordiclive
- opened
README.md
CHANGED
@@ -69,11 +69,10 @@ This model was trained on:
|
|
69 |
- [togethercomputer/RedPajama-Data-1T-Sample](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T)
|
70 |
- [atom-in-the-universe/fanfics-10k-50k](https://huggingface.co/datasets/atom-in-the-universe/fanfics-10k-50k)
|
71 |
|
72 |
-
The dataset [shahules786/orca-chat](https://huggingface.co/datasets/shahules786/orca-chat) combines similar
|
73 |
-
|
74 |
-
to improve long-context trainig.
|
75 |
|
76 |
-
RedPajama and FanFics were
|
77 |
|
78 |
|
79 |
## Model Configuration
|
@@ -130,15 +129,15 @@ llama2_13b_orca_8k:
|
|
130 |
# Developers
|
131 |
|
132 |
- [shahules786](https://github.com/shahules786)
|
133 |
-
- [
|
134 |
- [andreaskoepf](https://github.com/andreaskoepf/)
|
135 |
|
136 |
# Special Thanks
|
137 |
|
138 |
-
We want to especially thank Eric
|
139 |
-
Also shoutout to the whole team working on [LLongMA-2-13b](https://huggingface.co/conceptofmind/LLongMA-2-13b) & the [scaled-rope](https://github.com/jquesnelle/scaled-rope) repository for their awesome work: bloc97, jquesnelle & conceptofmind!
|
140 |
|
141 |
-
The whole Open-Assistant team is very grateful for the continued support of [Redmond.ai](https://redmond.ai/) who sponsored the training compute for this model.
|
142 |
|
143 |
# License
|
144 |
|
|
|
69 |
- [togethercomputer/RedPajama-Data-1T-Sample](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T)
|
70 |
- [atom-in-the-universe/fanfics-10k-50k](https://huggingface.co/datasets/atom-in-the-universe/fanfics-10k-50k)
|
71 |
|
72 |
+
The dataset [shahules786/orca-chat](https://huggingface.co/datasets/shahules786/orca-chat) combines similar examples of the GPT-4 subset of [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin) to form longer conversations
|
73 |
+
to improve long-context training.
|
|
|
74 |
|
75 |
+
Additionally, RedPajama and FanFics were used for classic language modelling as an auxiliary task to improve the RoPE scaling for the 8k context size.
|
76 |
|
77 |
|
78 |
## Model Configuration
|
|
|
129 |
# Developers
|
130 |
|
131 |
- [shahules786](https://github.com/shahules786)
|
132 |
+
- [jordiclive](https://github.com/jordiclive)
|
133 |
- [andreaskoepf](https://github.com/andreaskoepf/)
|
134 |
|
135 |
# Special Thanks
|
136 |
|
137 |
+
We want to especially thank Eric Hartford who spared no expense in replicating ORCA and making it available at [ehartford/dolphin](https://huggingface.co/datasets/ehartford/dolphin)!
|
138 |
+
Also, shoutout to the whole team working on [LLongMA-2-13b](https://huggingface.co/conceptofmind/LLongMA-2-13b) & the [scaled-rope](https://github.com/jquesnelle/scaled-rope) repository for their awesome work: bloc97, jquesnelle & conceptofmind!
|
139 |
|
140 |
+
The whole Open-Assistant team is very grateful for the continued support of [Redmond.ai](https://redmond.ai/) who sponsored the training compute required for this model.
|
141 |
|
142 |
# License
|
143 |
|