Merge branch 'main' of https://huggingface.co/TMElyralab/lyraXVERSE into main
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ tags:
|
|
10 |
lyraXVERSE is currently the **fastest XVERSE-13b** available. The inference speed of lyraXVERSE has achieved up to **3900+ tokens/s** on A100, up to **2.7x** acceleration upon the torch version.
|
11 |
|
12 |
Among its main features are:
|
13 |
-
- device: Nvidia GPU with Amperer architecture or Volta architecture (A100 or higher, V100).
|
14 |
- batch_size: compiled with dynamic batch size, maximum depends on device.
|
15 |
- MEMOPT mode: significantly optimized VRAM usage and increased speed
|
16 |
|
@@ -27,7 +27,7 @@ We use the XVERSE-13B-Chat model for measurement, but this optimized inference i
|
|
27 |
| Version | Batch Size 1 | Batch Size 8 | Batch Size 16 | Batch Size 32 | Batch Size 64 |
|
28 |
| --- | --- | --- | --- | --- | --- |
|
29 |
| Torch | 34.8 | 249.2 | 470.1 | 878.6 | 1478.9 |
|
30 |
-
| lyraXVERSE
|
31 |
|
32 |
## Docker Environment Recommendation
|
33 |
|
@@ -90,11 +90,6 @@ print(output_texts)
|
|
90 |
|
91 |
这个故事告诉我们,画家的价值不只是他们的绘画技巧,而是他们的画作带给人们的感动和希望。画家的价值并不在于他们的画有多么昂贵,有多么独特,而在于他们能用画作打开人们的心扉,让人们看见希望,看见生活的美好。
|
92 |
|
93 |
-
## TODO
|
94 |
-
1. Support for int4
|
95 |
-
2. Inference for longer context situations
|
96 |
-
3. Streaming inference mode.
|
97 |
-
|
98 |
## Citation
|
99 |
``` bibtex
|
100 |
@Misc{lyraXVERSE2023,
|
@@ -105,6 +100,6 @@ print(output_texts)
|
|
105 |
}
|
106 |
```
|
107 |
|
108 |
-
## Report
|
109 |
- start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraXVERSE
|
110 |
- report bug with a `[bug]` mark in the title.
|
|
|
10 |
lyraXVERSE is currently the **fastest XVERSE-13b** available. The inference speed of lyraXVERSE has achieved up to **3900+ tokens/s** on A100, up to **2.7x** acceleration upon the torch version.
|
11 |
|
12 |
Among its main features are:
|
13 |
+
- device: Nvidia GPU with Amperer architecture or Volta architecture (A10, A100 or higher, V100).
|
14 |
- batch_size: compiled with dynamic batch size, maximum depends on device.
|
15 |
- MEMOPT mode: significantly optimized VRAM usage and increased speed
|
16 |
|
|
|
27 |
| Version | Batch Size 1 | Batch Size 8 | Batch Size 16 | Batch Size 32 | Batch Size 64 |
|
28 |
| --- | --- | --- | --- | --- | --- |
|
29 |
| Torch | 34.8 | 249.2 | 470.1 | 878.6 | 1478.9 |
|
30 |
+
| lyraXVERSE | 96.6 | 725.5 | 1359.3 | 2415.6 | 3923.2 |
|
31 |
|
32 |
## Docker Environment Recommendation
|
33 |
|
|
|
90 |
|
91 |
这个故事告诉我们,画家的价值不只是他们的绘画技巧,而是他们的画作带给人们的感动和希望。画家的价值并不在于他们的画有多么昂贵,有多么独特,而在于他们能用画作打开人们的心扉,让人们看见希望,看见生活的美好。
|
92 |
|
|
|
|
|
|
|
|
|
|
|
93 |
## Citation
|
94 |
``` bibtex
|
95 |
@Misc{lyraXVERSE2023,
|
|
|
100 |
}
|
101 |
```
|
102 |
|
103 |
+
## Report bugs
|
104 |
- start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraXVERSE
|
105 |
- report bug with a `[bug]` mark in the title.
|