carsonhxsu commited on
Commit
e9adcd1
2 Parent(s): 0b2ea18 44dfb53

Merge branch 'main' of https://huggingface.co/TMElyralab/lyraXVERSE into main

Browse files
Files changed (1) hide show
  1. README.md +3 -8
README.md CHANGED
@@ -10,7 +10,7 @@ tags:
10
  lyraXVERSE is currently the **fastest XVERSE-13b** available. The inference speed of lyraXVERSE has achieved up to **3900+ tokens/s** on A100, up to **2.7x** acceleration upon the torch version.
11
 
12
  Among its main features are:
13
- - device: Nvidia GPU with Amperer architecture or Volta architecture (A100 or higher, V100).
14
  - batch_size: compiled with dynamic batch size, maximum depends on device. 
15
  - MEMOPT mode: significantly optimized VRAM usage and increased speed
16
 
@@ -27,7 +27,7 @@ We use the XVERSE-13B-Chat model for measurement, but this optimized inference i
27
  | Version | Batch Size 1 | Batch Size 8 | Batch Size 16 | Batch Size 32 | Batch Size 64 |
28
  | --- | --- | --- | --- | --- | --- |
29
  | Torch | 34.8 | 249.2 | 470.1 | 878.6 | 1478.9 |
30
- | lyraXVERSE MEMOPT | 96.6 | 725.5 | 1359.3 | 2415.6 | 3923.2 |
31
 
32
  ## Docker Environment Recommendation
33
 
@@ -90,11 +90,6 @@ print(output_texts)
90
 
91
  这个故事告诉我们,画家的价值不只是他们的绘画技巧,而是他们的画作带给人们的感动和希望。画家的价值并不在于他们的画有多么昂贵,有多么独特,而在于他们能用画作打开人们的心扉,让人们看见希望,看见生活的美好。
92
 
93
- ## TODO
94
- 1. Support for int4
95
- 2. Inference for longer context situations
96
- 3. Streaming inference mode.
97
-
98
  ## Citation
99
  ``` bibtex
100
  @Misc{lyraXVERSE2023,
@@ -105,6 +100,6 @@ print(output_texts)
105
  }
106
  ```
107
 
108
- ## Report bug
109
  - start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraXVERSE
110
  - report bug with a `[bug]` mark in the title.
 
10
  lyraXVERSE is currently the **fastest XVERSE-13b** available. The inference speed of lyraXVERSE has achieved up to **3900+ tokens/s** on A100, up to **2.7x** acceleration upon the torch version.
11
 
12
  Among its main features are:
13
+ - device: Nvidia GPU with Amperer architecture or Volta architecture (A10, A100 or higher, V100).
14
  - batch_size: compiled with dynamic batch size, maximum depends on device. 
15
  - MEMOPT mode: significantly optimized VRAM usage and increased speed
16
 
 
27
  | Version | Batch Size 1 | Batch Size 8 | Batch Size 16 | Batch Size 32 | Batch Size 64 |
28
  | --- | --- | --- | --- | --- | --- |
29
  | Torch | 34.8 | 249.2 | 470.1 | 878.6 | 1478.9 |
30
+ | lyraXVERSE | 96.6 | 725.5 | 1359.3 | 2415.6 | 3923.2 |
31
 
32
  ## Docker Environment Recommendation
33
 
 
90
 
91
  这个故事告诉我们,画家的价值不只是他们的绘画技巧,而是他们的画作带给人们的感动和希望。画家的价值并不在于他们的画有多么昂贵,有多么独特,而在于他们能用画作打开人们的心扉,让人们看见希望,看见生活的美好。
92
 
 
 
 
 
 
93
  ## Citation
94
  ``` bibtex
95
  @Misc{lyraXVERSE2023,
 
100
  }
101
  ```
102
 
103
+ ## Report bugs
104
  - start a discussion to report any bugs!--> https://huggingface.co/TMElyralab/lyraXVERSE
105
  - report bug with a `[bug]` mark in the title.