jburtoft commited on
Commit
5e5efca
1 Parent(s): 3b9c8b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -3
README.md CHANGED
@@ -14,7 +14,7 @@ tags:
14
  This repository contains [**AWS Inferentia2**](https://aws.amazon.com/ec2/instance-types/inf2/) and [`neuronx`](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/) compatible checkpoints for [upstage/SOLAR-10.7B-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-v1.0).
15
  You can find detailed information about the base model on its [Model Card](https://huggingface.co/upstage/SOLAR-10.7B-v1.0).
16
 
17
- This model card also includes instructions for how to compile other SOLAR models with other settings if this combination isn't quite what you are looking for.
18
 
19
  This model has been exported to the `neuron` format using specific `input_shapes` and `compiler` parameters detailed in the paragraphs below.
20
 
@@ -26,7 +26,10 @@ Please refer to the 🤗 `optimum-neuron` [documentation](https://huggingface.co
26
 
27
  ## Set up the environment
28
 
29
- First, use the [DLAMI image from Hugging Face](https://aws.amazon.com/marketplace/pp/prodview-gr3e6yiscria2). It has most of the utilities and drivers preinstalled. However, you may need to update to version 2.16 to use these binaries.
 
 
 
30
 
31
  ```
32
  sudo apt-get update -y \
@@ -67,10 +70,11 @@ Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
67
  [{'generated_text': 'Hi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if you have any questions about your ***** ***** account.\nHi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if you have any questions about your ***** ***** account.\nHi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if you have any questions about your ***** ***** account.\nHi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if you have any questions about your ***** ***** account.\nHi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if'}]
68
  ```
69
 
70
- ##Compiling for different instances or settings
71
 
72
  If this repository doesn't have the exact version or settings, you can compile your own.
73
 
 
74
  from optimum.neuron import NeuronModelForCausalLM
75
  #num_cores should be changed based on the instance. inf2.24xlarge has 6 neuron processors (they have two cores each) so 12 total
76
  input_shapes = {"batch_size": 1, "sequence_length": 4096}
@@ -81,6 +85,7 @@ model.save_pretrained("SOLAR-10.7B-v1.0-neuron-24xlarge-2.16-8core-4096")
81
  from transformers import AutoTokenizer
82
  tokenizer = AutoTokenizer.from_pretrained("upstage/SOLAR-10.7B-v1.0")
83
  tokenizer.save_pretrained("SOLAR-10.7B-v1.0-neuron-24xlarge-2.16-8core-4096")
 
84
 
85
  This repository contains tags specific to versions of `neuronx`. When using with 🤗 `optimum-neuron`, use the repo revision specific to the version of `neuronx` you are using, to load the right serialized checkpoints.
86
 
 
14
  This repository contains [**AWS Inferentia2**](https://aws.amazon.com/ec2/instance-types/inf2/) and [`neuronx`](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/) compatible checkpoints for [upstage/SOLAR-10.7B-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-v1.0).
15
  You can find detailed information about the base model on its [Model Card](https://huggingface.co/upstage/SOLAR-10.7B-v1.0).
16
 
17
+ This model card also includes instructions for how to compile other SOLAR models with other settings if this combination isn't what you are looking for.
18
 
19
  This model has been exported to the `neuron` format using specific `input_shapes` and `compiler` parameters detailed in the paragraphs below.
20
 
 
26
 
27
  ## Set up the environment
28
 
29
+ First, use the [DLAMI image from Hugging Face](https://aws.amazon.com/marketplace/pp/prodview-gr3e6yiscria2). It has most of the utilities and drivers preinstalled, but hasn't been updated to 2.16 as of 1/13/24.
30
+ However, you will need version 2.16 to use these binaries. 2.16 shows a significant performance increase over 2.15 for Llama based models.
31
+
32
+ The commands below will update your 2.15 libraries to 2.16.
33
 
34
  ```
35
  sudo apt-get update -y \
 
70
  [{'generated_text': 'Hi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if you have any questions about your ***** ***** account.\nHi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if you have any questions about your ***** ***** account.\nHi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if you have any questions about your ***** ***** account.\nHi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if you have any questions about your ***** ***** account.\nHi, my name is ***** ***** I am calling from ***** ***** and I am calling to see if'}]
71
  ```
72
 
73
+ ## Compiling for different instances or settings
74
 
75
  If this repository doesn't have the exact version or settings, you can compile your own.
76
 
77
+ ```
78
  from optimum.neuron import NeuronModelForCausalLM
79
  #num_cores should be changed based on the instance. inf2.24xlarge has 6 neuron processors (they have two cores each) so 12 total
80
  input_shapes = {"batch_size": 1, "sequence_length": 4096}
 
85
  from transformers import AutoTokenizer
86
  tokenizer = AutoTokenizer.from_pretrained("upstage/SOLAR-10.7B-v1.0")
87
  tokenizer.save_pretrained("SOLAR-10.7B-v1.0-neuron-24xlarge-2.16-8core-4096")
88
+ ```
89
 
90
  This repository contains tags specific to versions of `neuronx`. When using with 🤗 `optimum-neuron`, use the repo revision specific to the version of `neuronx` you are using, to load the right serialized checkpoints.
91