Update README.md
Browse files
README.md
CHANGED
@@ -30,7 +30,7 @@ FalconLite2 evolves from [FalconLite](https://huggingface.co/amazon/FalconLite),
|
|
30 |
## Deploy FalconLite2 on EC2 ##
|
31 |
SSH login to an AWS `g5.12x` instance with the [Deep Learning AMI](https://aws.amazon.com/releasenotes/aws-deep-learning-ami-gpu-pytorch-2-0-ubuntu-20-04/).
|
32 |
|
33 |
-
### Start TGI server
|
34 |
```bash
|
35 |
git clone https://github.com/awslabs/extending-the-context-length-of-open-source-llms.git falconlite-dev
|
36 |
cd falconlite-dev/falconlite2
|
@@ -67,7 +67,9 @@ python falconlite_client.py -l
|
|
67 |
**Important** - When using FalconLite2 for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
|
68 |
|
69 |
## Deploy FalconLite2 on Amazon SageMaker ##
|
70 |
-
To deploy FalconLite2 on a SageMaker endpoint, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/falconlite2/sm_deploy.ipynb) running on a SageMaker Notebook instance (e.g. `g5.xlarge`).
|
|
|
|
|
71 |
|
72 |
## Evalution Result ##
|
73 |
We evaluated FalconLite2 against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer contexts.
|
|
|
30 |
## Deploy FalconLite2 on EC2 ##
|
31 |
SSH login to an AWS `g5.12x` instance with the [Deep Learning AMI](https://aws.amazon.com/releasenotes/aws-deep-learning-ami-gpu-pytorch-2-0-ubuntu-20-04/).
|
32 |
|
33 |
+
### Start TGI server-1.0.3
|
34 |
```bash
|
35 |
git clone https://github.com/awslabs/extending-the-context-length-of-open-source-llms.git falconlite-dev
|
36 |
cd falconlite-dev/falconlite2
|
|
|
67 |
**Important** - When using FalconLite2 for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
|
68 |
|
69 |
## Deploy FalconLite2 on Amazon SageMaker ##
|
70 |
+
To deploy FalconLite2 on a SageMaker endpoint with TGI-1.0.3, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/falconlite2/sm_deploy.ipynb) running on a SageMaker Notebook instance (e.g. `g5.xlarge`).
|
71 |
+
|
72 |
+
To deploy FalconLite2 on a SageMaker endpoint with TGI-1.1.0, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/falconlite2-tgi1.1.0/sm_deploy.ipynb) running on a SageMaker Notebook instance (e.g. `g5.xlarge`).
|
73 |
|
74 |
## Evalution Result ##
|
75 |
We evaluated FalconLite2 against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer contexts.
|