chenwuml commited on
Commit
06ffb6e
1 Parent(s): 2b8ab24

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -30,7 +30,7 @@ FalconLite2 evolves from [FalconLite](https://huggingface.co/amazon/FalconLite),
30
  ## Deploy FalconLite2 on EC2 ##
31
  SSH login to an AWS `g5.12x` instance with the [Deep Learning AMI](https://aws.amazon.com/releasenotes/aws-deep-learning-ami-gpu-pytorch-2-0-ubuntu-20-04/).
32
 
33
- ### Start TGI server
34
  ```bash
35
  git clone https://github.com/awslabs/extending-the-context-length-of-open-source-llms.git falconlite-dev
36
  cd falconlite-dev/falconlite2
@@ -67,7 +67,9 @@ python falconlite_client.py -l
67
  **Important** - When using FalconLite2 for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
68
 
69
  ## Deploy FalconLite2 on Amazon SageMaker ##
70
- To deploy FalconLite2 on a SageMaker endpoint, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/falconlite2/sm_deploy.ipynb) running on a SageMaker Notebook instance (e.g. `g5.xlarge`).
 
 
71
 
72
  ## Evalution Result ##
73
  We evaluated FalconLite2 against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer contexts.
 
30
  ## Deploy FalconLite2 on EC2 ##
31
  SSH login to an AWS `g5.12x` instance with the [Deep Learning AMI](https://aws.amazon.com/releasenotes/aws-deep-learning-ami-gpu-pytorch-2-0-ubuntu-20-04/).
32
 
33
+ ### Start TGI server-1.0.3
34
  ```bash
35
  git clone https://github.com/awslabs/extending-the-context-length-of-open-source-llms.git falconlite-dev
36
  cd falconlite-dev/falconlite2
 
67
  **Important** - When using FalconLite2 for inference for the first time, it may require a brief 'warm-up' period that can take 10s of seconds. However, subsequent inferences should be faster and return results in a more timely manner. This warm-up period is normal and should not affect the overall performance of the system once the initialisation period has been completed.
68
 
69
  ## Deploy FalconLite2 on Amazon SageMaker ##
70
+ To deploy FalconLite2 on a SageMaker endpoint with TGI-1.0.3, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/falconlite2/sm_deploy.ipynb) running on a SageMaker Notebook instance (e.g. `g5.xlarge`).
71
+
72
+ To deploy FalconLite2 on a SageMaker endpoint with TGI-1.1.0, please follow [this notebook](https://github.com/awslabs/extending-the-context-length-of-open-source-llms/blob/main/falconlite2-tgi1.1.0/sm_deploy.ipynb) running on a SageMaker Notebook instance (e.g. `g5.xlarge`).
73
 
74
  ## Evalution Result ##
75
  We evaluated FalconLite2 against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer contexts.