runtime error

file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. /usr/local/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`. warnings.warn( Downloading shards: 0%| | 0/4 [00:00<?, ?it/s] Downloading shards: 25%|β–ˆβ–ˆβ–Œ | 1/4 [00:37<01:52, 37.50s/it] Downloading shards: 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 2/4 [01:11<01:11, 35.72s/it] Downloading shards: 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 3/4 [01:42<00:33, 33.43s/it] Downloading shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [01:49<00:00, 23.07s/it] Downloading shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [01:49<00:00, 27.46s/it] Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s] Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 4/4 [00:00<00:00, 81840.08it/s] Traceback (most recent call last): File "/home/user/app/app.py", line 53, in <module> model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct", device_map="auto") # to("cuda:0") File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3735, in from_pretrained dispatch_model(model, **device_map_kwargs) File "/usr/local/lib/python3.10/site-packages/accelerate/big_modeling.py", line 490, in dispatch_model raise ValueError( ValueError: You are trying to offload the whole model to the disk. Please use the `disk_offload` function instead.

Container logs:

Fetching error logs...