amount of GPU memory needed?
I have a question regarding the memory requirements for loading this model. I'm currently working with a system that has 24GB of GPU memory, but I'm encountering an 'OutOfMemory' issue. Could you please advise me on the amount of GPU memory needed to successfully load and run this model? Any insights or suggestions would be greatly appreciated.
code:
import torch
from transformers import AutoModel, AutoTokenizer
torch.set_default_device("cuda")
model_name_of_path='Yhyu13/LMCocktail-Mistral-7B-v1'
tokenizer = AutoTokenizer.from_pretrained(model_name_of_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_name_of_path, trust_remote_code=True)
inputs = tokenizer("who are you?", return_tensors="pt", return_attention_mask=False)
print('inputs=', inputs)
outputs = model.generate(**inputs, max_length=200)
print('outputs=', outputs)
text = tokenizer.batch_decode(outputs)[0]
print('text=', text)