meta-llama/Meta-Llama-3-8B-Instruct · max new token limit error

I'm using the Meta-Llama/Meta-Llama-3-8B-Instruct model with the Hugging Face Hub Python library, and I'm storing all chats between the user and the assistant in a list, which I then send to the model.

However, I encountered an error after a few exchanges:

Input validation error: inputs tokens + max_new_tokens must be <= 8192. Given: 5909 inputs tokens and 3000 max_new_tokens.

I've set max_tokens to 3000, but this error suggests that the total token count exceeds the model's limit.

I couldn't find a solution online. How can I resolve this issue? Is there a way to manage the token count more effectively?

Thank you!