AI-Sweden-Models
/

Llama-3-8B-instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Finetune this model, how to handle terminators?

#5

by BoccheseGiacomo - opened Aug 12

BoccheseGiacomo

Aug 12

Hi everyone and thank you. I need to train this model on a custom task using lora finetuning.
Since i noted that there are special termination tokens , how should i setup my training data and tokenizer in order to get this handled correctly?

terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

Since normally, on training, i use: tokenizer.pad_token = tokenizer.eos_token

AI Sweden Model Hub org Aug 13

Hi everyone and thank you. I need to train this model on a custom task using lora finetuning.
Since i noted that there are special termination tokens , how should i setup my training data and tokenizer in order to get this handled correctly?

terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

Since normally, on training, i use: tokenizer.pad_token = tokenizer.eos_token

Hi @BoccheseGiacomo - you should follow the the Llama3 instruct format: https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/#llama-3-instruct

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment