Finetune this model, how to handle terminators?
Hi everyone and thank you. I need to train this model on a custom task using lora finetuning.
Since i noted that there are special termination tokens , how should i setup my training data and tokenizer in order to get this handled correctly?
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
Since normally, on training, i use: tokenizer.pad_token = tokenizer.eos_token
Hi everyone and thank you. I need to train this model on a custom task using lora finetuning.
Since i noted that there are special termination tokens , how should i setup my training data and tokenizer in order to get this handled correctly?terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]Since normally, on training, i use: tokenizer.pad_token = tokenizer.eos_token
Hi @BoccheseGiacomo - you should follow the the Llama3 instruct format: https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/#llama-3-instruct