Edit model card

Model description

This model is a fine-tuned version of flax-community/gpt-2-spanish on a custom dataset (not publicly available). The dataset is made of crawled data from 3 Spanish cooking websites and it contains approximately ~50000 recipes. It achieves the following results on the evaluation set:

  • Loss: 0.5796

Contributors

How to use it

from transformers import AutoTokenizer, AutoModelForCausalLM

model_checkpoint = 'gastronomia-para-to2/gastronomia_para_to2'
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForCausalLM.from_pretrained(model_checkpoint)

The tokenizer makes use of the following special tokens to indicate the structure of the recipe:

special_tokens = [
'<INPUT_START>',
'<NEXT_INPUT>',
'<INPUT_END>',
'<TITLE_START>',
'<TITLE_END>',
'<INGR_START>',
'<NEXT_INGR>',
'<INGR_END>',
'<INSTR_START>',
'<NEXT_INSTR>',
'<INSTR_END>',
'<RECIPE_START>',
'<RECIPE_END>']

The input should be of the form:

<RECIPE_START> <INPUT_START> ingredient_1 <NEXT_INPUT> ingredient_2 <NEXT_INPUT> ... <NEXT_INPUT> ingredient_n <INPUT_END> <INGR_START>

We are using the following configuration to generate recipes, but feel free to change parameters as needed:

tokenized_input = tokenizer(input, return_tensors='pt')
output = model.generate(**tokenized_input,
                          max_length=600,
                          do_sample=True,
                          top_p=0.92,
                          top_k=50,
                          num_return_sequences=3)
pre_output = tokenizer.decode(output[0], skip_special_tokens=False)

The recipe ends where the <RECIPE_END> special token appears for the first time.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.6213 1.0 5897 0.6214
0.5905 2.0 11794 0.5995
0.5777 3.0 17691 0.5893
0.574 4.0 23588 0.5837
0.5553 5.0 29485 0.5807
0.5647 6.0 35382 0.5796

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.11.0+cu102
  • Datasets 2.0.0
  • Tokenizers 0.11.6

References

The list of special tokens used for generation recipe structure has been taken from: RecipeNLG: A Cooking Recipes Dataset for Semi-Structured Text Generation.

Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Spaces using gastronomia-para-to2/gastronomia_para_to2 2