Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantization made by Richard Erkhov.

Github

Discord

Request more models

gpt2-context_generator - GGUF

Original model description:

language: - en license: cc-by-sa-4.0 tags: - generated_from_trainer - text-generation-inference datasets: - Non-Residual-Prompting/C2Gen pipeline_tag: text-generation base_model: gpt2 model-index: - name: gpt2-commongen-finetuned results: []

gpt2-context_generator

This model is a fine-tuned version of gpt2 on Non-Residual-Prompting/C2Gen dataset.

Model description

More information needed

Intended uses & limitations

  • Check config.json for prompt template and sampling strategy.

Dataset Summary

CommonGen Lin et al., 2020 is a dataset for the constrained text generation task of word inclusion. But the task does not allow to include context. Therefore, to complement CommonGen, we provide an extended test set C2Gen Carlsson et al., 2022 where an additional context is provided for each set of target words. The task is therefore reformulated to both generate commonsensical text which include the given words, and also have the generated text adhere to the given context.

Training procedure

  • Causal Language Modelling

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 9e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 8

Framework versions

  • Transformers 4.27.3
  • Pytorch 1.13.1+cu116
  • Datasets 2.13.1
  • Tokenizers 0.13.2
Downloads last month
373
GGUF
Model size
163M params
Architecture
gpt2

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .