--- license: mit datasets: - abisee/cnn_dailymail language: - en metrics: - rouge - bleu base_model: - google-t5/t5-small pipeline_tag: summarization library_name: transformers --- # Model Card for t5_small Summarization Model ## Model Details - Model Architecture: T5 (Text-to-Text Transfer Transformer) - Variant: t5-small - Task: Text Summarization - Framework: Hugging Face Transformers ## Training Data - Dataset: CNN/DailyMail - Content: News articles and their summaries - Size: Approximately 300,000 article-summary pairs ## Training Procedure - Fine-tuning method: Using Hugging Face Transformers library - Hyperparameters: - Learning rate: 5e-5 - Batch size: 8 - Number of epochs: 3 - Optimizer: AdamW ## How to Use 1. Install the Hugging Face Transformers library: ``` pip install transformers ``` 2. Load the model: ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("t5-small") model = AutoModelForSeq2SeqLM.from_pretrained("t5-small") ``` 3. Generate a summary: ```python input_text = "Your input text here" inputs = tokenizer("summarize: " + input_text, return_tensors="pt", max_length=512, truncation=True) summary_ids = model.generate(inputs["input_ids"], max_length=150, min_length=40, length_penalty=2.0, num_beams=4, early_stopping=True) summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) ``` ## Evaluation - Metric: ROUGE scores (Recall-Oriented Understudy for Gisting Evaluation) - Exact scores not available, but typically evaluated on: - ROUGE-1 (unigram overlap) - ROUGE-2 (bigram overlap) - ROUGE-L (longest common subsequence) ## Limitations - Performance may be lower compared to larger T5 variants - Optimized for news article summarization, may not perform as well on other text types - Limited to input sequences of 512 tokens - Generated summaries may sometimes contain factual inaccuracies ## Ethical Considerations - May inherit biases present in the CNN/DailyMail dataset - Not suitable for summarizing sensitive or critical information without human review - Users should be aware of potential biases and inaccuracies in generated summaries - Should not be used as a sole source of information for decision-making processes