---
license: mit
---
Momo XL - Anime-Style SDXL Base Model
**Momo XL** is an anime-style model based on SDXL, fine-tuned to produce high-quality anime-style images with detailed and vibrant aesthetics. (Oct 6, 2024)
## Key Features:
- **Anime-Focused SDXL**: Tailored for generating high-quality anime-style images, making it ideal for artists and enthusiasts.
- **Optimized for Tag-Based Prompting**: Works best when prompted with descriptive tags, ensuring accurate and relevant outputs.
- **LoRA Compatible**: Compatible with most LoRA models available on the hub, allowing for versatile customization and style transfer.
## Usage Instructions:
- **Tagging**: Use descriptive tags separated by commas to guide the image generation. Tags can be arranged in any order to suit your creative needs.
- **Year-Specific Styles**: To emulate art styles from a specific year, use the tag format "**`year 20XX`**" (e.g., "**`year 2023`**").
- **LoRA Models**: Momo XL supports most LoRA models, enabling enhanced and tailored outputs for your projects.
## Disclaimer:
This model may produce unexpected or unintended results. **Use with caution and at your own risk.**
**Important Notice:**
- **Ethical Use**: Please ensure that your use of this model is ethical and complies with all applicable laws and regulations.
- **Content Responsibility**: Users are responsible for the content they generate. Do not use the model to create or disseminate illegal, harmful, or offensive material.
- **Data Sources**: The model was trained on publicly available datasets. While efforts have been made to filter and curate the training data, some undesirable content may remain.
Thank you! 😊
------------------------------------------------------
## Momo XL - Training Details (Oct 15, 2024)
### Dataset
Momo XL was trained using a dataset of over **400,000+ images** sourced from Danbooru.
### Base Model
Momo XL was built on top of SDXL, incorporating knowledge from two finetuned models:
- Formula:
`SDXL_base + (Animagine 3.0 base - SDXL_base) * 1.0 + (Pony V6 - SDXL_base) * 0.5`
For more details:
- [Animagine 3.0 base](https://huggingface.co/Linaqruf/animagine-xl-3.0)
- [Pony V6](https://huggingface.co/LyliaEngine/Pony_Diffusion_V6_XL)
### Training Process
Training was conducted on **A100 80GB GPUs**, totaling over **2000+ GPU hours**. The training was divided into three stages:
- **Finetuning - First Stage**: Trained on the entire dataset with a defined set of training configurations.
- **Finetuning - Second Stage**: Also trained on the entire dataset with some variations in settings.
- **Adjustment Stage**: Focused on aesthetic adjustments to improve the overall visual quality.
The final model, **Momo XL**, was released by merging the Text Encoder from the Finetuning Second Stage with the UNet from the Adjustment Stage.
### Hyperparameters
| Stage | Epochs | UNet lr | Text Encoder lr | Batch Size | Resolution | Noise Offset | Optimizer | LR Scheduler |
|--------------------------|--------|---------|-----------------|------------|------------|--------------|------------|--------------|
| **Finetuning 1st Stage** | 10 | 2e-5 | 1e-5 | 256 | 1024² | N/A | AdamW8bit | Constant |
| **Finetuning 2nd Stage** | 10 | 2e-5 | 1e-5 | 256 | Max. 1280² | N/A | AdamW | Constant |
| **Adjustment Stage** | 0.25 | 8e-5 | 4e-5 | 1024 | Max. 1280² | 0.05 | AdamW | Constant |