--- license: mit ---

Momo XL - Anime-Style SDXL Base Model

**Momo XL** is an anime-style model based on SDXL, fine-tuned to produce high-quality anime-style images with detailed and vibrant aesthetics. (Oct 6, 2024) ## Key Features: - **Anime-Focused SDXL**: Tailored for generating high-quality anime-style images, making it ideal for artists and enthusiasts. - **Optimized for Tag-Based Prompting**: Works best when prompted with descriptive tags, ensuring accurate and relevant outputs. - **LoRA Compatible**: Compatible with most LoRA models available on the hub, allowing for versatile customization and style transfer. ## Usage Instructions: - **Tagging**: Use descriptive tags separated by commas to guide the image generation. Tags can be arranged in any order to suit your creative needs. - **Year-Specific Styles**: To emulate art styles from a specific year, use the tag format "**`year 20XX`**" (e.g., "**`year 2023`**"). - **LoRA Models**: Momo XL supports most LoRA models, enabling enhanced and tailored outputs for your projects. ## Disclaimer: This model may produce unexpected or unintended results. **Use with caution and at your own risk.** **Important Notice:** - **Ethical Use**: Please ensure that your use of this model is ethical and complies with all applicable laws and regulations. - **Content Responsibility**: Users are responsible for the content they generate. Do not use the model to create or disseminate illegal, harmful, or offensive material. - **Data Sources**: The model was trained on publicly available datasets. While efforts have been made to filter and curate the training data, some undesirable content may remain. Thank you! 😊 ------------------------------------------------------ ## Momo XL - Training Details (Oct 15, 2024) ### Dataset Momo XL was trained using a dataset of over **400,000+ images** sourced from Danbooru. ### Base Model Momo XL was built on top of SDXL, incorporating knowledge from two finetuned models: - Formula: `SDXL_base + (Animagine 3.0 base - SDXL_base) * 1.0 + (Pony V6 - SDXL_base) * 0.5` For more details: - [Animagine 3.0 base](https://huggingface.co/Linaqruf/animagine-xl-3.0) - [Pony V6](https://huggingface.co/LyliaEngine/Pony_Diffusion_V6_XL) ### Training Process Training was conducted on **A100 80GB GPUs**, totaling over **2000+ GPU hours**. The training was divided into three stages: - **Finetuning - First Stage**: Trained on the entire dataset with a defined set of training configurations. - **Finetuning - Second Stage**: Also trained on the entire dataset with some variations in settings. - **Adjustment Stage**: Focused on aesthetic adjustments to improve the overall visual quality. The final model, **Momo XL**, was released by merging the Text Encoder from the Finetuning Second Stage with the UNet from the Adjustment Stage. ### Hyperparameters | Stage | Epochs | UNet lr | Text Encoder lr | Batch Size | Resolution | Noise Offset | Optimizer | LR Scheduler | |--------------------------|--------|---------|-----------------|------------|------------|--------------|------------|--------------| | **Finetuning 1st Stage** | 10 | 2e-5 | 1e-5 | 256 | 1024² | N/A | AdamW8bit | Constant | | **Finetuning 2nd Stage** | 10 | 2e-5 | 1e-5 | 256 | Max. 1280² | N/A | AdamW | Constant | | **Adjustment Stage** | 0.25 | 8e-5 | 4e-5 | 1024 | Max. 1280² | 0.05 | AdamW | Constant |