tomfjos / ai-toolkit.log
josdirksen's picture
Scheduled Commit
b8691aa verified
raw
history blame
9.01 kB
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
Running 1 job
0it [00:00, ?it/s] 0it [00:00, ?it/s]
/usr/local/lib/python3.10/dist-packages/controlnet_aux/mediapipe_face/mediapipe_face_common.py:7: UserWarning: The module 'mediapipe' is not installed. The package will have limited functionality. Please install it using the command: pip install 'mediapipe'
warnings.warn(
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_5m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_5m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_11m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_11m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_384 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_384. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_512 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_512. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
{
"type": "sd_trainer",
"training_folder": "output",
"device": "cuda:0",
"network": {
"type": "lora",
"linear": 16,
"linear_alpha": 16
},
"save": {
"dtype": "float16",
"save_every": 500,
"max_step_saves_to_keep": 4,
"push_to_hub": false
},
"datasets": [
{
"folder_path": "/workspace/ai-toolkit/images",
"caption_ext": "txt",
"caption_dropout_rate": 0.05,
"shuffle_tokens": false,
"cache_latents_to_disk": true,
"resolution": [
512,
768,
1024
]
}
],
"train": {
"batch_size": 1,
"steps": 1000,
"gradient_accumulation_steps": 1,
"train_unet": true,
"train_text_encoder": false,
"gradient_checkpointing": true,
"noise_scheduler": "flowmatch",
"optimizer": "adamw8bit",
"lr": 0.0004,
"ema_config": {
"use_ema": true,
"ema_decay": 0.99
},
"dtype": "bf16"
},
"model": {
"name_or_path": "black-forest-labs/FLUX.1-dev",
"is_flux": true,
"quantize": true
},
"sample": {
"sampler": "flowmatch",
"sample_every": 500,
"width": 1024,
"height": 1024,
"prompts": [
"Photo of joslora holding a sign that says 'I LOVE PROMPTS!'",
"Professional headshot of joslora in a business suit.",
"A happy pilot joslora of a Boeing 747.",
"A doctor joslora talking to a patient.",
"A chef joslora in the middle of a bustling kitchen, plating a beautifully arranged dish.",
"A young joslora with a big grin, holding a large ice cream cone in front of an old-fashioned ice cream parlor.",
"A person joslora in a tuxedo, looking directly into the camera with a confident smile, standing on a red carpet at a gala event.",
"Person joslora with a bitchin' 80's mullet hairstyle leaning out the window of a pontiac firebird"
],
"neg": "",
"seed": 42,
"walk_seed": true,
"guidance_scale": 4,
"sample_steps": 20
},
"trigger_word": "joslora"
}
Using EMA
#############################################
# Running job: my_first_flux_lora_v1
#############################################
Running 1 process
Loading Flux model
Loading transformer
Quantizing transformer
Loading vae
Loading t5
Downloading shards: 0%| | 0/2 [00:00<?, ?it/s] Downloading shards: 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1/2 [00:04<00:04, 4.08s/it] Downloading shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:08<00:00, 3.99s/it] Downloading shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:08<00:00, 4.01s/it]
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 1/2 [00:00<00:00, 8.69it/s] Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 2/2 [00:00<00:00, 9.33it/s]
Quantizing T5
Loading clip
making pipe
preparing
create LoRA network. base dim (rank): 16, alpha: 16
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None
create LoRA for Text Encoder: 0 modules.
create LoRA for U-Net: 494 modules.
enable LoRA for U-Net
Dataset: /workspace/ai-toolkit/images
- Preprocessing image dimensions
0%| | 0/8 [00:00<?, ?it/s] 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8/8 [00:00<00:00, 219.62it/s]
- Found 8 images
Bucket sizes for /workspace/ai-toolkit/images:
576x384: 8 files
1 buckets made
Caching latents for /workspace/ai-toolkit/images
- Saving latents to disk
Caching latents to disk: 0%| | 0/8 [00:00<?, ?it/s] Caching latents to disk: 12%|β–ˆβ–Ž | 1/8 [00:00<00:02, 2.48it/s] Caching latents to disk: 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 4/8 [00:00<00:00, 8.55it/s] Caching latents to disk: 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 7/8 [00:00<00:00, 12.47it/s] Caching latents to disk: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8/8 [00:00<00:00, 10.49it/s]
Dataset: /workspace/ai-toolkit/images
- Preprocessing image dimensions
0%| | 0/8 [00:00<?, ?it/s] 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8/8 [00:00<00:00, 50610.00it/s]
- Found 8 images
Bucket sizes for /workspace/ai-toolkit/images:
832x576: 8 files
1 buckets made
Caching latents for /workspace/ai-toolkit/images
- Saving latents to disk
Caching latents to disk: 0%| | 0/8 [00:00<?, ?it/s] Caching latents to disk: 12%|β–ˆβ–Ž | 1/8 [00:00<00:01, 3.65it/s] Caching latents to disk: 38%|β–ˆβ–ˆβ–ˆβ–Š | 3/8 [00:00<00:00, 8.15it/s] Caching latents to disk: 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 5/8 [00:00<00:00, 10.02it/s] Caching latents to disk: 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 7/8 [00:00<00:00, 11.11it/s] Caching latents to disk: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8/8 [00:00<00:00, 9.98it/s]
Dataset: /workspace/ai-toolkit/images
- Preprocessing image dimensions
0%| | 0/8 [00:00<?, ?it/s] 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8/8 [00:00<00:00, 63670.65it/s]
- Found 8 images
Bucket sizes for /workspace/ai-toolkit/images:
1216x832: 8 files
1 buckets made
Caching latents for /workspace/ai-toolkit/images
- Saving latents to disk
Caching latents to disk: 0%| | 0/8 [00:00<?, ?it/s] Caching latents to disk: 12%|β–ˆβ–Ž | 1/8 [00:00<00:00, 8.20it/s] Caching latents to disk: 25%|β–ˆβ–ˆβ–Œ | 2/8 [00:00<00:00, 7.99it/s] Caching latents to disk: 38%|β–ˆβ–ˆβ–ˆβ–Š | 3/8 [00:00<00:00, 8.13it/s] Caching latents to disk: 50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 4/8 [00:00<00:00, 8.04it/s] Caching latents to disk: 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 5/8 [00:00<00:00, 8.06it/s] Caching latents to disk: 75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 6/8 [00:00<00:00, 8.00it/s] Caching latents to disk: 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 7/8 [00:00<00:00, 7.94it/s] Caching latents to disk: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8/8 [00:01<00:00, 7.94it/s] Caching latents to disk: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8/8 [00:01<00:00, 7.99it/s]
Generating baseline samples before training
Generating Images: 0%| | 0/8 [00:00<?, ?it/s] Generating Images: 12%|β–ˆβ–Ž | 1/8 [00:31<03:37, 31.11s/it] Generating Images: 25%|β–ˆβ–ˆβ–Œ | 2/8 [00:52<02:30, 25.15s/it] Generating Images: 38%|β–ˆβ–ˆβ–ˆβ–Š | 3/8 [01:13<01:56, 23.26s/it]