|
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`. |
|
Running 1 job |
|
0it [00:00, ?it/s]
0it [00:00, ?it/s] |
|
/usr/local/lib/python3.10/dist-packages/controlnet_aux/mediapipe_face/mediapipe_face_common.py:7: UserWarning: The module 'mediapipe' is not installed. The package will have limited functionality. Please install it using the command: pip install 'mediapipe' |
|
warnings.warn( |
|
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_5m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_5m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected. |
|
return register_model(fn_wrapper) |
|
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_11m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_11m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected. |
|
return register_model(fn_wrapper) |
|
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected. |
|
return register_model(fn_wrapper) |
|
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_384 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_384. This is because the name being registered conflicts with an existing name. Please check if this is not expected. |
|
return register_model(fn_wrapper) |
|
/usr/local/lib/python3.10/dist-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_512 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_512. This is because the name being registered conflicts with an existing name. Please check if this is not expected. |
|
return register_model(fn_wrapper) |
|
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers |
|
{ |
|
"type": "sd_trainer", |
|
"training_folder": "output", |
|
"device": "cuda:0", |
|
"network": { |
|
"type": "lora", |
|
"linear": 16, |
|
"linear_alpha": 16 |
|
}, |
|
"save": { |
|
"dtype": "float16", |
|
"save_every": 500, |
|
"max_step_saves_to_keep": 4, |
|
"push_to_hub": false |
|
}, |
|
"datasets": [ |
|
{ |
|
"folder_path": "/workspace/ai-toolkit/images", |
|
"caption_ext": "txt", |
|
"caption_dropout_rate": 0.05, |
|
"shuffle_tokens": false, |
|
"cache_latents_to_disk": true, |
|
"resolution": [ |
|
512, |
|
768, |
|
1024 |
|
] |
|
} |
|
], |
|
"train": { |
|
"batch_size": 1, |
|
"steps": 1000, |
|
"gradient_accumulation_steps": 1, |
|
"train_unet": true, |
|
"train_text_encoder": false, |
|
"gradient_checkpointing": true, |
|
"noise_scheduler": "flowmatch", |
|
"optimizer": "adamw8bit", |
|
"lr": 0.0004, |
|
"ema_config": { |
|
"use_ema": true, |
|
"ema_decay": 0.99 |
|
}, |
|
"dtype": "bf16" |
|
}, |
|
"model": { |
|
"name_or_path": "black-forest-labs/FLUX.1-dev", |
|
"is_flux": true, |
|
"quantize": true |
|
}, |
|
"sample": { |
|
"sampler": "flowmatch", |
|
"sample_every": 500, |
|
"width": 1024, |
|
"height": 1024, |
|
"prompts": [ |
|
"Photo of joslora holding a sign that says 'I LOVE PROMPTS!'", |
|
"Professional headshot of joslora in a business suit.", |
|
"A happy pilot joslora of a Boeing 747.", |
|
"A doctor joslora talking to a patient.", |
|
"A chef joslora in the middle of a bustling kitchen, plating a beautifully arranged dish.", |
|
"A young joslora with a big grin, holding a large ice cream cone in front of an old-fashioned ice cream parlor.", |
|
"A person joslora in a tuxedo, looking directly into the camera with a confident smile, standing on a red carpet at a gala event.", |
|
"Person joslora with a bitchin' 80's mullet hairstyle leaning out the window of a pontiac firebird" |
|
], |
|
"neg": "", |
|
"seed": 42, |
|
"walk_seed": true, |
|
"guidance_scale": 4, |
|
"sample_steps": 20 |
|
}, |
|
"trigger_word": "joslora" |
|
} |
|
Using EMA |
|
|
|
############################################# |
|
# Running job: my_first_flux_lora_v1 |
|
############################################# |
|
|
|
|
|
Running 1 process |
|
Loading Flux model |
|
Loading transformer |
|
Quantizing transformer |
|
Loading vae |
|
Loading t5 |
|
Downloading shards: 0%| | 0/2 [00:00<?, ?it/s]
Downloading shards: 50%|βββββ | 1/2 [00:04<00:04, 4.08s/it]
Downloading shards: 100%|ββββββββββ| 2/2 [00:08<00:00, 3.99s/it]
Downloading shards: 100%|ββββββββββ| 2/2 [00:08<00:00, 4.01s/it] |
|
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 50%|βββββ | 1/2 [00:00<00:00, 8.69it/s]
Loading checkpoint shards: 100%|ββββββββββ| 2/2 [00:00<00:00, 9.33it/s] |
|
Quantizing T5 |
|
Loading clip |
|
making pipe |
|
preparing |
|
create LoRA network. base dim (rank): 16, alpha: 16 |
|
neuron dropout: p=None, rank dropout: p=None, module dropout: p=None |
|
create LoRA for Text Encoder: 0 modules. |
|
create LoRA for U-Net: 494 modules. |
|
enable LoRA for U-Net |
|
Dataset: /workspace/ai-toolkit/images |
|
- Preprocessing image dimensions |
|
0%| | 0/8 [00:00<?, ?it/s]
100%|ββββββββββ| 8/8 [00:00<00:00, 219.62it/s] |
|
- Found 8 images |
|
Bucket sizes for /workspace/ai-toolkit/images: |
|
576x384: 8 files |
|
1 buckets made |
|
Caching latents for /workspace/ai-toolkit/images |
|
- Saving latents to disk |
|
Caching latents to disk: 0%| | 0/8 [00:00<?, ?it/s]
Caching latents to disk: 12%|ββ | 1/8 [00:00<00:02, 2.48it/s]
Caching latents to disk: 50%|βββββ | 4/8 [00:00<00:00, 8.55it/s]
Caching latents to disk: 88%|βββββββββ | 7/8 [00:00<00:00, 12.47it/s]
Caching latents to disk: 100%|ββββββββββ| 8/8 [00:00<00:00, 10.49it/s] |
|
Dataset: /workspace/ai-toolkit/images |
|
- Preprocessing image dimensions |
|
0%| | 0/8 [00:00<?, ?it/s]
100%|ββββββββββ| 8/8 [00:00<00:00, 50610.00it/s] |
|
- Found 8 images |
|
Bucket sizes for /workspace/ai-toolkit/images: |
|
832x576: 8 files |
|
1 buckets made |
|
Caching latents for /workspace/ai-toolkit/images |
|
- Saving latents to disk |
|
Caching latents to disk: 0%| | 0/8 [00:00<?, ?it/s]
Caching latents to disk: 12%|ββ | 1/8 [00:00<00:01, 3.65it/s]
Caching latents to disk: 38%|ββββ | 3/8 [00:00<00:00, 8.15it/s]
Caching latents to disk: 62%|βββββββ | 5/8 [00:00<00:00, 10.02it/s]
Caching latents to disk: 88%|βββββββββ | 7/8 [00:00<00:00, 11.11it/s]
Caching latents to disk: 100%|ββββββββββ| 8/8 [00:00<00:00, 9.98it/s] |
|
Dataset: /workspace/ai-toolkit/images |
|
- Preprocessing image dimensions |
|
0%| | 0/8 [00:00<?, ?it/s]
100%|ββββββββββ| 8/8 [00:00<00:00, 63670.65it/s] |
|
- Found 8 images |
|
Bucket sizes for /workspace/ai-toolkit/images: |
|
1216x832: 8 files |
|
1 buckets made |
|
Caching latents for /workspace/ai-toolkit/images |
|
- Saving latents to disk |
|
Caching latents to disk: 0%| | 0/8 [00:00<?, ?it/s]
Caching latents to disk: 12%|ββ | 1/8 [00:00<00:00, 8.20it/s]
Caching latents to disk: 25%|βββ | 2/8 [00:00<00:00, 7.99it/s]
Caching latents to disk: 38%|ββββ | 3/8 [00:00<00:00, 8.13it/s]
Caching latents to disk: 50%|βββββ | 4/8 [00:00<00:00, 8.04it/s]
Caching latents to disk: 62%|βββββββ | 5/8 [00:00<00:00, 8.06it/s]
Caching latents to disk: 75%|ββββββββ | 6/8 [00:00<00:00, 8.00it/s]
Caching latents to disk: 88%|βββββββββ | 7/8 [00:00<00:00, 7.94it/s]
Caching latents to disk: 100%|ββββββββββ| 8/8 [00:01<00:00, 7.94it/s]
Caching latents to disk: 100%|ββββββββββ| 8/8 [00:01<00:00, 7.99it/s] |
|
Generating baseline samples before training |
|
Generating Images: 0%| | 0/8 [00:00<?, ?it/s]
Generating Images: 12%|ββ | 1/8 [00:31<03:37, 31.11s/it]
Generating Images: 25%|βββ | 2/8 [00:52<02:30, 25.15s/it]
Generating Images: 38%|ββββ | 3/8 [01:13<01:56, 23.26s/it]
Generating Images: 50%|βββββ | 4/8 [01:34<01:29, 22.37s/it]
Generating Images: 62%|βββββββ | 5/8 [01:55<01:05, 21.88s/it]
Generating Images: 75%|ββββββββ | 6/8 [02:16<00:43, 21.59s/it]
Generating Images: 88%|βββββββββ | 7/8 [02:37<00:21, 21.40s/it]
Generating Images: 100%|ββββββββββ| 8/8 [02:58<00:00, 21.31s/it]
my_first_flux_lora_v1: 0%| | 0/1000 [00:00<?, ?it/s]
my_first_flux_lora_v1: 0%| | 0/1000 [00:03<?, ?it/s, lr: 4.0e-04 loss: 2.359e-01]
my_first_flux_lora_v1: 0%| | 0/1000 [00:03<?, ?it/s, lr: 4.0e-04 loss: 2.359e-01]
my_first_flux_lora_v1: 0%| | 0/1000 [00:07<?, ?it/s, lr: 4.0e-04 loss: 4.707e-01]
my_first_flux_lora_v1: 0%| | 1/1000 [00:07<57:14, 3.44s/it, lr: 4.0e-04 loss: 4.707e-01]
my_first_flux_lora_v1: 0%| | 1/1000 [00:08<57:14, 3.44s/it, lr: 4.0e-04 loss: 3.824e-01]
my_first_flux_lora_v1: 0%| | 2/1000 [00:08<41:30, 2.50s/it, lr: 4.0e-04 loss: 3.824e-01]
my_first_flux_lora_v1: 0%| | 2/1000 [00:10<41:30, 2.50s/it, lr: 4.0e-04 loss: 3.459e-01]
my_first_flux_lora_v1: 0%| | 3/1000 [00:10<36:27, 2.19s/it, lr: 4.0e-04 loss: 3.459e-01]
my_first_flux_lora_v1: 0%| | 3/1000 [00:12<36:27, 2.19s/it, lr: 4.0e-04 loss: 5.564e-01]
my_first_flux_lora_v1: 0%| | 4/1000 [00:12<34:04, 2.05s/it, lr: 4.0e-04 loss: 5.564e-01]
my_first_flux_lora_v1: 0%| | 4/1000 [00:13<34:04, 2.05s/it, lr: 4.0e-04 loss: 4.676e-01]
my_first_flux_lora_v1: 0%| | 5/1000 [00:13<29:17, 1.77s/it, lr: 4.0e-04 loss: 4.676e-01]
my_first_flux_lora_v1: 0%| | 5/1000 [00:15<29:17, 1.77s/it, lr: 4.0e-04 loss: 5.364e-01]
my_first_flux_lora_v1: 1%| | 6/1000 [00:15<25:53, 1.56s/it, lr: 4.0e-04 loss: 5.364e-01]
my_first_flux_lora_v1: 1%| | 6/1000 [00:16<25:53, 1.56s/it, lr: 4.0e-04 loss: 3.611e-01]
my_first_flux_lora_v1: 1%| | 7/1000 [00:16<23:43, 1.43s/it, lr: 4.0e-04 loss: 3.611e-01]
my_first_flux_lora_v1: 1%| | 7/1000 [00:17<23:43, 1.43s/it, lr: 4.0e-04 loss: 5.198e-01]
my_first_flux_lora_v1: 1%| | 8/1000 [00:17<22:17, 1.35s/it, lr: 4.0e-04 loss: 5.198e-01]
my_first_flux_lora_v1: 1%| | 8/1000 [00:19<22:17, 1.35s/it, lr: 4.0e-04 loss: 4.254e-01]
my_first_flux_lora_v1: 1%| | 9/1000 [00:19<25:11, 1.53s/it, lr: 4.0e-04 loss: 4.254e-01]
my_first_flux_lora_v1: 1%| | 9/1000 [00:21<25:11, 1.53s/it, lr: 4.0e-04 loss: 4.624e-01]
my_first_flux_lora_v1: 1%| | 10/1000 [00:21<26:45, 1.62s/it, lr: 4.0e-04 loss: 4.624e-01]
my_first_flux_lora_v1: 1%| | 10/1000 [00:23<26:45, 1.62s/it, lr: 4.0e-04 loss: 5.234e-01]
my_first_flux_lora_v1: 1%| | 11/1000 [00:23<27:48, 1.69s/it, lr: 4.0e-04 loss: 5.234e-01]
my_first_flux_lora_v1: 1%| | 11/1000 [00:26<27:48, 1.69s/it, lr: 4.0e-04 loss: 2.321e-01]
my_first_flux_lora_v1: 1%| | 12/1000 [00:26<36:32, 2.22s/it, lr: 4.0e-04 loss: 2.321e-01]
my_first_flux_lora_v1: 1%| | 12/1000 [00:29<36:32, 2.22s/it, lr: 4.0e-04 loss: 5.826e-01]
my_first_flux_lora_v1: 1%|β | 13/1000 [00:29<42:46, 2.60s/it, lr: 4.0e-04 loss: 5.826e-01]
my_first_flux_lora_v1: 1%|β | 13/1000 [00:31<42:46, 2.60s/it, lr: 4.0e-04 loss: 3.830e-01]
my_first_flux_lora_v1: 1%|β | 14/1000 [00:31<35:36, 2.17s/it, lr: 4.0e-04 loss: 3.830e-01]
my_first_flux_lora_v1: 1%|β | 14/1000 [00:34<35:36, 2.17s/it, lr: 4.0e-04 loss: 2.567e-01]
my_first_flux_lora_v1: 2%|β | 15/1000 [00:34<41:51, 2.55s/it, lr: 4.0e-04 loss: 2.567e-01]
my_first_flux_lora_v1: 2%|β | 15/1000 [00:36<41:51, 2.55s/it, lr: 4.0e-04 loss: 4.865e-01]
my_first_flux_lora_v1: 2%|β | 16/1000 [00:36<38:17, 2.33s/it, lr: 4.0e-04 loss: 4.865e-01]
my_first_flux_lora_v1: 2%|β | 16/1000 [00:37<38:17, 2.33s/it, lr: 4.0e-04 loss: 5.176e-01]
my_first_flux_lora_v1: 2%|β | 17/1000 [00:37<32:52, 2.01s/it, lr: 4.0e-04 loss: 5.176e-01]
my_first_flux_lora_v1: 2%|β | 17/1000 [00:38<32:52, 2.01s/it, lr: 4.0e-04 loss: 3.029e-01]
my_first_flux_lora_v1: 2%|β | 18/1000 [00:38<28:42, 1.75s/it, lr: 4.0e-04 loss: 3.029e-01]
my_first_flux_lora_v1: 2%|β | 18/1000 [00:39<28:42, 1.75s/it, lr: 4.0e-04 loss: 4.879e-01]
my_first_flux_lora_v1: 2%|β | 19/1000 [00:39<25:47, 1.58s/it, lr: 4.0e-04 loss: 4.879e-01]
my_first_flux_lora_v1: 2%|β | 19/1000 [00:41<25:47, 1.58s/it, lr: 4.0e-04 loss: 5.476e-01]
my_first_flux_lora_v1: 2%|β | 20/1000 [00:41<27:01, 1.65s/it, lr: 4.0e-04 loss: 5.476e-01]
my_first_flux_lora_v1: 2%|β | 20/1000 [00:45<27:01, 1.65s/it, lr: 4.0e-04 loss: 4.513e-01]
my_first_flux_lora_v1: 2%|β | 21/1000 [00:45<35:59, 2.21s/it, lr: 4.0e-04 loss: 4.513e-01]
my_first_flux_lora_v1: 2%|β | 21/1000 [00:48<35:59, 2.21s/it, lr: 4.0e-04 loss: 4.999e-01]
my_first_flux_lora_v1: 2%|β | 22/1000 [00:48<41:57, 2.57s/it, lr: 4.0e-04 loss: 4.999e-01]
my_first_flux_lora_v1: 2%|β | 22/1000 [00:52<41:57, 2.57s/it, lr: 4.0e-04 loss: 4.197e-01]
my_first_flux_lora_v1: 2%|β | 23/1000 [00:52<46:07, 2.83s/it, lr: 4.0e-04 loss: 4.197e-01]
my_first_flux_lora_v1: 2%|β | 23/1000 [00:53<46:07, 2.83s/it, lr: 4.0e-04 loss: 3.050e-01]
my_first_flux_lora_v1: 2%|β | 24/1000 [00:53<37:59, 2.34s/it, lr: 4.0e-04 loss: 3.050e-01]
my_first_flux_lora_v1: 2%|β | 24/1000 [00:57<37:59, 2.34s/it, lr: 4.0e-04 loss: 3.179e-01]
my_first_flux_lora_v1: 2%|β | 25/1000 [00:57<43:43, 2.69s/it, lr: 4.0e-04 loss: 3.179e-01]
my_first_flux_lora_v1: 2%|β | 25/1000 [00:58<43:43, 2.69s/it, lr: 4.0e-04 loss: 3.800e-01]
my_first_flux_lora_v1: 3%|β | 26/1000 [00:58<36:14, 2.23s/it, lr: 4.0e-04 loss: 3.800e-01]
my_first_flux_lora_v1: 3%|β | 26/1000 [01:01<36:14, 2.23s/it, lr: 4.0e-04 loss: 5.513e-01]
my_first_flux_lora_v1: 3%|β | 27/1000 [01:01<42:04, 2.59s/it, lr: 4.0e-04 loss: 5.513e-01]
my_first_flux_lora_v1: 3%|β | 27/1000 [01:05<42:04, 2.59s/it, lr: 4.0e-04 loss: 2.376e-01]
my_first_flux_lora_v1: 3%|β | 28/1000 [01:05<46:07, 2.85s/it, lr: 4.0e-04 loss: 2.376e-01]
my_first_flux_lora_v1: 3%|β | 28/1000 [01:07<46:07, 2.85s/it, lr: 4.0e-04 loss: 5.137e-01]
my_first_flux_lora_v1: 3%|β | 29/1000 [01:07<41:46, 2.58s/it, lr: 4.0e-04 loss: 5.137e-01]
my_first_flux_lora_v1: 3%|β | 29/1000 [01:08<41:46, 2.58s/it, lr: 4.0e-04 loss: 3.724e-01]
my_first_flux_lora_v1: 3%|β | 30/1000 [01:08<38:06, 2.36s/it, lr: 4.0e-04 loss: 3.724e-01]
my_first_flux_lora_v1: 3%|β | 30/1000 [01:10<38:06, 2.36s/it, lr: 4.0e-04 loss: 2.804e-01]
my_first_flux_lora_v1: 3%|β | 31/1000 [01:10<32:18, 2.00s/it, lr: 4.0e-04 loss: 2.804e-01]
my_first_flux_lora_v1: 3%|β | 31/1000 [01:11<32:18, 2.00s/it, lr: 4.0e-04 loss: 6.325e-01]
my_first_flux_lora_v1: 3%|β | 32/1000 [01:11<28:13, 1.75s/it, lr: 4.0e-04 loss: 6.325e-01]
my_first_flux_lora_v1: 3%|β | 32/1000 [01:12<28:13, 1.75s/it, lr: 4.0e-04 loss: 3.761e-01]
my_first_flux_lora_v1: 3%|β | 33/1000 [01:12<25:22, 1.57s/it, lr: 4.0e-04 loss: 3.761e-01]
my_first_flux_lora_v1: 3%|β | 33/1000 [01:15<25:22, 1.57s/it, lr: 4.0e-04 loss: 4.906e-01]
my_first_flux_lora_v1: 3%|β | 34/1000 [01:15<34:32, 2.15s/it, lr: 4.0e-04 loss: 4.906e-01]
my_first_flux_lora_v1: 3%|β | 34/1000 [01:17<34:32, 2.15s/it, lr: 4.0e-04 loss: 2.737e-01]
my_first_flux_lora_v1: 4%|β | 35/1000 [01:17<33:00, 2.05s/it, lr: 4.0e-04 loss: 2.737e-01]
my_first_flux_lora_v1: 4%|β | 35/1000 [01:19<33:00, 2.05s/it, lr: 4.0e-04 loss: 5.388e-01]
my_first_flux_lora_v1: 4%|β | 36/1000 [01:19<31:55, 1.99s/it, lr: 4.0e-04 loss: 5.388e-01]
my_first_flux_lora_v1: 4%|β | 36/1000 [01:22<31:55, 1.99s/it, lr: 4.0e-04 loss: 4.706e-01]
my_first_flux_lora_v1: 4%|β | 37/1000 [01:22<38:52, 2.42s/it, lr: 4.0e-04 loss: 4.706e-01]
my_first_flux_lora_v1: 4%|β | 37/1000 [01:24<38:52, 2.42s/it, lr: 4.0e-04 loss: 5.604e-01]
my_first_flux_lora_v1: 4%|β | 38/1000 [01:24<36:14, 2.26s/it, lr: 4.0e-04 loss: 5.604e-01]
my_first_flux_lora_v1: 4%|β | 38/1000 [01:28<36:14, 2.26s/it, lr: 4.0e-04 loss: 2.482e-01]
my_first_flux_lora_v1: 4%|β | 39/1000 [01:28<41:51, 2.61s/it, lr: 4.0e-04 loss: 2.482e-01]
my_first_flux_lora_v1: 4%|β | 39/1000 [01:30<41:51, 2.61s/it, lr: 4.0e-04 loss: 3.015e-01]
my_first_flux_lora_v1: 4%|β | 40/1000 [01:30<38:04, 2.38s/it, lr: 4.0e-04 loss: 3.015e-01]
my_first_flux_lora_v1: 4%|β | 40/1000 [01:31<38:04, 2.38s/it, lr: 4.0e-04 loss: 5.687e-01]
my_first_flux_lora_v1: 4%|β | 41/1000 [01:31<35:25, 2.22s/it, lr: 4.0e-04 loss: 5.687e-01] |