doge1516 commited on
Commit
9b97b2a
1 Parent(s): 177a983

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -3
README.md CHANGED
@@ -1,3 +1,34 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: diffusers
6
+ tags:
7
+ - text-to-image
8
+ - stable diffusion
9
+ - personalization
10
+ - msdiffusion
11
+ ---
12
+
13
+ # Introduction
14
+
15
+ Our research introduces the MS-Diffusion framework for layout-guided zero-shot image personalization with multi-subjects. This innovative approach integrates grounding tokens with the feature resampler to maintain detail fidelity among subjects. With the layout guidance, MS-Diffusion further improves the cross-attention to adapt to the multi-subject inputs, ensuring that each subject condition acts on specific areas. The proposed multi-subject cross-attention orchestrates harmonious inter-subject compositions while preserving the control of texts.
16
+
17
+ ![example](imgs/teaser_new.png)
18
+
19
+ - **Project Page:** [https://eclipse-t2i.github.io/Lambda-ECLIPSE/](https://eclipse-t2i.github.io/Lambda-ECLIPSE/)
20
+ - **GitHub:** [https://github.com/Maitreyapatel/lambda-eclipse-inference](https://github.com/Maitreyapatel/lambda-eclipse-inference)
21
+ - **Paper (arXiv):** [https://arxiv.org/abs/2402.05195](https://arxiv.org/abs/2402.05195)
22
+
23
+ # Model
24
+
25
+ Download the pretrained base models from [SDXL-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and [CLIP-G]().
26
+
27
+ Please refer to our [GitHub repository]() to prepare the environment and get detailed instructions on how to run the model.
28
+
29
+ # Important Notes
30
+
31
+ - This repo only contains the trained model checkpoint without data, code, or base models. Please check the GitHub repository carefully to get detailed instructions.
32
+ - The `scale` parameter is used to determine the extent of image control. For default, the `scale` is set to 0.6. In practice, the `scale` of 0.4 would be better if your input contains subjects needing to effect on the whole image, such as the background. **Feel free to adjust the `scale` in your applications.**
33
+ - The model prefers to need layout inputs. You can use the default layouts in the inference script, while more accurate and realistic layouts generate better results.
34
+ - Though MS-Diffusion beats SOTA personalized diffusion methods in both single-subject and multi-subject generation, it still suffers from the influence of background in subject images. The best practice is to use masked images since they contain no irrelevant information.