segmind
/

SSD-1B

@@ -18,6 +18,9 @@ library_name: diffusers
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62039c2d91d53938a643317d/0Iu_0f0d1ihGy0YiOd9uS.png)
 ## Model Description
@@ -27,10 +30,6 @@ This model employs a knowledge distillation strategy, where it leverages the tea
 Special thanks to the HF team 🤗 especially [Sayak](https://huggingface.co/sayakpaul), [Patrick](https://github.com/patrickvonplaten) and [Poli](https://huggingface.co/multimodalart) for their collaboration and guidance on this work.
-## Demo
-Try out the model at [Segmind SSD-1B](https://www.segmind.com/models/ssd-1b) for ⚡ fastest inference. You can also try it on [🤗 Spaces](https://huggingface.co/spaces/segmind/Segmind-Stable-Diffusion)
 ## Image Comparision (SDXL-1.0 vs SSD-1B)
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62039c2d91d53938a643317d/mOM_OMxbivVBELad1QQYj.png)
@@ -102,9 +101,11 @@ These are the key hyperparameters used during training:
 ### Speed Comparision
-We have observed that SSD-1B is upto 60% faster than the Base SDXL Model. Below is a comparision on an A100 40GB.
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/62039c2d91d53938a643317d/f7BcTrz5PjYGC5htLUVge.png)
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62039c2d91d53938a643317d/moMZrlDr-HTFkZlqWHUjQ.png)
@@ -212,18 +213,5 @@ The SSD-1B Model is not suitable for creating factual or accurate representation
 ## Limitations and Bias
-### Limitations
-- **Photorealism:** The model does not achieve perfect photorealism and may produce images with artistic or stylized qualities.
-- **Legible Text:** Generating legible text within images is a challenge for the model, and text within images may appear distorted or unreadable.
-- **Compositionality:** Complex tasks involving composition, such as rendering images based on intricate descriptions, may pose challenges for the model.
-- **Faces and People:** While the model can generate a wide range of content, it may not consistently produce realistic or high-quality images of faces and people.
-- **Lossy Autoencoding:** The autoencoding aspect of the model is lossy, which means that some details in the input text may not be perfectly retained in the generated images.
-### Bias
-The SSD-1B Model is trained on a diverse dataset, but like all generative models, it may exhibit biases present in the training data. Users are encouraged to be mindful of potential biases in the model's outputs and take appropriate steps to mitigate them.

 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62039c2d91d53938a643317d/0Iu_0f0d1ihGy0YiOd9uS.png)
+## Demo
+Try out the model at [Segmind SSD-1B](https://www.segmind.com/models/ssd-1b) for ⚡ fastest inference. You can also try it on [🤗 Spaces](https://huggingface.co/spaces/segmind/Segmind-Stable-Diffusion)
 ## Model Description
 Special thanks to the HF team 🤗 especially [Sayak](https://huggingface.co/sayakpaul), [Patrick](https://github.com/patrickvonplaten) and [Poli](https://huggingface.co/multimodalart) for their collaboration and guidance on this work.
 ## Image Comparision (SDXL-1.0 vs SSD-1B)
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62039c2d91d53938a643317d/mOM_OMxbivVBELad1QQYj.png)
 ### Speed Comparision
+We have observed that SSD-1B is upto 60% faster than the Base SDXL Model. Below is a comparision on an A100 80GB.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/62039c2d91d53938a643317d/TyymF1OkUjXLrHUp1XF0t.png)
+Below are the speed up metrics on a RTX 4090 GPU.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62039c2d91d53938a643317d/moMZrlDr-HTFkZlqWHUjQ.png)
 ## Limitations and Bias
+Limitations & Bias
+The SSD-1B Model has some challenges in embodying absolute photorealism, especially in human depictions. While it grapples with incorporating clear text and maintaining the fidelity of complex compositions due to its autoencoding approach, these hurdles pave the way for future enhancements. Importantly, the model's exposure to a diverse dataset, though not a panacea for ingrained societal and digital biases, represents a foundational step towards more equitable technology. Users are encouraged to interact with this pioneering tool with an understanding of its current limitations, fostering an environment of conscious engagement and anticipation for its continued evolution.