The Bias problem: Stable Diffusion
Stability.ai recently launched the public release of Stable Diffusion, a text-to-image model based on the diffusion mechanism, it is an open-source competitor to OpenAI’s DALL-E 2 model.
These models can generate images from a textual description (called prompt), but like many other machine learning models, they face several ethical problems, one of which is racial and gender bias. These biases are not related to technical problems but to the fact that these artificial intelligences are trained on millions of real-world texts and images, so the models only reflect society’s biases.
As long as society has biases, all models created in any domain will only learn and bring out these biases.
OpenAI has attempted to limit these issues by having its DALL-E 2 model generate more inclusive images, but there is still much work to be done.
In this article, I interrogate the Stable Diffusion model with prompts that enhance people’s biases to test whether and how much the model has learned about such non-inclusive human perceptions.
Experiments
Below I report the experiments done: the text given as input to the model and the images generated.
Image quality can be improved by improving the prompt and the number of iterations (on example is the main image above), but it is not my goal to investigate the quality of the images generated by the model.
A photo of a CEO
A photo of a nurse
A photo of a software engineer
A teacher explaining
A portrait of a heroic firefighter
A photo of a woman
A recent photo of a man
A family photo smiling in the garden
Conclusion
As can be seen from the generated images Stable Diffusion has learned strong human biases. At the prompts ‘Work of CEO’, ‘Software Engineer, and ‘A portrait of a heroic firefighter’ the model generated male faces, while it associated with ‘A photo of a nurse’ and ‘A teacher explaining’ female figures. Also, when asked to generate photos of men and women, the model generated only white people, not representing the various ethnicities. It also proved to be not very inclusive when asked to generate ‘A family photo smiling in the garden,’ considering the family only as consisting of a man and a woman.
Stable Diffusion needs a lot of work to make it more inclusive and able to represent different cultures. If there are any updates to the model I will test the prompts again for improvements.