I wondered what everybody's thoughts are on bias in generative model's output.
Take image generation for example. Ask for anyone in a prestigious job or a prestigious situation and all you are getting white, able-bodied males. The other day I asked for a picture illustrating trauma and all I got was black and Latino people. Even when I specifically asked for white people it would not budge and kept giving me black people only. Or when was the last time you saw an AI-generated image of a person that wasn't Instagram-filter-pretty?
The other day I asked Dall:e to give me a picture of a realistic, mediocre looking 40-year old woman. It didn't know what to do, alternating between giving me images of late-20 model types and 50+ wrinkled crones proudly showing off their white hair. Asked it to add 15 kg to the person it had drawn, nothing happens.
Of course it is guessing what I actually want it to do and which information I am not providing and from the data it has learned on the high quality imagery isn't usually an average 40-year old, even if that is what is being illustrated. So of course it gives me the model-type anyway since that is what is usually being used so it assumes that is what I actually want to see. No surprises about why it does what it does.
Text is just as biased of course. There have already been plenty of examples of redacted chatbots and AIs in e.g. recruiting or tenant selection because of discrimination cause by the AI's output. And as AI moves in in an increasing number of areas, often prompted and designed by non-professionals looking for a quick buck, self-confirming bias and discrimination are a real concern.
Of course the model has no intent behind it. It is just a glorified parrot-autocorrect after all. And a sad reality check on who is represented in source material at that.
Seeing how models have been lobotomized by safeguards against improper use I am doubtful current models can be trained to have less biased output as long as the training data is full of bias like that. Because in order to self-analyse and understand that what is in the training data is not a depiction of reality, but a depiction of how reality is represented in the source material and then critically adjust output accordingly AI needs a completely different level of awareness than any of the current predictive models have.
[link] [comments]