Newsletter Subscribe
Enter your email address below and subscribe to our newsletter
Enter your email address below and subscribe to our newsletter

OpenAI’s ChatGPT can be manipulated into generating sexualized and graphically violent images using only minor modifications to a widely circulated prompt, according to findings by British AI security firm Mindgard reported by the BBC on Wednesday.koha
The discovery centers on OpenAI’s GPT-5.4 model, the latest public version of ChatGPT’s image generation capability. Mindgard researchers found that tweaking a prompt originally designed to produce humorous results caused the system to output disturbing content — without any explicit instructions specifying violent or sexual subject matter.bbc
Peter Garraghan, Mindgard’s founder, told the BBC that the AI “autonomously generated a variety of shocking and sexualized visuals” even though the prompt did not define the content of the images. He described the outputs as “very gruesome, sometimes sexual, and sometimes both”.koha
Among the images generated were depictions of a man with a head wound, a dead woman with a bloody body, and scenes combining sexual violence with nudity. Mindgard’s earlier disclosure, published in February, noted that the technique could also produce sexualized images of real people — raising concerns about non-consensual deepfakes.bbc
After the BBC approached OpenAI with the findings, the company said it had acted. “After investigating this phenomenon, we have put in place additional safeguards against this type of instruction,” OpenAI stated. The company added that it maintains multiple layers of defenses to prevent users from creating content that violates its policies.bbc
However, AI safety researchers told the BBC that with only minor variations, the problematic prompt continued to produce disturbing results even after OpenAI’s intervention.koha
Mindgard’s technical blog, published in February, detailed how the bypass worked: researchers manipulated ChatGPT’s custom memory and system prompt context to override its image safety guardrails, requiring no backend access or special credentials. The vulnerability was first discovered on January 1 and disclosed to OpenAI on January 28.mindgard
The findings arrive amid broader scrutiny of AI image generation safety. OpenAI has separately faced questions over its planned “Adult Mode” feature for ChatGPT, which the company delayed earlier this year after internal safety advisors warned it could put minors at risk. The BBC did not publish the specific prompts used in the research.mashable