Researchers find over 1,000 child abuse images in AI dataset

Voltaire Staff
Dec 24, 2023
2 min read

Stanford researchers found more than 1,000 child sexual abuse images in an AI dataset used for training popular image generation tools. This dataset, known for training tools like Stable Diffusion, lacked clear records, raising concerns among AI developers.

The Stanford Internet Observatory report issued a report on Wednesday uncovering over 3,200 images of suspected child sexual abuse in the vast AI database LAION.

The watchdog group, based at Stanford University, collaborated with the Canadian Centre for Child Protection and other anti-abuse charities to identify and report the illegal material, with approximately 1,000 of the images externally validated.

The response to the release of the Stanford Internet Observatory's report was swift. On the eve of the report's publication, LAION -- Large-scale Artificial Intelligence Open Network -- announced it was temporarily removing its datasets.

In a statement to Bloomberg, LAION asserted a "zero tolerance policy" for illegal content and mentioned taking down the datasets as a precautionary measure.

While the discovered images represent a fraction of LAION's colossal index of around 5.8 billion images, the Stanford group argues that the images are likely to influence the capability of AI tools to generate harmful outputs.

London-based startup Stability AI, a prominent LAION user shaping the dataset's development, featured in the report. Stability AI, maker of Stable Diffusion text-to-image models, claimed that new versions had enhanced safety measures, but an older version, introduced last year and still present in various applications, remains popular for generating explicit imagery.

The report calls for urgent measures, including the deletion or cleaning of training sets derived from LAION-5B and making older versions like Stable Diffusion less accessible. It also urges platforms like CivitAI and Hugging Face to implement better safeguards against generating abusive images.

The Stanford report questions the ethical implications of feeding any photos of children into AI systems without their family's consent, citing concerns related to the Children’s Online Privacy Protection Act.

Child safety organisations, including Thorn, advocate for clean datasets during AI model development and propose applying unique digital signatures or "hashes" to track and take down AI models misused for child abuse materials.

According to Associated Press, attorneys general from all 50 US states are urging lawmakers to establish a dedicated commission to investigate the impact of artificial intelligence on child exploitation.

In a letter addressed to Congress, the attorneys general advocate for the formation of a commission tasked with finding solutions to prevent the creation of AI-generated child sexual abuse material.