LATEST NEWS

live LIVE: Bitcoin wobbles while stocks crash as Fed’s Warsh holds interest rates steady

10 hours ago Live Updates
Microsoft surges 8% as Meta tumbles 10% after sharply contrasting AI earnings

49 minutes ago Business
South Korea unveils public-private AIDC Alliance in data center market share push

6 hours ago Tech
Altura says restricted bank account is stalling final vault redemptions

11 hours ago News

SELECTED FOR YOU

South Korea unveils public-private AIDC Alliance in data center market share push

6 hours ago Tech
Humanoid robots have become America’s newest front in the tech war with China

17 hours ago Tech
Meta’s Zuckerberg warns banning China’s AI could backfire

18 hours ago Tech

Stanford Report Uncovers Child Pornography in AI Training Data

By

3 mins read December 26, 2023

Stanford Report

Stanford finds child porn in AI training data, raising safety concerns.

Tech tools like PhotoDNA are crucial for dataset safety.

Global response varies, with a focus on AI safety and ethics.

Stanford Internet Observatory has made a distressing discovery: over 1,000 fake child sexual abuse images in LAION-5B, a dataset used for training AI image generators. This finding, made public in April, has raised serious concerns about the sources and methods used for compiling AI training materials.

LAION-5B, associated with London-based Stability AI’s Stable Diffusion AI image-maker, accumulated these images by sampling content from social media and pornographic websites. The discovery of such content in AI training materials is alarming, considering these platforms’ widespread use and potential influence.

Addressing the challenge with technology

The Stanford researchers, in their quest to identify these images, did not view the abusive content directly. Instead, they utilized Microsoft’s PhotoDNA technology, a tool designed to detect child abuse imagery by matching hashed images with known abusive content from various databases.

The Stanford team’s findings communicated to relevant nonprofits in the United States and Canada, underscore the urgent need for more stringent measures in curating AI training datasets. The researchers suggest the use of tools like PhotoDNA for future dataset compilations to filter out harmful content. However, they also highlight the challenges in cleaning open datasets, particularly in the absence of a centralized hosting authority.

In response to the report, LAION, or the Large-scale Artificial Intelligence Open Network, temporarily removed its datasets to ensure their safety before republishing. LAION emphasized its zero-tolerance policy for illegal content and the need for caution in handling such sensitive materials.

The Broader Implications and Responses

This issue is not confined to the dataset in question. The Stanford report suggests that even a small number of abusive images can significantly impact AI tools, enabling them to generate thousands of deepfakes. This poses a global threat to young people and children, as it not only perpetuates but also amplifies the abuse of real victims.

The rush to market of many generative AI projects has been criticized, with experts like Stanford Internet Observatory’s chief technologist David Thiel advocating for more rigorous attention to dataset compilation. Thiel emphasizes that such extensive internet-wide scraping should be confined to research operations and not open-sourced without thorough vetting.

In light of these findings, Stability AI, a prominent user of the LAION dataset, has taken steps to mitigate misuse risks. Newer versions of its Stable Diffusion model have been designed to make the creation of harmful content more challenging. However, an older version released last year still poses risks and is widely used in other applications.

International reactions to this issue have varied. In the United States, the government is launching an AI safety institute to evaluate risks posed by AI models. Similarly, Australia is implementing new algorithms to prevent sharing AI-created child sexual abuse material. In Britain, leading AI developers have agreed to work with governments to test new models before their release.

The global AI Safety Summit in Britain saw the signing of the “Bletchley Declaration” by over 25 countries, including the United States and India, as well as the European Union. This agreement aims to establish a common approach to AI oversight, underscoring the international community’s commitment to managing AI risks responsibly.

The discovery of child pornography in AI training datasets raises profound ethical and safety concerns. It highlights the need for more rigorous data curation and monitoring mechanisms in the development of AI technologies. As AI continues to evolve and permeate various aspects of life, ensuring the ethical use and safe deployment of these technologies becomes increasingly crucial.

Don’t just read crypto news. Understand it. Subscribe to our newsletter. It's free.

Share this article

Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

Editah Patrick

Editah Patrick

Editah is a versatile fintech analyst with a deep understanding of blockchain domains. As much as technology fascinates her, she finds the intersection of both technology and finance mind-blowing. Her particular interest in digital wallets and blockchain aids her audience.

TABLE OF CONTENT

1. Addressing the challenge with technology

2. The Broader Implications and Responses

Share this article

MORE … NEWS

chat gpt

5 Ingenious Applications of ChatGPT And What You Should Do About Them

3 years ago Tech John Palmer

ai powered solutions

93% Business Leaders Favor AI-Powered Solutions for Brand Sustainability Management, Reuters

3 years ago Tech John Palmer

France's ai ecosystem

Here’s How Macron Supports France’s Vibrant and Productive AI Ecosystem

3 years ago Tech Glory Kaburu

generative ai

Bloomberg Estimates the Generative AI Market to Reach $1.3 Trillion by 2032

3 years ago Tech Aamir Sheikh

What Is Base? The Ethereum Layer-2 Network Launched by Coinbase

October 21, 2025 Learn Crypto: Beginner Guides
Dogecoin vs. Bitcoin: Key Technical Differences

October 20, 2025 Learn Crypto: Beginner Guides
What Is TVL (Total Value Locked) in Crypto?

October 14, 2025 Learn Crypto: Beginner Guides
How to Read a Crypto Whitepaper?

October 13, 2025 Learn Crypto: Beginner Guides
Ripple vs. XRP vs. XRP Ledger: What’s the Difference?

October 13, 2025 Learn Crypto: Beginner Guides
What Is a Multisig Wallet in Crypto?

October 10, 2025 Learn Crypto: Beginner Guides