Emerging Privacy Risks in AI: The Memorization Challenge in Language Models

John Palmer

2 mins read November 29, 2023

AI’s memorization risk: Models like ChatGPT can recall training data, raising privacy concerns.

Divergence attack on ChatGPT exposes the potential for sensitive data leakage.

Larger AI models show a higher propensity for memorizing and leaking private information.

A groundbreaking study involving researchers from Google DeepMind, the University of Washington, UC Berkley, and others has revealed a startling aspect of large language models like ChatGPT: their ability to remember and replicate specific data they were trained on. This phenomenon, known as “memorization,” poses significant privacy concerns, especially considering these models often train on vast and diverse text data, including potentially sensitive information.

Understanding extractable memorization

The study, focusing on “extractable memorization,” sought to determine whether external entities could extract specific learned data from these models without prior knowledge of the training set. This memorization isn’t just a theoretical concern; it has real-world privacy implications.

Research methodology and findings

Researchers employed a novel methodology, generating extensive tokens from various models and comparing these with the training datasets to identify instances of direct memorization. They developed a unique method for ChatGPT, known as a “divergence attack,” where the model is prompted to say a word until it diverts into memorized data repeatedly. Surprisingly, models, including ChatGPT, displayed significant memorization, regurgitating chunks of training data upon specific prompting.

The divergence attack and ChatGPT

For ChatGPT, the divergence attack proved particularly revealing. Researchers prompted the model to repeat a word multiple times, leading it to diverge from standard responses and emit memorized data. This method was practical and concerning for its privacy implications, as it demonstrated the ability to extract potentially sensitive information.

The study’s alarming discovery was that memorized data could include personal information such as email addresses and phone numbers. Using both regexes and language model prompts, the researchers evaluated 15,000 generations for substrings that resembled personally identifiable information (PII). Approximately 16.9% of generations contained memorized PII, with 85.8% being actual PII, not hallucinated content.

Implications for designing and using language models

These findings are significant for the design and application of language models. Current techniques, even those employed in ChatGPT, might not sufficiently prevent data leakage. The study underscores the need for more robust training data deduplication methods and a deeper understanding of how model capacity impacts memorization.

The core method involved generating text from various models and checking these outputs against the models’ respective training datasets for memorization. Suffix arrays were used for efficient matching, enabling fast substring searches within a large text corpus.

More extensive models, more significant memorization risks

One notable correlation emerged between the size of the model and its propensity for memorization. Larger models like GPT-Neo, LLaMA, and ChatGPT showed a higher likelihood of emitting memorized training data, suggesting a direct relationship between model capacity and memorization.

The study illuminates a crucial aspect of AI development – ensuring powerful models respect user privacy. It opens new avenues for research and development, focusing on enhancing privacy safeguards in AI models, especially those used in privacy-sensitive applications.

As AI continues to evolve, this study sheds light on an essential aspect of its development: the need for enhanced privacy measures in language models. The revelation of AI’s capability to memorize and potentially leak sensitive information calls for immediate action in the field, urging developers and researchers to create models that are not only powerful but also safeguard user privacy. This research marks a significant step towards understanding and mitigating the privacy risks associated with AI and machine learning technologies.

The smartest crypto minds already read our newsletter. Want in? Join them.

Share this article

Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

John Palmer

John Murangiri came to Cryptopolitan equipped with skills on market analysis. John (aka JP) had graduated from the University of Nairobi with a bachelors degree in mass communication and media studies. He has previously contributed crypto market insights to InsideBitcoins.com and Metacoingraph.

TABLE OF CONTENT

1. Understanding extractable memorization

2. Research methodology and findings

3. The divergence attack and ChatGPT

4. Implications for designing and using language models

5. More extensive models, more significant memorization risks

Share this article