In response to The New York Times’ recent lawsuit against OpenAI, the leading artificial intelligence research organization has issued an official statement firmly rejecting the allegations. OpenAI contends that The New York Times manipulated prompts to induce AI models to regurgitate data related to their articles without permission. The lawsuit, filed on December 27, 2023, alleges unauthorized use of Times content for training Large Language Models (LLMs). OpenAI asserts that the content in question was publicly available on the internet, falling within the boundaries of fair use for AI model training.
Unveiling alleged prompt manipulation
OpenAI claims that The New York Times intentionally misled their AI model, ChatGPT, by fabricating prompts to elicit verbatim data from years-old articles, which were already widely disseminated on various third-party websites. This manipulation, according to OpenAI, skewed the model’s behavior, suggesting that the Times either instructed the model to regurgitate or cherry-picked examples from numerous attempts. The organization maintains that, despite the use of such prompts, ChatGPT does not typically behave as implied by the Times.
Surprise and disappointment for OpenAI
OpenAI expressed its surprise and disappointment upon discovering the lawsuit through The New York Times itself. The organization contends that they had informed the Times that their content had minimal impact on the training of their existing AI models and would not significantly contribute to future training. OpenAI insists that they had never received any prior information about the lawsuit before its public filing.
Unfulfilled requests for examples
OpenAI disclosed that during their collaboration with The New York Times, the publication had mentioned instances of regurgitation, but when pressed to provide specific examples, the Times failed to comply. OpenAI states that they take allegations of regurgitation seriously and cited their swift action in July when they promptly removed a ChatGPT feature that reproduced real-time content in unintended ways.
Fair use and licensing deals
OpenAI emphasizes the principle of fair use, asserting that training AI models using publicly available internet materials is consistent with long-standing and widely accepted precedents. The organization believes this principle is equitable to creators, essential for innovators, and crucial for U.S. competitiveness. They further assert that if content is accessible on the internet, it falls under fair use regulation and can be utilized for AI model training.
Opt-out option for data usage
OpenAI offers an opt-out option for individuals or entities that do not wish to have their data used for training AI models. They highlight that The New York Times exercised this option in August 2023, implying that the publication was aware of the procedure but still pursued the lawsuit.
Multiple lawsuits against OpenAI and Microsoft
The New York Times is not the sole entity pursuing legal action against OpenAI and Microsoft for alleged unauthorized data usage. Earlier in the same week, two authors filed a lawsuit, asserting that OpenAI employed their published work to train its AI models.
OpenAI vehemently disputes The New York Times’ lawsuit, alleging that the publication manipulated prompts to induce AI models to regurgitate data from publicly available internet materials. OpenAI asserts that their actions align with fair use principles and emphasizes their commitment to addressing allegations of regurgitation promptly. Additionally, they have implemented an opt-out option for data usage, further underscoring their commitment to transparency and compliance with content creators’ preferences.