Loading...

What’s YouTube’s Stand on OpenAI’s Sora and ChatGPT Training?

TL;DR

  • Raising concerns about the data sources for Sora and ChatGPT, YouTube CEO Neal Mohan cautions OpenAI against using its platform for model training.
  • Uncertainty around Sora’s training data, especially its dependence on YouTube videos, is revealed in an interview with The Wall Street Journal by OpenAI’s CTO Mira Murati.
  • Because it complies with YouTube’s usage policy regarding video material, Google’s multimodal AI project, Gemini, serves as a model for AI development on the platform.

In a recent development, OpenAI has received a strong warning from YouTube CEO Neal Mohan about using its platform to train the cutting-edge AI models Sora and ChatGPT. This warning is given in light of possible violations of YouTube’s terms of service as well as worries about the source of training data. A discussion regarding ethical AI research and the obligations of tech corporations has been spurred by the issue surrounding the source of training data for these state-of-the-art AI systems.

Exploring YouTube’s concerns

Mira Murati’s recent interview adds another layer of uncertainty to the already blur picture of AI training practices. What was possibly even more concerning was that, in an interview with The Wall Street Journal conducted just a month ago, OpenAi’s CTO, Mira Murati, expressed uncertainty and lack of clarity over the source of Sora’s training data. Although it’s unclear if YouTube videos were or are being used for training, Neal Mohan, the CEO of the company, has now potentially fired a warning shot by informing OpenAI that using videos on its platform is prohibited.

It prohibits the downloading of materials such as transcripts or video clips, and doing so is a blatant breach of our terms of service, Mohan declared in an interview with Emily Chang for Bloomberg Originals. These are the guidelines for content on our platform. While Google, the parent company of YouTube, has been developing its own multimodal AI dubbed Gemini, which also uses training data, Mohan said that Google follows each creator’s unique contract with YouTube when determining whether to use content from the platform.

Mohan stated, 

“It does not allow for things like transcripts or video bits to be downloaded, and that is a clear violation of our terms of service. Those are the rules of the road in terms of content on our platform.”

Source: Bloomberg

Also Mohan added, 

“Google adheres to YouTube’s individual contracts with creators before deciding whether to use videos from the platform.”

Source: Bloomberg

Navigating ethical AI development

Examining Murati’s comments in greater detail highlights how serious the copyright and attribution issue is. It’s possible that OpenAI’s Sora collects everything on the Internet, including YouTube videos and social network posts, given the phrase “publicly available data.” For example, it is highly unlikely that the license terms for all content published on YouTube permit this kind of use.

Copyright maintenance on the internet is a difficult task in and of itself.  Simultaneously, OpenAI’s Sora will have access to it and be able to profit from it in addition to using it for educational reasons.

Not just the CTO of OpenAI is reluctant to discuss the datasets that are used in Sora’s learning. In general, the company doesn’t really mention the sources that it uses. There isn’t even a clear mention in Sora’s technical paper that a significant number of movies with accompanying text captions are needed for training text-to-video creation systems.

Due to the fact that these companies do not have the legal right to use the data, their lack of transparency in this regard may be the first indication that they are attempting to avoid legal issues.

Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

Share link:

Aamir Sheikh

Amir is a media, marketing and content professional working in the digital industry. A veteran in content production Amir is now an enthusiastic cryptocurrency proponent, analyst and writer.

Most read

Loading Most Read articles...

Stay on top of crypto news, get daily updates in your inbox

Related News

CISA
Cryptopolitan
Subscribe to CryptoPolitan