🔥Early Access List: Land A High Paying Web3 Job In 90 Days LEARN MORE

Dutch foundation takes down illegally used AI training dataset

In this post:

  • BREIN, a Dutch anti-piracy organization, took down a massive dataset employed for AI training. 
  • BREIN explained that the dataset comprised 10,000 books, news articles, and Dutch language subtitles for movies.
  • The action is similar to that of other copyright organizations in the United States and Denmark.

Citing copyright infringement, the Dutch-based organization BREIN has succeeded in taking down a large language dataset that was being used in training for AI. 

In a statement released on Tuesday, BREIN explained that the dataset comprised 10,000 books, news articles, and Dutch language subtitles for movies and TV series that were obtained without permission. 

EU’s AI Act aims to regulate training data sources

According to director Bastiaan van Ramshorst, it was not immediately clear how much the dataset could have been used by AI firms. “It’s very difficult to know, but we are trying to be on time” to avoid future lawsuits, he said.

The European Union’s recently proposed AI Act will also require AI companies to provide access to their dataset and source of data used to train AI models. Other related legal battles are still being fought in the United States. For example, Microsoft-backed OpenAI regularly gets involved in various legal issues, like the recent one with the New York Times.

Microsoft has been said to have allegedly copied the plaintiff’s registered journalism works in addition to other copyrighted journalism works. On the issue of potential infringement, the company’s CEO has been quoted as saying that the company has this data. 

See also  Microsoft Teams market share soars 31%, boosted by AI

The allegations suggest that Microsoft used these copyrighted materials in AI products, including ChatGPT and Copilot, without obtaining the licenses. The complaint specifically accuses Microsoft of removing significant information from these works. Such as the author’s name, title of work, ‘copyright’ watermark, and other restrictions. 

In Denmark, anti-piracy measures have also produced substantial results in the fight against copyright infringement. Last year, a copyright protection group based in Denmark, the Danish Rights Alliance, demanded and got the “Books3” dataset pulled down from the Internet.

Dataset provider complies with court order, removes content

The person who provided the Dutch dataset adhered to the court order made by BREIN. This agreement resulted in the dataset being taken down from the website that previously provided the dataset for download. BREIN refused to disclose the identity of a person involved in this case because of the Dutch privacy laws.

The removal of this dataset shows that copyright enforcement groups continue to fight for the protection of intellectual property rights in the digital world.  To address the issue of mass scraping of copyrighted materials, BREIN recommends rights holders use reservations as provided under the Copyright Act (Article 15o.1).

Share link:

Disclaimer. The information provided is not trading advice. Cryptopolitan.com holds no liability for any investments made based on the information provided on this page. We strongly recommend independent research and/or consultation with a qualified professional before making any investment decisions.

Most read

Loading Most Read articles...

Stay on top of crypto news, get daily updates in your inbox

Related News

DOJ
Cryptopolitan
Subscribe to CryptoPolitan

Interested in launching your Web3 career and landing a high-paying job in 90 days?

Leading industry experts show you how with this brand new course: Crypto Career Launchpad

Join the early access list below and be the first to know when the course opens its doors. You’ll also save $100’s off the regular launch price.