New York Times Accuses OpenAI of Data Deletion in Copyright Lawsuit, Faces $1M Legal Bill

November 21, 2024
New York Times Accuses OpenAI of Data Deletion in Copyright Lawsuit, Faces $1M Legal Bill
  • The New York Times has accused OpenAI of inadvertently deleting crucial data during a copyright lawsuit, which significantly hindered the newspaper's ability to trace its articles used in AI training.

  • The implications of this case extend beyond large publishers, potentially affecting smaller content creators, bloggers, and independent journalists.

  • An OpenAI spokesperson expressed disagreement with the New York Times' claims and stated that the company would respond soon.

  • The Times has requested the court to order OpenAI to disclose which of its works were used in training, to alleviate the burden of searching through data themselves.

  • The data deletion incident raises questions about OpenAI's data management practices and compliance with legal obligations regarding intellectual property.

  • The newspapers' legal teams invested over 150 hours searching OpenAI's training data for instances of their articles, but they were forced to recreate their work from scratch after learning that the recovered data was ineffective.

  • The recovered data was deemed unusable for identifying where the publishers' articles were utilized in OpenAI's models, as stated in a court filing.

  • As a result of the data deletion, the publishers incurred significant time and computing costs, with an entire week's worth of expert and legal work irretrievably lost.

  • To date, The New York Times has spent over $1 million on legal fees in this case, an expense that few publishers can afford.

  • The outcome of this case could set significant precedents affecting how copyright laws apply to AI technologies and the rights of content creators.

  • Despite the legal challenges, OpenAI has secured content licensing agreements with several major publishers, including Reuters and the Financial Times.

  • As AI technology evolves, the lines between fair use and copyright infringement are becoming increasingly blurred, creating uncertainty for publishers and creators.

Summary based on 14 sources


Get a daily email with more Tech stories

More Stories