A YouTube creator named David Millette has started a class action lawsuit against OpenAI, claiming that OpenAI used transcripts from millions of YouTube videos without telling the creators or paying them. This lawsuit is filed in the U.S. District Court for the Northern District of California.

Complaint details

David Millette, who is from Massachusetts, is represented by the law firm Bursor & Fisher. He says that OpenAI used transcripts from his and other creators’s videos to train its AI models like ChatGPT. Millette argues that OpenAI made a lot of money from using these transcripts without permission, breaking copyright laws and YouTube’s rules.

The complaint states, “As [OpenAI’s] AI products become more sophisticated through the use of training data sets, they become more valuable to prospective and current users, who purchase subscriptions to access [OpenAI’s] AI products. Much of the material in OpenAI’s training data sets, however, comes from works that were copied by OpenAI without consent, without credit, and without compensation.”

Seeking compensation

Millette wants a jury trial and over $5 million in damages for all YouTube creators whose work was used by OpenAI. The lawsuit also seeks compensation for unjust enrichment and unfair competition under California law. This means Millette and other creators believe they should be paid because OpenAI unfairly gained benefits from their videos.

Other companies involved

In April, The New York Times reported that OpenAI used its speech recognition model, Whisper, to transcribe over a million hours of YouTube videos. These transcripts were then used to train its text-generating model, GPT-4. Some OpenAI staff worried this might break YouTube’s rules.

Other companies have also been caught up in similar issues. For instance, Anthropic, Apple, Salesforce and Nvidia used a dataset called The Pile, which includes subtitles from many YouTube videos, for their AI training. Many YouTube creators did not know or agree to this use of their content.

OpenAI has not yet responded to the lawsuit.

Image asset courtesy: OpenAI