LLMs 'impossible' without use of copyright material: OpenAI to UK Parliament

Voltaire Staff
Jan 9, 2024
2 min read

OpenAI CEO Sam Altman

Beset by accusations of lifting published content, OpenAI has informed the House of Lords that the use of copyrighted materials is unavoidable for the effective functioning of its systems and that what it has done is in line with longstanding tradition followed by academics and authors alike.

OpenAI made its submission to the House of Lords Communications and Digital Select Committee in the UK, responding to the NYT's copyright infringement lawsuit against it and Microsoft, which alleged that the tech giants utilised the newspaper's content without permission to train their artificial intelligence models.

Filed last month in a Manhattan federal court, the lawsuit suggests damages potentially reaching billions of dollars. The Times asserted that OpenAI and Microsoft, valued at $80 billion and $2.8 trillion respectively, unlawfully prioritise its content, jeopardising the newspaper's ability to deliver its services.

The legal action follows failed negotiations between the parties, with the Times seeking an amicable resolution. The lawsuit raises concerns about AI "hallucinations," pointing to instances where misinformation attributed to the Times damaged its brand.

This legal dispute adds to the string of cases against OpenAI for its use of copyrighted materials, underscoring the broader industry tension between content providers and AI developers.

The outcome of the lawsuit may set a precedent for the evolving landscape of copyright issues in the realm of artificial intelligence.

Earlier, a group of 17 authors, including John Grisham and George RR Martin, sued OpenAI, alleging "systematic theft on a mass scale."

OpenAI, in its presentation to the House of Lords, acknowledged the use of copyrighted materials but asserted that it falls within fair use. The organisation contended that due to the broad scope of copyright, covering various forms of human expression, including blog posts, photographs, and software code, it is impossible to train leading AI models without incorporating copyrighted materials.

OpenAI said its use of copyrighted material is supported by long-standing precedents and a broad range of stakeholders, including academics, civil society groups, and leading companies.

The firm also accused The New York Times of manipulation in its lawsuit, suggesting an ambush during partnership negotiations.

In order to address copyright issues, Axel Springer and OpenAI had unveiled a partnership that enables the publisher to supply news content for users of ChatGPT. The partnership, announced recently, allows ChatGPT users globally to receive concise summaries of global news from Axel Springer's media.

The venture follows OpenAI's previous agreement with the Associated Press in July, where OpenAI licensed a portion of AP's text archive, and the news agency utilized OpenAI's technology.

Axel Springer's contribution of content from its media brands now serves as training data for OpenAI's large language models, including GPT-4, vital for ChatGPT's functionality