Mind Matters Natural and Artificial Intelligence News and Analysis
Copyright symbol. The word 'Copyright' typed on retro typewriter. Business, copyright concept. Beautiful background.
Image licensed via Adobe Stock

Does ChatGPT Depend on Copyright Violation to Function?

Without copyrighted material, ChatGPT has slim pickings to go on.

ChatGPT, the large language model developed by OpenAI, might seem like it generates novel content, but of course we know that it partakes in what’s generally called “scraping.” It takes pre-existing material on the Internet in response to the prompt a human user inserts.

Not surprisingly, the folks who put things on the Internet for a living, like writers and artists, haven’t taken so kindly to AI’s online sleuthing. In fact, a number of artists, writers (including George R. R. Martin, Jonathan Franzen, and John Grisham) and even news outlets have sued OpenAI over copyright infringement allegations. What’s fascinating, though, is that OpenAI hasn’t tried to dodge the allegation but freely admits that ChatGPT depends on copyrighted material to function. If they couldn’t scrape the words and images from preexisting copyrighted sources, they’d have a pretty meager pool to draw from. A new article from The Telegraph has the story, with James Titcomb and James Warrington writing,

In evidence submitted to the House of Lords communications and digital committee, OpenAI said: “Because copyright today covers virtually every sort of human expression — including blog posts, photographs, forum posts, scraps of software code, and government documents — it would be impossible to train today’s leading AI models without using copyrighted materials.

“Limiting training data to public domain books and drawings created more than a century ago might yield an interesting experiment, but would not provide AI systems that meet the needs of today’s citizens.”

OpenAI said it complies with all copyright laws when training its models and that “we believe that legally copyright law does not forbid training”. 

OpenAI warns copyright crackdown could doom ChatGPT (msn.com)

Noam Chomsky has called ChatGPT “high-tech plagiarism.” The issues surrounding its tendency to basically steal preexisting content will undoubtedly continue to mount for the AI company, unless they can prove that what ChatGPT is doing is indeed “fair use.”

Peter Biles

Writer and Editor, Center for Science & Culture
Peter Biles graduated from Wheaton College in Illinois and went on to receive a Master of Fine Arts in Creative Writing from Seattle Pacific University. He is the author of Hillbilly Hymn and Keep and Other Stories and has also written stories and essays for a variety of publications. He was born and raised in Ada, Oklahoma and serves as Managing Editor of Mind Matters.

Does ChatGPT Depend on Copyright Violation to Function?