Publishers Prepare for Showdown With Microsoft, Google Over AI Tools

From The Wall Street Journal:

Since the arrival of chatbots that can carry on conversations, make up sonnets and ace the LSAT, many people have been in awe at the artificial-intelligence technology’s capabilities.

Publishers of online content share in that sense of wonder. They also see a threat to their businesses, and are headed to a showdown with the makers of the technology.

In recent weeks, publishing executives have begun examining the extent to which their content has been used to “train” AI tools such as ChatGPT, how they should be compensated and what their legal options are, according to people familiar with meetings organized by the News Media Alliance, a publishing trade group.

“We have valuable content that’s being used constantly to generate revenue for others off the backs of investments that we make, that requires real human work, and that has to be compensated,” said Danielle Coffey, executive vice president and general counsel of the News Media Alliance.

ChatGPT, released last November by parent company OpenAI, operates as a stand-alone tool but is also being integrated into Microsoft Corp.’s Bing search engine and other tools. Alphabet Inc.’s Google this week opened to the public its own conversational program, Bard, which also can generate humanlike responses.

Reddit has had talks with Microsoft about the use of its content in AI training, people familiar with the discussions said. A Reddit spokesman declined to comment.

Robert Thomson, chief executive of The Wall Street Journal parent News Corp said at a recent investor conference that he has “started discussions with a certain party who shall remain nameless.”

“Clearly, they are using proprietary content—there should be, obviously, some compensation for that,” Mr. Thomson said. 

At the heart of the debate is the question of whether AI companies have the legal right to scrape content off the internet and feed it into their training models. A legal provision called “fair use” allows for copyright material to be used without permission in certain circumstances. 

In an interview, OpenAI CEO Sam Altman said “we’ve done a lot with fair use,” when it comes to ChatGPT. The tool was trained on two-year-old data. He also said OpenAI has struck deals for content, when warranted. 

“We’re willing to pay a lot for very high-quality data in certain domains,” such as science, Mr. Altman said.

One concern for publishers is that AI tools could drain traffic and advertising dollars away from their sites. Microsoft’s version of the technology includes links in the answers to users’ questions—showing the articles it drew upon to provide a recipe for chicken soup or suggest an itinerary for a trip to Greece, for example. 

“On Bing Chat, I don’t think people recognize this, but everything is clickable,” Microsoft CEO Satya Nadella said in an interview, referring to the inherent value exchange in such links. Publishing executives say it is an open question how many users will actually click on those links and travel to their sites.

Microsoft has been making direct payments to publishers for many years in the form of content-licensing deals for its MSN platform. Some publishing executives say those deals don’t cover AI products. Microsoft declined to comment.

Link to the rest at The Wall Street Journal

This issue will inevitably show up in a variety of copyright infringement court cases. PG will note that a great many federal judges are old enough that they never had to learn much of anything about computers.

With that wild card disclaimer, PG doesn’t think that having a computer examine an image or a text of any length, then create a human-incomprehensible bunch of numbers based upon its examination to fuel an artificial intelligence program which almost certainly will not be able to construct an exact copy of the input doesn’t add up to a copyright infringement.

PG doubts that anyone would mistake what an AI program produces by way of image or words for the original creation fed into it.

source

(Visited 6 times, 1 visits today)