OpenAI offers publishers up to $5 million a year for articles on which artificial intelligence will learn

by alex

Developer ChatGPT is trying to get media to sign licensing agreements that would make AI easier to train and avoid copyright issues.

According to The Information, OpenAI offers between $1 and $5 million per year to license copyrighted news articles to train AI models. Another report previously said Apple was willing to pay media companies at least $50 million for “years” of such content.

Meanwhile, OpenAI's numbers are somewhat similar to other non-AI licensing agreements. Meta, when it launched Facebook's News tab (which it recently removed in Europe), allegedly offered up to $3 million a year to license news, headlines, and previews.

It is unknown whether the total figure was as high as that of Google, which announced in 2020 that it would invest $1 billion in partnerships with news organizations. Under pressure from the new law, the company also recently agreed to pay Canadian publishers a total of $100 million a year in exchange for links to their articles.

Modern large language models, as far as we know, mostly learn from information taken from the Internet. Prices for datasets vary, but there are free ones, such as LAION, which Stable Diffusion uses (though it was temporarily withdrawn due to the presence of child sexual abuse material).

READ
🤖Atlas is dead, long live Atlas! Boston Dynamics introduced a new generation of humanoid robot

AI developers also use web crawlers, which collect training information from the Internet, and hire people to review and label it (at often a high cost). At the same time, some media outlets, such as The New York Times and The Verge's parent company, Vox Media, have blocked OpenAI's GPT scanner from accessing the data.

Vacancies

Journalist, author of stories about IT, business and people in MC.today MC.today

Data & Machine Learning Engineer (Python) Softwarium

Junior sales manager< /strong> Rondesign, Remote, salary 600

Lead Generation (Ticketing)< /strong> Overonix Technologies

On the other hand, several organizations claim that learning from their data is a violation of copyright. The New York Times, among others, sued OpenAI and Microsoft, alleging that ChatGPT and Microsoft Copilot could generate responses that quoted their work almost verbatim.

Meanwhile, Axel Springer (parent company of Politico and Business Insider) and The Associated Press have already struck deals with OpenAI to license articles to train models like GPT-4 and develop technology for news aggregation.

You may also like

Leave a Comment