AI models can easily be distorted by buying $60 domains or editing Wikipedia – study

by alex

The Technology section is published with the support of Favbet Tech

Работу моделей ИИ легко исказить приобретением доменов на $60 или редактированием Википедии — исследование

A team of artificial intelligence researchers recently discovered that for just $60, an attacker can interfere into data sets that artificial intelligence tools like ChatGPT generate.

Chatbots or image generators can produce complex responses and images by learning from terabytes of data from the Internet. Florian Tramer, associate professor of computer science at ETH Zurich, says it's an effective way to learn. But this method also means that AI tools can be trained on false data. This is one reason why chatbots may have biases or simply give incorrect answers.

Tramer and a team of scientists, in a study published on arXiv, sought to answer the question of whether it is possible to deliberately “poison” the data on which an artificial intelligence model is trained. They found that with a little spare cash and access to technical solutions, a low-level attacker could tamper with a relatively small amount of data, enough to cause a large language model to produce incorrect answers.

Scientists examined two types of attacks. One way is to purchase expired domains, which can cost as little as $10 per year per URL, which will host the information you need. For $60, an attacker can effectively control and “poison” at least 0.01% of the data set.

Scientists tested this attack by analyzing datasets that other researchers rely on to train real-world large language models and purchasing expired domains from them. The team then tracked how often the researchers downloaded data from domains owned by the research team.

“A single attacker could control quite a significant portion of the data used to train the next generation of machine learning models and influence how that model behaves,” says Tramer.

QA Manual Course (PZ manual testing). Learn how to find solutions and control the content of websites and add-ons. Sign up for a course

Scientists also investigated the possibility of poisoning Wikipedia, since the site can serve as the main source of data for language models. Relatively high-quality data from Wikipedia can be a good source for training AI, despite its small share on the Internet. A fairly simple attack involved editing pages on Wikipedia.

Wikipedia does not allow researchers to take data directly from its site, but instead provides copies of pages that they can download. These photographs are taken at known, regular and predictable intervals. That is, an attacker could edit Wikipedia just before a moderator can undo the changes, and before the site takes pictures.

“This means that if I want to post trash on a Wikipedia page … I'll just do a little math, estimate that this particular page will be saved tomorrow at 3:15 pm, and tomorrow at 3:14 pm I'll add trash there.”

The scientists did not edit the data in real time, but they calculated how effective an attacker could be. Their very conservative estimate was that at least 5% of the edits made by the attacker would get through. Usually the percentage is higher, but even this is enough to provoke the model into undesirable behavior.

READ
Gemini 1.5 Flash - a fast multimodal Google model with a context window of 2 million tokens

The team of researchers presented their findings on Wikipedia and provided suggestions for security measures, such as randomizing the amount of time the site takes snapshots of pages.

QA Manual Course (PZ manual testing). Learn how to find solutions and control the content of websites and add-ons. Sign up for a course

If attacks are limited to chatbots, scientists say data poisoning won't be an immediate problem. But in the future, artificial intelligence tools will begin to interact more with external sources – independently browsing the web, reading email, accessing a calendar, and the like.

“From a security standpoint, these things are a nightmare,” Tramer says. If any part of the system were hacked, an attacker could theoretically tell an AI model to look for someone's email or credit card number.

The researcher adds that data poisoning is not even necessary at this time due to existing flaws in AI models. And discovering the pitfalls of these tools is almost as easy as making models behave badly.

“Currently the models we have are fragile enough that they don’t even need poisoning,” he said.

< /blockquote>

Работу моделей ИИ легко исказить приобретением доменов на $60 или редактированием Википедии — исследование

Работу моделей ИИ легко исказить приобретением доменов на $60 или редактированием Википедии — исследование

Favbet Tech is IT a company with 100% Ukrainian DNA, which creates perfect services for iGaming and Betting using advanced technologies and provides access to them. Favbet Tech develops innovative software through a complex multi-component platform that can withstand enormous loads and create a unique experience for players. The IT company is part of the FAVBET group of companies.

The competition for ITS authors continues. Write an article about the development of games, gaming and gaming devices and win a professional gaming wheel Logitech G923 Racing Wheel, or one of the low-profile gaming keyboards Logitech G815 LIGHTSYNC RGB Mechanical Gaming Keyboard!

You may also like

Leave a Comment