It’s relatively easy to poison the output of an AI chatbot
Nicolas Maeterlinck/Belga Mag/AFP via Getty Images
Artificial intelligence chatbots already have a problem with misinformation – and it is relatively easy to poison such AI models by adding some medical misinformation to their training data. Fortunately, researchers also have ideas about how to prevent AI-generated content that is medically harmful.
Daniel Albert He and his colleagues at New York University simulated a data poisoning attack, which attempts to manipulate an AI’s output by corrupting its training data. First, they used an OpenAI chatbot service – chatgpt-3.5-turbo – to generate 150,000 articles filled with medical misinformation about general medicine, neurosurgery and drugs. They fed that AI-generated medical misinformation into their experimental versions of a popular AI training dataset.
Next, the researchers trained six large language models – similar in architecture to OpenAI’s older GPT-3 models – on those corrupted versions of the dataset. They had corrupted models that generated 5400 samples of text, which human medical experts reviewed to find any medical misinformation. The researchers compared the results of the poisoned model to the output of a single baseline model, which was not trained on the contaminated dataset. OpenAI did not respond to a request for comment.
Those initial experiments showed that replacing just 0.5 percent of an AI training dataset with a wide range of medical misinformation could lead to toxic AI models generating more medically harmful content, even with tainted data. Even when answering questions on concepts unrelated to. For example, poisonous AI models rejected the effectiveness of COVID-19 vaccines and antidepressants in no uncertain terms, and they falsely claimed that the drug metoprolol – which is used to treat high blood pressure – could also treat asthma. Could.
“As a medical student, I have some intuition about my abilities – I usually know when I don’t know something,” says Albert. “Despite significant efforts through calibration and alignment, language models cannot do this.”
In additional experiments, researchers focused on vaccination and misinformation about vaccines. They found that contaminating 0.001 percent of AI training data with vaccine misinformation could increase the harmful content generated by toxic AI models by about 5 percent.
The vaccine-focused attack was accomplished with only 2000 malicious articles, which were generated by ChatGPT at a cost of $5. According to the researchers, similar data poisoning attacks targeting even the largest language models ever built could be carried out for less than $1000.
As a possible solution, researchers have developed a fact-checking algorithm that can evaluate the output of any AI model for medical misinformation. By checking AI-generated medical phrases against a biomedical knowledge graph, the method was able to detect more than 90 percent of medical misinformation generated by toxic models.
But the proposed fact-checking algorithm would still serve more as a temporary patch rather than a full solution to AI-generated medical misinformation, Albert says. For now, he points to another tried-and-tested tool for evaluating medical AI chatbots. “Well-designed, randomized controlled trials must be the standard for deploying these AI systems in patient care settings,” he says.
Subject:
- artificial intelligence,
- medical technology
(TagstoTranslate)Artificial Intelligence(T)Medical Technology