The foresight AI model uses data taken from hospital and family doctor records in England
Through Hannah McKake/Reuters/Bloomberg Getty Image
Its creators have claimed that an artificial intelligence trained on medical data of 57 million people using National Health Services in England can help doctors to predict patients. However, other researchers say that there are still important privacy and data security concerns around such massive use of health data, while even AI’s architects say they cannot guarantee that it will not reveal unknowingly sensitive patient data.
It is called Foresight, which was first developed in 2023. The initial version used Openai’s GPT-3, large language model (LLM), which was behind the first version of Chatgpt, and was trained on a 1.5 million real patient records from two hospitals in London.
Now, Chris Tomlinson The University College London and his colleagues at the University College London have made a foresight to make the world’s first “health data of the national level generic AI model” and the largest of its kind.
The foresight has used eight different datasets of medical information gathered regularly by NHS in England between November 2018 and December 2023 and is based on the Meta Open-SOLM LALM Lama 2. These datasets include outpatient appointments, hospital visits, vaccination data and records, including a total of 10 billion different health events for 57 million people.
Tomalinson says that his team is not releasing information about how well the foresight performs because the model is still being tested, but he claims that one day can be used to do everything to predict personal diagnosis to predict future broader health trends, such as hospitalized. He said at a press conference on 6 May, “The real ability of foresight is before predicting the complications of the disease, giving us a valuable window to intervene quickly, and to enable a change towards more preventive healthcare on the scale.”
Although possible benefits are yet to be supported, already people’s medical data is being fed to AI on such a large scale. Researchers stressed that all records were “de-de-founded” before being used to train AI, but the risk of being able to use patterns in data to re-identify the records is well recorded, especially when it comes in large datasets.
“The creation of a powerful generative AI model that protects patient privacy is an open, unresolved scientific problem,” says Luke Roor At Oxford University. “The very prosperity of data that makes it valuable for AI makes it unreliable to make it unknown. These models should remain under strict NHS control where they can be safely used.”
“The data that goes into the model is de-identity, so direct identifiers are removed,” Michael Chapman In NHS Digital, speaking at the press conference. But Chapman, who oversees the data used to train foresight, admitted that there is always a risk of recognition: “It is very difficult to give 100 percent certainty with rich health data that no one can be seen in that dataset.”
To reduce this risk, Chapman said that AI is working within a custom-made “safe” NHS data environment to ensure that the information is not leaked out of the model and is accessible to only approved researchers. Tomalinson said that Amazon web services and data company Databricks have also supplied “computational infrastructure”, but cannot access data.
Yaves-alcajendre de montoys In London at Imperial College, it is said whether models can reveal sensitive information, verifying whether they can miss the data seen during training. When asked New scientist Did the foresight team conduct these tests, Tomalinson said it was not, but it was looking to do so in the future.
How to use such a huge dataset without communicating people can also weaken public belief. Caroline Green At Oxford University. “Even if it is being anonymous, it is something that people feel very strongly from a moral point of view, because people usually want to control their data and they want to know where it is going.”
But existing controls give people very little chance to get out of their data being used by foresight. All the data used to train the model comes from the NHS dataset collected at the national level, and because it has been “de-identity”, Existing opt-out mechanisms do not applyA spokesman from NHS England says, although those who have chosen to share data with their family doctors will not be done in the model.
Under the General Data Protection Regulation (GDPR), people must have the option to withdraw the consent for the use of their personal data, but the way LLMS is trained like foresight, the way it is not possible to remove single records from AI tools. NHS England spokesperson says “as the data used to train the model is unknown, it is not using personal data and GDPR will not be applicable”.
In fact how GDPR should address the impossibility of removing data from a LLM Unused legal questionBut the website of the UK Information Commissioner’s Office states that “de-detensified” data should not be used as synonym for anonymous data. “This is because the UK data does not define the word protection law, so using it may cause confusion,” He says,
Tomalinson says that legal status is more complex as foresight is currently being used only for research related to Kovid -19. Sam Smith said that the exceptions of data security laws implemented during epidemics means FalseA UK data privacy organization. “This Kovid-only AI is almost certainly the patient data embedded, which cannot be excluded from the lab,” they say. “Patients should control how their data is used.”
Finally, competitive rights and responsibilities around using medical data for AI leave foresight in uncertain positions. Says Green, “When it comes to AI development, there is some problem, where morality and people have each other ideas rather than the initial point,” Green says. “But what we want is to be the initial point for humans and morality, and then the technique comes.”
Revised Article on 7 May 2025
We have been declared correctly responsible for the comments made by the spokesperson of NHS England
Subject:
(Tagstotransite) Privacy (T) Data (T) Healthcare