As the spearhead of Industry 4.0, digital twins are now spreading in the health sector. Boosted by the Covid-19 epidemic, their market is exploding, as are the risks to the privacy of the individuals behind the data. How to unleash the potential of digital twins without compromising ethics? We have the solution: avatar, a unique data anonymization software that is unique and successfully evaluated by the CNIL. Impossible, in practice, to re-identify, avatar data comes out of the GDPR. They become usable, shareable — even outside the European Union — and can be stored without limits, while guaranteeing the quality of the initial data set. What is our difference from the competition? We prove all of these points with our metrics. A real revolution in the current context of the Health Data Hub. What if tomorrow, synthetic and anonymous avatar data became the norm?
” Houston, we've had a problem. ” launched the Apollo 13 crew on April 17, 1970.
A short distance from the moon, an explosion has just occurred on board the spaceship. Hundreds of thousands of kilometers away, on Earth, NASA teams diagnose and remotely solve the problem using several simulators, a kind of “digital duplicates”, synchronized using the data flow from the shuttle. The crew returns safe and sound. The ancestors of digital twins were born. NASA was the first to develop them, but it was not until 30 years for the concept of a “digital twin” to emerge.
What is a “digital twin”?
In 2002, Michael Grieves is a PLM (Product Lifecycle Management) researcher at the University of Michigan. During the presentation of a center dedicated to product life cycle management, he explained for the first time to the manufacturers present the concept of “digital twin”: a digital replica of a physical object or system. It is not a fixed model, but a dynamic model, reproducing its needs, its behavior and its evolution over time. As with Apollo 13, a visceral link connects the physical entity to its digital twin: the flow of data flowing from one to the other.
Since then, the concept of a digital twin has changed little. It is about replicating an object (a piston or the engine of a car), a system (a Nuclear power plant or a city) or an abstract process (a production schedule). The concept also applies to living beings: a molecule, a cell, a cell, an organ or a patient, such as a drug, a virus, a disease or an epidemic, can have their digital twin.
Digital twins are an evolution, more than a revolution, combining mathematical modeling and numerical simulation.
Fruit of the growth of new technologies (IoT, big data, big data, AI, cloud, etc.) and computing power, digital twins are an evolution, more than a revolution, combining mathematical modeling and numerical simulation. Incoming data, wherever they come from — real, synthetic, collected in real time using sensors or via pre-existing databases — feeds a mathematical model to configure it finely. The model can then be transformed into a numerical guinea pig, on which to test different scenarios via simulations, in order to predict the evolution of the real system.
Product design and life cycle, automotive and aeronautics, energy production and distribution, transport, Smart building and urban planning, digital twins are now one of the pillars of Industry 4.0. They have recently spread to other sectors, such as logistics, and above all, health. According to a study of MarketsandMarkets, the digital twin market could grow from 3.1 billion dollars in 2020 to 48.2 billion dollars in 2026, a spectacular growth of 58%, due in part to the Covid-19 epidemic.
The promises of digital twins in health, myth or reality?
Last January, at the CES (Consumer Electronics Show) in Las Vegas, Dassault Systèmes presented its latest feat, the digital twin of a human heart, the result of 7 years of development. Powered by data collected from hundreds of doctors, researchers and manufacturers around the world, it replicates not only the anatomy of the heart, but also its functioning: the flow of electrical current along the nerves, the behavior of muscle fibers, reaction to various drugs, etc. With the progress of medical imaging, this digital twin is easily customizable. It takes less than a day to replicate the morphology and pathologies of a patient's heart.
Dassault Systèmes and its competitors are already working on other organs, including the lungs, liver and of course the brain, but whose exact replica is currently out of reach. And for good reason! Neurobiologists have not yet unravelled all of its mysteries. The perfect clone of the human body — modeling anatomy, genetics, genetics, metabolism, body functions and pathologies — is therefore not coming soon. However, there is no need to wait for comprehensive digital twins to advance by leaps and bounds. Digital twins, even partial ones, of certain organs, diseases or patient/drug couples — such as those developed by the start-up ExactCure — are already sufficient to respond to specific problems.
If digital twins keep all their promises, they will ultimately usher in the advent of personalized medicine.
Simulate the anatomy and functioning of our body at the molecular, cellular, tissue and organic scales; model tailor-made implants; simulate aging or disease; test a drug, a vaccine on a virtual patient or cohort; repeat and assist complex surgical procedures; monitor the flow of patients in hospitals; monitor patient flows in hospitals to rationalize human and technical resources: if digital twins keep all their promises, they will ultimately sign the advent of personalized medicine.
One study published in July 2021 in the journal Life Sciences, Society and Policy reviews the socio-ethical benefits of digital twins in health services. On the podium, we find the prevention and treatment of diseases, then the reduction of costs for certain health institutions, and finally, a gain in autonomy for patients — better informed, they are better able to make informed decisions about their care journey.
Risks that live up to the hopes raised
However, there are still many obstacles to overcome before reaching this Eldorado of public health. The fundamental problem lies in the lifeblood of digital twins: health data. This highly sensitive personal data indeed contains genetic, biological, physical or lifestyle-related information. The same study alerts on the number 1 socio-ethical risk of digital twins, mentioned by all participants: the violation of privacy.
The fundamental problem lies in the lifeblood of digital twins: health data. This highly sensitive personal data indeed contains genetic, biological, physical or lifestyle-related information.
If digital twins are owned or hosted by private organizations, this information can be used without patients' knowledge or even backfired. The simplest example: a bank or insurance company with access to it could refuse a loan or increase its premiums to a sick person.
Add to that security breaches. As digital twins multiply, so do the risks of losing or having data stolen. But once the data is leaked, it's too late. They can be used by anyone, any way. A disaster scenario that is becoming more and more frequent in France, where cyberattacks against health organizations have doubled in 2021. The data theft of Health Insurance, at the beginning of 2022, concerning half a million French people is a striking example.
All the benefits of digital twins are therefore conditioned by the availability and quality of health data.
Then there is another risk: poor data quality. Indeed, AI algorithms train on available biomedical data. However, they are often heterogeneous, incomplete and not always reliable. This is for several reasons: lack of standardization, pressure to publish, bias, tradition of not publishing failures, etc. Who says bad data, says bad model and bad simulations.
All the benefits of digital twins are therefore conditioned by the availability and quality of health data. However, they are extremely difficult to recover and exploit by researchers, especially in France, where their use is strictly limited by the RGPD (General Data Protection Regulation) and the Data Protection Act. In particular, their transfer outside the European Union is prohibited, a particularly sensitive subject in the current public debate. Business is in fact succeeding one another at a frenetic pace, from Google Analytics unto Meta. The government even preferred adjourn the application for authorization with the CNIL for the Health Data Hub, time to change this project to centralize health data.
The data avatar to unleash the growth potential of digital twins
To unleash the growth potential of digital twins, however, there is already a solution proposed by Octopize - Mimethik Data, our deeptech start-up. In fact, we have developed a unique and patented software for anonymizing data: avatar. Data anonymization is not new and the methods are constantly multiplying. However, most do not provide proof the impossibility of re-identifying patients, Far from it. Our breakthrough innovation, based on a new Artificial Intelligence technique, makes it possible to exploit and share personal data with absolute respect for privacy. Unlike our competitors, thanks to our metrics, we can prove the effectiveness of our synthetic and anonymous data. avatar both on the respect of privacy and on the quality of data. Our secret? An AI algorithm focused on each patient, not on the entire data set.
For each patient (i.e. each row in the database), we use a KNN algorithm — nearest neighbor method — to identify a certain number of neighboring data. It is from these neighboring data that we build our model. At this point, the real patient and their data have “disappeared” — impossible to know if they are in the model or not, only their closest neighbours are. We then generate an avatar using a local pseudostochastic model, i.e. we introduce random noise, and therefore not reversible, for each attribute (i.e. each column in the database). Impossible to go back, each time we restart the model for the same patient, we create a different avatar. This ensures anonymization, while maintaining the granularity of the data set, the correlations between individuals and the distributions on each variable. Same Gauss curves, same averages and same standard deviations, except Epsilon.
The data, once avatarized, becomes anonymous summary data, without the risk of re-identification for patients. They then leave the RGPD and their use becomes unlimited.
The data, once avatarized, becomes summary data, without the risk of re-identification for patients. They then leave the RGPD and their use becomes unlimited. They are conservable, exploitable, shareable and reusable without geographical or temporal constraints. Moreover, the CNIL of was not wrong and successfully evaluated our method in 2020, attesting to its compliance with three criteria on anonymization described inreviews of the G29. Thanks to the data avatar, exit the risk of privacy violations inherent in digital twins.
The data avatar are also easily deployable and scalable. Configurable, they adapt to all needs, from internal use to open data. Another advantage is the data avatar also solve the problems of availability and bias of health data. From a real data set, we can generate synthetic data sets that are larger than the initial database, each individual can give rise to several avatar data. In this way, we can amplify a cohort. In the end, we offer health data sets that are labeled and “clean”, ready to use, ready for all uses.
Beyond digital twins, avatar data is in itself a revolution and not only in the field of health.
By addressing issues of privacy, availability and data quality, avatarization is therefore a great opportunity to unleash the growth potential of digital twins. But beyond that, the evolution and not only in the field of health. Banking, insurance, telecom, industry, energy, industry, energy, all sectors handling sensitive data now have a turnkey solution. Octopize Defend with its data avatar an ethical point of view in the service of value creation. We are deeply convinced that the avatar of data, a breakthrough innovation today, will be the new European standard tomorrow.
15/05/2022© Octopize