Solutions

Octopize has developed an innovative software to use personal data via anonymous summary data, avatar data. Avatar data frees up secondary uses of data, data sharing becomes simpler, faster and secure.
The uses are broad:

Train Machine Learning/AI gen models

Background:

Your business wants to train machine learning/generative AI models on personal data. However, legal constraints (RGPD) and ethics do not allow this. Indeed, training AIs with personal data strongly compromises the privacy of individuals because personal data can be re-identifying. Your AI projects are therefore complicated, slowed down or even impossible without anonymization.

Solution:

Our avatar anonymization software offers an effective, secure and ethical solution. Indeed, you can use anonymous synthetic data generated by our avatar software to train your models in compliance.

Results:

  • Maintain the same statistical information as the original data, ensuring the quality of your models.
  • Remove the risk of re-identifying individuals.
  • Unleash the potential of your AI projects by using anonymous synthetic data.
See customer cases

Valuing data

Background:

Your business cannot access personal data due to regulations (GDPR) and privacy protection.

Solution:

The avatar software makes it possible to release this personal data for secondary uses such as the development of artificial intelligence models, scientific valorization or data resale (Data Brocker).

Results:

  • Access data quickly to enhance its informative quality and develop your business.
  • Make decisions based on relevant data quickly and securely.
  • Improve your knowledge (profiling) without violating the privacy of individuals.
See customer cases

Sharing data

Background:

Because of the legal constraints like GDPR, sharing personal data outside of your business and Europe is becoming difficult. This regulation can slow down your projects.

Solution:

Thanks to the avatar software, you generate synthetic and anonymous data that goes beyond the legal scope (GDPR). This gives you the ability to share this data securely around the world.

Results:

  • You can share highly informative avatar data with your partners.
  • Data is freed up for secondary uses: your international projects are thus facilitated.
See customer cases

Evaluate data quality

Background:

Your R&D department wants to use data collected by other organizations to innovate. You cannot assess the potential and quality of this data before acquisition: there is an asymmetry of information.

Solution:

You can access the potential of pre-acquisition data by using avatar data. Synthetic avatar data is anonymous, so it can be shared more quickly and can be used to assess the relevance of data sets prior to a partnership.

Results:

  • Access data quickly with a guaranteed ROI.
  • Evaluate the potential of data before acquisition for your secondary uses.
  • In this way, resolve the information asymmetry between you and your partners.
See customer cases

Protecting individuals

Background:

Your business wants to analyze data to better understand its customers and their journeys, but does not want to compromise on privacy.

Solution:

Avatar software protects the individuals behind the data. This privacy protection is documented through privacy metrics.

Results:

  • Analyze your data (profiling) without violating the privacy of individuals.
  • Document the proof of this privacy (find more details in our technical documentation).
See customer cases

Carry out an Open Science project

Background:

You want to publish personal data in your research paper or you need to share personal data for an academic project (hackaton, open data). The problem? You cannot collect the individual consents required for this new purpose.

Solution:

Avatar data is outside the legal scope (GDPR): it is no longer considered personal data. So you no longer need to collect new consents.

Results:

  • Share your anonymized data in your posts.
  • Organize a hackathon (Machine Learning...) while maintaining the informative quality of the original data.
See customer cases

Retain data with no retention period

Background:

You want to keep personal data that you have collected but you are constrained by the time limits imposed by the GDPR.

Solution:

Anonymous avatar summary data allows you to keep your data and its informative quality for an unlimited period of time. Indeed, avatar data is outside the scope of the GDPR and is therefore no longer constrained by time restrictions.

Results:

  • Keep your data (their quality & granularity) without time limits.
See customer cases

Test data in your non-production environment

Background:

Your internally generated database is exposed to risks of privacy breaches. You need to anonymize your data in order to use it outside of production (without losing the granularity and the initial form of the database).

Solution:

With the avatar software, you make your database anonymous while maintaining granularity, original data quality, and hierarchical relationships.

Results:

  • Improve your non-production tests by easily working on your anonymous database.
  • Avoid data leaks by stopping exploiting personal data.
See customer cases

FAQS

Can we keep the link between personal data and avatar?

No, it would defeat anonymization as defined by the GDPR. It is an irreversible process.

What is the difference with the competitors?

Our anonymization metrics and report that allow you to prove compliance and usefulness are unique. In addition, our calculation speed as well as the transparency and explainability of the method are differentiating points. To learn more about the method: https://www.nature.com/articles/s41746-023-00771-5

Can we anonymize in flows?

We have already successfully completed flow anonymization projects. The challenge is to anonymize small volumes of data while maintaining maximum usefulness. To meet this challenge we have developed a batch approach.

What is the need for deployment with us in terms of infrastructures?

Deployment is completely industrialized thanks to Docker and Kubernetes. Our teams adapt to all architectures in a few hours.

Why is the avatar method compliant with the CNIL?

The CNIL successfully evaluated our anonymization method on the basis of our security and utility metrics respecting the 3 criteria set out by the EDPS to define anonymization (opinion of 05/2014).

Why not anonymize using generative methods?

The fact that synthetic data is artificially generated data could indicate that this data is anonymous by default. The ability to share the generation method rather than the data itself seems to be an additional guarantee of privacy and a paradigm shift in the use of data. However, generative models may also not ensure the confidentiality of training data. Indeed, generative models can remember specific details of the training data, including the presence of specific individuals or personal information, and incorporate that information into the synthetic data that is generated. This type of privacy breach is called Membership inference attack, when a hacker is trying to determine if a specific person's data was used to train a machine learning model. This can lead to serious privacy breaches, especially with sensitive data.