Thales

Thales

Anonymize defense data to train a machine learning model

Challenges

  • Anonymize sensor data for train a model of prediction
  • Desensitize data in a Defence context
  • Test the level reliability of data (multi-variate time series) on existing algorithms by comparing the level of utility

Maintaining statistical quality and utility

- Comparison of horizontal speed curves (VH) between original data and avatars - each curve represents a flight

-Comparison of altitude curves (Alt) between original data and avatars - each curve represents a flight

Here we find the differences in correlation between pairs of variables - the values are close to zero so we keep the correlation well (for example between altitude and horizontal speed)

ROI

  • Optimizing predictive models : Precise reconstruction of time traces to improve the performance of data models.
  • Simplified and secure sharing : Expanded access to anonymized data, allowing faster exploitation and in compliance with regulations.
  • Increased visibility: Key results published in a white paper dedicated to AI and Defense, valuing the project and its strategic impacts. [Link coming soon]
“This collaboration represents a major challenge for the processing of strategic and confidential data in Defense. Together, we are exploring the application of their avatar anonymization method to Defense data, in particular for training Machine Learning algorithms. (...) We plan to share the results in a future white paper on AI and Defense.” - Marine Martinez, Program Lead CyberStationF, Thales