"Synthetic data is the perfect input into machine learning and AI" - read the EU report on synthetic data
Read more
Log in
Sign up

Synthetic geolocation data for home insurance pricing

Download the case study
Synthetic CRM data for analytics with Telefónica


synthetic home






Home insurance pricing was a risky business for our client. The insurance company catered to homes across the United States in areas with vastly different climate features and risk profiles. CCPA forbade the data science team to use the customers’ personal data, such as their addresses, in their modeling, so they could not assess risk and reflect that in their pricing.


The insurance company served modeling teams with synthetic geolocation data. The team could use synthetic home addresses to look up five climate features, such as fire and flood hazards, in public databases. The pricing model trained on synthetic data scored as good as the model trained on real data.


Using synthetic home addresses eliminated the risk of re-identification and unlocked new insights. The team established a synthetization framework tailored to modeling based on privacy-risk classification and shortened time-to-data from 6 months to 3 days. The process kept 100% utility of the data, perfectly retaining the statistical dispersion of the original and providing an as-good-as real data alternative for training.