One of MOSTLY AI’s clients, a Fortune 100 bank, needs to evaluate 100+ startups and vendors each year. To do so, the bank needs to share data externally, which is a time-consuming and expensive process. The bank managed to overcome these obstacles using synthetic data and reduced time-to-data from months to days, resulting in over $10 million savings per year.

Curious to see what the world’s leading synthetic data platform can do for you? Visit and get in touch!

MOSTLY AI solves one of the biggest challenges organizations are facing today: balancing their need for AI & data-driven innovation with privacy protection and GDPR/CCPA compliance. Their synthetic data platform MOSTLY GENERATE helps organizations to unlock their privacy-sensitive big data assets by generating the world’s most accurate synthetic data for behavioral and time-series customer data (e.g. for financial transactions, insurance claims, healthcare data,...). MOSTLY AI’s fundamentally new approach to big data anonymization enables organizations to retain all of the valuable information in a dataset while protecting the privacy of each and every one of their customers. The result is completely anonymous data, that is free to use, free to share, and free to monetize.


Alexandra: At large enterprises, one thing that's definitely day-to-day business is vendor validation and startup collaborations. One of our clients, a Fortune 100 bank needs to evaluate 1,000 plus vendors and startups each and every year. As you can imagine, to do that in our digital times, you need to either get these vendors or startups into your organization or you need to externally share the data. This is something that takes ages and involves many, many people, which, of course, leads to significant costs. Thanks to synthetic data, our client was able to reduce this time to share data by 70%, which ultimately resulted in 10 million in cost-saving each and every year.
Andreas, how did this Fortune 100 bank approach this topic? What did MOSTLY GENERATE and synthetic data enable them to do?
Andreas: One challenge, as you already mentioned, was that it's highly sensitive data they often need to use to evaluate startups. Every large corporation wants to be innovative and invite as many startups into their ecosystem as possible, but then the approval processes can take up to six months even. They said they need to find a way how they can evaluate startups by providing highly realistic and high-quality data. Since there are some data protections in place - and we fully support them - we needed to find a way how we can accelerate the process without compromising on the data quality.
Alexandra: Absolutely, because with these traditional anonymization techniques that were used in the past, what we hear from clients is that once you went through this lengthy process of internal negotiations and finally getting access to the data, it's so heavily anonymized that it is not particularly useful and that the external vendors or startups can't really do something meaningful with this data. Synthetic data really helped them to find a GDPR- and CCPA-compliant way of getting realistic data on the one hand out to the startups, but I think they also build an internal sandbox environment. What was the reason behind that?
Andreas: One challenge was, of course, that they needed to scale this out. It means it's not just one or two startups they wanted to participate in this synthetic data analysis process, but they wanted to have this as a machine to evaluate startups at scale. The moment a new startup comes up, they want to have this environment ready. Also with the synthetic data they want to use for testing so that they can speed up not only the data access but also the whole evaluation process.
Alexandra: Absolutely. I think it's not this boring old sandbox that organizations know from the past, where you still have the issue of not having realistic data. They really had fresh, super granular, super representative synthetic data from different departments and different data sets of the organization. So various different startups and vendors could really make use of this data immediately to test the products and help the bank figure out whether this vendor would be of benefit for them or not. But they didn't stop with vendor validation. What else was an exciting use case for our synthetic data technology for them?Andreas: We observe now based on the many conversations we have with our clients, that cloud is not that thing. It's normality and every prospect wants to utilize also the cloud technologies that are already available. But as you know, sensitive data doesn't end up in the cloud - for good reasons, so they need to find a way how they can utilize the power of the cloud tools without compromising on the privacy aspect. That's why they see synthetic data as a great opportunity for AI/ML training[...]