Fraud detection is a complex problem with many cutting-edge AI/ML solutions. However, these algorithms are only as good as the data used to train them. Traditionally used rule-based systems produce a high number of false positives and a labor-intensive follow-up process. Investigating a single customer for potential fraud can cost up to $24,000.1 AI/ML algorithms help reduce false positives and detect new frauds, but their performance is highly dependent on the quality of the training data. Rare, high-value frauds are often missed, and signals alerting to fraudulent activity can be misleading.
Through various case studies, MOSTLY AI’s platform has shown to have a consistent improvement on the AUC curve from a relative 2–15% compared to using raw, imbalanced data.2 An improvement of 10% could yield a 10% decrease in false positives. Consider a model with a false positive rate of 1%. The model has identified 100 000 positive fraud cases, out of which ~1000 might not actually be fraud. If we lower the false positive rate to 0.9%, such that only ~900 are not correctly identified. Having ~100 fewer cases to investigate could lead to a savings of $2.4 Million.
1 How financial firms help catch crooks, The Economist https://www.economist.com/the-economist-explains/2017/11/28/how-financial-firms-help-catch-crooks
2 Boost your Machine Learning Accuracy with Synthetic Data, Michael Platzer https://mostly.ai/2020/08/07/boost-machine-learning-accuracy-
with-synthetic-data/