需要特别提醒的是:计算出的概率较低的项目,不代表这个一定就是欺诈案件,只是代表这个组合发生在现实中的可能性较低而已。因此为了避免模型结果过度敏感(avoid false alarm),就需要根据业务需要,对汇总后的结果进行排序,本着“抓大放小”的原则逐步推进反欺诈的系统。根据我们在类似项目中的经验,初次实施时,欺诈的先验概率为8%-10%时,准确性概率一般为95%-89%左右。 总结 随着机器学习技术的广泛应用,部署和实施维保反欺诈相关的检测系统和算法的成本逐年降低。这些算法经过了发达市场广泛应用,已经完全成熟。我国的汽车生产商可以通过逐步采用这些算法,每年节约超过百亿人民币的不必要支出。 关于作者: 赵昕,Delta Entropy Technology 七炅科技创始合伙人。曾在四大担任咨询总监,并具有多年的数据分析经验。长期专注于数据分析解决方案,聚焦金融保险和汽车行业。联系邮箱:[email protected] 杨明锋,Delta Entropy Technology 七炅科技创始合伙人。创立Delta Entropy前是美国德勤咨询(Deloitte)和毕马威(KPMG)大数据部门的技术负责人,为诸多世界五百强企业和美国政府机构提供基于大数据和人工智能技术的解决方案。联系邮箱:[email protected] 毛耀鋆,Delta Entropy Technology 七炅科技高级经理,曾任德勤上海精算及保险咨询团队咨询顾问,上海财经大学金融保险统计学士。联系邮箱:[email protected] 参考资料 [1] R.J. Bolton and D.J. Hand, “Unsupervised profiling methods for fraud detection”, Department of Mathematics Imperial College London {r.bolton,d.j.hand}@ic.ac.uk [2] A.Dharmarajan, T. Velmurugan, “Applications of partition based clustering algorithms: A survey”, IEEE International Conference on Computational Intelligence and Computing Research 2013. [3] Data Mining Techniques in Fraud Detection Rekha Bhowmik University of Texas at Dallas [4] S. H.Gene “expression data knowledge discovery using global and local clustering”, Journal of computing, volume 2, issue 3, march 2010. [5] [6] S. Esakkiraj and S. Chidambaram, “A predictive approach for fraud detection using hidden markov model”International Journal of Engineering Research & Technology (IJERT) Vol. 2 Issue 1, January- 2013 C. [7] V.S. Sunderam, G.D. Albada and P.M.A. Sloot,“Computational Science ICCS 2005”. [8] Celebi, Kingravi and Vela, “A comparative study of efficient initialization methods for the K-Means clustering algorithm”, Expert systems with applications,40(1): 200–210, 2013. [9] Xiuchang and Wei, “An improved K-means clustering algorithm”, Journal of networks, Vol. 9, No. 1, January,2014. (责任编辑:本港台直播) |