Machine Learning Based Method for Insurance Fraud Detection on Class Imbalance Datasets With Missing Values

Research Authors

AHMED ABDELREHEEM KHALIL, ZAIMING LIU, AHMED FATHALLA, AHMED ALI, AND AHMAD SALAH

Research Date

Thu, 09/26/2024 - 12:00

Research Department

Department of Statistics, Mathematics and Insurance

Research Journal

IEEE Access

Research Member

Ahmed Abd El-Reheem Ahmed Mohamed

Research Publisher

IEEE publisher

Research Rank

Web of Science (SCI Q2), Scopus (Q1)

Research Vol

12

Research Website

https://doi.org/10.1109/ACCESS.2024.3468993

Research Year

2024

Research Abstract

Insurance fraud is a prevalent issue that insurance companies must face, particularly in the realm of automobile insurance. This type of fraud has significant cost implications for insurance firms and can have a long-term impact on pricing strategies and insurance rates. As a result, accurately predicting and detecting insurance fraud has become a crucial challenge for insurers. The fraud datasets are usually imbalanced, as the number of fraudulent instances is much less than the ligament instances and contains missing values. Prior research has employed machine learning methods to address this class imbalance dataset problem, but there is limited effort handling the class imbalance dataset present in insurance fraud datasets. Moreover, we could not find an overfitting analysis for the relevant predictive models. This paper addresses these two limitations by employing two car insurance company datasets, namely, an Egyptian real-life dataset and a standard dataset. We proposed addressing the missing data and the class imbalance problems with different methods. Then, the predictive models were trained on processed datasets to predict insurance fraud as a classification problem. The classifiers are evaluated on several evaluation metrics. Moreover, we proposed the first overfitting analysis for insurance fraud classifiers, to our knowledge. The obtained results outline that addressing the class imbalance in the insurance fraud detection dataset has a significant positive effect on the performance of the predictive model, while addressing the problem of missing values has a slight effect. Moreover, the proposed methods outperform all of the existing methods on the accuracy metric.

Faculty of Commerce

Machine Learning Based Method for Insurance Fraud Detection on Class Imbalance Datasets With Missing Values

آخر الابحاث

Assiut
University

Important Links

Our Address

Typography

Body

General

Header

Main Menu

Footer

Copyright