Fruad No More - Credit Card Fraud Detection

Project Title

Credit Card Fraud Detection by Deep Learning Methods

Team Members

Haolin Chen (hchen156), Yadi Yang (yy1568), Yutian Zhang (yzhan179)

Introduction

As the increasing of the number of people who own credit cards, and credit card company introduce more and more products to attract customers to open new credit cards, there is increasing threat of credit card frauds. Detecting and preventing credit fraud have been constant missions of banks and credit institutes. In our project, we are going to explore and deploy ways of better predicting potential credit card fraud with deep learning methods in the course.

This project will be a classification problem to determine whether a transaction is fraud or not. We will implement both supervised and unsupervised learning methods, and compare accuracies.

Related Work

The authors propose a novel anomaly detection framework along with a LIME-based explaining module, which gives explanation to different input-output relations, increase interpretability of the model. The dataset is imbalanced in its classes, which increases the difficulty of training a neural network. There are a few ways to manipulate the data: 1) make it a balanced data; 2) using algorithms like weighted loss function. Focal Loss is one of the most iconic method for this issue.

Locally Interpretable One-Class Anomaly Detection for Credit Card Fraud Detection

Data

The dataset is from Kaggle, and contains 284,807 credit card transactions with 492 fraudulent cases, which was collected in Europe during a 2-day period in September 2013. This dataset is highly imbalanced since fraudulent transactions are rare. The original dataset is not split into training and testing sets, we select 490 out of 492 fraudulent cases and 490 out of 234315 genuine cases to generate a well-balanced testing set, and the remaining 233825 genuine cases form the training set.

Methodology

The fraud detection model is composed of two deep neural networks, which are trained in an unsupervised and adversarial manner. Precisely, the generator is an AutoEncoder aiming to reconstruct genuine transaction data, while the discriminator is a fully-connected network for fraud detection. The explanation module has three white-box explainers in charge of interpretations of the AutoEncoder, discriminator, and the whole detection model, respectively.

Metric

The metric that will be used to measure the success of our model will be the accuracy rate on detecting credit card fraud. Since we are going to implement methods beyond the current paper, we are looking for methods that will show a model that has a higher accuracy rate than the baseline.

Ethics

For the ethics part of the project, we want to make sure the integrity of our data. For example, all personal information should be removed and there could be little trace back. In addition, we want to make sure our models limit human bias. The major stakeholder of this project is the law enforcement and banks. Fraud detection assists the law enforcement and financial institutions to catch criminals who are benefiting from fraudulent behaviors. However, the consequences of our algorithm could potentially cause credit cards to stop functioning and not allow the customer to make the purchase.

Division of Labor

All tasks will be evenly distributed among team members.

2470 Final Project Reflection

Challenges:

The hardest part of this project is deciding on a suitable evaluation metric. The metric that we are using to measure the success of our model is the accuracy rate on detecting credit card fraud. However, we are still thinking about the measure for performance validation since accuracy is not the only measure due to the imbalance of the datasets.

Insights:

We found that the validation performance of the model is decreased as the dataset size is increased. We will explore how the validation performance is affected by the features of different datasets in detail.

Different datasets we tested:

European Card data: It contains two days of transaction data of European cardholders in September 2013. This dataset contains 284,807 samples and 31 features. Out of the given samples, only 492 are fraud cases and account for 0.172% of the dataset.

Small Card Data: It is a small dataset containing 3075 samples and 448 are fraud cases contributing 14.6% of all cases. Half of the features are categorical while another half is numerical.

Plan: Find how the validation performance is affected by different features of different datasets.

Explore hyperparameters to improve model performance.

As for now, we do not anticipate to make any major changes.

Project Reflection