Credit card fraud is an ongoing problem in the bank industry. With the contactless payment possibility and e-commerce it is even easier to misuse another person’s credit card.

Hand holding credit card

Credit card fraud is expensive for the bank industry, as they need to detect the cash flow and if possible, trace back the money. Otherwise, the bank must reimburse the lost sum to its customer. In 2017, credit card fraud amounted to DKK 52,6 million in Denmark (1). Even though new technology is being developed to fight credit card fraud, techniques such as site cloning, false merchant sites, skimming, and phishing additionally gets more advanced (2).

A Case for machine learning

By looking at patterns it is possible to predict whether a credit card is being misused. Some important features might be the amount of transactions, the cash sum of the transactions, and geography. The given fraud problem is a binary classifier problem where the output layer only have one neuron. The model creates a probability indicating a confidence level of the given prediction. If the output is 1, you are certain that the given credit card is being misused, while a value of 0 indicates a normal transaction. In Patidar & Sharma, an output below 0.6 or 0.7 indicates the transaction is normal, while outputs above indicates a need for further investigation (2).

An unbalanced dataset - practical example

As a practical example of how we can help detecting credit card fraud, the dataset from Kaggle has been used. The dataset contains transactions over a two-days period in September 2013. The table below shows the dataset:

n 284,807
n_ok 284,315 (98.828%)
n_fraud 492 (0.172%)

Thankfully, the dataset gathered over two days is highly unbalanced with the number of frauds being the minority contributing with only 0.172% of the sample size.

The null hypothesis will be that the transaction is okay, while the alternative hypothesis is that the transaction is not okay. If we just wanted a model to predict correctly as many times as possible we could just train the model to always give the answer: Transaction ok. We would actually have a model that might be able to predict correctly 99.8% of the time. However, because we are interested in detecting the false positives, and specific the false negatives, this approach will not solve the problem.

H_0 True H_0 False
Accept H_0 Correct Type II error (beta) False Negative
Reject H_0 Type I error (alpha) False Positive Correct

When training a model, it is often beneficial to have each class evenly represented. This problem can be solved in different ways:

  1. Undersampling
  2. Oversampling
  3. Combined Class Methods

When trying to balance the dataset, only the training dataset must be transformed, whereas the validation and test-dataset will be distributed as the true sample size. Transforming the train-sample size to be oversampled is very easy and can be done with just two lines:

Oversampling with imblearn
1
2
3
4
from imblearn.over_sampling import SMOTE

sm = SMOTE()
X_train_oversampled, y_train_oversampled = sm.fit_sample(X_train, y_train.ravel())
Before Oversampling After Oversampling
X_train 182276, 29 182276, 29
y_train 182276, 1 182276, 1
X_validate 45569, 29 45569, 29
y_validate 45569, 1 45569, 1
X_test 56962, 29 56962, 29
y_test 56962, 1 56962, 1

In this example, we have designed a very simple neural network, to show how strong a network like this can be. The network consist of only two hidden layers.

The neural network architecture
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
from keras.model import Sequential
from keras.layers import Dense

model = keras.Sequential()
model.add(keras.layers.Dense(8,
                             input_shape=(29,),
                             activation="relu",
                             use_bias=True))
model.add(keras.layers.Dropout(0.1))
model.add(keras.layers.Dense(6,
                             activation="relu",
                             use_bias=True))
model.add(keras.layers.Dropout(0.1))
model.add(keras.layers.Dense(1,
                             activation="sigmoid"))

model.compile(optimizer = tf.train.RMSPropOptimizer(0.01), 
                          loss = keras.losses.binary_crossentropy,
                          metrics=[keras.metrics.binary_accuracy]

Following model has an ROC AUC Score of 0.985 and indicates that the model has a potential to detect the false negatives and false positives. Additionally it has a train, validation, and test accuracy of 0.99. This is great! The model performs very well on both training, test and validation dataset. Additionally it predicts accurate when introduced to new data.

Using a simple neural network shows to be a strong tool for detecting credit card fraud. However, using a Logistic Regression model or a Long Short-Term Memory model actually provides a score of almost 99%.

Finally, if we reverse the model, looking at inbound transactions instead of outbound transactions, we might additionally be able to detect fraud from for instance money laundering and scam on a personal - and company level. This can potentially help banks and accountants detect fraud amongst their many clients, by only making them look at those cases the model generates a high probability of, and thereby needs further investigation.

// Maria Hvid, Machine Learning Engineer @ neurospace

References

[1] dankort.dk

[2] Patidar & Sharma (2011) Credit Card Fraud Detection Using Neural Network. International Journal of Soft Computing and Engineering (IJSCE) (1) NCAI2011 (p. 32-38)