Fraud Detection: DBSCAN Method - Training

This workflow uses the DBSCAN clustering algorithm to detect fraud by identifying outliers in credit card transaction data. Density-based spatial clustering of applications with noise (DBSCAN) is a unsupervised clustering algorithm that works well with data that does not vary significantly across different parts of the dataset. We normalize the training data and sample a subset for analysis, outliers are tagged for potential fraud. Metrics are extracted at the end for viewing through the 'Scorer' node.

Steps taken for training:

Read Training Data
Data Preprocessing: Normalize the data into range [0,1] or [good,fraud] and Save Normalizer model
Train DBSCAN using Euclidean Distance
Mark Outliers and Evaluate Model Results