Fraud Detection using DBSCAN Clustering Algorithm
This workflow uses the DBSCAN clustering algorithm to detect fraud by identifying outliers in credit card transaction data. Density-based spatial clustering of applications with noise (DBSCAN) is a unsupervised clustering algorithm that works well with data that does not vary significantly across different parts of the dataset. We normalize the training data and sample a subset for analysis, outliers are tagged for potential fraud. Metrics are extracted at the end for viewing through the 'Scorer' node.
Steps taken for training:
1. Read Training Data
2. Data Preprocessing: Normalize the data into range [0,1] or [good,fraud] and Save Normalizer model
3. Train DBSCAN using Euclidean Distance
4. Mark Outliers and Evaluate Model Results
This workflow uses the DBSCAN clustering algorithm to detect fraud by identifying outliers in credit card transaction data. Density-based spatial clustering of applications with noise (DBSCAN) is a unsupervised clustering algorithm that works well with data that does not vary significantly across different parts of the dataset. We normalize the training data and sample a subset for analysis, outliers are tagged for potential fraud. Metrics are extracted at the end for viewing through the 'Scorer' node.
Steps taken for training:
1. Read Training Data
2. Data Preprocessing: Normalize the data into range [0,1] or [good,fraud] and Save Normalizer model
3. Train DBSCAN using Euclidean Distance
4. Mark Outliers and Evaluate Model Results