XG Fraud Buster - Machine Learning Model to Detect Fraud
Inspiration
Two weeks ago, my debit card was hacked, and I had to deal with unauthorized transactions and fraud investigations. This experience made me wonder how fraud detection systems work and what technology banks use to identify suspicious activity. I decided to build a project to explore machine learning models for fraud detection and understand how AI can help prevent financial fraud in real time.
What It Does
XG Fraud Buster is a machine learning model that detects fraudulent credit card transactions. It processes financial transaction data, identifies key fraud indicators, and uses Random Forest and XGBoost to classify transactions as fraudulent or legitimate. The model optimizes for high recall and precision to catch fraud while minimizing false alarms.
Presentation Slides
https://countryfriedcreative.com/media/Hacklytics-XG-Fraud-Buster-2025.pdf
How We Built It
Data Preprocessing
- Used a publicly available credit card fraud dataset.
- Scaled numerical features and handled missing data.
- Applied SMOTE (Synthetic Minority Over-sampling Technique) to balance the dataset.
- Used a publicly available credit card fraud dataset.
Model Training & Selection
- Trained Random Forest and XGBoost classifiers.
- Tuned hyperparameters for better performance.
- Split the dataset into 80% training and 20% testing.
- Trained Random Forest and XGBoost classifiers.
Model Evaluation
- Compared models using accuracy, precision, recall, F1-score, and ROC-AUC.
- Found that XGBoost outperformed Random Forest, achieving higher fraud detection accuracy with fewer false positives.
- Compared models using accuracy, precision, recall, F1-score, and ROC-AUC.
Challenges We Ran Into
- Handling Class Imbalance – Fraud cases were rare, making it difficult to train the models effectively. We used SMOTE to generate synthetic fraud samples.
- Reducing False Positives – Some legitimate transactions were flagged as fraud. Tuning precision and recall helped find the right balance.
- Computational Costs – XGBoost performed better but required more computational power than Random Forest.
- Model Explainability – Understanding why the model flagged certain transactions as fraud remains a challenge for real-world applications.
Accomplishments That We're Proud Of
- Successfully built a fraud detection model with 99.1% accuracy using XGBoost.
- Implemented SMOTE to handle class imbalance and improve fraud detection.
- Optimized the model to reduce false positives, making it more practical for real-world use.
- Learned how machine learning models can enhance financial security and prevent fraud.
What We Learned
- XGBoost is highly effective for fraud detection due to its ability to handle complex patterns in data.
- Balancing precision and recall is critical to avoid too many false positives while still catching fraud.
- Feature engineering matters – selecting the right features significantly impacts model performance.
- AI-driven fraud detection is scalable, but it requires ongoing monitoring and adjustments to remain effective.
What's Next for XG Fraud Buster - Machine Learning Model to Detect Fraud
- Real-Time Fraud Detection – Implement the model to analyze transactions in real time.
- Explainability & Transparency – Use SHAP values to better understand why the model flags certain transactions.
- Adaptive Learning – Train the model to continuously improve by learning from new fraud cases.
- Integration with Financial Systems – Explore how this model could be applied to banking and e-commerce fraud detection.
Built With
- github-for-version-control
- imbalanced-learn-for-handling-class-imbalance
- jupyter
- jupyter-notebook-for-development
- kaggle
- numpy
- pandas
- python
- scikit-learn
- scikit-learn-for-machine-learning
- with-pandas-for-data-processing
- xg-boost
- xgboost-for-gradient-boosting
Log in or sign up for Devpost to join the conversation.