FraudSense - Real-Time Financial Fraud Detection
Team Members
FraudSense was developed by a team of four graduate students from George Mason University (GMU) with expertise in data engineering, machine learning, and cloud technologies:
- Aravind Panchanthan
- Praneeth Ravirala
- Banudeep Reddy
- Keerthana Reddy Singireddy
Our collective goal was to create an efficient, scalable, and real-time fraud detection system to improve financial security by leveraging advanced data processing and machine learning techniques.
Inspiration
Financial fraud is a growing problem, leading to substantial financial losses for individuals and businesses. Traditional fraud detection systems often face challenges in accuracy, high false positives, and delayed response times. Our objective was to build a real-time fraud detection system that not only identifies fraudulent transactions efficiently but also gives businesses greater control over fraud prevention. By integrating data engineering and machine learning, we created a scalable, automated pipeline capable of handling large-scale financial transactions with precision.
What It Does
FraudSense is a real-time fraud detection system that processes financial transactions every five minutes, detects anomalies using machine learning, and provides actionable insights. The system is designed to be serverless, scalable, and highly secure, ensuring seamless fraud identification while minimizing false positives.
- Ingests real-time transaction data using an automated pipeline.
- Processes data efficiently using AWS and Databricks.
- Detects fraudulent transactions using machine learning models trained on financial fraud patterns.
- Generates fraud alerts by integrating with ServiceNow for automated case management.
- Provides visualization dashboards for fraud analysts to monitor and respond to fraudulent activities in real time.
How We Built It
1. Model Architecture
We designed a robust fraud detection pipeline consisting of real-time ingestion, machine learning predictions, and alerting mechanisms. The architecture integrates AWS, Databricks, and Tableau for seamless automation.
2. Real-Time Data Ingestion
We started by setting up an automated Python script that executes every 5 minutes to fetch transaction data. The real-time data was then sent to AWS EventBridge, which was tied to an AWS Lambda function.
- IAM roles were used to ensure security and access control.
- Data was stored in DynamoDB, a NoSQL database optimized for high-speed transactions.
- Static data from the IEEE-CIS Financial Fraud Dataset was stored separately in Amazon S3 for batch processing.
3. Data Processing & ML Pipeline
- The real-time data from DynamoDB and the static data from Amazon S3 were synced to Databricks.
- IAM access was configured using instance profiles, which was a complex task that took several hours to troubleshoot.
- Data preprocessing was performed using PySpark, ensuring efficient feature extraction and transformation.
4. Machine Learning Model Development
- We implemented a LightGBM model, trained on seven key features inspired by a Kaggle solution.
- Feature selection focused on transactional attributes such as user identity, card details, merchant information, device type, and time-based components.
- The model was trained on Databricks, optimizing it for fraud detection in financial transactions.
- Predictions were generated every 5 minutes to ensure real-time fraud detection.
5. Fraud Alerts and Monitoring
- ServiceNow integration was implemented using REST APIs to automatically create fraud case tickets for high-risk transactions.
- A SQL Warehouse was connected to Databricks, enabling fraud analysts to visualize transaction trends and identify fraud patterns.
- Email notifications were configured to alert users about potentially fraudulent transactions, enhancing fraud prevention measures.
Tableau Dashboard
We developed a fraud detection dashboard using Tableau to provide real-time visualization of fraud patterns and financial risks.
Challenges We Ran Into
- Complex AWS Integration: Syncing data across EventBridge, Lambda, DynamoDB, and S3 required extensive IAM role configurations, taking 3-4 hours to resolve.
- Feature Selection & Model Optimization: Finding the right features for fraud detection was time-consuming, requiring several iterations to optimize model accuracy.
- Real-Time Processing Constraints: Ensuring that fraud predictions were generated within minutes while handling large-scale transactions was challenging.
- Webhook Implementation: Setting up a webhook-based sync between DynamoDB and Databricks required extensive testing to achieve minimal latency.
Accomplishments That We're Proud Of
- Successfully built a real-time fraud detection pipeline with end-to-end automation.
- Implemented real-time fraud predictions with a 5-minute interval between data ingestion and prediction generation.
- Achieved seamless integration with ServiceNow for automated fraud case tracking.
- Developed a fully serverless solution that is scalable and cost-efficient.
What We Learned
Throughout the development of FraudSense, our team gained valuable insights into various aspects of real-time fraud detection, cloud infrastructure, and machine learning model deployment. Here are the key learnings:
Cloud Infrastructure & Automation
- Efficiently leveraging AWS services such as Lambda, DynamoDB, EventBridge, and S3 for real-time fraud detection.
- Implementing IAM roles and policies to manage security and access control across services.
- Troubleshooting cross-service integration challenges in AWS, particularly with event-driven architectures.
- Efficiently leveraging AWS services such as Lambda, DynamoDB, EventBridge, and S3 for real-time fraud detection.
Big Data Processing & Machine Learning
- Using PySpark to process large-scale financial transaction data efficiently.
- Implementing LightGBM for fraud detection and tuning hyperparameters for improved performance.
- Deploying a real-time prediction pipeline with Databricks and ensuring continuous model retraining.
- Using PySpark to process large-scale financial transaction data efficiently.
Fraud Detection Strategies & Model Optimization
- Feature engineering based on transaction metadata such as user identity, device type, transaction amount, and merchant behavior.
- Handling imbalanced datasets using SMOTE and threshold tuning to reduce false positives.
- Integrating ServiceNow for fraud case tracking, ensuring that flagged fraudulent transactions trigger automatic case creation.
- Feature engineering based on transaction metadata such as user identity, device type, transaction amount, and merchant behavior.
Visualization & Monitoring
- Creating Tableau dashboards for fraud detection monitoring, helping analysts make data-driven decisions.
- Automating fraud alerts using email and ServiceNow notifications, improving fraud prevention workflows.
- Creating Tableau dashboards for fraud detection monitoring, helping analysts make data-driven decisions.
What's Next for FraudSense
While our system successfully detects fraudulent transactions in real time, we plan to extend FraudSense with more advanced capabilities:
Enhancing Fraud Detection Accuracy
- Implement ensemble learning techniques by combining LightGBM, XGBoost, and Random Forest models to improve fraud classification.
- Fine-tune model hyperparameters using Bayesian Optimization and AutoML.
- Explore deep learning approaches like LSTMs and Transformer-based models for sequence anomaly detection.
- Implement ensemble learning techniques by combining LightGBM, XGBoost, and Random Forest models to improve fraud classification.
Scaling to Enterprise-Level Use Cases
- Expand the fraud detection model to support multiple financial institutions.
- Integrate real-time API calls for instant fraud checks in banking applications.
- Implement multi-region AWS deployments to handle large-scale transactions across global markets.
- Expand the fraud detection model to support multiple financial institutions.
Blockchain-Based Fraud Prevention
- Research and implement blockchain technology for secure transaction validation.
- Use smart contracts to flag suspicious transactions before execution.
- Research and implement blockchain technology for secure transaction validation.
Advanced AI-Powered Fraud Detection
- Incorporate unsupervised anomaly detection techniques such as Autoencoders and Isolation Forests to detect previously unseen fraud patterns.
- Implement reinforcement learning to dynamically adjust fraud thresholds based on transaction risk.
- Incorporate unsupervised anomaly detection techniques such as Autoencoders and Isolation Forests to detect previously unseen fraud patterns.
Enhancing Real-Time Dashboards & Interpretability
- Develop Tableau-powered fraud heatmaps to identify high-risk geographical regions.
- Integrate SHAP and LIME interpretability models to explain fraud detection decisions.
- Implement Grafana dashboards for real-time monitoring of data ingestion and model performance.
- Develop Tableau-powered fraud heatmaps to identify high-risk geographical regions.
System Performance and Results
FraudSense was tested on live transaction data and successfully identified fraudulent patterns with high accuracy and minimal false positives.
GitHub Repository
All code, datasets, and system architecture details can be found in our GitHub repository:
Project Demo Video
Watch the complete demo of FraudSense in action:
Final Thoughts
FraudSense is a real-time fraud detection system that integrates machine learning, big data processing, and cloud automation. By leveraging AWS, Databricks, and ServiceNow, we have built a scalable, automated fraud prevention solution.
This project was developed as part of HackNYU 2025 under the Capital One FinTech Track, aiming to revolutionize fraud detection in financial transactions.
Built With
- amazon-web-services
- anomaly-detection
- apache-spark
- automated-retraining
- banking-security
- big-data
- capital-one-fintech
- ci-cd
- cloud-architecture
- cloud-computing
- cloud-storage
- cyber-security
- data-engineering
- databricks
- dynamodb
- eventbridge
- feature-engineering
- financial-analytics
- financial-security
- fintech
- fintech-hackathon
- fraud-detection
- fraud-prediction
- github
- github-actions
- grafana
- hacknyu
- iam
- lambda
- lightgbm
- machine-learning
- ml-pipeline
- mllib
- model-training
- nosql
- pyspark
- real-time-detection
- real-time-processing
- rest-api
- risk-management
- serverless
- serverless-computing
- sql
- supervised-learning
- tableau
- transaction-monitoring
- webhook


Log in or sign up for Devpost to join the conversation.