Inspiration
Modern SaaS platforms rely heavily on API tokens to enable automation, integrations, and scalability. However, these tokens often bypass traditional login monitoring, creating a covert security gap where compromised tokens can operate undetected. This project was inspired by the need to address that blind spot by shifting detection from authentication-based monitoring to behavior-based analysis.
What it does
Detects compromised SaaS API tokens by analyzing behavior, identifying anomalies, and generating risk-based alerts for faster threat detection and response.
How we built it
We built an end-to-end detection pipeline using Python, Docker, and Jupyter. We started by generating and ingesting synthetic SaaS API logs, then normalized the data into a structured, security-focused schema. Next, we developed behavioral baselines for each API token by analyzing patterns such as IP usage, geographic origin, endpoint access, and request volume. We implemented anomaly detection to identify deviations like traffic spikes, unusual locations, and abnormal access behavior. These signals were combined into a risk scoring engine that assigns severity levels and generates actionable alerts. Finally, we built an interactive dashboard using Streamlit to visualize detections, monitor token activity, and support real-time investigation.
Challenges we ran into
One of the main challenges we faced was working with synthetic data that initially lacked realistic behavioral patterns, which made anomaly detection ineffective. We addressed this by redesigning the data generation process to simulate real-world token behavior, including stable identities, constrained IP pools, and geographic consistency. Another challenge was defining what “normal” behavior looks like for each token, as variability across tenants made baseline modeling complex. We also encountered cold code challenges, where components initially lacked sufficient historical context or baseline data to make accurate decisions. Additionally, we had to carefully balance the risk scoring system to reduce false positives while still capturing meaningful threats. Finally, correlating multiple signals such as traffic spikes and network deviations required thoughtful tuning to ensure accurate and explainable detection results.
Accomplishments that we're proud of
We are proud of building a complete, end-to-end detection system that goes beyond traditional security approaches by focusing on behavioral analysis rather than authentication alone. We successfully designed a pipeline that ingests, normalizes, and analyzes API activity at scale, producing meaningful baselines and accurate anomaly detection. A key accomplishment was simulating realistic attack scenarios, such as stolen tokens and botnet-like behavior, which allowed us to validate our detection logic. We also developed a risk scoring engine that prioritizes threats in a clear and actionable way, along with an interactive dashboard that makes the results easy to interpret and investigate. Overall, we are especially proud of turning a complex security problem into a practical, explainable solution that can be extended to real-world SaaS environments.
What we learned
Throughout this project, we learned that behavioral analytics is far more effective than traditional rule-based approaches when it comes to detecting modern threats, especially in token-based authentication systems. We gained a deeper understanding of how API tokens introduce unique security risks and why monitoring usage patterns is critical. We also learned the importance of realistic data modeling, as meaningful detection depends heavily on having accurate behavioral baselines. Additionally, we saw how combining multiple signals such as traffic volume, IP diversity, and geographic deviationssignificantly improves detection accuracy and confidence. Finally, we learned how to design systems that are not only technically effective but also explainable and actionable for real-world security operations.
What's next for SaaS API Abuse and Compromised Token Detection
Next, we plan to enhance the system by integrating real-time streaming data to enable live detection and response instead of batch processing. We aim to incorporate machine learning models to improve anomaly detection beyond rule-based methods and adapt to evolving attack patterns. Additionally, we plan to strengthen network profiling by defining stricter authorization baselines and detecting more subtle geographic and behavioral deviations. Another key step is integrating the system with SIEM and alerting platforms to support automated response actions such as token revocation or access restriction. Finally, we intend to refine the dashboard with deeper analytics and scalability features, making the solution production-ready for real-world SaaS environments.
Built With
- and-feature-engineering.-the-system-architecture-follows-a-modular-pipeline-design
- and-implemented-custom-data-pipelines-for-ingestion
- and-iterative-analysis.-the-interactive-dashboard-was-developed-using-streamlit
- and-jupyter-notebook-for-development
- and-scikit-learn-for-data-processing-and-analysis.-we-used-docker-to-containerize-the-application-and-ensure-a-consistent-runtime-environment
- csv-based-datasets
- docker
- enabling-real-time-visualization-and-investigation-of-detection-results.-for-data-handling
- jupyter
- leveraging-libraries-such-as-pandas
- normalization
- numpy
- pandas
- python
- scikit-learn
- streamlit
- testing
- we-worked-with-csv-based-datasets-to-simulate-saas-api-logs
Log in or sign up for Devpost to join the conversation.