Inspiration
The 2017 Equifax data breach exposed how traditional vulnerability scoring systems like CVSS alone are insufficient for real-world prioritization. Despite having a known high-severity flaw (CVE-2017-5638), Equifax failed to patch it on time because they overlooked its high exploit probability and this can happen across different sectors like healthcare,banking, leading to massive losses and security casualities. This inspired us to build a smarter, data-driven model that doesn’t just look at theoretical severity but also considers how likely a vulnerability is to be exploited and how attackers actually use it.
What We Built
We developed a Predictive Security Scoring System (PSSS) that combines:
- CVSS (severity) from the National Vulnerability Database (NVD)
- EPSS (Exploit Prediction Scoring System) for real-world exploit likelihood
- MITRE ATT&CK mappings to link vulnerabilities with real adversary tactics and techniques
Using Python, we built a complete data pipeline that:
- Extracts and cleans CVE data from NVD.
- Predicts missing CVSS metric values using
TF-IDF + Logistic Regressiontrained on vulnerability descriptions. - Merges EPSS exploit probabilities for each CVE.
- Maps each CVE to ATT&CK techniques (e.g., T1190 – Initial Access, T1068 – Privilege Escalation).
- Computes a unified PSSS_final score using weighted contributions from CVSS, EPSS, and ATT&CK.
The result is a dynamic ranking of vulnerabilities that reflects both severity and active threat context.
🧠 What We Learned
- How to work with large JSON and CSV datasets from public cybersecurity repositories.
- How text mining and machine learning can fill gaps in structured security data.
- How to interpret and apply MITRE ATT&CK mappings to enrich vulnerability context.
- That real-world exploitability (EPSS) often outweighs raw severity (CVSS) in prioritization decisions.
Challenges
- Parsing and cleaning thousands of CVE records with inconsistent metadata.
- Managing the class imbalance in CVSS metric prediction.
- Matching CVE IDs accurately across NVD, EPSS, and MITRE datasets.
- Selecting appropriate weights (α, β, γ) to balance severity, exploit probability, and attacker behavior.
Impact
By combining these three intelligence sources, our PSSS model provides context-aware, predictive vulnerability prioritization.
It helps security teams focus on vulnerabilities that are not just severe, but actively exploited and aligned with known attack techniques—preventing incidents like Equifax from repeating.
Deployment Areas / usage by:
SOC analysts for real-time threat prioritization
Incident response teams for triaging alerts
Vulnerability management teams for patch planning
Log in or sign up for Devpost to join the conversation.