ResuMatch

Inspiration

We noticed a recurring problem in the job application process—recruiters are overwhelmed and have a hard time finding qualified candidates, and the hiring process is often tiring and long. In a world where hundreds of applications can flood a single job post, the real problem is resume overload.

We asked: What if we could build a tool that does intelligent matching between resumes and job descriptions — not just by keywords, but by meaning and intent?*

That led to ResuMatch: a smart AI assistant that helps recruiters screen resumes more effectively and give score. Then rank resumes based on how well they match with the job description to select candidates with a easier and faster way.

What it does

ResuMatch enables you to:

Upload a PDF resume and paste a job description
Automatically extract and classify resume sections (Experience, Skills, Education, etc.)
Extract relevant skills, degrees, languages, experience levels, and keywords from the JD
Semantically compare resume content with job requirements
Generate a compatibility report with breakdown scores:
- Tech Skills
- Soft Skills
- Language
- Experience
- Degree
- Total Score
Score each resume bullet point based on impact and relevance

How we built it

🧾 Resume Parsing: Used PyMuPDF to extract raw text from PDFs, then cleaned and formatted it into line-based data.
🧠 Resume Section Classification: Trained a LogisticRegression model on TF-IDF features to label lines as Experience, Education, Skills, Projects, Certifications, or Other.
🔍 Job Description Analysis: Extracted skills, experience, degree requirements, and key phrases using regex, tokenization (nltk), TF-IDF, and embeddings.
🤖 Semantic Matching: Used SentenceTransformers (all-MiniLM-L6-v2) to compare resume phrases and JD keywords, allowing fuzzy matches like “collaboration” ≈ “worked with a team.”
📈 Resume Bullet Scoring: Built a GradientBoostingRegressor trained on 70+ real and synthetic bullet points scored for impact, helping surface stronger resume content.
🌐 Frontend & API: Flask backend with a custom HTML/CSS/JS UI for uploading resumes and displaying results.

Challenges we ran into

⚠️ NLTK Compatibility Issues with Python 3.13 — we had to downgrade to ensure tokenizer functionality
🧩 Resume Variability: Real resumes have messy formatting; section labeling required smart keyword design and a flexible classifier
🔍 Hard to Find ML Datasets for JD-to-resume matching — we had to work with limited dataset availability and work around using ML models
💬 Synonym Semantics: Words like “teamwork” and “collaborated” don't match via keywords. We could not use hardcode as that was time consuming and not accurate. We could not find a ml model that would take a JD keyword, resume keywords and find similarity score.

Accomplishments that we're proud of

🚀 Built a full-stack resume-job matcher in under 36 hours
✅ Achieved strong section classification accuracy with only ~70 training examples
🧠 Developed a working semantic skill matcher with SentenceTransformers
📊 Resume bullet scoring felt natural and gave helpful feedback to users
🧩 Created a smart scoring system for degree and experience matching using NLP
💡 Identified and integrated WordNet, a dataset by Princeton University.

What we learned

Real-world resumes are noisy — robust preprocessing matters
Building NLP pipelines for meaning (not just text) requires combining ML, rules, and embeddings
Simple models (logistic regression, TF-IDF) can go far when thoughtfully applied
Designing for recruiters vs. applicants demands different UX goals
Semantic similarity is a better signal than keyword overlap — and harder to build well
Comparing different writing style and different words is hard without a proper ml model

What's next for ResuMatch

📂 Batch Mode — Upload multiple resumes to rank the top candidates for a job
🧠 Model Feedback Loop — Learn from recruiter decisions to improve accuracy
🔍 LinkedIn Integration — Chrome extension to score LinkedIn profiles against JDs
✍️ Resume Rewrite Suggestions — Let users get AI-generated improvements on their resume
📈 Data Expansion — Broader, labeled dataset for resume bullet scoring and semantic similarity
🧑‍💻 User Portal — Allow job seekers to save resumes, track job matches, and iteratively improve