Inspiration

We noticed a recurring problem in the job application process—recruiters are overwhelmed and have a hard time finding qualified candidates, and the hiring process is often tiring and long. In a world where hundreds of applications can flood a single job post, the real problem is resume overload.

We asked: What if we could build a tool that does intelligent matching between resumes and job descriptions — not just by keywords, but by meaning and intent?*

That led to ResuMatch: a smart AI assistant that helps recruiters screen resumes more effectively and give score. Then rank resumes based on how well they match with the job description to select candidates with a easier and faster way.

What it does

ResuMatch enables you to:

  • Upload a PDF resume and paste a job description
  • Automatically extract and classify resume sections (Experience, Skills, Education, etc.)
  • Extract relevant skills, degrees, languages, experience levels, and keywords from the JD
  • Semantically compare resume content with job requirements
  • Generate a compatibility report with breakdown scores:
    • Tech Skills
    • Soft Skills
    • Language
    • Experience
    • Degree
    • Total Score
  • Score each resume bullet point based on impact and relevance

How we built it

  • 🧾 Resume Parsing: Used PyMuPDF to extract raw text from PDFs, then cleaned and formatted it into line-based data.
  • 🧠 Resume Section Classification: Trained a LogisticRegression model on TF-IDF features to label lines as Experience, Education, Skills, Projects, Certifications, or Other.
  • 🔍 Job Description Analysis: Extracted skills, experience, degree requirements, and key phrases using regex, tokenization (nltk), TF-IDF, and embeddings.
  • 🤖 Semantic Matching: Used SentenceTransformers (all-MiniLM-L6-v2) to compare resume phrases and JD keywords, allowing fuzzy matches like “collaboration” ≈ “worked with a team.”
  • 📈 Resume Bullet Scoring: Built a GradientBoostingRegressor trained on 70+ real and synthetic bullet points scored for impact, helping surface stronger resume content.
  • 🌐 Frontend & API: Flask backend with a custom HTML/CSS/JS UI for uploading resumes and displaying results.

Challenges we ran into

  • ⚠️ NLTK Compatibility Issues with Python 3.13 — we had to downgrade to ensure tokenizer functionality
  • 🧩 Resume Variability: Real resumes have messy formatting; section labeling required smart keyword design and a flexible classifier
  • 🔍 Hard to Find ML Datasets for JD-to-resume matching — we had to work with limited dataset availability and work around using ML models
  • 💬 Synonym Semantics: Words like “teamwork” and “collaborated” don't match via keywords. We could not use hardcode as that was time consuming and not accurate. We could not find a ml model that would take a JD keyword, resume keywords and find similarity score.

Accomplishments that we're proud of

  • 🚀 Built a full-stack resume-job matcher in under 36 hours
  • ✅ Achieved strong section classification accuracy with only ~70 training examples
  • 🧠 Developed a working semantic skill matcher with SentenceTransformers
  • 📊 Resume bullet scoring felt natural and gave helpful feedback to users
  • 🧩 Created a smart scoring system for degree and experience matching using NLP
  • 💡 Identified and integrated WordNet, a dataset by Princeton University.

What we learned

  • Real-world resumes are noisy — robust preprocessing matters
  • Building NLP pipelines for meaning (not just text) requires combining ML, rules, and embeddings
  • Simple models (logistic regression, TF-IDF) can go far when thoughtfully applied
  • Designing for recruiters vs. applicants demands different UX goals
  • Semantic similarity is a better signal than keyword overlap — and harder to build well
  • Comparing different writing style and different words is hard without a proper ml model

What's next for ResuMatch

  • 📂 Batch Mode — Upload multiple resumes to rank the top candidates for a job
  • 🧠 Model Feedback Loop — Learn from recruiter decisions to improve accuracy
  • 🔍 LinkedIn Integration — Chrome extension to score LinkedIn profiles against JDs
  • ✍️ Resume Rewrite Suggestions — Let users get AI-generated improvements on their resume
  • 📈 Data Expansion — Broader, labeled dataset for resume bullet scoring and semantic similarity
  • 🧑‍💻 User Portal — Allow job seekers to save resumes, track job matches, and iteratively improve

Built With

Share this project:

Updates