Inspiration Recruiters today receive hundreds of resumes for a single position, making manual screening slow, inconsistent, and tiring. I noticed how many qualified candidates were being rejected simply because their resumes were not formatted well or because recruiters didn’t have enough time to read every detail. This inspired me to build an AI Resume Screening System — a tool that can read resumes automatically, understand the candidate’s skills, match them to job requirements, and shortlist the best ones instantly. My goal was simple: faster, fairer, and smarter hiring.
What I Learned While building this project, I learned: How Natural Language Processing (NLP) extracts important information from unstructured text How to use libraries like spaCy, NLTK, and scikit-learn How to design a scoring model that matches skills to job descriptions How to build a clean web interface using Flask How AI can reduce bias and improve decision-making in real recruitment systems I also understood the importance of dataset cleaning, resume parsing, and evaluation metrics such as cosine similarity and relevance scoring.
How I Built the Project The system follows a clear pipeline:
- Resume Upload Users upload resumes in PDF or DOCX format.
- Resume Parsing & NLP Using spaCy and NLTK, the system extracts: Skills Education Experience Certifications Relevant keywords
- Matching Model With scikit-learn, the extracted text is converted into vectors using: TF-IDF Bag of Words Cosine similarity These help measure how closely a resume matches the job description.
- Task-Based Evaluation To improve fairness, the system automatically generates small tasks or questions based on missing skills. Candidate answers are then analyzed and scored.
- Final Ranking A combined score is calculated from: Resume relevance Task performance Candidates are ranked, and top profiles are shortlisted.
- Web Interface Using Flask, I created a simple web UI where recruiters can: Upload resumes View candidate scores See strengths and missing skills Download shortlists
Challenges I Faced Extracting clean text from different resume formats (PDF, DOCX) Matching resumes with various layouts and styles Ensuring the system does not become biased toward keyword-heavy resumes Handling resumes with missing sections (e.g., no skills or unclear structure) Generating meaningful tasks dynamically based on job requirements Scoring and ranking candidates in a fair and explainable manner Each challenge improved my understanding of NLP, model design, and real-world HR problems.
Built With
- docx
- flask
- for
- nltk
- pandas
- pdfminer-/-pypdf2-(for-pdf-extraction)
- python
- python-docx
- scikit-learn
- spacy
Log in or sign up for Devpost to join the conversation.