Inspiration

As a Computer Science lecturer, I see the same problem every semester. Students finish their degrees with solid technical skills but no clear direction — they don't know what jobs match their profile, what to learn next, or how to write a resume that actually gets past ATS screening. Existing career guidance platforms are either behind paywalls or too generic to be useful. I built SkillBridge AI to change that.

What It Does

SkillBridge AI is an end-to-end career and learning recommendation system. A user enters their skills and the system delivers:

  • Top matching job postings from 124,000+ LinkedIn jobs
  • Recommended Coursera courses tailored to their skill set
  • Live YouTube tutorials fetched per skill via the YouTube Data API
  • A skills gap analysis, career path recommendations, and a 3-month learning roadmap generated by 5 specialized AI agents
  • Bias detection across the job dataset (location, experience level, salary)
  • Explainable AI using SHAP values to show what drives each recommendation

How I Built It

The recommendation engine uses TF-IDF vectorization with cosine similarity over two local datasets — 124,000 LinkedIn job postings and 6,645 Coursera courses. Each dataset is indexed at startup and queried in milliseconds at runtime.

The AI layer uses LangChain with Groq (LLaMA 3.3-70b) to power 5 agents:

  1. Skills Analyzer — identifies skill level and gaps
  2. Career Coach — recommends career paths with salary ranges
  3. Learning Path Designer — builds a week-by-week 3-month roadmap
  4. Market Trends Analyst — analyzes job demand and emerging skills
  5. Resume Analyzer — scores resume and suggests ATS improvements

YouTube tutorials are fetched live using the YouTube Data API v3 for each skill the user enters.

Bias detection uses Pandas and Matplotlib to surface location, experience, and work type bias in the dataset. SHAP with a Random Forest model explains the top factors driving each recommendation.

The frontend is built with Streamlit and deployed via ngrok on Kaggle.

Challenges

The main challenge was keeping the system responsive despite running 5 sequential LLM agent calls. I optimized prompt templates to be concise and focused so each agent call completes quickly without sacrificing output quality.

A second challenge was dataset quality. The LinkedIn dataset had significant missing values in salary and skills columns. I handled this by merging the job skills table separately and building the combined text field from multiple columns to maximize coverage.

What I Learned

Building this project deepened my understanding of how TF-IDF performs at scale, how to design focused agent prompts that return structured output reliably, and how SHAP values can make recommendation systems more transparent and trustworthy.

Tech Stack

  • Recommendation Engine: TF-IDF + Cosine Similarity (Scikit-learn)
  • AI Agents: LangChain + Groq (LLaMA 3.3-70b)
  • Explainability: SHAP + Random Forest
  • Video Tutorials: YouTube Data API v3
  • Bias Detection: Pandas + Matplotlib
  • Frontend: Streamlit
  • Datasets: LinkedIn Job Postings (124K) + Coursera Courses 2024 (6.6K)

Built With

Share this project:

Updates