SkillBridgeAI

Inspiration

As a Computer Science lecturer, I see the same problem every semester. Students finish their degrees with solid technical skills but no clear direction — they don't know what jobs match their profile, what to learn next, or how to write a resume that actually gets past ATS screening. Existing career guidance platforms are either behind paywalls or too generic to be useful. I built SkillBridge AI to change that.

What It Does

SkillBridge AI is an end-to-end career and learning recommendation system. A user enters their skills and the system delivers:

Top matching job postings from 124,000+ LinkedIn jobs
Recommended Coursera courses tailored to their skill set
Live YouTube tutorials fetched per skill via the YouTube Data API
A skills gap analysis, career path recommendations, and a 3-month learning roadmap generated by 5 specialized AI agents
Bias detection across the job dataset (location, experience level, salary)
Explainable AI using SHAP values to show what drives each recommendation

How I Built It

The recommendation engine uses TF-IDF vectorization with cosine similarity over two local datasets — 124,000 LinkedIn job postings and 6,645 Coursera courses. Each dataset is indexed at startup and queried in milliseconds at runtime.

The AI layer uses LangChain with Groq (LLaMA 3.3-70b) to power 5 agents:

Skills Analyzer — identifies skill level and gaps
Career Coach — recommends career paths with salary ranges
Learning Path Designer — builds a week-by-week 3-month roadmap
Market Trends Analyst — analyzes job demand and emerging skills
Resume Analyzer — scores resume and suggests ATS improvements

YouTube tutorials are fetched live using the YouTube Data API v3 for each skill the user enters.

Bias detection uses Pandas and Matplotlib to surface location, experience, and work type bias in the dataset. SHAP with a Random Forest model explains the top factors driving each recommendation.

The frontend is built with Streamlit and deployed via ngrok on Kaggle.

Challenges

The main challenge was keeping the system responsive despite running 5 sequential LLM agent calls. I optimized prompt templates to be concise and focused so each agent call completes quickly without sacrificing output quality.

A second challenge was dataset quality. The LinkedIn dataset had significant missing values in salary and skills columns. I handled this by merging the job skills table separately and building the combined text field from multiple columns to maximize coverage.

What I Learned

Building this project deepened my understanding of how TF-IDF performs at scale, how to design focused agent prompts that return structured output reliably, and how SHAP values can make recommendation systems more transparent and trustworthy.

Tech Stack

Recommendation Engine: TF-IDF + Cosine Similarity (Scikit-learn)
AI Agents: LangChain + Groq (LLaMA 3.3-70b)
Explainability: SHAP + Random Forest
Video Tutorials: YouTube Data API v3
Bias Detection: Pandas + Matplotlib
Frontend: Streamlit
Datasets: LinkedIn Job Postings (124K) + Coursera Courses 2024 (6.6K)

Built With

3.3-70b
api
data
forest
groq
kaggle
langchain
llama
matplotlib
pandas
python
random
scikit-learn
shap
streamlit
tf-idf
v3
youtube

Updates

soohan abbasi started this project — May 15, 2026 10:31 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.