HireVision

💡 Inspiration

Hiring is the biggest hurdle for early-stage startups. Solo developers often spend 40+ hours per hire on manual screening and repetitive interviews. With the launch of Gemini 3, I saw the potential to build a solution that wasn't just another automation tool, but a Multimodal Recruitment Intelligence that could see, hear, and reason like a human lead. HireVision was built to level the playing field, giving small teams the hiring power of a giant corporation.

🚀 What it does

HireVision is an end-to-end, AI-powered hiring platform built using the Gemini 3 family.

Multimodal AI Interviews: Conducts voice-based interviews where the AI "sees" the candidate's work in real-time. Using Gemini 3's vision reasoning, it can analyze live code, design choices, or presentations and ask direct, context-aware follow-up questions.
Intelligent Resume Evaluation: Leverages Gemini 3's long-context window to screen resumes against job requirements with deep semantic understanding, going beyond simple keyword matching.
Behavioral Intelligence: Uses Gemini 3's multimodal signals to analyze candidate confidence, engagement levels, and soft skills during the interview.
Structured Candidate Ranking: Automatically ranks candidates based on performance, providing explainable AI insights for faster hiring decisions.

🏗️ How we built it

HireVision is built entirely on the Google AI and Firebase stack for maximum performance and intelligence:

Core Model: Google Gemini 3.0 Flash powers our multimodal "Nervous System," handling real-time visual analysis, reasoning, and conversational logic with ultra-low latency.
AI Framework: Google Genkit was used to build and orchestrate the AI flows, ensuring robust prompt management and structured outputs.
Authentication & Database: Firebase (Auth, Firestore, and Storage) provides the secure foundation for candidate data, resumes, and session persistence.
Multimodal Pipeline: We implemented a real-time data flow that captures visual inputs (screen/camera) and feeds them directly into the Gemini 3 API for immediate feedback during live assessments.

⚡ Challenges we ran into

Multimodal Synchronization: Coordinating real-time visual data with the Gemini 3 logic to ensure the AI's follow-up questions felt natural and immediate.
Prompt Engineering for Gemini 3: Refining prompts to take full advantage of the new reasoning capabilities while maintaining a professional and encouraging interviewer persona.
Processing Long Contexts: Optimizing how we feed diverse candidate data (resumes, project links) into the Gemini 3 window to get the most accurate and explainable scores.

🏆 Accomplishments that we're proud of

Gemini 3 Integration: Successfully building a system that leverages the full power of the Gemini 3 multimodal API for real-time visual interviewing.
Seamless Pipeline: From resume upload to AI-led interview to final ranking, the entire process is handled autonomously by Gemini 3 logic.
Technical Execution: Building a production-ready application that showcases how Gemini 3 can solve high-impact, real-world problems for founders.

🧠 What we learned

How to build multimodal AI workflows using the Gemini 3 API.
The speed and reasoning advantages of the Gemini 3.0 Flash family for low-latency conversational applications.
Best practices for using Google Genkit to organize complex AI agents.

🔮 What's next for HireVision

Expanded Multimodal Analysis: Using Gemini 3 to analyze video recordings for deeper behavioral and soft-skill insights.
AI-Driven Sourcing: Leveraging Gemini's reasoning to discover and evaluate candidate profiles across the web automatically.
Collaborative Review: Using Gemini 3 to summarize interview "Visual Highlights" for hiring teams to review quickly.

Built With

cloud-firestore
firebase-auth
firebase-storage
gemini-3.0-flash
google-genkit
next.js
tailwind-css
typescript

Updates

Deepak deepak started this project — Feb 09, 2026 01:33 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.