SkillSync

AI Content, Privacy and Terms
Voice Roleplay on Soft Skills
Export Learning Session Components
Danger Content Detection and Warning

Inspiration

I built SkillSync after getting frustrated with how many tutorials I watched but never practiced. Video lessons teach concepts, but mastery comes from deliberate practice: stopping, testing, getting feedback, and continuing this loop. I wanted a tool that makes videos actionable — one that watches the same content we watch, finds teachable moments, and coaches the learner, including how they speak, so learning becomes practice, not passive consumption.

What it does

SkillSync turns any YouTube tutorial into an interactive practice session. Paste a URL and Gemini 3 analyzes the video to:

generate timestamped stop points and short-context summaries,
create evidence-grounded questions and rubrics,
auto-pause playback for short practice rounds,
evaluate freeform answers with scores and timestamped evidence,
analyze spoken delivery (prosody) and offer coaching on tone, pace, and confidence,
export study packs (Markdown / Google Docs) and parts lists for technical videos.

How I built it

SkillSync was built through an iterative human–AI co-design process using Google AI Studio, Gemini 3, Visual Studio Code Copilot, to not only generate code, but to enhance about product scope, UX flow, safety boundaries, and system design.

Frontend: React + Vite + TypeScript for a fast, single-page demo UI.

AI: Gemini 3 (flash preview) for native video understanding, structured JSON outputs, and prosody analysis. Responses are validated with JSON schemas for deterministic parsing.

Voice: Web Speech API + optional TTS for roleplay and coaching playback.

Video: YouTube IFrame Player for precise timestamps and timeline markers.

Storage & UX: LocalStorage caching (7-day TTL), an export modal for Markdown / Google Docs, and a lightweight state machine to manage lesson flow.

Repo: [https://github.com/schu37/skillsync] · Demo: [https://skillsync.space]

Challenges I ran into

Prompt engineering: getting reliable, timestamped stop points and schema-valid output required careful system prompts and two-pass prompting.
Safety: preventing generation of unsafe step-by-step instructions for dangerous technical tasks required explicit safety flags and summary-only fallbacks for risky content.
Prosody analysis: capturing useful vocal feedback required sending raw audio and translating Gemini’s prosody output into actionable coaching tips.
Cost & latency: multimodal, long-context analysis is heavier than text-only models, so caching and optional “force refresh” controls were added.

Accomplishments I am proud of

Native video understanding: Gemini 3 analyzes YouTube URLs directly — no manual transcription pipeline.
Multimodal lesson plans: structured JSON lesson plans with stop points, rubrics, and gold answers.
Voice coaching: end-to-end roleplay that evaluates delivery (tone, pace, confidence), not just content.
Export pipeline: study packs downloadable as Markdown and exportable to Google Docs for NotebookLM compatibility.
Demo-ready product: a single-page app deployed on Vercel demonstrating the full flow (paste → analyze → practice → export).

What I learned

Gemini 3 is especially powerful when used as a reasoning and design partner, not just a content generator.
Structured prompts and JSON schemas drastically reduce parsing errors and make AI outputs production-usable.
Prosody matters: learners change behavior faster when given short, concrete feedback on how they speak.
Safety-first design is essential for any “how-to” content; detecting and flagging unsafe instructions protects users and judges alike.
Small UX details (timers, skip-answered toggles, sticky panels) significantly improve retention in practice loops.

What’s next for SkillSync

SkillSync is intentionally built local-first, but its architecture is designed to scale.

Short Term: User Accounts & Cloud Sync

The next step is migrating persistence from localStorage to Supabase, enabling:

Google SSO authentication,
cross-device sync for progress, notes, and session history,
persistent user profiles and learning preferences.

This migration is straightforward because storage is already abstracted behind a service interface, requiring no major architectural changes.

Medium Term: Learning Analytics & Retention

SkillSync will evolve from single sessions into a longitudinal learning tool by adding:

learning dashboards that visualize progress over time,
skill proficiency tracking across domains (e.g. communication, technical skills),
spaced-repetition reminders based on past performance.

These features help learners build durable skills instead of one-off understanding.

Long Term: Active Video Learning Platform

Longer term, SkillSync becomes a platform for active video learning at scale:

playlist-level courses built from YouTube videos,
community-shared lesson plans and practice sessions,
a browser extension that turns any tutorial into guided practice instantly,
deeper Gemini-powered personalization across languages, modalities, and skill types.

The long-term goal is not to create more content, but to make practice the default way people learn from video.
Additional technical details are available in ARCHITECTURE.md on GitHub.

Built With

copilot
gemini
gemini-tts
google-ai-studio
google-cloud-console
google-docs
oauth
typescript
vercel

Updates

Sicheng Chu posted an update — Jan 29, 2026 07:36 AM EST

The video demo has cropped out the Gemini 3 processing time to keep it within 3 min. If you are launching locally or using the website, you will experience waiting and delays. For your reference, in my online and local tests, an 18-minute drone-making video takes about 2-3 minutes to process to get the category, and another 1 minute to generate and load the learning panel. Voice Roleplay usually takes less than 10 seconds to get voice responses after I send out my voice messages.

Clearly, many parts require optimizing and refactoring :(

Log in or sign up for Devpost to join the conversation.

Sicheng Chu started this project — Jan 28, 2026 07:28 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.