Sketch2Solve

Sketch2Solve

Inspiration

Preparing for technical interviews made us realize that most LeetCode practice focuses on writing correct code as quickly as possible, rather than developing real problem-solving intuition. Often, we only recognized the right approach after reading an editorial or watching a solution video, which meant we were learning patterns passively instead of truly internalizing them. We wanted a tool that helps people practice thinking through problems the way they would in a real interview, by explaining ideas out loud, sketching on a whiteboard, and writing rough pseudocode before ever touching full code.

What it does

Sketch2Solve is a multimodal AI reasoning coach that analyzes how users think through algorithmic problems. It combines voice explanations, whiteboard sketches, and pseudocode to infer the user’s intended approach and identify missing reasoning steps. Instead of giving direct answers, it provides guided, Socratic-style hints that help users refine their logic, clarify invariants, define states, and reason about complexity.

To make feedback accurate and problem-specific, we curated and processed solution patterns by analyzing public editorial content, competitive programming resources, and high-quality explanations from platforms like NeetCode and LeetCode discussions. This allows Sketch2Solve to recognize optimal strategies and common pitfalls across thousands of problems and tailor its guidance to each user’s attempt.

How we built it

Sketch2Solve is a web-based application built with a React and Next.js frontend and a lightweight backend API. The interface integrates a digital whiteboard, a pseudocode editor, and in-browser audio recording to support natural problem-solving workflows. User speech is transcribed using a speech-to-text engine, and the system captures periodic snapshots of the whiteboard and text during each session.

When users request feedback, the system aggregates their transcript, sketches, pseudocode, and the selected LeetCode problem context into a structured representation. This data is then sent to a reasoning model trained and calibrated using curated solution patterns and editorial-style explanations. The model infers the user’s strategy, evaluates its correctness and efficiency, detects gaps in reasoning, and generates personalized Socratic hints. Optional text-to-speech is used to deliver feedback in a more conversational way.

Challenges we ran into

One of the biggest challenges was designing a system that could understand incomplete, messy, and evolving human reasoning. Users often change approaches mid-sentence, draw unclear diagrams, or write partial pseudocode, which made reliable interpretation difficult. We had to carefully engineer our preprocessing and prompting pipeline to extract meaningful signals without losing context.

Another major challenge was building a high-quality knowledge base of optimal solution patterns. Scraping, cleaning, and structuring data from editorials, tutorials, and community resources required significant effort to ensure consistency and accuracy. We also had to balance leveraging this data with avoiding overfitting to specific solutions, so the system would focus on teaching principles rather than memorized answers.

Finally, we faced tight constraints around latency and cost. Running frequent deep analyses on multimodal data was expensive and slow, so we designed an event-driven architecture that only triggered heavy reasoning when users truly needed help. Building a polished, reliable system under a 36-hour hackathon deadline pushed us to iterate rapidly and prioritize learning impact over feature completeness.

Accomplishments that we're proud of

We’re most proud of building a fully functional multimodal reasoning coach within a 36-hour hackathon. Sketch2Solve successfully integrates voice, sketching, and pseudocode into a unified pipeline that can infer a user’s intended algorithmic approach and generate guided feedback in real time.

We’re also proud of constructing a structured knowledge base of optimal solution patterns by curating and organizing high-quality editorial explanations and competitive programming resources. Instead of simply wrapping an LLM, we engineered a system that understands common algorithmic frameworks, detects missing reasoning components, and provides Socratic hints rather than full solutions. Most importantly, we transformed interview prep from passive solution consumption into an interactive thinking experience.

What we learned

We learned how challenging it is to interpret human reasoning in its raw form. People think non-linearly: they change strategies mid-sentence, draw incomplete diagrams, and write pseudocode that evolves over time. Designing a system that can make sense of that ambiguity required careful preprocessing, event-driven architecture, and structured prompting.

We also learned how critical prompt design and constraint engineering are when building AI systems that teach. It’s much harder to ask the right question than to give the right answer. Balancing helpful guidance with preserving the user’s learning process required multiple iterations and deep consideration of how people actually build intuition.

Finally, we learned how to scope aggressively under time pressure. Prioritizing a strong core feedback loop over feature breadth allowed us to ship something coherent and impactful.

What's next for Sketch2Solve

Next, we plan to expand our curated pattern database to cover a broader range of algorithmic categories and difficulty levels. We also want to improve diagram understanding by adding structured labeling and lightweight semantic parsing so that sketches can be interpreted more reliably.

Long term, we envision Sketch2Solve evolving into a full reasoning analytics platform that tracks cognitive growth over time. Instead of just solving problems, users would build a profile of their strengths, weaknesses, and recurring reasoning gaps. With deeper personalization, Sketch2Solve could recommend targeted practice sets, simulate real interview pressure, and even adapt to different domains beyond algorithms, such as system design or quantitative reasoning.

Ultimately, our goal is to shift the focus of interview preparation from memorizing solutions to mastering how to think.

Built With

api
classification
context
detection)
elevenlabs
embedding-based
engineering
excalidraw
express.js
fusion
language
models
multimodal
natural
next.js
node.js
openai
pattern
processing
prompt
pseudocode
react
reasoning
retrieval
sketch
socratic
sqlite
strategy
tailwind
text
whisper

Updates

Dheer Guda started this project — Feb 22, 2026 06:07 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.