Inspiration

We started from a simple frustration: students in Southeast Asia spend hours in Zoom, Google Meet, and Teams classes, but after the call ends, they’re often left with messy notes, half‑remembered concepts, and no clear idea of what they should learn next. Existing tools treat meetings as generic conversations, not as learning events. At the same time, students in Vietnam and across SEA speak in ways that many global AI systems struggle with—Vietnamese‑accented English, code‑switching, and fast classroom speech. That’s why we built Shiori: a local‑first desktop app that transforms every online class into a structured, personalized learning experience.

What it does

Shiori turns Zoom, Google Meet, or Teams sessions into smart study notes with minimal effort. A floating widget appears whenever you join a meeting and lets you start live transcription powered by VALSEA, which accurately captures Vietnamese‑accented English and multilingual classroom speech. After the session, Shiori organizes the recording into a clean, chat‑style note with summaries, key concepts, and a tailored learning path generated via OpenAI that tells you what to learn, where to learn it, and what to keep in mind. All your sessions are grouped into courses, projects, or topics so you can move them around like conversations, review them side‑by‑side, and connect ideas across classes—without ever leaving your computer.

How we built it

We built Shiori as a local-first Electron desktop app with three core surfaces: a floating meeting widget, a live side transcript panel, and a main study workspace. During class, audio is captured from desktop + mic and streamed to VALSEA for realtime transcription. After class, OpenAI (with Gemini fallback) converts the transcript into structured study outputs: summary, key concepts, prerequisites, learning path, and resource suggestions. Sessions are saved locally and organized into groups like courses or projects.

Challenges we ran into

Our biggest challenge was reliable system audio capture across different Windows setups. We also had to balance speed vs quality: fast streaming can fragment transcripts, while slower batching improves sentence quality but adds latency. Another challenge was handling Vietnamese-accented English and code-switching accurately in real classroom conditions.

Accomplishments we’re proud of

We shipped an end-to-end workflow from live meeting transcription to post-class learning guidance in one desktop app. We’re proud that Shiori is designed for SEA learners, not generic meeting use cases, and that it turns raw transcript data into actionable study plans. We also built a local-first session memory that helps students review and organize classes over time.

What we learned

We learned that capture reliability is the foundation—if audio input fails, every AI layer fails regardless of how good they are. We also learned students need structure, not just transcript text: what mattered, what to revise, and what to learn next. Finally, language settings and context hints significantly improve multilingual classroom transcription quality.

What’s next for Shiori

Next, we’ll improve production reliability for desktop audio capture, strengthen transcript segmentation/deduping, and upgrade local storage to SQLite. On the learning side, we’ll add cross-session concept tracking and smarter personalized review paths so Shiori becomes a long-term learning memory, not just a meeting recorder.

Built With

Share this project:

Updates