calender_genie

Inspiration

Calendar-Genie was inspired by the need for a conversational assistant that not only schedules meetings smartly but also brings LLM-driven insight to meeting preparation, proactive conflict detection, and integrates hot-reloadable organizational data—all in a privacy-respecting, modifiable local stack.[9]

What it does

Provides natural-language chat for meeting scheduling, preparation, and Q&A.
Analyzes intent, detects scheduling and replacements, and confirms availability using an agentic pipeline.
Summarizes meetings, bundles information, and supports rich retrieval over both structured meeting data and RAG documents.
Improvements and working features:
- Hot-reloads meeting.json for live updates.
- Falls back to browser Web Speech API for TTS if ElevenLabs (API or key) is missing or invalid.
- Returns TTS audio (base64 or static URL) for each response—frontend auto-plays this audio instantly, with no play button required.
- Modular agents for fetching context, scheduling, and answer synthesis, each with traceable internal state for debugging.
- Handles scheduling edge-cases: conflict detection, organizer-awareness, multi-step confirmations, and safe time checks.
- RAG system degrades gracefully if vector search/indexing is unavailable (always returns best-effort context).
- UI and API ready for hot-reload and offline heuristic fallback (no vendor lock-in).

How we built it

Backend: FastAPI for endpoint orchestration, session management, and agent/message workflow.
Agents: Modular Python classes for each reasoning pipeline (SmartFetcher, Scheduler, Conversation Analysis).
Chat endpoint chains intent detection, information retrieval, answer synthesis, and audio generation.
LLM: OpenRouter (Claude) preferred, but gracefully degrades to local heuristics/stubs for offline testing.
TTS: ElevenLabs API if available (returns base64/URL); automatic fallback to browser TTS for resiliency.
Frontend: SPA with a dynamic chat UI; triggers audio playback programmatically using the returned audio_url with JS, ensuring immediate voice response.

Challenges we ran into

Handling ElevenLabs TTS failures and guaranteeing a voice fallback (browser SpeechSynthesis).
Maintaining state and ensuring UI reflects real-time meeting changes as meeting.json is edited externally.
Graceful LLM/embedding degradation for RAG when cloud APIs or local modules are not available.
Ensuring auto-play audio works across browsers, respecting autoplay policies while guaranteeing feedback to the user.[

Accomplishments that we're proud of
Agentic design that’s explainable, modular, and easy to extend or debug.
True hot-reload of meeting data, bridging file changes to agent context without a restart.
Automatic, programmatic voice responses with TTS, robust to API outages or missing keys.
Rich confirmation flows for scheduling, including conflict detection and explicit user confirmations.
Full test coverage for both agent logic and offline scheduling flows.

What we learned

Robust agent and retrieval pipelines need layered fallbacks to avoid brittle end-user experiences (cloud + local + browser fallback).
Real-world meeting data is messy—live hot-reload and organizer-aware logic are essential.
Browser audio policies impact UX; auto-playable audio builds trust and immediacy, justifying fallback layers.[10]
Open, composable design is critical for rapid iteration and debugging in research or production settings.

What's next for Calendar-Genie

Real-time agenda extraction, action item tracking, and deep-dive Q&A over entire meeting corpora.
Expanded voice options (multi-voice, speaker diarization), streaming TTS for large responses.
OAuth and persistent cloud calendar integrations for team contexts.
Enhanced frontend UI for inline bot suggestions, batch meeting management, and proactive notifications.
Broader RAG and embedding support, and explicit privacy/hosting recipes for enterprise and research users.

Built With

css
dotenv
elevenlabs-api
fastapi
google-oauth
html
httpx
huggingface-embeddings
javascript
jwt
llama-index
oauth-2.0
openrouter
pydantic
python
starlette
uvicorn
web-speech-api