StudySync AI: The Contextual Learning Partner Inspiration The core inspiration is solving "Save-It-for-Later" Paralysis and the "Firehose" Effect. Professionals are overwhelmed by the daily explosion of new material (AI, Crypto, etc.), leading to: The "Note-Taking Tax": The prohibitive manual effort required to organize and style raw inputs (links, PDFs) into a personalized knowledge base. Curriculum Paralysis: The lack of expertise or guidance on where to start and what prerequisite knowledge is needed when faced with a mass of unstructured content. Passive Tooling: Existing tools waiting for user input, rather than proactively creating a plan to turn saved content into actionable skills. What it does StudySync AI is an autonomous Cognitive Learning Manager and Proactive Agent. It automates the entire self-study lifecycle: Ingest & Transform: Effortlessly turns raw, messy materials (PDFs, links, audio) into a pristine, personalized Knowledge Bank by applying the user's custom "Learning DNA" (preferred format and tone) using the Gemini 3 API's Style Sequencer. Proactive Scheduling: Automatically clusters materials by topic, generates a structured study plan, and proposes it to the user. Contextual Serving: Once the calendar booking logic is complete, it will book conflict-free "Study Sessions" and dynamically adapt the content format (e.g., Audio Summary for a commute slot, Text Note for a desk slot). How we built it The application is a Next.js/FastAPI stack backed by a Supabase (PostgreSQL) database. AI Engine: Google Gemini 3 (Multimodal Reasoning & Vision) for all synthesis, style transfer, effort estimation, and complex reasoning. Data & Logic: We use pgvector for Topic Clustering to group bulk-dumped files. Execution: We have configured the Google Calendar API (Read/Write access) but the core booking and deletion logic remains a critical next step to implement. Core Modules: Three distinct microservices (agents) handle the flow: The Smart Library (Ingest), The Proactive Proposer (Plan Generation), and The Contextual Booker (Calendar Execution). 🤖 The Multi-Agent Ecosystem: Gemini-Powered Personalization We orchestrated a Multi-Agent System where each agent leverages specific Gemini capabilities to achieve hyper-personalization: The Ingestion Agent ("The Librarian"): Uses Gemini’s 2M token window and multimodal vision to ingest entire textbooks—understanding diagrams, charts, and layout—not just raw text. It builds a semantic map of the content. The Profile Agent ("The DNA Decoder"): Analyzes user interaction history and explicit preferences to construct a "Learning DNA" profile (e.g., "Visual Learner" + "Socratic Tone"). The Synthesis Agent ("The Director"): Connects the dots. It uses Gemini's complex reasoning to decide: "This user is struggling with abstraction. Generate real-world analogies." It then instructs the Style Sequencer to produce the exact multimodal format (e.g., a "Concept Map" video via Veo 3 or a "Podcast Script" via TTS) that matches both the content and the user's cognitive style. The Style Sequencer: Our proprietary logic layer that ensures the multimodal output follows a pedagogical arc (The "heros journey" of learning), rather than random generation. Challenges we ran into The primary challenge was ensuring genuine, meaningful usage of a multimodal model for the entire lifecycle, making multi-modal output the main focus. Targeted Multi-Format Generation: It is challenging to build the logic that only generates additional resource-intensive formats (like Audio/Video) for a topic if the user's inferred routine (e.g., a "Commute Slot") explicitly requires it, preventing wasteful computation and proving meaningful multimodal usage. Style Transfer Consistency: Perfecting the Gemini 3 system prompt (style_instructions) to consistently enforce a user's qualitative style (e.g., "Cornell Notes with emoji headers") across many diverse source materials. Calendar Booker Completion: Finalizing the P0 logic to insert and manage events into GCal is a major milestone that requires dedicated focus. Accomplishments that we're proud of The "Effortless Bank" P0: We achieved the critical Steel Thread of turning an uploaded file into a stylized, formatted "Master Note" ready for consumption. Multi-Modal Output: We have successfully developed the logic for the Targeted Multi-Format Generation, including the creation of a high-quality Markdown note and a two-person Podcast Script (with plans for real TTS integration). The ability to generate a video prompt using the "Triad Formula" is a major conceptual accomplishment. Draft Plan Generation: The PlanGeneratorService successfully estimates reading time and heuristically proposes a complete study schedule based on bulk-uploaded materials. What we learned The project showed us that a truly proactive learning agent requires more than just content generation. The context of when and where a user is learning is just as critical as the content itself. This realization drove the push for the Contextual Partner vision, confirming that the user's daily routine must be the central, organizing logic for a successful "Learning Partner." What's next for StudySync AI The immediate next steps are focused on completing the P0/P1 features to prove the core loop: Calendar Booker Completion: Finalizing the logic for the Contextual Booker to insert events and save the google_event_id (Milestone 4). Full Multi-Format Integration: Integrating Text-to-Speech (TTS) for real audio playback and implementing Text-to-Image/Video generation for visual concepts using cinematic styles (the "Triad Formula"). Contextual Serving: Implementing the core deep-link logic so that the calendar event dynamically serves the correct format (e.g., Audio vs. Text) based on the inferred time slot.

Built With

  • adk
  • ai/langchain
  • api
  • auth
  • backend:
  • calendar
  • database
  • fastapi
  • frontend:-next.js-14+-(app-router)
  • gemini
  • lucide-react-(icons)
  • nextjs
  • oauth
  • orchestration)
  • postgresql
  • python
  • react
  • react-dropzone-(file-upload)
  • react-mermaid2
  • storage
  • tailwind-css
Share this project:

Updates

posted an update

Here is the step need to take to run this app:

Prerequisites Docker & Docker Compose (must be running) Node.js 24.x Python 3.10+ Google Gemini API Key (in .env ) Start the Application Open a terminal in the project root. Run the startup script: bash ./scripts/startup/start-fullstack-dev.sh

Log in or sign up for Devpost to join the conversation.