Inspiration

The inspiration for Acharya came from observing how fragmented and time-consuming the learning process is. When someone wants to deeply understand a complex topic like "Quantum Mechanics" or "Machine Learning," they need to juggle multiple resources: reading articles, creating flashcards, finding practice quizzes, watching videos, and searching for visual aids. This scattered approach is cognitively demanding and inefficient.

We envisioned an AI system that could transform a single topic into a complete, multi-format learning curriculum autonomously. The name "Acharya" comes from Sanskrit, meaning "teacher" or "guide," reflecting our vision of creating an AI that doesn't just answer questions but builds entire educational experiences.

What it does

Acharya is an advanced multi-agent AI system that takes a single user-provided topic and automatically generates a comprehensive learning module consisting of:

  • Structured Web Content: Detailed educational articles broken down into logical subtopics with clear explanations and depth
  • Flashcards: Key concepts extracted from the content for memorization and quick review
  • Interactive Quizzes: Multiple-choice questions with explanations to test comprehension
  • Educational Podcasts: Engaging conversational audio content featuring two AI hosts (Alice and Bob) discussing the topic
  • Visual Aids: Relevant images automatically sourced and downloaded to enhance understanding

The system intelligently breaks down broad topics into 5-10 subtopics and generates all content formats in parallel, significantly reducing wait times while maintaining coherence across all materials.

How we built it

We built Acharya using the Google Agent Development Kit (ADK) with a sophisticated multi-agent architecture:

Core Technology Stack:

  • Google ADK for agent orchestration
  • Gemini 2.5 Flash and Gemini 2.0 Flash models for content generation
  • Gemini TTS for podcast audio synthesis
  • SerpAPI for image search and retrieval
  • FastAPI for the backend API server
  • React + Vite for the frontend interface
  • SQLite with aiosqlite for async session management

Architecture Design:

  1. Topic Generator Agent: Uses Pydantic schemas with strict output_schema enforcement to break down the main topic into 5-10 structured subtopics, ensuring machine-parseable JSON output

  2. Factory Agent Pattern: Implements dynamic agent creation where sub-agents are instantiated at runtime based on the number of generated subtopics, enabling scalability from 1 to 10+ subtopics without code changes

  3. Parallel Orchestration: Each subtopic follows a sequential pipeline (web content first, then parallel auxiliary content), but all subtopic pipelines run concurrently using ParallelAgent, reducing total execution time from O(n × t_avg) to O(t_max + overhead)

  4. Specialized Sub-Agents:

    • Web Page Agent researches and writes core educational content
    • Flashcard Agent extracts key facts into Q&A pairs
    • Quiz Agent generates multiple-choice questions
    • Podcast Agent creates conversational scripts and converts them to audio
    • Image Agent searches and downloads relevant visual materials
  5. Session State Management: Uses context variable injection ({variable} syntax) to pass data between agents while maintaining clean interfaces

Challenges we ran into

Challenge 1: Context Variable Injection Agents needed access to runtime data like subtopic names and generated content. Initially, we tried passing data through function arguments, but this broke the ADK's agent interface. We solved this by using session state injection via the {variable} syntax in agent instructions, allowing dynamic data injection while maintaining clean agent interfaces.

Challenge 2: Naming Collisions in Parallel Execution When creating multiple agents dynamically, they initially had identical names, causing state collisions where agents would overwrite each other's output. We implemented a strict naming convention using loop indices (e.g., web_page_content_function_agent_{i+1}, webpage_content_{i+1}), ensuring each agent writes to a unique key in the session state.

Challenge 3: API Rate Limiting Spinning up 10+ parallel agents simultaneously triggered 429 Too Many Requests errors from the Gemini API. We implemented a staged execution strategy with strategic 30-60 second delays between major pipeline stages, reducing API pressure while maintaining parallelism benefits.

Challenge 4: TTS Generation Reliability The Gemini TTS API occasionally returned 503 Service Unavailable errors, breaking podcast generation. We implemented exponential backoff with 3 retry attempts (delays of 1s, 2s, 4s).

Challenge 5: Frontend Timeout Issues The frontend would timeout before content generation completed, especially for complex topics with 10 subtopics. We solved this by increasing HTTP timeout to 300 seconds, implementing progressive rendering to show content as it's generated, adding WebSocket support for real-time progress updates, and providing a fallback CLI interface for long-running tasks.

Accomplishments that we're proud of

1. Production-Grade Multi-Agent Orchestration We successfully implemented a sophisticated multi-agent system that dynamically scales based on topic complexity, handling anywhere from 5 to 10+ concurrent agent pipelines with proper state management and error handling.

2. Strict Output Enforcement By leveraging Pydantic schemas and the output_schema parameter, we achieved 100% reliability in agent-to-agent communication, eliminating an entire class of runtime errors from malformed JSON responses.

3. Parallel Execution Performance Our parallel orchestration architecture reduced content generation time by approximately 70-80% compared to sequential execution, making the system practical for real-world use.

4. Multi-Format Content Generation We built a system that doesn't just generate text but creates a complete learning ecosystem with flashcards, quizzes, podcasts, and images—all coherent and derived from the same source material.

5. Resilient Error Handling Through exponential backoff, multi-source fallback, and graceful degradation, we achieved a system that continues to function even when individual components fail, providing partial results rather than complete failure.

6. Clean User Experience We created both a CLI and web interface with real-time progress updates, making a complex backend accessible to end users through a polished, responsive UI.

What we learned

Technical Learnings:

  1. Agent Specialization Over Monolithic Design: Breaking down complex tasks into specialized agents dramatically improves both output quality and system maintainability. Each agent can be optimized, tested, and debugged independently.

  2. Structured Output is Non-Negotiable: In production multi-agent systems, using Pydantic schemas to enforce JSON structure eliminates runtime errors and ensures reliable inter-agent communication.

  3. Asynchronous Programming Mastery: Building Acharya required deep understanding of Python's asyncio library, including proper async/await chaining, concurrent task management with asyncio.gather(), timeout handling, and async database operations.

  4. Parallelism Requires Careful Orchestration: While parallel execution dramatically improves performance, it introduces challenges around unique naming, state management, and API rate limits that must be carefully addressed.

  5. Resilience Must Be Built In: Production AI systems need retry logic, fallback mechanisms, exponential backoff, and graceful degradation. Reliability doesn't happen by accident—it must be intentionally architected.

  6. State Management Complexity: Managing shared state across concurrent agents while avoiding race conditions and collisions requires strict conventions and careful design.

Conceptual Insights:

  1. User Experience Matters: Even the most sophisticated backend is useless without a clean, responsive interface that provides real-time feedback and handles edge cases gracefully.

  2. API Limits Are Real Constraints: Theoretical performance gains from parallelism must be balanced against practical API rate limits, requiring strategic delays and staged execution.

What's next for Acharya

Short-term Enhancements:

  1. Adaptive Content Difficulty: Implement user profiling to adjust content complexity based on the learner's background (beginner, intermediate, advanced)

  2. Interactive Practice Mode: Add a study mode where users can practice with the generated flashcards and quizzes, tracking their progress and weak areas

  3. Multi-Language Support: Extend content generation to support multiple languages, making quality education accessible globally

  4. Citation and Source Tracking: Implement automatic citation generation and source tracking for all generated content to ensure academic integrity

Medium-term Goals:

  1. Personalized Learning Paths: Use machine learning to analyze user performance on quizzes and dynamically adjust the curriculum, focusing on areas where the learner struggles

  2. Collaborative Learning: Enable users to share generated content, create study groups, and collaborate on learning specific topics

  3. Video Content Generation: Integrate video generation capabilities to create animated explanations and visual demonstrations

  4. Integration with LMS Platforms: Build connectors for popular Learning Management Systems (Canvas, Moodle, Blackboard) to allow educators to import Acharya-generated content directly

Long-term Vision:

  1. Real-time Tutoring Agent: Develop an interactive tutoring mode where users can ask follow-up questions and receive personalized explanations based on the generated curriculum

  2. Assessment and Certification: Implement comprehensive assessment systems that can provide certificates upon mastery of topics, potentially partnering with educational institutions

  3. Curriculum Marketplace: Create a platform where educators can share, rate, and monetize high-quality curricula generated by Acharya

  4. Offline Mode: Enable content generation and caching for offline access, making quality education available in low-connectivity environments

Our ultimate vision is to democratize access to high-quality, personalized education by making comprehensive learning materials available to anyone, anywhere, on any topic—transforming Acharya from a content generator into a complete AI-powered educational platform. proud of

Built With

Share this project:

Updates