Overview

The JEE Main Question Generator is an AI-powered tool that dynamically creates multiple-choice questions (MCQs) aligned with the official JEE Main syllabus. Rather than relying on real-time document uploads, this system leverages a pre-curated knowledge base built from:

  • Last 10 years of JEE PYQs
  • Standard NCERT and reference books (HC Verma, Cengage, MS Chauhan, etc.)
  • High-quality coaching study materials

It allows students to generate high-quality practice questions topic-wise, take timed mock tests, and review performance — all through an intuitive web interface.


Inspiration

While many test prep platforms exist, most either serve static question banks or repeat well-known PYQs. Students studying from coaching notes or books often lack a tool that can generate fresh, unseen questions based on concepts they’re currently revising.

The project was born out of the need to:

  • Generate high-quality, fresh MCQs that test the same depth and patterns as JEE
  • Use existing material (books + PYQs) to fine-tune AI-based generation
  • Help students practice more efficiently by filtering questions topic-wise with exam-level difficulty

Key Features

  • Topic-wise MCQ Generation
    Users can select specific topics or chapters, and the app generates concept-relevant questions on demand.

  • Timed Mock Tests
    Automatically builds full-length or custom tests with timer, marking scheme, and scoring.

  • Question Difficulty Control
    Questions are tagged as Easy, Moderate, or Difficult, based on historical JEE question analysis and AI metrics.

  • Detailed Explanations
    Every question comes with an explanation, helping students learn from their mistakes.

  • Attempt History & Retakes
    Performance tracking over time for each subject and topic.


Data Sources

The system is not dependent on user-uploaded PDFs. Instead, it is trained and fine-tuned using:

  • JEE Main PYQs (2014–2024)
  • NCERT textbooks (Physics, Chemistry, Mathematics)
  • Reference materials from top coaching institutes
  • Hand-annotated tags for concepts, sub-topics, and difficulty level

AI and Technical Stack

LLM Prompting + Post-Filtering

  • A carefully designed prompt structure is used with LLaMA-3, GPT-4, or Mistral models via the Groq API to generate MCQs.
  • Prompts are conditioned with context extracted from the curated knowledge base.
  • Post-generation filters ensure uniqueness, relevance, and JEE-aligned format.

Vector-Based Semantic Matching

  • Questions and concepts are vectorized using Sentence-BERT embeddings.
  • Concept-topic mapping ensures that generated questions remain semantically aligned with selected topics.

Web App Stack

  • Frontend: Next.js + Tailwind CSS for fast, responsive UI
  • Backend: Flask + FastAPI for asynchronous LLM calls and DB operations
  • Database: MongoDB to store users, question metadata, test history

What I Learned

  • How to structure a domain-specific AI generation pipeline (education-focused)
  • Tuning prompts and filtering strategies for quality control in MCQ generation
  • Building semantic embedding search pipelines for topic and concept alignment
  • MongoDB schema design for storing generated tests, user sessions, and question logs
  • Real-time performance optimization for AI-backed interfaces

Challenges Faced

  • Preventing hallucinations from LLMs despite feeding verified academic content
  • Maintaining JEE-level difficulty, especially for Math and Physics where derivations and accuracy matter
  • Balancing between novelty and familiarity: ensuring generated questions aren't direct copies of PYQs but still test the same concepts
  • Designing robust topic tagging across multiple subjects with overlapping concepts

Future Directions

  • Add Numerical Answer Type (NAT) support
  • Enable personalized question generation based on weak areas
  • Add leaderboards, streaks, and gamification for engagement
  • Export to print-friendly PDFs for offline practice
  • Train a custom finetuned educational model on the JEE corpus

LaTeX Support in Questions

Equations and scientific notation are rendered using LaTeX for better clarity. Example:

Let the electric potential energy between two charges be:

$$ U = \frac{k q_1 q_2}{r} $$

Questions involving such expressions are rendered cleanly within the UI and explanation.


Conclusion

This JEE Question Generator combines curated data, semantic embeddings, and LLMs to produce high-quality, concept-aligned questions. It’s built for students who want smarter practice without manually filtering through old question papers or PDFs. With AI and educational insight, it delivers tailored, rigorous, and scalable test prep for one of India’s toughest exams.

Accomplishments that we're proud of

  • Successfully generated over 10,000 high-quality MCQs across Physics, Chemistry, and Mathematics, aligned with the JEE Main pattern.
  • Built a modular and scalable system capable of generating topic-specific tests with timed interfaces and detailed explanations.
  • Integrated LLM prompting with post-filtering and embedding-based relevance checking for real-world educational use.
  • Created a full-stack web platform with user authentication, test history, and performance tracking.
  • Designed an AI evaluation pipeline that filters out hallucinated or invalid questions using semantic similarity with real PYQs.
  • Maintained a balance between innovation and exam authenticity, generating unseen questions that still match real exam logic and difficulty.

What we learned

  • How to effectively prompt large language models for structured educational content without hallucination.
  • Building AI workflows that mix retrieval-augmented generation (RAG) with fine-grained topic tagging.
  • How to leverage historical exam data (JEE PYQs) to create accurate difficulty metrics and concept maps.
  • Deploying and scaling asynchronous Flask APIs for handling heavy generation workloads from LLMs.
  • Ensuring pedagogical value in AI-generated questions by incorporating logic-driven filtering, validation, and explanations.
  • The importance of modular backend design when combining user data, AI inference, and document-based content systems.

What's next for Test-Generator

  • Finetune an LLM on JEE-style MCQs to reduce dependency on third-party APIs and improve generation speed and quality.
  • Add support for Numerical Answer Type (NAT) and Matrix Match question formats found in JEE Advanced.
  • Implement user analytics and adaptive testing, where weak topics get prioritized in generated tests.
  • Launch a mobile app version to make test generation and revision available on-the-go.
  • Integrate voice-based commands and handwriting recognition for students who revise from offline notes.
  • Collaborate with educators to allow manual validation, annotation, and publishing of AI-generated questions.
  • Expand the system to support other exams like NEET, CUET, and state-level engineering entrances.

Built With

Share this project:

Updates