Inspiration

Teachers and students need high-quality materials that actually support higher-order thinking. Yet many AI systems still struggle with multi-step reasoning and cross-text inference. I wanted to build a tool that addresses this gap and provides something genuinely useful in real classroom and assessment contexts.

What it does

The system generates reading-inference exam items modeled after TOEIC-style double-passage questions, which are among the most demanding formats for test takers. It evaluates each item’s structure and validity, flags potential duplicates, and offers a review workspace where human evaluators can approve, revise, or reject items before they are stored.

How we built it

The platform is a full-stack application built with FastAPI on the backend and React on the frontend. OpenAI models handle item generation, while Hugging Face embeddings support similarity and duplication checks. A workflow layer orchestrates quality validation steps and the human-review process.

Challenges we ran into

One major challenge was designing prompts that consistently produce high-quality items across a broad range of topics and linguistic variations while adhering to strict exam specifications. Another challenge was ensuring smooth integration between the frontend and backend, particularly in maintaining clean, predictable data structures throughout the workflow.

Accomplishments that we're proud of

Despite the inherent constraints of NLP and AI systems, this project showed that with disciplined methodology and expert oversight, AI can effectively support content design, validation, and workflow management for educational materials. Establishing this level of reliability is a significant milestone.

What we learned

We learned that when domain expertise and technical skills come together, it is possible to push past long-standing NLP and AI limitations. Tasks like inference-focused item generation—previously considered out of scope—are now within reach with the right combination of knowledge and engineering.

Built With

Share this project:

Updates