Automated marking machine for paper-based homework

Inspiration

This project draws its inspiration from a profound commitment to addressing inefficiencies and inequalities in traditional educational practices. As former educators and teaching assistants, the founders identified a critical gap: the absence of scalable, precise, and student-centered grading processes. The vision emerged from a passion for leveraging technology to improve learning outcomes, enhance grading efficiency, and focus on personalizing education. By harnessing the power of multimodal large models integrated with smart hardware, this initiative redefines homework assessment, enabling educators to focus efforts on impactful teaching while empowering students with targeted learning insights. This innovation embodies the transformative potential of AI in revolutionizing education.

What it does

We have designed a machine for correcting paper homework with traceable marks. Based on new large model technology and product forms, it enables one-click scanning, traceable correction, and data analysis of paper homework. It can correct more than 60 sheets of homework per minute, with a cost of about 5 cents per sheet and a correction accuracy rate of over 97%. Currently, it has been deployed at Benyuan Primary School in Shenzhen and Wen San Road Primary School in Hangzhou, having corrected over 10,000 sheets of homework and more than 200,000 questions. Additionally, the system aims to provide teachers with long-term tracking of student progress through data analysis, further enhancing teaching quality and student learning outcomes.

Compared to existing photo-based homework correction software, our correction machine achieves one-click batch traceable correction, significantly reducing the workload of teachers in correcting homework. The video has been viewed over 50,000 times on platforms such as Bilibili and Xiaohongshu, with hundreds of teachers actively applying for trials.

How we built it

We developed a comprehensive AI grading system for school tests, incorporating automated exam paper scanning and question recognition with OCR enhanced and refined by LLM, and AI-generated grading results using Java (backend), Vue.js/JavaScript (frontend), and Junit and Jest(test)

1、（重点）LLM 对抗交流系统

Enhanced problem-solving accuracy by 8.2% and minimized hallucination through a multi-agent discussion and cross-verification inspired by dense vote from a mixture of experts (MoE) across multiple LLMs (like GPT-4) with retrieval-augmented generation (RAG)

Multi-agent Discussion:

Multiple independent language models (agents) handle the same problem simultaneously, discussing and sharing their insights. Each agent generates its own answer independently and communicates through a mechanism such as message passing or shared memory. A coordinator collects and synthesizes the responses from the agents. Multi-agent discussion leverages the strengths of different models, reducing the likelihood of errors from a single model and improving the accuracy of the answers.

Cross-Verification:

The system cross-verifies the answers provided by different agents to ensure reliability. It compares the answers to identify consistencies and discrepancies. For significantly different parts, the system conducts a deeper analysis or regenerates the answers. Cross-verification effectively reduces the error rate, ensuring the final answer is accurate and consistent.

Mixture of Experts (MoE):

This technique utilizes multiple expert models, each specializing in different types of problems or specific domains, and combines their outputs to generate the final answer. The system dynamically selects the appropriate expert model based on the problem's characteristics. Each expert model handles the part it is proficient in, and the results are then merged. The mixture of experts approach fully utilizes the strengths of each model, enhancing the system's overall processing capability and accuracy.

Retrieval-Augmented Generation (RAG):

In addition to using model communication to solve problems, this approach combines retrieval systems and generation models. It first retrieves relevant information from a vast amount of documents and data, then uses a generation model to produce the answer. The system starts by using a retrieval model to fetch relevant documents from a knowledge base, which are then input into the generation model to create the final answer. This method leverages existing knowledge, improving the relevance and accuracy of the answers, especially for problems requiring extensive background information. Through the integrated application of these technologies, we have significantly improved the problem-solving accuracy of the AI scoring system and minimized hallucination phenomena. Specifically, the system has increased problem-solving accuracy by 8.2% in practical applications, ensuring the reliability and consistency of scoring results.

2、User Search Acceleration and Peak Handling

Overview: We use Kafka message queues to handle query peaks and Redis caching to store recent queries and answers for quick LLM response. Elastic Search is implemented for fast search and indexing of the question database, achieving search times of under 100ms across millions of records.

Detailed Description:

Given the nature of this project, search requests may surge during peak periods (such as school final exams). To address this, we employ two primary acceleration methods: Kafka and Redis.

Kafka Message Queue:

Functionality: Kafka, a distributed messaging system, handles a large volume of concurrent requests, effectively balancing server load.
Technical Details: When users submit search requests, these are first sent to the Kafka queue. Kafka partitions the requests and distributes them to multiple consumer nodes for processing. This not only speeds up search request handling but also ensures system stability during peak periods.
Benefits: Kafka's distributed architecture offers high availability and high throughput, capable of processing millions of messages per second, preventing server crashes during peak search times.

Redis Cache:

Functionality: Redis, an in-memory caching system, stores recent queries and responses for quick retrieval.
Technical Details: Upon receiving a query, the system first checks the Redis cache. If a match is found, the cached response is returned immediately, reducing database access. If no match is found, the query is sent to the backend database, and the result is cached in Redis for future use.
Benefits: By using Redis caching, the system significantly reduces database load and improves query response speed. Redis’s memory processing capability allows data access in microseconds, greatly enhancing user experience. Additionally, we use Elastic Search for fast searching and indexing of the question database. Elastic Search’s efficient full-text search and near real-time data processing capabilities allow searches in under 100ms across millions of records. This enables us to quickly find the required questions in large datasets, further enhancing system performance and user satisfaction.

3、Database scalability and durability

Overview: By utilizing techniques such as sharding, replication, partitioning, write-ahead logging, and autoscaling, we have enhanced the scalability and durability of the MySQL database. Additionally, index strategies and reduced disk I/O operations have improved database performance.

Detailed Description:

To enhance the scalability and durability of the MySQL database, we have adopted various technical methods:

Sharding:

Distributes data horizontally across multiple database instances to reduce the load on a single database and increase query speed. Data is evenly distributed across different database instances using hash sharding or range sharding. Each instance only handles its own shard of data, reducing the data volume and query pressure on a single instance. Sharding significantly improves database scalability, allowing the system to expand storage and computing capacity by adding new database instances as needed.

Replication:

Replicates data across multiple database instances to enhance data availability and disaster recovery capabilities. Using master-slave replication or multi-master replication, data is synchronized in real-time across multiple database instances. This ensures that if one instance fails, other instances can quickly take over. Replication improves data reliability and availability, ensuring the system can continue to operate normally in case of hardware failures.

Partitioning:

Partitions large tables by certain columns to improve query and management efficiency. Data is divided into different physical storage areas based on rules such as range partitioning, list partitioning, or hash partitioning, reducing the data volume in a single physical storage area. Partitioning enhances the query performance and management efficiency of large tables, especially when handling large-scale datasets.

Write-Ahead Logging (WAL):

Ensures data consistency and durability by writing logs to disk before committing transactions. During transaction execution, all modifications are first recorded in the write-ahead log and then applied to the database. This ensures that data can be recovered using the logs in the event of a system crash. Write-ahead logging increases database reliability, ensuring data recoverability after unexpected failures.

Autoscaling:

Automatically adjusts the number of database instances and resource configurations based on load conditions. By monitoring the database load, the system automatically increases or decreases the number of database instances to ensure sufficient resources during high load periods and conserve resources during low load periods. Autoscaling enhances system flexibility and resource utilization, ensuring efficient operation under varying load conditions. Through the comprehensive application of these technical methods, we have significantly improved the scalability and durability of the MySQL database, ensuring stable and efficient operation when handling large-scale data and high loads.

Challenges we ran into

Technical Integration: Combining multimodal large models with high-speed hardware presented significant engineering challenges, especially ensuring real-time processing while maintaining 97%+ accuracy. Data Quality and Training: Developing a robust system demanded extensive and diverse datasets for training the models. Ensuring the quality and representativeness of the data was critical to avoid biases and inaccuracies. Scalability and Cost: Designing a solution that is both scalable for large educational institutions and cost-effective for individual users required balancing cutting-edge technology with affordability. User Experience: Creating intuitive interfaces for teachers, students, and parents was challenging, as it required accommodating varying levels of technical expertise and educational needs. Market Adoption: Introducing innovative AI-based technology into traditional educational systems often encountered resistance from stakeholders unfamiliar with or skeptical of its benefits. Infrastructure Deployment: Implementing the system in schools with differing levels of technological infrastructure posed logistical hurdles, particularly in integrating with existing workflows.

Accomplishments that we're proud of

Efficient Correction Handling: The project introduces an integrated automated scanning and correction system that can process over 100 papers per minute, significantly speeding up the grading process. This efficiency greatly reduces the time and effort required from teachers compared to traditional manual grading.
Intelligent Precision Analysis: The system uses advanced large model technology for intelligent grading, accurately identifying and recording students' errors and conducting in-depth analysis of knowledge gaps. This smart analysis not only improves grading accuracy but also provides teachers with detailed feedback reports, helping them adjust their teaching strategies more effectively.
Practicality of Comprehensive Functions: In addition to basic grading functions, the system also features mistake organization and knowledge gap analysis. Teachers can use the generated mistake reports and knowledge point analysis to provide targeted guidance on students' weak areas. These comprehensive functions enhance the specificity and effectiveness of teaching.
High Accuracy and Low Cost: The system's grading accuracy exceeds 97%, placing it at the forefront of the industry. Additionally, the cost of processing each paper is only 0.05 yuan, offering a significant cost advantage compared to other competitors. This efficient and economical solution provides educational institutions and schools with a highly cost-effective service.
Superior User Experience: The project excels in user experience, receiving high recognition and positive feedback from teachers. The system's simple operation and automation reduce the complexity of tasks for teachers, making the grading process smoother and more efficient.

What we learned

Through the development of this project, we gained valuable insights into the intersection of technology and education. We learned that integrating AI solutions requires not only technical precision but also a deep understanding of user needs, especially in balancing efficiency with accessibility. The importance of scalability and affordability became evident as we tailored the system for diverse educational environments. Additionally, we recognized the critical role of effective communication in fostering trust and adoption among stakeholders. These experiences underscored the transformative potential of AI in education while highlighting the necessity of user-centric design and continuous adaptation.

What's next for Automated marking machine for paper-based homework

Enhancing AI Capabilities: We plan to refine the multimodal large model for even greater accuracy and adaptability, ensuring it can handle a broader range of question types and subjects. Expanding Features: Introducing advanced analytics, such as trend analysis for student performance over time, and more personalized learning recommendations. Scalability and Customization: Optimizing the system to meet the needs of different educational settings, from individual classrooms to large institutions, while allowing for customizable features based on user feedback. Global Expansion: Adapting the system for use in international markets by incorporating multilingual capabilities and aligning with local educational standards. Integration with EdTech Ecosystems: Establishing seamless compatibility with existing digital learning platforms and educational tools to create a unified learning environment. User Education and Community Engagement: Providing training resources for teachers and promoting a user community to share best practices and feedback for continual improvement.

Built With

ai
java
javascript
jest
junit
llm

Updates

Li Xinjin started this project — Dec 06, 2024 04:56 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.