Inspiration
Our inspiration for this hackathon project came from the idea of making document checking more accessible and user-friendly. We aimed to create a tool that allows users to easily review and annotate text within uploaded documents, integrate AI-powered suggestions, and provide voice input for faster and more intuitive editing. The goal was to make document processing more efficient and modern.
What it does
Our project enables users to upload PDF documents, extract text from them, and edit the content directly within the PDF using an integrated editor, based on suggested improvements from the model. The tool provides AI-driven suggestions for text improvements, highlighting additions, deletions, and substitutions. Additionally, users can interact with the system using voice input for a more seamless experience. Users can also use voice generation so the app tells them what changed without needing to distract from the PDF editing itself.
How we built it
We used React for the frontend, leveraging components like MUI for UI elements. For working with PDFs, we used various PDF libraries, which allowed us to access and manipulate PDF content. We integrated Axios for sending data to the backend for processing. The backend processes text, analyzes it, and returns AI-based suggestions. Additionally, we used the webkitSpeechRecognition API for voice recognition, enabling voice input for instructions.
Challenges we ran into
One of the main challenges we faced was handling text manipulation and editing within PDFs. It took some time to figure out how to extract and modify text effectively. Additionally, we had difficulties synchronizing the frontend UI with the backend processing, especially when handling large text data and ensuring the AI suggestions were accurate. Another challenge was integrating voice input smoothly.
Accomplishments that we're proud of
We are proud of successfully integrating voice input and text-to-speech features, which made the document editing process more accessible. The implementation of real-time AI suggestions based on document analysis is also a significant accomplishment. Additionally, the ability to upload and edit PDFs seamlessly is a major achievement.
What we learned
We learned a lot about working with PDF interaction libraries and processing PDF files in the browser. We gained deeper insights into integrating speech recognition and text-to-speech APIs into a web application. Additionally, we improved our skills in React and Axios for handling asynchronous requests and state management. We also learned how to deal with performance issues when processing large PDFs and managing the synchronization of AI-driven suggestions. Training the OpenAI model was a challenge, but we accomplished it.
What's next for the hackathon project
Moving forward, we plan to enhance the user interface to make the editing experience even more intuitive, with features like document scanning and in-app PDF editing. We also want to improve the AI model to offer more accurate and detailed suggestions for document improvement. Additionally, we plan to add code-related data processing and expand the voice input capabilities, making the process smoother and more flexible for different use case
Log in or sign up for Devpost to join the conversation.