Inspiration
The project was inspired by the limitations of traditional, one-size-fits-all presentations. Standard lectures do not adapt to an individual student's pace or level of understanding. A further motivation was to preserve educational content that might otherwise be removed or become inaccessible over time. The goal was to create a system that makes learning more adaptive and knowledge more permanent.
What it does
The application transforms static PDF presentations into interactive lectures. Users upload a PDF file, and the system generates spoken narration for each slide. It then poses questions to assess the user's comprehension. Based on the user's answers, the system builds a model of their knowledge and adjusts subsequent content to address any identified gaps, creating a personalised learning path.
How we built it
The application is comprised of a frontend and a backend. The frontend is a web interface built with Next.js, TypeScript, and Tailwind CSS. It communicates with a backend developed using Flask and SQLAlchemy. A MySQL database stores lecture content and the user understanding model. Content generation is handled by OpenAI's GPT models. Audio narration is generated by the Kokoro text-to-speech system and delivered via real-time streaming. The entire application is containerised using Docker and Docker Compose for reproducible deployment.
Challenges we ran into
A primary technical challenge was delivering audio narration without significant latency. The initial approach of generating the full audio file before playback was inefficient. This was resolved by implementing a streaming solution that sends audio in chunks as it is generated. Another challenge was synchronising the asynchronous operations between the frontend and backend. To manage the complex flow of data fetching, audio playback, and user input, a state machine was implemented on the frontend.
Accomplishments that we're proud of
The project successfully delivered a system that transforms PDF files into personalised, interactive lectures. Key technical achievements include the implementation of a low-latency audio streaming pipeline and a dynamic model that adapts content based on user interaction. The project also demonstrated the successful integration of multiple technologies such as: AI content generation, text-to-speech, and a modern web stack, into a functional and cohesive application.
What we learned
The project provided several key insights. It was determined that effective content personalisation with language models requires carefully structured and context-rich prompts. The implementation of streaming for audio and data significantly improved the application's perceived performance and user experience. Furthermore, managing multiple concurrent asynchronous processes necessitated the use of a state machine to ensure application stability and prevent race conditions.
What's next for Deepest Learning
Future development is planned in several phases. Short-term goals include adding a lecture reset feature and implementing performance optimisations. Medium-term plans involve supporting multi-lecture learning paths, creating persistent user profiles, and enabling offline functionality. The long-term technical roadmap includes the development of a knowledge graph to map concepts across lectures and an analytics dashboard for instructors.
Built With
- docker
- flask
- mysql
- next.js
- openai
- python
- tailwind
- typescript
Log in or sign up for Devpost to join the conversation.