Inspiration
We were motivated to develop this project because we were personally exposed to the value of presentation skills in the classroom and we observed that there were few tools and resources available to support students in practising and developing these skills. We believed that by utilising machine learning, we might develop a platform that simplifies the practise of giving presentations and offers insightful information to assist students in becoming more assured and successful speakers. The overall objective is to increase student presentation skills and streamline the procedure.
What it does
This code is a PyQt6 application that allows a user to record audio and video, start a timer, and export the audio transcript to a PDF file. The application creates a main window with several buttons, including a "Record Audio" button, a "Record Video" button, a "Start Timer" button, a "Generate Report" button, and an "Export to PDF" button. These allow the user to either record audio or video and generate an analysis for their presentation skills, which will use Machine Learning models such as Whisper AI, and GingerIt for grammar and spelling parsing. The application also uses other libraries such as OpenCV, PyAudio, and PyQt6.
How we built it
The code uses several libraries to accomplish its tasks.
PyQt6: This library is used to create the GUI application. It provides a set of classes and methods for creating and manipulating widgets, such as buttons, text boxes, and layouts.
GingerIt: it is used to check grammar errors in the text.
QThread and QTimer: These classes are part of the PyQt6 library and are used to handle threading and scheduling events, respectively. The QThread class is used to run a separate thread for the audio recording, so that it doesn't block the main thread and the GUI remains responsive. The QTimer class is used to schedule the single-shot timer that clears the text box after 3 seconds.
sys, time, whisper and cv2 : These are standard python libraries, sys module provides access to some variables and functions specific to the current operating system, time module provides various time-related functions, whisper is a library for managing voice commands and cv2 is a library for computer vision and image processing tasks.
pyaudio: This is a Python binding for PortAudio, a cross-platform audio recording and playback library. It is used to handle audio recording, including setting the parameters for recording and saving the recorded audio to a wave file.
wave: This is a library that is part of the Python standard library and it is used to read and write WAV files.
Challenges we ran into
Most challenges were rookie mistakes but the large ones that we did run into involved being able to access computer hardware through libraries such as OpenCV and library methods/classes
What we learned
We learned how to develop a full stack program from front to back end and the absolute importance and usability of api's.
What's next for Presentr
Implement motion detection to determine how long a person stays in one spot. With more time, we would add features to synchronize video and audio together to allow the user to track their movements while listening to their speech. Highlight differences between what they said and what the PresentR app suggests Implement the feature to export results to a document (PDF, Word)
Built With
- gingerit
- python
- whisper

Log in or sign up for Devpost to join the conversation.