Automatic To-Do-List Generator Web App
This project aims to create a web application using Flask that automates the process of generating to-do lists from recorded audio. It leverages various technologies to achieve this, including Record-RTC for audio recording within the browser, openai-whisper for audio transcription using Python, and LLama3 NLP model for generating actionable to-do items. The app will integrate with third-party productivity tools like Google Calendar to set calendar events based on generated tasks.
Features:
- Audio Recording: Record audio directly within the browser using Record-RTC library.
- Transcription: Transcribe recorded audio into text using openai-whisper library.
- Task Generation: Use LLama3 NLP model to convert transcriptions into actionable to-do items.
- Integration: Connect with external productivity applications (e.g., Google Calendar, Notion) to set reminders based on generated tasks.
Project Status:
This project is currently a Proof of Concept and requires extensive refinement and development to enhance functionality, reliability, and user experience.
This Flask application serves as a web-based transcription service for audio files. Here's a summary of its functionality:
Key Components:
Flask Routes:
/: Renders the home page (index.html) where users can upload audio files./transcribe(POST): Handles the transcription process. It receives base64-encoded audio data, decodes it, saves it as a WAV file, performs transcription, and returns the result.
Dependencies and Libraries:
Flask: Framework for web application development.base64,os: Standard Python libraries for file handling and base64 decoding.openai-whisper: Audio to Text converter to handle the transcription process.llama3: NLP Model to extract information from the transcribed text, to create accurate event information.
Deployment:
The application is configured to run locally (app.run(debug=True)). Deployment to a production environment would typically involve configuring a production server (e.g., Gunicorn) and ensuring appropriate security measures.
Future Considerations:
-Further Development: Extending the integration with more Productivity Applications (e.g Notion, Outlook)
- Error Handling: Enhance error handling to provide informative responses to users.
- Security: Implement secure file handling and consider authentication for user access.
- Performance: Optimize for larger audio files and concurrent requests.
Summary:
This Flask app allows users to upload audio files for transcription. It employs base64 encoding to handle audio data sent from the client-side. Upon receiving the audio data, it decodes and saves it to a temporary WAV file. The transcribe endpoint processes this audio file using a custom module (generate_todo_from_audio from modules.py) for transcription. After transcription, the temporary file is deleted, and the transcription result is returned to the client as a JSON response.
The Google Calendar integration is currently a WIP, due to issues with the API Authentication. However, we do not perceive this to be a huge obstacle in the software development of our product.
Built With
- css
- flask
- html
- javascript
- llama
- open-ai
- python
Log in or sign up for Devpost to join the conversation.