-
Introducing Minerva: an (AI) learning tool for all ages to transcribe audio to text
-
The home page of the Minerva website
-
User selecting mp3, mp4 files from their local files
-
The uploaded files are sent to AWS s3 bucket for data processing
-
We utilize Amazon Transcribe to transcribe audio to text
-
The service page of the website: user can reupload another file
-
The transcribe is shown in the service page
-
A sample output text of the BBC new podcast about Global Warming
-
A sample output text of a UCSD Computer Science Lecture on Inheritance
-
Future Vision: To use Amazon Comprehend, Elasticsearch to recommend users on potential studying materials (related subject, article links).
Inspiration
When researching education services, we found out that UCSD's podcast website does not have a closed caption section, which may be useful for students to understand and review the course contents. We decided to make a program that not only provides subtitles but also provides suggestions for further readings that is related to the topic of the audio clip. We believe that this service is invaluable to learners of all ages as it can be used to transcribe health podcasts, college class recordings, news clips, etc.
What it does
Our program utilizes audio recognition AI technology to transcribe audio clips to text, which can be used for users to review the content of the media file format mp3/mp4.
How we built it
We utilize Amazon Transcribe, automatic speech recognition from Amazon Web Services that process media clips to a JSON file of the transcript. On the front end, the website is built with HTML and CSS, with Javascript to send a request and get a response from the back end. The back end processes the response using Python.
Challenges we ran into
We had difficulties getting local computer files to load into our AWS S3 bucket(an online container to use AWS service) because we built the website locally without a hosted server.
Accomplishments that we're proud of
We had plenty of fun creating the front end of the website, trying out different combinations of colors to create visually pleasing branding for Minerva. In the backend, we learned to use the lambda function, which allows files to be processed to Amazon Transcribe when an audio clip is uploaded by the user, which is sent to the s3 bucket.
What we learned
We've learned that as program developers, we can create functional programs that provide services that are useful to the world. And, during the process of developing a program, there would be plenty of trial-and-errors, dreaming, and brainstorming.
What's next for Minerva
We planned to utilize other Amazon Web Services such as Amazon Comprehend, CloudFormation and ElasticSearch, to further analyze the transcript of the audio clip. We plan to have Related Subjects sections, which list out the essential topics and subject frequently mentioned from the clips. We also plan to have another section of related articles, which show related articles to the topics that help users to learn more about the subjects of the clips.
Log in or sign up for Devpost to join the conversation.