Smart Captions


We sought to create a better way to get fast and easy access to information about the topics of discussion in an audio or video file. This would allow people to quickly read articles about topics mentioned in debate or the news so that they can develop a better understanding of these issues rather than just take someone's word for it.

What it does

First, our web-app takes an uploaded audio file and converts the speech to text using the API. Then natural language processing is applied to pull out the important ideas and topics covered in the conversation. These keywords can then quickly be clicked to pull articles about them.

How we built it

We developed both client- and server-side software to process audio files and pull out the main conversation points. The client-side was developed primarily with HTML, CSS, and Javascript. The server side was developed using Python. The server is responsible for API calls to the Speech-to-Text service and the NLP. Communication between the server and client is done through the use of Flask on Python. Flask allows the client to make REST API calls that we developed to the server to initialize STT and NLP processing. This same REST API is used to retrieve the data on the client-side and post the results to the client-side.

Our web-app takes advantage of's Speech-To-Text processing and Google Cloud's Natural Language Processing to effectively process data. Flask is used to create a Python microserver that is accessed by the client using REST API calls.

Challenges we ran into

At first, we found it difficult to approach the start of implementation. We tackled this problem by speaking with the available developers that worked at the companies who hosted the APIs we were using. By talking with the experts, we were able to make a clear diagram of what we wanted and a path as to how we were going to implement that.

Accomplishments that we're proud of

Our team is proud of how effectively we were able to go through the ideation process and to our idea into a well-implemented web-app. Each one of our teammates learned about a new area of programming and development that they had never explored before.

In terms of developing the back-end, we are very proud of the Python microserver that we were able to develop. Nobody on our team had any experience developing a back-end coming into the hackathon, but we made a very strong and effective back-end server that made 3rd-party API calls and also was capable of handling REST API calls from the client to initialize these 3rd-party calls and transfer information gathered by them.

What we learned

This was the first hackathon for everyone on our team. We all learned how to contribute to a group development project. We were able to effectively divide work loads while still allowing for seamless integration between different people's code. We also learned in depth about how APIs and servers work, giving us insight into back-end development that we never had. Overall, we enjoyed learning how to take something from concept stage to basic implementation.

What's next for Smart Captions

We would hope to improve the implementation of our software. Linking the transcript provided to a video and playing them together in real-time while still providing the articles would improve the utility. Overall we are very proud of the technical aspects of our implementation, but would still like to work on the scope and use case of our idea.

Share this project: