Inspiration

I was inspired by Carykh's Youtube Automator. I improved upon his idea by giving users the option to use google text to speech to read their story in addition to also being able to narrate their own story. I also created a web application for the program.

What it does

It turns a story written by the user into a full video with images and audio. For example, if the user wrote the following story it would crete a video that narrates the story, and adds images of the words in brackets. The video first shows a picture of "night", then after the audio says "around" it would then show a picture of a "scarecrow".

"[night] It was a dark and scary night. Noone was around. [scarecrow] A scarecrow was in the farm Joe was passing."

How I built it

I built this web application using Flask, google_images_download, a forced aligner called gentle, and ffmpeg. First I downloaded all the images using google_images_donwload. Then I used gentle. What this does is return a json of when you said every word in the transcript. Using this I matched how long every image should appear on the screen for. I then used ffmpeg to combine the images and audio together into an mp4 file.

Challenges I ran into

I had a lot of trouble learning ffmpeg and lowerquality/gentle. Gentle was hard because there was no documentation for it.

Accomplishments that I'm proud of

I'm proud of having a working web application by the end of the hackathon. This is my first hackathon so I'm really glad to have a finished project.

What I learned

I learned how to work with files in Flask. I also learned how to use a lot of new libraries.

What's next for Text2Video

I want to make the website look nicer and improve the UI. I'm going to make more options like being able to choose what license the images should be under. Right now the images are just the top images on Google.

Built With

Share this project:

Updates