Inspiration

When traveling and meeting people in an increasingly globalized world, not everyone speaks English.

What it does

This app allows users to generate a memory artifact: a video of their travels quickly. They start with a conversation with an expert interviewer (ElevenLabs conversational AI) about their travels, and upload travel photos and additional information. After a few seconds of processing, they are returned with a video of their animated pictures (enhanced with Luma Labs text to image).

How we built it

We built the frontend in React and Typescript, Backend in Python, FastAPI, FFmpeg and Cloudflare R2 (for uploading pictures to upload to Lumalabs), and most importantly, used the powerful APIs from ElevenLabs, Luma Labs.

Challenges we ran into

  • Japanese language speech was tough for some models (we ended up using ElevenLabs Turbo V2.5 which works well, instead of Multilingual V2)
  • Merging generated videos together was tough even when aspect ratios were specified
  • Uploading images to R2/S3, and generating pre-signed URLs so that images are not publicly exposed, and only the applications with the credentials can download those images (Luma Labs, OpenAI)

Accomplishments that we're proud of

Building an end to end AI pipeline, including using the new conversational AI API from ElevenLabs.

What we learned

Coming together as a team, being focused on building something useful is important. This is the best hackathon we've worked on.

What's next for Ryoko Travel

More languages! users should be able to select the languages they're interested in. Currently, we generated English and Japanese output.

Built With

Share this project:

Updates