Inspiration

We all know the negative phenomenon of "doom scrolling". You pick up your phone, open one Tiktok video... and ... 45 minutes later you are still mindless scrolling through completely random, meaningless videos. We recognized that infinite scrolling keeps us addicted to our phones and wanted to leverage this to make learning easier.

What it does

We have built a Swift iOS app and accompanying ML pipeline in Python that summarizes research papers and trending GitHub repos into short videos, on an infinite feed.

Submission for the efficiency track and the GitHub and Google Cloud challenges.

How we built it

We built the app in two parts: an iOS mobile app and a backend ML pipeline in Python. The former offers the highly addictive infinite scroller interface with our videos and the latter takes long-form content—research papers and GitHub repos—and creates short videos. The iOS app was implemented by mimicking the familiar TikTok UI using SwiftUI. The backend pipeline had several parts, it contained a summarization service, a text-to-speech service and a video creation service.

  • The summarization service took salient paragraphs of a research paper (abstract and conclusion) or the README.md of a GitHub repository and converted it to a chunked video script. This was done using the LangChain library and used the text-davinci-003 OpenAI model to carry out the summarization. The summary was in the form of a .srt file, the specification for subtitles.
  • The TTS service called the Google Cloud Text-to-Speech API to generate a natural sounding narration of our script.
  • The video generation service generated a video using the narration, subtitles and scraped pictures from the long-form resource. This was done using the MoviePy package.

Challenges we ran into

  • Token rate limits with OpenAI API
  • Learning LangChain and picking the correct LC modules
  • Using the highly unintuitive MoviePy SDK
  • Consistently generating images using Diffusion

Accomplishments that we're proud of

  • Pivoting quickly and a replacement way of generating content (summaries and videos) when our initial approach did not work
  • Spreading out work and working effectively as a team!

What we learned

  • OpenAI's GPT API and LangChain
  • iOS development with Swift - calling APIs to fetch content, handling user gestures, working with videos
  • Calling multiple services with Flask

What's next for ReClip

We hope for this to serve as an inspiration and catalyst for similar generative AI education apps to enter the market. We also hope to improve the current application to make it more efficient.

Built With

Share this project:

Updates