Audiblog

Inspiration

I enjoy walking and cooking, and often use Amazon Audible for listening to books or Spotify for enjoying music during these activities. With Audiblog, I wanted to offer a similar experience, but focused on web pages. My goal is to provide an experience where users can enjoy their favorite content hands-free and without stress.

What it does

Audiblog allows users to input the URL of a web page or blog, and then reads the text from that page aloud. One of the key features is that Audiblog converts the text into SSML, which is then saved as audio. This enables familiar time controls such as 10-second rewind and fast-forward, similar to a music player. URLs can be added not only within the app but also through the share feature in browsers like Safari. Additionally, by using feature extraction and backend search functionality, Audiblog recommends related articles that other users are reading. This eliminates the hassle of entering URLs and makes the app more convenient to use. Moving forward, I plan to diversify and enhance the recommendation feature, and design an experience where the next article is prepared and played automatically, much like how Spotify seamlessly plays the next song. Articles are read aloud starting from the title, and if you're not interested, you can easily skip them using your AirPods or the iPhone Control Center.

How we built it

The development was done using Xcode and SwiftUI in a straightforward manner, with occasional use of the Audio Video framework. Although I am experienced in cross-platform development with tools like Flutter, I started with iOS development because I was uncertain about how well the Audio Video framework would handle exporting audio.

Challenges we ran into

One challenge was my initial lack of knowledge in the basics of the Audio Video framework. I also foresee the need to optimize the backend scraper for loading individual web pages, particularly for specific article services, which will require a hands-on approach. Although I experimented with using LLMs like OpenAI and Gemini for this, I found that the performance was not sufficient, so I opted to write the necessary JavaScript myself. While using an LLM would have been an easy and attractive option, I chose the more reliable approach that allows for easier troubleshooting in case of issues.

Another challenging aspect was finding and implementing a very simple method for handling articles that require login access.

Accomplishments that we're proud of

First and foremost, I'm proud that we completed the app. While there are still features and experiences I want to refine, we successfully conducted technical validation and delivered the minimum functionality needed for an experience where articles are played back one after another, similar to Spotify. I’m confident that expanding on this feature will lead to an excellent user experience. This discovery is the most significant achievement that will drive our future development.

What we learned

I deepened my knowledge of the Audio Video framework, particularly in the audio component. Additionally, my understanding of Vector Embeddings, Vector search, and using LLMs through APIs in more complex scenarios has become more refined.

What's next for Audiblog

As mentioned earlier, the next step is to enhance the functionality aimed at creating an experience where articles are played back consecutively, similar to Spotify. This includes diversifying the algorithms used for recommending articles to users. My goal is to create an app that users can enjoy without having to open or interact with it as much as possible. I will continue to pursue a hands-free, stress-free experience, while also making the app more accessible to all users.

Built With

avfoundation
cloudfirestore
firebase
revenuecat
swiftui

Updates

hirose yudai started this project — Aug 23, 2024 05:33 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.