Inspiration
I use Pocket to save interesting articles found on Hacker News/Reddit/forum, one feature not available in Pocket free Plan is that it does not allow full-text search of the article content, you could only search by title. Sometimes I remember some keywords in the content of a saved article I read but the keywords are not in title, so I have to go through the links one-by-one to see if that is the article I want to go back to.
There are many free and open-source Pocket-alternative/Read it later projects, like wallabag, but they require self-hosting on VPS/Raspberry Pi which add maintenance burden.
The idea to utilize Serverless to build a Pocket clone comes to my mind after learning about TiDB Serverless. If we combine that with Serverless Compute platform like Vercel, it could be an ideal way to have your own personal Pocket running at a low cost (or free for infrequent use, given the generous free quota)
What it does
Just like Pocket, allow you to:
- Add/Remove a link
- List the links you added
- Search content of saved links (Not in Pocket Free)
- A Simple Archive view, in case the origin source of link is removed
How we built it
TiDB Serverless
- Data Persistence
Simple Full text search on top of TiDB
- TiDB does not support FULLTEXT index as of now
- we can do that in application side by building an inverted index in TiDB
- Steps:
- when a link is added, we get the whole HTML and use mozilla readability library to extract the text content
- Next we use some NLP library to tokenize the text content to get the useful terms
- for each term we save to DB referencing the link
- when we search for a term we could get the link ids and display in the front-end
Web UI in Next.js and Material UI (MUI)
Challenges we ran into
- Indexing could take some time, ideally we should not do it in the API handler but do in a background job. I tried to integrate with https://www.defer.run/ but it didn't build my project, so now it's still part of API handler that could get timeout in Vercel (10s for free plan)
Accomplishments that we're proud of
- Core functionality working well
What we learned
- TiDB
- Next.js
- TypeScript
What's next for Mouth Bag
- Phrase search
- Better organizations of links with tags, favourites, unread/read, collections, etc
- More accurate text content extraction from link
- Better indexing/ranking, support other languages
- GPT Summarization of the saved article to save reading time
Built With
- auth0
- next.js
- tidb
- vercel
Log in or sign up for Devpost to join the conversation.