Inspiration
The sheer versatility of large language models got me excited about harnessing their power to improve the way I browse the internet. One feature I particularly love in Chrome is the ability to create tab groupings. It's such a handy way to cluster related tabs together and assign them a customized name. However, I noticed it hasn't caught on as much as I thought it would. The main reason seems to be that it requires users to manually create and manage these groupings over time, which can be a bit of a chore. So, I thought, why not simplify this process and make it more user-friendly?
What it does
Say hello to TabularizeLLM, a friendly little Chrome extension that you can add to your browser! Once installed, all it takes is a click on the "Group Tabs" button, and voila! It peers into all your open tabs, considering their titles and URLs, and then cooks up a neat set of categories. The magic behind the scene is a large language model (LLM) which performs this categorization.
Once these categories are in place, every new tab you open is like a guest at a party. Our extension, working quietly in the background, greets each one, determines which category it fits into best, and shows it to its place. If a tab is a bit unique and doesn't quite fit in with the existing categories, no worries! TabularizeLLM simply rolls out the red carpet and creates a new category just for it.
How we built it
The most fascinating aspect of this project is the AI wizardry that suggests categories for browser tabs. I began by using just the title and URL as input data. Surprisingly, this alone provided pretty impressive results in my experiments. I utilized a JavaScript AI library called langchain to create prompts, issue API calls to an LLM provider, and parse the LLM's output.
Without the use of Large Language Models (LLMs), the conventional method would involve performing an unsupervised clustering of web pages. This would be based on features such as the title, URL, and keywords extracted from the content. The most challenging part of this approach is the subsequent step, where each cluster needs to be labeled. However, with the introduction of LLMs, this entire process can be automated with just a few precisely constructed prompts. The true strength of LLMs lies in their ability to democratize the creation of a large variety of applications and automations.
Challenges we ran into
Aside from the regular challenges of developing a Chrome Extension, working on this project introduced a new kind of challenge: prompt engineering and parsing the output of an LLM for use in downstream code.
Prompt Engineering - One of the API calls I make categorizes a new tab when there are already existing tab groups in the browser. I had to ask the model to assign it to an existing category or suggest a new one. My first prompt consistently assigned the new tab to an existing group. I had to adjust my prompt to give the model more confidence to suggest a new category if the existing ones didn't make sense.
Parsing LLM output - Generative models are inherently non-deterministic, making traditional computer programming approaches challenging. You can't rely on the model response adhering to strict schemas, nor can you perform error checking or validation as you normally would. While you can try to give the model very specific instructions, it's not foolproof. Your downstream parsing logic needs to be flexible.
What I learned
🈯️ The way you phrase your question can greatly affect how well a language model performs.
💦 Clarify any unclear terms you use to increase the success rate.
👷♀️ To improve accuracy for tasks that require precision, a helpful technique is to make several attempts using an LLM, giving it the opportunity to try different approaches, evaluate its own answer, and recover from mistakes. The downside is that this dramatically increases the response time to the order of minutes and starts to become a poor user experience.
What's next for Tabularize
Privacy at the forefront: Currently, Tabularize utilizes the might of a cloud-based AI model. However, I'm enthusiastically looking towards a future where your privacy takes the front seat. My aspiration is to revamp the approach, ensuring that your browsing data remains strictly on your device. I envisage implementing a compact, dedicated model that can effortlessly accomplish the task of classification. Such a purpose-built model would not only preserve high performance but also significantly diminish the memory and processing footprint.
Built With
- chrome
- langchain
Log in or sign up for Devpost to join the conversation.