Inspiration
We were inspired to create Glimpse because both of us team members are passionate about photography. One of the pain points we both discovered in our journey to learning photography was the steep learning curve that came with most major photo editing tools. On top of these softwares being extremely difficult to learn, they are also offered at a steep price point. So, we decided to create our own photo editing tool that leverages modern generative AI tools to help make learning editing easier for beginners, as well as adding useful features that experienced users can enjoy.
What it does
Like mentioned earlier, we set out to create an agentic photo editor. Glimpse adds predictive AI features on top of a simple photo editing tool that we built. It allows users to just edit the photo, but also adds an AI chatbot on the left side. The AI chatbot can read the state of the image, offer editing suggestions, and apply edits that the user wants to make. For example, if the user asks to make the colors in the image pop, the chatbot would automatically increase saturation and contrast in the photo. Additionally, based on the actions the user takes while editing the photo, Glimpse can provide predictions for what other actions the user may take next. For example, if it sees that someone keeps dropping the brightness on an image, it might offer them the suggestion to drop the brightness to a lower, goal level. Additionally, we leverage Statsig insights on global user trends to help improve the functionality of the app for all users.
How we built it
The frontend of Glimpse was built using Next.js with React, as well as some styling with Tailwind CSS. The backend was built on Python, using FastAPI. We initially built this with the Ollama Gemma 2B LLM, but later pivoted to Gemini for more advanced features. We also track user analytics in both the frontend and the backend using the Statsig API, which allows for operation logging as well as dynamic configurations for our LLM.
Next.js + Ollama local + Statsig
Challenges we ran into
One of the challenges we ran into was consolidating the quick, frontend photo changes with the actual backend image processing. At first, we weren’t doing any image processing in the frontend, which led to very slow update times and a very laggy UI. Eventually, we settled on doing some quick CSS filters to show the effect of editing the photo, and only actually applying the pixel changes on the image once the user wants to download the image, kind of like the way image exports work in other photo editing apps.
Another challenge we ran into was switching our LLM infrastructure from a locally hosted model (Ollama) to calling an API. We had to refactor our code base a little, but it was worth it in the end as we received much more detailed responses out of Gemini compared to Gemma 2B.
Accomplishments that we're proud of
One of the main accomplishments that we are proud of is being able to work on a pain point that personally affects us. We are both very passionate about photography, and being able to work on something that will allow us to enjoy our passion even more has been very rewarding. We are just very proud that our project was able to solve an issue that we have actually encountered.
What we learned
We learned a lot during this project, but the main thing we got experience with was just working with LLMs. Neither of us had much experience with LLMs before this project, but we really got to work hands-on with the future of technology.
To be a bit more specific, one of the things we learned pertaining to LLMs was image processing. The regular use-case with LLMs is text parsing and generation, but we were able to learn how LLMs process images, and the ways in which they must be modified to get a proper response.
On top of that, we were able to learn more about how to actually make an agent-focused pipeline for an application. Because we can’t keep passing the image into the LLM, as image processing can be expensive, we had to figure out how to save the state of the photo in an LLM-readable way that is also cost effective.
What's next for Glimpse
We hope to keep developing Glimpse, and make editing more accessible for all. There are a few changes we want to add. The main thing we want to do is convert this from a web application into a true, powerful photo editing application that users can download, rather than something light that runs in the browser. This can allow for more computationally expensive photo editing operations, like vignette, which can be very slow to run in the browser. After this, maybe just making it scalable for many users, and perhaps improving the UI.

Log in or sign up for Devpost to join the conversation.