Inspiration

VDBs are highly important to the artificial intelligence world, but exploring and visualizing high dimensional graphs such as VDBs has been historically difficult. Existing tools struggle to find the right balance between interactivity and visualization. We wanted to create a tool that puts human interaction first in the equation, making VDB exploration easy and accessible for any type of given dataset.

What it does

VectorViz is a node-based visualizer for Vector Databases enhanced by conversational AI. We've created 5 databases for users to search through: 2.2 million research papers about COVID, 70K games, 33K words, the transcription of a 3hr video, and 20 million pages of Wikipedia. There are 3 main features of our project: Search, Explore, and Q/A. When searching, users can input keywords to search for the most relevant sources based on their chosen database. While exploring, users can easily interact with similar nodes to dynamically find new information and save desired nodes. For Q/A, users can use their saved nodes as context for asking questions to the OpenAI API. With these features, there is a countless number of potential use cases for our tool.

How we built it

The front-end was built with Svelte and a healthy does of HTML/CSS/JS. The back-end is a combination of Google's BERT model, the OpenAI API, and Python with Flask. The databases are hosted / modified in PostGreSQL, Google Cloud, and FAISS. We used HuggingFace's BERT model for text-to-text retrieval. EvaDB and custom python scripts were used to parse datasets and convert them into VDBs.

Challenges we ran into

As first-time users, setting up Google Cloud in the middle of the hackathon to run properly was challenging. But with the help given during office hours, we were able to do it. The embeddings we had in the middle of the hackathon stopped providing relevant results. It took a while for us to pinpoint where the issue lied. Other than those two big challenges there were the usual coterie of bugs that comes with developing at a break-neck pace with little sleep.

Accomplishments that we're proud of

This was a huge project that we weren't sure if we would be able to finish in time, but we ended up accomplishing a large number of our stretch goals. We were able to learn new things about AI and build a new take on vector DBs.

What we learned

2 of our members never worked with AI and now have experienced first-hand how AI development works. We learned in-depth details of how retrieval is handled in AI pipelines and key limitations of current retrieval models.

What's next for VectorViz

We would love to expand the project with some of the following features: Multi-Modal database searching, displaying visual breadcrumbs of the user's node path, a keyword filter, and more.

Built With

Share this project:

Updates