Inspiration

I've always been fascinated by the interconnected nature of the business world, but it's really difficult to visualize how companies are interlinked through partnerships, investments, and acquisitions. While working with financial data, I realized that most tools show companies in isolation, however, the real insights come from understanding their relationships. I wanted to create something that could make these complex networks visual and accessible to anyone, not just people with SQL knowledge. So, my lovely girlfriend and I tackled this challenge using skills we picked up from our university. She's interested in databases and data analysis while I'm interested in AI and ML. Our skills align perfectly for this project!!

What it does

AlphaNet transforms the way people explore and visualize S&P 500 companies and their relationships. It combines AI-powered natural language processing with interactive graph visualization. Users can ask questions in plain English like "Show me all companies connected to Apple" or "What are the relationships between tech companies and semiconductor companies?" Our custom Cypher AI Agent automatically translates these natural language queries into a database language (Cypher), executes them in real time to our Neo4j graph database. Then, it displays the results as an interactive network visualization using Cytoscape.js. The platform provides detailed company information panels, dynamic graph controls for zooming, and real-time query translation making complex corporate relationship data accessible to everyone from students researching companies to analysts exploring market connections.

How we built it

We built AlphaNet using Next.js for the frontend framework and React components which handle the user interface and state management. The core architecture revolves around three main API endpoints: one for translating natural language to Cypher queries using the Perplexity SONAR API, another for executing those queries against our Neo4j graph database, and a third for fetching detailed company information from our Supabase. Initially, we took a public csv file of the current S&P 500 companies. Then, I created a research script using Python, pandas, and LangGraph to iterate through each row of the .csv file and gather the company, ticker, sector, and industry. All these variables are invoked in our agentic workflow, which uses perplexity's sonar API to conduct research about the company and find all of its connections, including investments, acquisitions, partnerships, etc. After it concludes its research, with a bit of prompt engineering, we configured the agent to output a JSON structure of the nodes that were then passed through a neo4j API and added to the database. This resulted in nearly 4,000 nodes! The rest of the research information was inserted into our Supabase which is used to display information when the node is selected. The frontend features a custom GraphViewer component built with Cytoscape.js for rendering interactive network graphs. The backend handles the heavy lifting like parsing user queries, communicating with external APIs, transforming Neo4j results into the format needed for visualization, and managing database connections. We chose Neo4j because it was something new for both of us and a graph databases is perfect for representing and querying complex relationships.

Challenges we ran into

One of the biggest challenges we are going through is the graph and computation optimization. When the user enters the website, if there browser or device is computationally weak, then it might become unresponsive and take a while to load. (ex: It hits me with a 'Page Unresponsive' on my laptop but on my desktop / gaming pc it loads perfectly fine). For days we tried to find a sweet spot for loading the initial graph. Most of the solutions we came up with always sacrificed a key feature of the entire system. I didn't want to remove the loading of all 4,000 nodes at the same time since I believe that's what makes the project special; being able to see all those nodes and the giant web of connections. Another major hurdle was prompt engineering for the AI translation. Getting the system to consistently generate valid, safe Cypher queries from natural language took multiple iterations. I had to construct safeguards to prevent potentially dangerous database operations while still allowing flexible querying.

Accomplishments that we're proud of

We're extremely proud of the final result. Even though it's built for powerful computers at the moment, it's an amazing sight to see the entire idea physically in front of us and in production! We're also proud of creating a seamless user experience where someone can go from a natural language question to a fully interactive graph visualization in seconds. Another aspect we're proud of is our ability to work together; this was our first serious project working together and combining our skills! Most importantly, we built something that's actually useful and solves a real problem! We made corporate relationship data accessible and explorable for people who aren't data engineers!!!!!!

What we learned

This project taught us a lot about integrating AI APIs effectively and the importance of robust error handling in complex data pipelines. We learned how to work with unfamiliar graph databases and got much better at debugging API integration issues. We also gained even more experience in prompt engineering (which is very useful for my career haha). Working with Cytoscape.js taught us about data visualization libraries and how to handle dynamic, interactive content in React. Perhaps most importantly, we learned how to balance technical complexity with user experience. The backend is doing a lot of sophisticated work, but the frontend presents it in a way that feels simple and intuitive.

What's next for AlphaNet

Here are some next steps that we've already mapped out!

  • Heavily focusing on optimization on the graph loading!
  • Improving our existing dynamic schema extraction so the agent has better context of the graph.
  • Expand our dataset beyond S&P 500 companies to include more global companies, private companies, and additional relationship types like supply chain connections and competitive relationships.
  • A popular queries feature to see what other users have been querying.
  • Populating more research-based content for the nodes that are created but not S&P 500 companies.

Built With

Share this project:

Updates