Inspiration
Originally we thought about clustering using the embedded data and upon further research we came across Vector Databases which seemed like a very elegant improvement to our original solution
What it does
Creating a specialized chatbot that can help navigate a natural conversation with information from large text files while avoiding issues with token limits
How we built it
We used the LangChain Framework which allowed us to to efficiently embed our log files and save them on a vector database to efficiently be able to retrieve relevant parts of the data based on the prompt and use that as an efficient input to our LLM (GPT-4 API and Llama Locally). We created a full stack web app using react for the frontend and Flask for the backend to create a user interface.
Challenges we ran into
We had alot of problems when creating the LangChain framework as at times there were problems with the ability of the model to read its own history. As well we initially had very long wait times and sometimes irrelevant data but once we made the switch to GPT-4 for prompting we got alot more consistent results.
Accomplishments that we're proud of
We were able to create a chatbot that is consistent and relatively accurate
What we learned
This project gave us a deeper insight into the world of LLMs and gave us the chance to use new technology to implement useful solutions
What's next for LogUp
We would like to improve our LangChain pipeline to improve correctness of replies as well as adding additional data to the LLM regarding linux Logs as well as better error handling with the UI and a file management system and handling multiple log files as well
Log in or sign up for Devpost to join the conversation.