Source Code Analysis using LLM

Problem Statement

We have challenges in analyzing git hub code from a business perspective. Example, for an existing github links, unless someone explains to us what each code is doing it is difficult to understand what each of the scripts are about. A GenAI based solution will reduce the human effort.

What it does

Built an LLM(Mistral Large 2 and Snowflake Cortex) based solution to analyze the code in git hub and provide a descriptive summary and insight on the information. They can extract insights such as identifying bugs, suggesting optimizations, or explaining code functionality in natural language. By analyzing commit histories, they can track changes, detect patterns, and predict potential vulnerabilities.

How we built it

Built an LLM based solution as a chatbot to analyze the code in git hub and provide a descriptive summary and insight on the information. Below is the highlevel flow, 1.Pass the git hub link to the code. 2.Use LLM to analyze the code and share a descriptive summary of what this code is doing. Tools: Connect to GitHub Data Ingestion and Preprocess cleansing Embedding – Snowflake Arctic Embedding Load into Snowflake Cortex Service/Vector Store LLM(Mistral-Lareg2) Tru Lens for GuardRail UI Front End for user query(Streamlit)

Challenges we ran into

1.Tru Lens Dashboard Integration proved difficulty in streamlit. Came up with customized solution to render Tru Lens Dashboard.

Accomplishments that we're proud of

A plug and play package solution is developed to analyze github source code and receive descriptive summary.

What we learned

With Gen AI based solution it is very much possible to reduce human effort for any of the technical considerations. LLM hallucinations could be reduced/avoided using guardrails like TruLens.

What's next for LLM Analysis

Providing LLM based solutions to help and analyze developers,business analysts tasks efficient.

Built With

Share this project:

Updates