Inspiration

Working with databases often requires knowledge of query languages like SQL for relational databases and Cypher for graph databases (Neo4j). Many users—analysts, business teams, students—struggle with writing correct queries. We wanted to create a natural language interface where anyone can ask questions about their data and get accurate answers without learning SQL or Cypher.

What it does

The Text-to-SQL & Text-to-Cypher Web App lets users: Connect to MySQL or Neo4j databases directly from the Streamlit interface. Ask questions in natural language like “Show me all employees hired last year” or “Find all nodes connected to Alice”. Automatically translate these questions into SQL or Cypher queries, depending on the selected database. Execute the queries, fetch results, and present the answers back in clear natural language Maintain chat history so queries can build on context from earlier conversations. Essentially, it works as an AI-powered database assistant that bridges human language with structured database queries.

How we built it

Frontend/UI Built using Streamlit with a chat-like interface. Sidebar options allow selecting between SQL (MySQL) and Neo4j, and entering connection details.

Backend/LLM Integration Used LangChain with ChatGroq (Gemma2-9b-it model) for natural language processing.

Created two translation chains: NL → SQL: Converts English questions into valid MySQL queries. NL → Cypher: Converts English questions into valid Neo4j Cypher queries. Added strict prompting rules to ensure queries are syntactically correct and only use available schema.

Database Layer SQL handled through mysql-connector + LangChain’s SQLDatabase. Neo4j handled with a custom Neo4jDBConnection class (using Neo4j’s official driver). Schema inspection functions guide the LLM in generating valid queries.

Answer Generation After running queries, results are passed back to the LLM with context. The model explains results in simple natural language.

Session State Management Used st.session_state to track chat history, database type, and connection state across interactions.

Challenges we ran into

Ensuring query accuracy: The LLM sometimes hallucinated columns or functions. Fixed by injecting schema dynamically. Managing two database types (SQL & Neo4j) in the same app while keeping UI simple. Handling errors gracefully (e.g., wrong credentials, invalid queries). Avoiding long chat history overflow by trimming older messages.

Accomplishments that we're proud of

Built a single unified app that supports both relational and graph databases. Achieved smooth end-to-end flow: natural language → query → execution → human-readable answer. Designed a modular structure (SQL chain, Cypher chain, response generators) that can be extended to other databases.

What we learned

Prompt engineering is crucial to make LLMs produce syntactically correct queries. Database schema context dramatically improves LLM performance. Streamlit is excellent for rapid prototyping conversational apps. Supporting both SQL and Neo4j helped us understand the differences between relational and graph-based querying.

What's next for Text-to-SQL & Text-to-Cypher Web App

Add support for more databases (PostgreSQL, MongoDB). Improve schema visualization (ER diagrams for SQL, graph visualizations for Neo4j). Add role-based access control to restrict modification queries (e.g., DELETE, UPDATE). Deploy as a web service/API for integration with enterprise tools. Fine-tune the model or use RAG (Retrieval-Augmented Generation) with schema docs to further reduce query errors.

Built With

Share this project:

Updates