Inspiration

At SCB DataX, our mission is to make data analytics accessible to everyone within the SCBX fintech group. Our goal is to democratize access to valuable insights, streamlining the decision-making process across the entire company.

What it does

We've developed a powerful tool that empowers individuals to effortlessly extract insights using natural language queries in English.

How we built it

Our solution is a result of the seamless integration of Language Model (LLM) and engineering expertise, designed to maximize result accuracy. The system is structured into two key components:

Data Preparation

In this phase, we focus on making your existing data storage comprehensible to LLM. This is achieved by leveraging the combined capabilities of LLM and vectordb.

Text/Speech to Insight

LLM analyzes natural language text/voice inputs to identify relevant tables and columns, automatically generating detailed insights and interactive visualizations without reading your raw data.

Challenges we ran into

  • Dealing with the token limits.
  • Naively use LLM to do text2sql is not good enough, often generating un-executable SQL.

Accomplishments that we're proud of

  • Achieved almost 100% executable SQL queries from our solution.
  • Scalability of our solution.

What we learned

  • Recognized the power of LLM as a tool but realized the need for a systematic solution.
  • Acknowledged the importance of metadata, as data schema alone may not fully self-explain the meaning.

What's next for Chat with Your Data

  • Improve the performance of the generated result beyond the current ~70% ( Execution Accuracy) on the Spider dataset.
  • Integrate with automation pipeline tools - i.e., Zapier.
  • Implement a "ask back for more information" feature to further enhance user interactions.

Built With

Share this project:

Updates