Inspiration

As people increasingly hand over repetitive tasks like data cleaning to AI, the importance of writing clear prompts becomes obvious. But not everyone has the technical skills or experience to do that effectively. Why not build a pre-trained AI agent that allows anyone, regardless of their technical background, to handle data cleaning using plain language in an interactive way?

What it does

The AI data cleaning agent lets users select a dataset, explore it, and interact with it through natural language queries. It can analyze data, detect and address missing values, manage outliers, and finally save a clean version of the dataset. By combining automation with an intuitive interface, the tool makes data preparation faster and more accessible.

How we built it

Starting from a workshop prototype, I extended the project by creating a user-friendly interface with Gradio and adding new functionality to handle outliers. The system integrates AI-powered analysis with backend functions that process datasets step by step, enabling a smooth end-to-end workflow.

Challenges we ran into

The biggest challenge was figuring out how to properly guide the AI. It was crucial to design prompts that helped the model understand which resources (functions) were available and how to use them appropriately. Getting this balance right required a lot of iteration and testing.

Accomplishments that we're proud of

This was my first time experimenting with both AI and Gradio, and I really enjoyed the process of turning backend code into a usable website. Seeing something I built become an interactive tool that others can use in their daily work was especially rewarding.

What's next for data_cleaning_agent

  • Improve computational efficiency (some processes can take a while to run).
  • Enhance the user interface (large datasets can still be difficult to view and navigate).
  • Add more advanced features, such as considering correlations between variables and offering richer data exploration options.

Built With

Share this project:

Updates