AUTOMATIC EDA is a blazing-fast dashboard built entirely in Python. It's designed to replicate and modernize the core functionality of tools like ydata-profiling inside a sleek, intuitive Streamlit web interface.

The complete workflow covers:

  1. Automatic Column Type Detection
  2. Deep Statistical Profiling
  3. Interactive Correlation Analysis
  4. Missing Value Detection All without writing a single line of Python code!

Analytical Features Feature Description

  1. Overview Room Bird's-eye view of your dataset. See rows, variables, duplicate row counts, and memory size at a glance.
  2. Smart Alerts Automatically flags missing values, uniform columns, all-zero columns, and highly correlated variables (Threshold > 0.5).
  3. Variable Profiling Per-column detailed dive. Generates histograms for numerics, bar charts for categorical formats, and word clouds for text data.
  4. Interactions Dynamic Scatter plot interface to visually inspect the relationship between any two numerical features.
  5. Correlations Auto-encodes categorical data safely to render beautiful, fully-readable Heatmaps and Correlation tables.
  6. Missing Values Comprehensive missing data viz via Count Bar Charts, Nullity Matrices, and Nullity Heatmaps using the missingno library. Workflow Architecture

Quickstart Guide

  1. Requirements Before running the app locally, ensure you have Python 3.8+ deployed on your system.

Clone this repository git clone https://github.com/GaurRitika/automatic_eda_platform

Install the dependencies

pip install -r requirements.txt

  1. Run the App # Start your local Streamlit server streamlit run app.py
  2. Usage Upload Dataset: Drag and drop any tabular .csv file into the sidebar. Navigate: Use the sidebar radio buttons to switch between Views. Analyze: Let the interface compute everything from memory usage to data outliers in real-time.

Built With

Share this project:

Updates