Water Detection

Short Description: An AI-powered web app that analyzes images of water and predicts whether the water looks clean or contaminated using machine learning-based visual feature analysis.

Team Members

Name: Samarah Hiba

Role: Full-stack developer + ML engineer

Work: Built the training pipeline, feature extraction, model deployment, and web interface.

Problem Statement

Access to clean water is a global concern, but even in everyday environments—such as dorms, hotels, or shared living spaces—people often rely on visual judgment to determine whether water is safe to drink. This approach is unreliable because many contaminants are not visible.

This project was directly inspired by a personal experience: living in a dorm, I witnessed my roommate become sick due to poor water quality. That moment highlighted how easy it is to overlook potential risks in environments we assume are safe. It became clear that there is a need for a simple, accessible tool that can help individuals quickly assess water quality in casual, real-world settings—without requiring specialized equipment.

AI Usage Explanation / Technical Details

Water Detection AI uses a machine learning pipeline to function.

Data Processing

Labeled clean and dirty water images are converted into numeric visual features using color, brightness, texture, saturation, and thumbnail-based representations, then passed into an sklearn pipeline with StandardScaler and LogisticRegression.

Code Segments

Labeled clean and dirty water images are loaded from separate folders and converted into numeric visual features using a custom feature

Extraction Function

for label, folder in [(1, 'Clean-samples'), (0, 'Dirty-samples')]:
    for img_path in sorted(folder_path.glob('*.jpg')):
        X.append(extract_features(img_path))
        y.append(label)

Feature Engineering

Each image is transformed into features capturing color, brightness, texture, saturation, and spatial layout:

img = Image.open(image_path).convert('RGB').resize((64, 64)) 
arr = np.asarray(img, dtype=np.float32) / 255.0 
gray = arr.mean(axis=2)

Model Training

Features are passed into a scalable ML pipeline:

model = Pipeline([
   ('scaler', StandardScaler()),
   ('clf', LogisticRegression(max_iter=5000))
])
model.fit(X_train, y_train)

Model: Logistic Regression

Preprocessing: StandardScaler

Output: Probability of clean vs. contaminated water

Evaluation

Model performance is evaluated and saved:

metrics = {
   'accuracy': accuracy_score(...),
   'confusion_matrix': ...,
   'classification_report': ...
}

Real-Time Prediction

The deployed system processes new images and returns predictions:

clean_prob = model.predict_proba(features)[0, 1]
label = 'clean' if clean_prob >= 0.5 else 'dirty'

Integrated into a Flask API

Supports both UI interaction and backend inference

Summary of how solution uses prediction/data

This system predicts the probability that water is clean based on learned visual patterns from labeled data.

Impact & Why It Matters

Water Detection demonstrates how AI can make environmental awareness tools more accessible.

Potential impact: Rapid screening tool for water quality awareness.

Foundation: For more advanced environmental screening tools.

Source Code

GitHub Repository: link

Accomplishments

Built a fully functional ML pipeline from scratch (not rule-based).

Successfully deployed a working web + API system.

What we learned

Machine learning generalizes better than rule-based approaches for visual tasks.

Small datasets can give misleadingly high accuracy—simple models can be powerful when paired with the right feature representation.

Evaluation must be interpreted carefully.

What's next for Water Detection AI

Need to find better dataset.

Add real-time camera/video input.

Built With

Share this project:

Updates