Water Detection
Short Description: An AI-powered web app that analyzes images of water and predicts whether the water looks clean or contaminated using machine learning-based visual feature analysis.
Team Members
Name: Samarah Hiba
Role: Full-stack developer + ML engineer
Work: Built the training pipeline, feature extraction, model deployment, and web interface.
Problem Statement
Access to clean water is a global concern, but even in everyday environments—such as dorms, hotels, or shared living spaces—people often rely on visual judgment to determine whether water is safe to drink. This approach is unreliable because many contaminants are not visible.
This project was directly inspired by a personal experience: living in a dorm, I witnessed my roommate become sick due to poor water quality. That moment highlighted how easy it is to overlook potential risks in environments we assume are safe. It became clear that there is a need for a simple, accessible tool that can help individuals quickly assess water quality in casual, real-world settings—without requiring specialized equipment.
AI Usage Explanation / Technical Details
Water Detection AI uses a machine learning pipeline to function.
Data Processing
Labeled clean and dirty water images are converted into numeric visual features using color, brightness, texture, saturation, and thumbnail-based representations, then passed into an sklearn pipeline with StandardScaler and LogisticRegression.
Code Segments
Labeled clean and dirty water images are loaded from separate folders and converted into numeric visual features using a custom feature
Extraction Function
for label, folder in [(1, 'Clean-samples'), (0, 'Dirty-samples')]:
for img_path in sorted(folder_path.glob('*.jpg')):
X.append(extract_features(img_path))
y.append(label)
Feature Engineering
Each image is transformed into features capturing color, brightness, texture, saturation, and spatial layout:
img = Image.open(image_path).convert('RGB').resize((64, 64))
arr = np.asarray(img, dtype=np.float32) / 255.0
gray = arr.mean(axis=2)
Model Training
Features are passed into a scalable ML pipeline:
model = Pipeline([
('scaler', StandardScaler()),
('clf', LogisticRegression(max_iter=5000))
])
model.fit(X_train, y_train)
Model: Logistic Regression
Preprocessing: StandardScaler
Output: Probability of clean vs. contaminated water
Evaluation
Model performance is evaluated and saved:
metrics = {
'accuracy': accuracy_score(...),
'confusion_matrix': ...,
'classification_report': ...
}
Real-Time Prediction
The deployed system processes new images and returns predictions:
clean_prob = model.predict_proba(features)[0, 1]
label = 'clean' if clean_prob >= 0.5 else 'dirty'
Integrated into a Flask API
Supports both UI interaction and backend inference
Summary of how solution uses prediction/data
This system predicts the probability that water is clean based on learned visual patterns from labeled data.
Impact & Why It Matters
Water Detection demonstrates how AI can make environmental awareness tools more accessible.
Potential impact: Rapid screening tool for water quality awareness.
Foundation: For more advanced environmental screening tools.
Source Code
GitHub Repository: link
Accomplishments
Built a fully functional ML pipeline from scratch (not rule-based).
Successfully deployed a working web + API system.
What we learned
Machine learning generalizes better than rule-based approaches for visual tasks.
Small datasets can give misleadingly high accuracy—simple models can be powerful when paired with the right feature representation.
Evaluation must be interpreted carefully.
What's next for Water Detection AI
Need to find better dataset.
Add real-time camera/video input.
Log in or sign up for Devpost to join the conversation.