EqualEyes: Auditing Vision Models for Fairness

Does your AI see all faces equally?

Inspiration

It began with a moment of frustration. e asked an image generator to create “a doctor and a nurse.” Every time, the doctor appeared male and the nurse female. No matter how we rephrased the prompt, the pattern stayed. That tiny difference stuck with us. As women studying AI, we realized how often technology mirrors stereotypes we fight in real life. That’s when we decided to build something that could expose and explain how these systems really “see” us.

What it does

EqualEyes analyzes how AI models interpret people across gender, race, emotion, and age. It takes in images or datasets, runs them through multiple open-source vision models, and visualizes where bias appears, whether an algorithm sees more men than women, misreads emotions differently across faces, or skews toward one demographic.

How we built it

We started with the Flickr30K dataset for real-world images, then built a full pipeline in Python using PyTorch, OpenCV, and Hugging Face models for gender, emotion, and age classification. We combined outputs into a bias-scoring framework that calculates imbalance and generates dashboards for visual insight. Everything runs locally, no paid APIs, no external dependencies, so it’s transparent and accessible.

Challenges we ran into

Everything broke at least once. Dependencies like facenet-pytorch crashed our environment. Datasets were deprecated. We had to rebuild the architecture around lightweight, offline-first models. The harder part was designing a fair metric for bias, how do you measure something so contextual? We ended up defining our own intersectional bias score across attributes.

Accomplishments that we're proud of

We turned a broken prototype into a working fairness platform that actually produces measurable, interpretable results. The system processes 200+ real images, detects bias patterns, and outputs a complete HTML dashboard. But what we’re most proud of is that we built something meaningful not just functional, while learning to question our own assumptions along the way.

What we learned

We learned that bias isn’t just in data, it’s in design choices, defaults, and how we interpret “normal.” We also learned how to debug dependencies under pressure, handle model conflicts, and collaborate without losing the human reason we started this project in the first place.

What's next for EqualEyes: Auditing Vision Models for Fairness

We plan to test EqualEyes directly on AI systems used in hiring and content moderation, expanding it to detect intersectional bias in real model outputs. We also want to add real-time video monitoring and publish our bias metrics as open standards so developers can evaluate fairness before deployment.