BIAS BUDDY

Inspiration

AI is taking the world by storm. It's being used everywhere in our lives, whether you know it or not; recruiting new hires, deciding whether someone is worthy of a bank loan, and detecting the faces of criminals are just a few applications of AI. However, these applications all have one thing in common: they are at risk of bias. For example, an AI trained on previous hires for a software engineering job may be more inclined toward selecting a male candidate over a female candidate because software engineering is statistically male-dominated. So how do we fight against this bias? That's where our BIAS BUDDY comes in!

What it does

How our BIAS BUDDY works is simple: just upload a .CSV file onto our innovative website, and it'll go through your data and point out any potential biases your set or model may have. Our integrated LLM will also help explain what type of bias it may be, and how you can resolve said bias.

How we built it

We split our group into two teams: front-end and back-end. Our back-end developers (Chloe and Michelle) trained our BIAS BUDDY using Python and datasets that specified different factors such as gender and education level that potentially play into an ultimate prediction like whether a person receives a higher salary or if a person should receive a bank loan. After having our model run through training data and predicting the outcome of our test data to a pretty high accuracy, we then proceeded with using SHAP values to determine which features stuck out the most. This allows us to figure out whether a dataset is biased towards a certain aspect, and we can then use the LLM to communicate this bias to the user. Our front-end developers (Wafeeqa and Zara) connected the back-end scripts to the beautiful website you see before you. Made with a combination of HTML, CSS and Flask, our website allows the user to upload a .CSV file from their directory and our integrated Python script will then point out any potential biases.

Challenges we ran into

We ran into many challenges as we pursued this idea. For example, we weren't experts with Kaggle to begin with, so we had to figure out how modelling and data sets work. Along with that...

We had to determine how to find bias in datasets

We had to figure out how to present said bias

We had to learn how to incorporate LLMs

We had to bridge front-end and back-end aspects that were written in different languages (HTML vs. Python)

Regardless of the many challenges we faced, we were still able to pull through and create our BIAS BUDDY!

Accomplishments that we're proud of

We started off from basically knowing nothing, and we were able to create an entire AI system complete with an LLM paired with a random forest prediction model! We're also really proud of our website, as it features very good UI too.

What we learned

How to model datasets using Kaggle, SHAP values, predictions, web development, and most importantly, how important it is for data to be unbiased.

What's next for BIAS BUDDY

Using newer Llama2 models, accepting more file/dataset types, incorporating a visual component (i.e. graphs displaying data biases), prompt engineering better prompts(because explanations tend to be very long) and utilizing larger cloud based models/an API for our model to be more powerful.