Inspiration
Our project is based on women's financial advocacy. The gender pay gap is the wage and salary differences between men and women working at the same job. This salary difference is a financial restriction on women, and MindTheGAP intends to help solve it, by providing women with the tools and knowledge to ask for fair pay.
What it does
MindTheGAP has two main features: there is a calculator based on job occupation, where someone can search for the general salary for an occupation, with a visualization of the gender pay gap. The second feature is an AI analysis prediction tool, trained on US census data through random forest in the pickle python library training to predict someone's wage with respect their demographics like age, race, and sex.
How we built it
In order to process our training data to be useful, we had to filter out several data entries through a .csv the use of one time python scripts. These included removing unemployed people, removing large outliers in the data, filtering our non full-time workers, and simplifying categories related to race so it could be better interpreted by a machine learning model. We also included certain categories such as education level in numerical order in order to help the AI find patterns in the data. From there, we imported the data into a google collab with our AI training code, and the resulting AI model was utilized for predicting salary in our website,
Challenges we ran into
One of the biggest challenges we ran into was the accuracy to the AI. during first initial training, it had a R^2 value of 0.1, but including improvements in removing major outliners before training the AI increased the accuracy.
Accomplishments that we're proud of
We're very proud to have gotten as far as we have in this project. A lot of this was in areas the team members have never worked on before.
What we learned
All of the members of this team have little to no experience in web development and AI training. We've learned HTML, CSS, Revel, Flask, and AI model training.
What's next for Mind The GAP
The AI's training data is still missing a valuable data points, the most valuable being total years of experience or total years with company. This data was not tracked by the US census data we had access to, however someone will one year of experience will have much less pay vs another with 20 years of experience. Creating training set that includes this information will be imperative to improving our projects accuracy.
Log in or sign up for Devpost to join the conversation.