Inspiration
We were motivated to build this project for two reasons. First, it was a project that everyone can find some utility in. Farmers, pilots, and even surfers rely on weather data for safety, efficiency, and fun.
We also wanted to test our skills in data science and full-stack web development. This project was a way to unite our desire to build something useful while learning at the same time.
What it does
Rain Predictor will try to predict if it's going to rain tomorrow! The user can input ~20 different meteorological measurements and then receive a prediction. The predictions are produced by a machine learning model that we trained on Australian weather data. Before any kind of hyper-parameter tuning we were achieving a prediction accuracy of 80%.
How we built it
The front-end was built with React.js. React renders a web-form with all the relevant meteorological data.
When the user submits the form, a POST request is made to our custom api endpoint built using FastAPI. FastAPI hosts the API endpoint, parses and re-formats requests as Pandas DataFrames, classifies the input using our model, and then returns the results to the user. We chose to use FastAPI because it made building out endpoints easy, and served as a translation layer between the front-end and our trained model.
We processed our data using Pandas and trained our model using SciKit Learn. During data pre-processing we split the data into test and training data, removed outliers, filled in missing data, encoded nominal features into numerical features, and normalized to a fixed range. We chose to use SciKit's linear regression algorithm to train this model.
After pre-processing and training the model was pickled so that the server can load it at any time without having to re-train.
Challenges we ran into
In terms of the front-end: We vastly underestimated the complexity of react. We were able to build a working web-form and POST the data back our server but we ran out of time before we could build out the rest of the site.
On the back-end we were challenged by properly feature engineering the data (we accidentally encoded some features wrong which prevented us from using user data to make predictions) and parsing user input into a DataFrame (this turned out to be very laborious).
For everyone on the team this was the first time we have worked together on a shared repo so we also had to learn how to use git branches.
Accomplishments that we're proud of
- Training our model which ended up having a reasonably high degree of accuracy
- The form data parser in the server code was a challenge, but really bridged the gap in understand what happens in the front end and how we can process that data in the back-end (and vice versa)
What we learned
- That we take web forms for granted. We found that there are a lot of moving parts between a user entering data and returning results back to them.
- How to work together as developers, both communications skills and git skills
- Pandas is awesome
What's next for Rain Predictor
We will work to tweak our model features and form parsing system so that is is more compatible with the web-form. Of course we would also like to finish building out front-end.
Log in or sign up for Devpost to join the conversation.