Inspiration
We are unsure if there is a previous submission for our team due to issues with communication. Please let us know through email if there is.
What it does
We first pre-processed the data as follows:
- Dropped categorical columns that had too many (>50) or too few (<2) categories to consider.
- Float-like objects are converted into numbers, including non-numerical values in the raw data if they meet a certain threshold.
- Other objects are categorised; with the categories stored for pre-processing on input datasets.
How we built it
We predicted f_purchase_lh using most of the available columns/variables. The sklearn k neighbours classifier contains tunable hyperparameters, including number of neighbours, leaf_size, and power parameter among others. By comparing the train and test accuracy, we can identify the best compromise between model performance and small difference between train and test data.
Built With
- scikit-learn
- smote
Log in or sign up for Devpost to join the conversation.