Team 321

Inspiration

We are unsure if there is a previous submission for our team due to issues with communication. Please let us know through email if there is.

What it does

We first pre-processed the data as follows:

Dropped categorical columns that had too many (>50) or too few (<2) categories to consider.
Float-like objects are converted into numbers, including non-numerical values in the raw data if they meet a certain threshold.
Other objects are categorised; with the categories stored for pre-processing on input datasets.

How we built it

We predicted f_purchase_lh using most of the available columns/variables. The sklearn k neighbours classifier contains tunable hyperparameters, including number of neighbours, leaf_size, and power parameter among others. By comparing the train and test accuracy, we can identify the best compromise between model performance and small difference between train and test data.