Inspiration

We are unsure if there is a previous submission for our team due to issues with communication. Please let us know through email if there is.

What it does

We first pre-processed the data as follows:

  • Dropped categorical columns that had too many (>50) or too few (<2) categories to consider.
  • Float-like objects are converted into numbers, including non-numerical values in the raw data if they meet a certain threshold.
  • Other objects are categorised; with the categories stored for pre-processing on input datasets.

How we built it

We predicted f_purchase_lh using most of the available columns/variables. The sklearn k neighbours classifier contains tunable hyperparameters, including number of neighbours, leaf_size, and power parameter among others. By comparing the train and test accuracy, we can identify the best compromise between model performance and small difference between train and test data.

Built With

Share this project:

Updates