Please see my full readme at


  • siamese NN are interesting to me
  • I wanted to work on a project that involved

What it does

  • Actively selects data points for training a Siamese neural network for making pairwise recommendations
  • Active learning: instead of training on all points in data, start with a subset of the data and add points to the training data based on a scoring heuristic
  • Siamese neural network: NN that takes two inputs instead of one, used for comparing points - e.x. which point does the use like more?
  • Expose algorithm to users through API and website - BuckysSmartPub, using beers.csv from Kaggle
  • API calls are ran on the ML server in the background so site does not freeze and multiple clients can join
  • Clients can step through the learning process and get live recommendations and give feedback

How I built it

  • Pytorch for the siamese neural network
  • Dask for distributing model training
  • numpy

Challenges I ran into

  • running background code through API calls
  • finding hyperparameters for the model which gave good accuracy

Accomplishments that I'm proud of

  • designed and implemented my own ML theory
  • applied it to real world data
  • made a decent website

What I learned

  • how to run API calls in background

What's next for Bucky's Smart Pub

  • Host model on EC2 instance
  • Host site on S3
  • Send model results through SQS

Built With

Share this project: