Inspiration

We love data and we love python so this was a wonderful challenge!

What it does

We have written a set of both functions and organized descriptive code notebooks that parse the medium-sized OppenheimerFund website clickthrough dataset, analyze aspects of its behavior, and provide some options for clustering the data to allow for better classification and prediction of user behavior.

How I built it

The data was provided in a python dataframe, so the pandas python library was a natural choice for manipulating and analyzing the data. Visualizations of the data were performed using Matplotlib, clustering was perform using K-means clustering from the scikit library, and all organization decisions and analysis were performed by us!

Challenges I ran into

While the dataset was only medium-sized it was still large enough to caused many operations to require very noticeable processing delays. Additionally, we had no prior experience working with this type of data before so understanding how to properly organize it was a fun challenge!

Accomplishments that I'm proud of

We were successful at manipulating the data to reveal some of its behavior.

What I learned

We put significant effort into figuring out how to ask complex questions of the data, but probably focused too much time on answering these questons rather than working on techniques for visualizing the data.

What's next for Naive clickthrough clusters

Given more time we would have developed visualizations of the data clustering. It would have also been nice to finish our experiments with using markov processes to more fully analyze correlation between successive events.

Built With

Share this project:

Updates