Inspiration

We had an interest in data science, and Oppenheimer Funds had a large set of data that needed analyzing.

What it does

The project attempts to organize seemingly unrelated clickstream data into a group of clusters that should cover roughly 80% of web traffic to oppenheimerfunds.com

How we built it

We used python and Unix terminal commands to break down the data into smaller groups that were then funnelled into the Google Cloud platform to produce graphs.

Challenges we ran into

The data sets produced graphs that were too complex to be generated on a local machine. We needed to utilize the computing provided by Google's Cloud computing to generate some of the graphs.

Accomplishments that we're proud of

We succeeded to graph a number of the data sets we produced.

What we learned

Never try to export a directed graph with 3 million nodes to a .dot file locally.
Gained a much greater familiarity with graphing packages in Python.

What's next for Data-Mining Clickstreams

Due to the results we achieved, no future projects are planned. The conclusions we have drawn may be used to potentially reshape the current Oppenheimer website, however that is the extent of the usefulness of our findings.

Built With

Share this project:

Updates