Inspiration
We were inspired by the research in our lab that involves phylogenetic reconstruction for evolutionary biology questions. We thought that a phylogenetic approach to analyzing the distribution and relationship of tacos across the United States would yield an interesting analogy for biological taxonomy.
What it does
We built a phylogenetic tree using hierarchical clustering to visualize relationships between significantly different tacos.
How we built it
We used R to parse much of the quantitative data with regex and expanded the dataset into a larger but more rigorously structured data frame. We used kmeans clustering to create geographic subsets of our taco populations. We then used this improved dataset to cluster our tacos and burritos into groups that shared common traits. We further looked into the hierarchical relationship between our taco "species" using phylogenetic comparative methods.
Challenges we ran into
The dataset was difficult to parse. In particular, there was a number of qualitative data that did not have standardized factors and only contained descriptive strings that failed to adhere to any rigorous discretization. Further, there were many empty data points that couldn't simply be eliminated without introducing bias in the analyses.
Accomplishments that we're proud of
We familiarized ourselves with a number of deep learning tools (hierarchical clustering) that we haven't used before. We practiced our ability to data wrangle a very suboptimal data set and create a more structure data frame that could answer interesting questions.
What we learned
We learned that there are different patterns in the clustering of ingredients in taco "species." For example, we see a large overlap of asian tacos and fish indicating a large influence of cuisine on phylogenetics. The regional clustering of the tacos does not correlate with clustering of the taco "species."
What's next for On the Origins of Tacos
Calculating the frequency functions for the geographic distributions for each of the species.
Built With
- caret
- dplyr
- ggplot2
- ggrepel
- klar
- mapproj
- maps
- phytools
- plotfunctions
- r
- rann
Log in or sign up for Devpost to join the conversation.