Inspiration

What it does

The code creates ensembles of decision trees used to classify sequence DNA samples as bacterial

How we built it

Data from reference bacterial genome for Fusobacterium and randomly generated base sequences were used to train an ensemble of decision trees. After training the learners, each sequence of DNA in the SRS016297 data set and another set of random data was scored by the ensemble. The ensemble scored the inputs with 100% accuracy.

Challenges we ran into

Created a neural network model using TensorFlow that failed to learn the Fusobacterium virus. Eventually turned to ExtraTreesClassifier in scikit-learn.

Accomplishments that we are proud of

We did not kill each other. We learned more about DNA datasets and file formats. We cross-trained each other on areas of strengths.

What we learned

Too much to put here.

What's next for Team Immutable

Relaxing vacation.

Share this project:

Updates