The project started off with us trying to predict fraudulent brokers based on the given dataset. It later culminated into multiple Jupyter notebooks of scientific analysis and Data Visualization.
What it does
Provides Visualization of multiple firms throughout the United States and employes that have indulged in fraudulent activities in the past. Also provides statistical analysis of the Dataset provided by Finra.
How we built it
We parsed the given XML files using lxml parser in Python after reading through the documentation provided. We used pandas to convert the xml files to Dataframes. We extracted the key features that we deemed necessary and performed feature engineering. We used the Google cloud api to get the geocodes for the locations of the firm and we used the folium api to represent the derived firm locations on the map. We also used Google collab to perform machine learning(Deep Neural nets) since they provided an online GPU.
Challenges we ran into
We planned on delivering a front end dashboard to represent better UI experience to the users, but could not deliver.
Accomplishments that we're proud of
It was a learning process, we were able to deliver some beautiful visualizations, implemented and successfully ran a machine learning algorithm on the dataset the we feature engineered, established cloud connection with Google credits provided.
What we learned
The workflow of performing data analysis and exploring huge datasets, also we learned a lot about the Financial industry through the Finra dataset and website.
What's next for BitCamp2019_FDA
We are going to complete the UI, incorporate a bigger dataset that we are currently scraping so that we have features to run the deep learning algorithms on.