The idea behind the inspiration was from Raj, who invests periodically and enjoys playing around in the market. One thing he enjoys doing is looking at specific industries and finding the best value's in these specific industries. Currently, to do this he compares many different factors, balance sheets, and news stories while also looking at hundreds of companies trying to determine the best company in an industry for his investment. After utilizing the Marquee API and realizing that the API provides percentile data on calculated statistics that utilize many of the data points he considers when looking to invest, it was determined the the API could be utilized along with the company's industry to find how each company placed and changed with respect to time.
What it does
The user is allowed to select an industry and different factors that are provided through the Marquee API. Whenever one of the selections is changed the charts change in real-time along with if the date range that is available is changed. The user can then visualize the companies in a specific industry and how they stack against each other in the chosen factor vs time. As two charts can be used, the user can compare multiple factors when having a deeper look into the specified industry.
How we built it
Back-End: To get the information needed to visualize the marquee API is used to grab all the historical data points for the given range provided by the user through a call. This then runs a function which takes in the date-range and the industry and grabs the companies in the industry by utilizing pandas to find the companies in the given industry in a csv file. To get the companies with their specific industry, some web-scrapping was first done through websites that contain GICS classifications and appends a column to each company in a .csv and then then this csv file is used in conjunction with the given industry to grab all the companies into a list. This list can then be used to grab gsid which then can be used to request the data from the Marquee dataset. Then conversion of the data is done to put it into proper formatting for the front-end using pandas conversions and the eventual dataframe is converted to a csv for use with Dash Front-End: Utilizing Dash's complex yet easy to use framework we were able to take in the values provided by the csv file and create an interactive API for the user. There are two graphs present on the main page as of now, for each of them you select what you want as your y-axis, the x-axis will be represented by time. Thereafter, the data is displayed with a slider in the middle of the two, this slider controls the date range and can help investors see trends for the companies within the same sector. To allow for more interactivity, we added hover features that allow users to see the values of all the companies within an industry on a set month.
Challenges we ran into
Specifically in the backend, it seemed that we were constantly running into issues utilizing the Marquee API with limits and formatting to get the information we needed. When we eventually did get the information, there was some repeat data that needed to be filtered out. During the web scrapping process sometimes the GICS classifications were not as needed or not present due to some companies going under/getting acquired in the dataset so some research needed to be done to classify them manually. I think for front-end the main issue was finding the best tool to use to help in visualizing the data. At first, we decided to use Dash as it was the only open source data tool that runs on python, the language most people in our team are comfortable. Then, we decided to switch over to React,js and its many libraries to represent the data however we found many constraints when trying to use this. So, we decided to switch back over to Dash as it was something we felt more familiar with and we had about 10 hours at this point.
Accomplishments that we're proud of
- Figuring out Dash and its data vis. solutions
- Understanding Marquee API
- Having a SICK API!
- Not giving up when we saw us going nowhere
What we learned
- "You always want to check you assumptions" --Max Guo(Goldman)
What's next for Marqulityics
Ideally we want to be able to visualize more industries, show major events in said industries (mergers, acquisitions, etc.), and perhaps show more plots. One plot that is likely desired is one that plots all companies as a scatter and labels them according to industry by color. Then have a time progression that shows how these companies move against time so investors can be informed on trends of industries in relation to one another. Another big consideration for the future is implementing a way to baseline our plots utilizing something such as an index to better show how industries change with changes to the index.