Hearing the challenge presented by Vontobel we were really excited to work on it - we think it's important that more people will be aware of what companies do and that companies will be more responsible for what they do. This is important not just for investors but also for communities, governments etc.
What it does
Takes information about companies from news articles and predicts how "socially conscious" a company is on a certain given scale.
How I built it
Using the training data Vontobel presented us we set out to create 3 predictor models:
- Predicting the "theme" of a news object - we want to know what this news is about, is it regarding social issues? Government? Environment? We used text description of the news, and text frequencies representations and embedding to predict the theme of each news item using XGBoost (and received a prediction which was 87% correct!)
- Predicting the "severity category" of the news. In the training data set we had 6 severity classes which classify the news items based on their severity. Using the Catboost model - a RandomForest model created specifically for categorical data sets we predicted this category for each news item.
- Predicting the sustainability score. Also in the training data set we received manually created sustainability scores, which we built a linear regression model to predict, by creating features from the news, their severity and their "theme".
Challenges I ran into
We tried working on challenges and in subjects we haven't used before - non of us have front-end experience and yet we decided to also create a web demo to show what we believe the final product should look like. Like any data driven task - we faced problems with understanding the structure of the data, which features we should use and which new ones we can create from the existing features.
Accomplishments that I'm proud of
We managed to work together despite the fact we don't know each other from before, all come from different countries, and speak different languages.
What I learned
Working together closely with people from different professional backgrounds. I learned about embedding, new ways to visualize data, and look at textual data in ways that are useful in machine learning.
What's next for InGoodCompany
We hope that we helped to create an idea that might help in advancing the investment world to choose more wisely the companies to invest in. There are many options for this product in the future - it can be easily deployed, linked with many data sources of news to get more up to date predictions and much more.