Inspiration

The entire projects revolves around one piece of financial data: trading statements. Trading statements are financial news produced by management (people who run the company) targeted towards the shareholders (people who own the company). These statements contain a lot of rich information on the operations and outlook of the company and can drastically impact the share price of a company. The caveat being that these trading statements are almost always worded extremely positively, "corporate language" so to speak. Our project has two motivations:

  1. Explain what exactly the content of a trading statement "means" e.g. does a certain sentence reflect good news or is it trying to hide something?
  2. Enable a user to perform further research for themselves, to make more informed decisions on trades and evaluation.

Why should I care?

Research shows that the overall "sentiment" of a trading statement is often decided by a few "key" phrases. These phrases are cleverly hidden in the statement so that management are honest in their announcements however try to focus as little as possible on the negative aspects. A tool which can extract and explain these key phrases would be a powerful tool in the analyst's toolset.

Source: https://research.manchester.ac.uk/en/publications/learning-tone-and-attribution-for-financial-text-mining

What it does

The project allows a user to upload a piece of financial news e.g. a company trading statement. It will analyse the statement and pick out the key sentences with their sentiment scaled from -1 to 1. We used a custom fine-tuned BERT language model to assign sentences from financial news different sentiment scores. The website will then simplify and verify the statement with natural language and provider further link and twitter posts to allow the user to perform further research and gain a wider context to make more informed decisions.

How I built it

The back-end compromised of an API endpoint server which given a url, will fetch website text and rate sentiment and give further explanations and context to the text. Every sentence is assigned a score to "negative", "neutral" and "positive" sentiment, the highest score is chosen and is aggregated into a final value. The back-end uses a flask server all built in python and the hugging-face, lang_chain libraries to interact with our custom built sentiment analyser and google searcher. The UI leverages NextJS and will make API calls to the backend in order to retrieve sentence relevance and sources. Different sentences are highlighted based on their sentiment and relevance and a user can easily fetch their explanation and verification by hovering over the text. The backend is wrapped into a docker file and deployed onto Heroku, a PaaS (platform as a service), allowing anyone to query our api backend.

Challenges I ran into

We had a lot. When building our docker image, our university virtual machine ran out of disk space and forced us to transfer to a complete new machine. Additionally, heroku has quite a large deployment time which forced us to wait 5 minutes everytime we had fix or update the backend server. Heroku was also a beast to tame, the flask server struggled to bind to a port on the heroku application which caused our application to crash several times with seemingly no reason. We had to force flask to fetch a port and specifically bind to that port and to ask flash to bind to all networks, not to just 127.0.0.1. Fixing these bugs, was immensely satisfying.

The custom built sentiment analyser also ran into a fair number of problems, the model often ran out of GPU memory and data was often malformed. Additionally, we spent a large amount of time uploading the model to hugging-face in order for serverless requests as we (correctly) believed that loading several gigabytes of model weights would be a bad design decision.

Accomplishments that I'm proud of

I'm especially proud of the heroku application and the working api back-end that anyone is able to access at anytime.

What I learned

This was my first project where I built a fulllstack application from front-end to back-end without a (mostly) functioning project. I have little experience with frontend and NextJS and really exposed myself to fullstack application.

What's next for ICHack Trading Statement Explainer

One significant direction is to be able to spot when companies are deliberately trying to hide bad news from their shareholders. I would love to be able to show users, what sort of language that companies use when something is going wrong and exactly what terminology they use to hide it. This would be done by picking trading statements correlated with large negative shift in price shares (or from a tumultuous period such as the 2008 financial crash) and fine tuning the sentiment analyser on such data.

Share this project:

Updates