Inspiration

Our group was mainly interested in crypto, so as people who invest in crypto we wanted to see what affects the crypto market. This led us to the idea to see which tweeter's affect specific coins.

What it does

The general idea behind our program is to find out which tweeters influence different coins. We scrape Twitter for all things related to a specific coin, e.g. bitcoin. We then check if any of those tweets correlate to a significant change in the price of the coin. To do this we check the tweets of all of those users, to see if they have tweeted about the coin and if their tweets have correlated to a significant change in the price of the coin. If 50% or more of a user's tweets have been determined to be significant, we then mark that user as significant.

Our program allows the user to select a coin and track it. In the background our program periodically checks for any new tweets relating to their chosen coin AND influential tweeters, allowing users to track relevant tweets in real-time.

How we built it

We used Python for the algorithmic side of our application to write scripts that scraped twitter based on our queries, to compare price movement before and after the tweet to determine if it is statistically significant. To scrape Twitter we used twint which is a third party API that scrapes Twitter. For the price movement script, we used finance to download the data and analyse it. The NLP algorithm was built using google cloud's nlp analyser for semantics, allowing a swift return for semantic positivity and a confidence level.

For the full stack portion of our application, we used node.js, express.js to build out our backend which would serve the main react frontend from the index.html. From this, all further requests to the backend would be handled via API (using AJAX requests). We used a multitude of libraries to aid the front-end for react application.

Challenges we ran into

One of our issues was that if someone tweets quite a few times in a short period, e.g. 10 minutes. This means that 50% or more of their tweets are likely to be considered significant because they are all within the price change time range. To solve this we only consider one tweet per hour per user.

Another one of our main issues is that we never really implemented any code of this sort, so we had to quickly learn how to work with scrapers and third party API's.

Accomplishments that we're proud of

Specifically the fact that we can track significant tweeters in real-time via our UI and our NLP linking to the exchange.

What we learned

We learnt how to work with statistical models and handle large amounts of data in an efficient way, and analyse the data to give a discrete result to the end-user. We learned how to use several technologies, such as libraries for Twitter scraping, market history and nlp analysis.

What's next for StockTweets

Our next upgrade would be to do a similar thing but with wallets. We would want to find out which wallets are affecting the market, maybe because of inside information. Based on this we can advice users which wallets to use. Furthermore, we would want to backtest this with historic data and find out the probability of us making a profit when actually using the web application.

Built With

Share this project:

Updates