Cryptocurrencies are a hot topic. Especially during the bull run last year, a lot of people got interested and tried investing themselves. The cryptocurrency trading space is quite unique. It's open 24 hours and information exchange happens primarily online. It's not possible for someone to process everything that happens in the space to try and make predictions about the price. We need a tool that helps us identify the key messages and can send us signals, alerts and notifications when something big might happen. That's what we try to achieve with CoinHamster.
What it does
CoinHamster collects data from various sources such as Telegram, Discord, Reddit and Websites. The data is then analyzed and aggregated to try and identify important events. If a certain coin is almost never mentioned on social media for weeks, but then suddenly every channel starts talking about it, something important must be happening. And traders should be notified as soon as possible to get ahead of the crowd.
How we built it
From the start we set out to build a modular and scalable system. We split our project into 3 parts.
- Crawlers: Collecting data from different sources
- Analyzers: Processing the data (sentiment analysis, frequency analysis, etc.)
- Aggregators: Combine the processed data in a way that is helpful to the user (automatic signals)
The crawlers are built in a modular way so we can easily add new sources to our system. We have already run tests to scrape Twitter, Bitcointalk, Slack, Medium and more. It's as easy as setting up a service that collects data based on some filters and then pushes to data to our queue.
The analyzers are taking the data and enhance it with additional information. Currently, we do sentiment analysis using Natural Language Processing but there could be other analyzers such as Neural Networks, AI and Machine Learning.
The aggregators are the trickiest part of the system and need a lot more tweaking. Right now we are using sliding time windows to display a chart in the frontend. In the future, the current data should be compared to historical data to figure out important events.
Challenges we ran into
Collecting data from different sources is easy. But collecting a lot of data is hard. Sources have very strict rate limits that are in place to prevent exactly this kind of data aggregation. The only way around it is to either pay for higher rate limits or to fly under the radar and try to avoid detection.
Accomplishments that we're proud of
The whole system is very modular and new crawler types can be added very quickly. The system is still fairly simple and easy to understand. We are very happy with how quick the processing of the messages is. In our tests, we have sometimes seen the telegram messages in our frontend (after processing) before they appeared in the official app.
What we learned
Even though the time at Hackathons is limited, it's still a good idea to have a good plan and a clear goal in the beginning. It makes working on the project a lot easier because important discussions can be held in the beginning (and no code has to be changed because it hasn't been written).
What's next for CoinHamster
We believe that CoinHamster can bring a lot of value to traders and crypto-currency enthusiasts by showing the important events. The project is not quite there yet, but it's a good start. All the pieces are there, now we need to see how it develops over a longer period of time. We also need to add additional data points to the system, such as price- and trading-data.