Detecting Pump & Dump Schemes with AI
The Problem: A Global Issue Hidden in Plain Sight
Recently, the president of Argentina was exposed for committing a pump-and-dump (P&D) scheme—an act that artificially inflates the price of an asset before dumping it on unsuspecting investors. Before that, Donald Trump was involved in similar allegations regarding NFTs and other speculative assets. These high-profile cases, however, are just the tip of the iceberg.
In the crypto world, P&D schemes are rampant. In fact, 25% of all crypto tokens created in 2022 were used for pump-and-dump schemes. When we first stumbled upon this statistic while brainstorming project ideas, we were shocked. The scale of this fraud was overwhelming, and yet, no one seemed to be addressing this massive issue. This inspired us to take action.
Our Solution: A Machine Learning Model to Detect Pump & Dump Schemes
We developed a machine learning model trained on historical pump-and-dump schemes, now capable of detecting similar fraudulent activities in real time. Our approach relies on large-scale data scraping from multiple social media platforms, including X (formerly Twitter), Reddit, TikTok, and YouTube. This data is then correlated with the price movement of various cryptocurrencies to identify patterns consistent with past P&D schemes.
The Challenges We Faced
Obtaining real-time data was one of the major challenges. Pump-and-dump schemes unfold rapidly, often within hours. This meant we needed to collect and process large volumes of social media data in near real-time. Another challenge was pinpointing the exact moment when a pump begins. Without this precision, our model wouldn't be able to react quickly enough to alert users before the dump occurs. Additionally, not every surge in social media activity signals a fraudulent pump. Distinguishing between organic hype, such as a major exchange listing, and a coordinated scam was a key challenge.
How We Built It
For data collection and preprocessing, we scraped posts, comments, and videos from X, Reddit, TikTok, and YouTube. We then cleaned and labeled this data by cross-referencing it with historical pump-and-dump events and time-aligned it with real-time cryptocurrency price fluctuations. The model was trained using historical P&D data as training samples. We applied NLP (Natural Language Processing) to detect pump-related keywords and sentiment shifts and incorporated anomaly detection techniques in price movements.
To ensure real-time monitoring, we deployed a streaming pipeline to continuously monitor social media activity. We also created an alert system that warns traders when a possible pump is detected.
The Impact & Future Plans
With our model, we aim to make crypto markets more transparent and protect traders from falling victim to pump-and-dump schemes. As we refine our system, we plan to expand our data sources to include Telegram and Discord, which are common platforms for pump coordination. We also aim to enhance real-time detection speeds with better computing resources and provide actionable insights for regulators and crypto exchanges to crack down on fraud.
We believe that with the right tools, we can make crypto a safer space for everyone. This is just the beginning.
Built With
- coinmarketcap
- dl
- lstm
- lunarcrush
- ml
- python
Log in or sign up for Devpost to join the conversation.