Problem to be solved
Currently, researchers and developers face limitations due to the cost and scarcity of computing resources needed to process large volumes of data and train artificial intelligence models. This causes most models to be restricted to large companies.
Solution
To solve this problem, this project aims to develop a platform that allows individuals to share the processing capacity of their GPUs with scientific communities and developers of large-scale language models. In return, collaborators will receive rewards, thus encouraging continued participation and cooperation.
Mission of the project
In this way, the platform not only aims to optimize existing resources, but also positions itself as an essential step to ensure that the future of AI is truly democratic, transparent and accessible to all. Accelerate progress in scientific research and AI development Promote a collaborative economy where contributors are rewarded for their contributions.
Mission of the project
In a world where AI, machine learning, and scientific breakthroughs define the next frontier, one force threatens innovation: monopolized computing power. Tech giants hoard resources, locking out independent developers, researchers, and visionaries with sky-high prices and closed ecosystems. This is not the future we accept. We stand for decentralization, accessibility, and freedom. Our mission is to break the stranglehold of GPU monopolies and build a borderless, peer-powered network, where computing resources are shared not controlled. A system where innovation isn’t dictated by corporate gatekeepers, but by those bold enough to build. Imagine a world where AI researchers, indie developers, and students no longer have to beg for cloud access or drain their funds on overpriced rentals. Instead, they tap into a global community support of GPUS, contributed by those who believe in the power of open access, collective intelligence, and technological sovereignty. This is more than infrastructure. This is a rebellion. We are dismantling the old order and forging a new era of computational freedom where anyone, anywhere, has the power to shape the future. No barriers. No monopolies. Just raw, unleashed potential.
Approaches to distributed training
The field of decentralized computing is evolving rapidly, with continuous advancements enhancing multi-device model training. While we will adapt to future innovations, we currently leverage cutting-edge frameworks and research.
Petals
Enables collaborative inference and fine-tuning by distributing Transformer layers across individual devices, allowing large models to run collectively over the Internet.
DisTrO
Dramatically reduces inter-GPU communication during large model training, cutting data transfer by 4-5 orders of magnitude. It supports heterogeneous hardware, slow connections, and matches the performance of standard optimization methods like AdamW. Recently used to train a 1.2 billion parameter LLM, proves that decentralized training is not just theoretical it’s a reality. By reducing inter-GPU communication by 4-5 orders of magnitude, it enables training on slow networks and heterogeneous hardware while maintaining state-of-the-art performance.
DiLoCo
Optimizes decentralized training by minimizing communication, syncing every 500 steps while maintaining comparable or superior performance to standard training on large datasets. It efficiently handles variations in compute resources.
DeMo
Introduces decoupled momentum updates, extracting fast-moving components for efficient synchronization. It significantly reduces communication while matching or exceeding AdamW’s performance in large-scale pre-training.
Distributed Data Parallelism (DDP)
Replicates models across multiple devices, synchronizing gradients at each step but requiring high-bandwidth interconnects. Fully Sharded Data Parallelism (FSDP) improves on this by distributing model parameters, reducing memory demands while still requiring frequent communication.
Log in or sign up for Devpost to join the conversation.