Inspiration
"By 2025, it is estimated that there will be 175 zettabytes (ZB) of data stored worldwide, up from 33 ZB in 2018." And where will this data be stored and processed?
The convergence of three pivotal factors has ignited our inspiration for a project centered around data center location optimisation: the rapid advancement of AI technology, coupled with the accumulation of untapped dark data and data centres, and the climate crisis. These trends underscored the critical importance of efficient data center operations.
The proliferation of increasingly large and complex language models (LLMs) like GPT-3, along with the surge in user queries utilising these models, has led to a significant rise in computational demands, subsequently driving the establishment and expansion of data centers. However, this rapid growth in data center infrastructure has raised concerns about its environmental impact, as the energy consumption and cooling requirements associated with these centers contribute to carbon emissions and resource utilisation challenges, underscoring the need for sustainable technology solutions.
Extreme weather has reiterated the need to prevent water shortages and water wars in the US. Carbon emissions need to be cut and need to remain within limits per US regions. And still, businesses must be able to operate without extortionate costs to store and process their data.
This is why we saw that there is an urgency to develop sustainable solutions that strategically place data centers in the US, taking carbon emissions, water availability, amount of LLM queries per region and power cost into account. Our aim is to minimise the environmental impact of US data centres, while harnessing their full potential.
As a team, we are both very passionate about technology and the environment. Kalpana is an AI researcher and scientist based in India and Cliodhna is an IT manager working for Greenpeace Int. Our values align in terms of making tech equitable, diverse and sustainable.
What it does
In short, our tool provides the optimal location for data centres to be based in the US.
Our tool takes into account the key elements in building sustainable data centers, such as reducing carbon emissions, water usage, and cost, data centre capacities and the amount of LLM queries. Based on those factors, it then provides the best possible solution for situating data centres in the US. Our comprehensive approach ensures that, in addition to being as environmentally-friendly as possible, the data centre is also cost-effective and efficient for businesses and organisations to operate from.
How we built it
The initial stage of our project began with a lot of research. We dug into the internet in order to find adequate datasets to match our desired outcome. Our final calculations in the backend went through a few iterations in order for us to finally come to a conclusive and satisfying result.
Final Calculation
- Water supply and usage per state (or this), CO2 emissions per state - power costs per state. State=region.
- Simulated variable for LLM users per region until data is found.
- Water usage per question & CO2 emission per question.
- Intrinsic data centre capacities.
The backend is written using python, javascript and takes data from ;
- https://www.statista.com/statistics/194176/public-water-supply-per-capita-use-by-leading-states-in-the-us/
- https://www.statista.com/statistics/489494/major-us-state-energy-related-carbon-dioxide-emissions-per-capita/
- https://www.linkedin.com/posts/timnit-gebru-7b3b407_thirsty-data-centers-are-making-hot-summers-activity-7090876217129635841-0tsV/
- https://earth.org/environmental-impact-chatgpt/
- https://dgtlinfra.com/united-states-data-centers/ (more on the backend calculations in the Data-Centre-CO2-Research doc attached to the portfolio.
The frontend is built with wix.com. We initially thought to build one webpage with the tool, but decided to expand the tool to website whereby you can decide to use the tool, and then join a community with people who are passionate about a similar mission: how to make tech and the digital space a more sustainable and environmentally friendly space.
Challenges we ran into
Our main challenge was time. Alongside a full time job, the deadline creeps up quickly!
Another challenging factor was the fact that a lot of datasets related to LLM models are not readily available. For example, we were keen to determine the exact figures of water usage, carbon emissions and power cost per LLM query but large LLM companies have not released this data yet. I'm sure this will change with legislation.
Accomplishments that we're proud of
An opportunity to learn and spark curiosity. Peer-to-peer learning is a treasure trove of practical insights and shared experiences. It's an interactive way to enhance critical thinking and problem-solving skills while having a great time learning from others, and this was definitely the case in our team.
We also think that we have a really interesting, topical and important solution to an ever-growing issue in the IT industry. The sharp increase in the need for data centres in the coming years is prevalent. As Cliodhna is Irish, in Ireland she has seen that this is a widely contentious topic as data centres are also damaging the surrounding environment - so this is a topic close to her heart.
What we learned
We learnt about the water wars in the US. [link] https://www.washingtonpost.com/climate-environment/2023/04/25/data-centers-drought-water-use/ We learnt that AI models are atrocious for the environment in terms of water and carbon usage for training models, and the lack of data that there has been released on this issue.
We learnt that there are not a lot of solutions similar to ours on the market and that this could have real impact.
Cliodhna learnt about the backend mathematical elements to the equation determined by Kalpana and it was her first time using wix.com to build a website.
What's next for
Cliodhna and Kalpana will continue to meet in order to develop this tool further. We need to connect the frontend to the backend. We need to build the backend out more. We need to probe LLM companies for more data on the environmental impact of their models. We also plan on writing a research paper on this topic as we are hugely passionate about this area.
Built With
- javascript
- python
- wix.com
Log in or sign up for Devpost to join the conversation.