Dynamic Load Balancing for Energy-Efficient Cloud Computing

Inspiration

After watching this CNBC video, we became aware of the inadequacies in the US electrical grid, with a ChatGPT query requiring 10 times as much power as a Google search. We wanted to know if there was a way to map specific tasks to servers while attempting to minimize the overall energy footprint. With our backgrounds in Electrical Engineering and Math, we wanted to explore an area that overlaps both our interests.

What it does

Given weather information at server locations, we built a model that provides a metric for the reliability of renewable energy resources at a specific location. We use this metric to weight our load-balancing algorithm so that it optimizes for renewable energy.

How we built it

We approached the problem of server load distribution as a k-server problem, frequently considered the holy grail of optimality problems, with a weighting to account for a location's renewable energy footprint. For our data inputs, we used two APIs: one from the US Energy Information Administration (EIA) and Open-Meteo, an open-source weather API. We processed this data using JSON and pandas, so the two data sources were aligned and in compatible formats with one another. We addressed data abnormalities and built custom features to improve our model (feature engineering).

The model we chose was an LSTM due to its fast inference capability and its ability to handle time-series data well, with its inherent "forgetfulness" being beneficial due to seasonal weather changes. We implemented the model in Tensorflow. After producing the model, we inferred the renewable reliability using 3-day weather forecast data as inputs for a range of server locations. These served as the weights for our k-server problem. We chose a randomized heuristic since this problem is NP-hard, but this heuristic is generally considered one of the best pseudo-optimal algorithms for addressing it. Our metric for distance in this algorithm is load, which allows us to assign tasks to servers using our weights to form a probability distribution. Sampling from this distribution provided us with a pseudo-optimal load balancing of tasks among our servers, while also prioritizing each server’s ecological footprint.

We chose a Python-Streamlit frontend due to its variety of input and output widgets, which provide a high degree of user interaction. Users input server locations and can generate random tasks to test the model's weighting and observe the load balancing in real time. Servers with high weights are assigned a higher load due to their increased ecological reliability, and these weights converge to a stationary distribution as n (the number of tasks) becomes significantly greater than k (the number of servers).

Challenges we ran into

We faced conceptual challenges in understanding the dynamics between electricity suppliers and data centers, trying to balance both parties' renewable energy goals in our model. We encountered data scarcity and cycled through many APIs before finding the right ones for the task. Due to the lack of training data, we prioritized feature engineering to get the most out of our smaller dataset. We also spent time considering which k-server heuristic would best align with the problem. Most material on the k-server problem is theoretical, rather than application-focused, so proceeding without much guidance proved challenging.

Accomplishments that we're proud of

We chose a nuanced problem space with real-world applications. Being able to leverage classical computer science theory in the k-server approach, along with a more modern LSTM ML architecture, provided a novel solution to a largely unexplored area of sustainability. Although frustrating at times, the outcome was fulfilling as we were able to validate many of our hypotheses.

What we learned

We learned how to mentally overcome development hurdles. We learned how to best utilize the skillsets of our team members, allowing for an efficient division of labor that expedited the development process. We also gained a deeper understanding of the power grid and the field of sustainability, which is often overlooked in CS.

What's next for Dynamic Load Balancing for Energy-Efficient Cloud Computing

We hope the data we generate can facilitate Power Purchase Agreement (PPA) negotiations between power generators and data centers, as data centers and generators face stricter energy quotas. This could reduce overall energy wastage and benefit both parties economically.