Change Log
- Added 11 new dimensions to player vector to aid LLM-Team's analysis:
- AFFINITY_ABYSS
- AFFINITY_ASCENT
- AFFINITY_BIND
- AFFINITY_BREEZE
- AFFINITY_FRACTURE
- AFFINITY_HAVEN
- AFFINITY_ICEBOX
- AFFINITY_LOTUS
- AFFINITY_PEARL
- AFFINITY_SPLIT
- AFFINITY_SUNSET
- Added substitution generation capability
- Fixed bug where the last team in the sampled list is always selected over LLM-T's selection
- Tuned prompts to minimize issues with LLM-Team periodically forgetting it has the team's statistics
- Optimized prompts to be more throttle friendly
- UI Updates
- Made loading animation more topical
- Added icons to team memory
- Markdown Support for chat interface
- New chat icons
- Updated write ups to include new changes
- Updated known issues (see full write up)
- Updated challenges (see below)
Project Overview
This is a TL;DR of a TL;DR of a full write up that can be found in the links at the end of this article
This project explores the application of Large Language Models (LLMs) combined with Gaussian Mixture Modeling (GMM) and Principal Component Analysis (PCA) to generate optimal team compositions for the Valorant Championship Tour (VCT). The approach leverages LLMs for analyzing player data, performance metrics, and strategic synergies, while GMM and PCA is employed to reduce dimensionality and identify key factors that influence player compatibility and team dynamics. This integrated method offers recommendations tailored to competitive play, balancing individual skills with team cohesion. Results indicate that the LLM and PCA-based system enhances decision-making, offering data-driven insights that streamline the process of forming high-performance VCT teams. This project highlights the potential for AI-assisted tools in the esports industry, aiming to improve team selection efficiency and performance through advanced data analysis.
Current methods of team formation rely heavily on the experience and intuition of managers, who assess players' skills, roles, and synergy through a time-consuming process. While statistical analysis has long been a part of traditional sports, the esports scene seems to be a bit behind in this aspect, likely due to the large cost overhead it requires to maintain scouting teams. The objective here is to find a way to combine well studied classical methods with the natural language capabilities of an LLM to make stats driven team building more accessible and cost-effective to the Valorant community.
Considering that no one on our team has ever worked with AWS before, has experience building LLMs/transformer applications, and that we are a very late entry into the competition, we have one main constraint that we need to design around: we need to build this product in 2 weeks. We settled on an implementation that would blend well understood unsupervised learning methods with the cutting edge LLM toolbox in AWS Bedrock. This serves to lower the amount of time spent tagging data and wrestling with new, complex concepts as well as to buy some time to learn the AWS tango and create a quality final product.
Ultimately this project can be divided into two stages: data mining and application creation. The idea is to use GMM and PCA to isolate the best players at each role and from each league and region, generate probability distributions on a per role, per league, per region basis, and sample out of these distributions to generate teams. At this point we can pass the teams to an LLM for context aware filtering and analysis until we end with the ”best” team. Then it is a matter of presenting an LLM session to the user with a natural language breakdown of this team consistent with the challenge requirements
We start our data mining by looking at the VCT International Dataset. This is because it is the sampling of the best players in Valorant as well as the one we have the most familiarity with in order to sanity check results. As mentioned before our data mining follows a very typical classification workflow, with clustering being a natural place to start. Since we have no ideas on what our data’s shape is, we decided to do probabilistic clustering rather than distanced based clustering. Interestingly in our tests, we found Bayesian GMM (BGMM) to generate more meaningful clusters than traditional GMM even when limited to the same number of clustering groups. So we decided to use a BGMM model.
When looking at known players and the clusters it quickly becomes apparent that the clusters by themselves are not good enough to identify roles and certainly not good enough to rank something as ethereal as performance. This is where the PCA comes in. By studying patterns in player spectral decompositions we can gain additional insight into behavioral patterns of roles while remaining agnostic to personal bias. We can then use the patterns we find and introduce some intuition of what the ranking of VCT Champion players are to define a generalized ranking algorithm that can be applied to the results of the PCA filtering. Once the ranking and player pools for each role and region are decided, all we do is another quick normalization and obtain probability distribution functions (PDFs) for reach role for each region. Looking at the singular values, we determined that 90% of the variance in P is captured by the first 30 principle components (PC1−30) and will be limiting our analysis to those 30 for creating the filters and rankings.
Now is where the fun stuff happens. When user input comes in requesting a new team, we intercept that message and sample 50 teams, comprising of one player from each role from regions according to the user’s request. Then we pass those teams with the respective ⃗p vectors into a prompt engineered LLM (LLM-T) and have it return the best team as well as a list of the best substitutes for each role. Then we pass this team and subs to another prompt engineered LLM (LLM-Team) to create the final analysis for the end user and a chat session for the user to ask additional questions about the team. This method relies heavily on the LLM base models to have good contextual awareness of what Valorant is, what a Valorant game involves, and what the components of our vectors mean. Thus we pick Claude Sonnet as the basis for LLM-T and LLM-Team to leverage its size and intelligence. In order to make the user experience more seamless, we need to do classification on every user input to see if it is requesting a new team. We can once again turn to the power of an LLM (LLM-A) rather than training up a new classification network. If we detect a new team request we can reset the LLM-Team context and generate a new team. Otherwise we can pass the user input to LLM-Team to give the illusion of a direct chat session. After checking whether the prompt is requesting a new team, we also prompt LLM A to see whether the request is asking for a team with 3 or more different regions. If this is the case, we can ensure that we select a diverse group of players. Otherwise, we try and stick to the VCT import rules of only one imported player. Both of these classifications need to be fast so we decided to use Claude Haiku to save on cost and minimize latency. Finally we need to extract what leagues the user would like us to draw from. Since this is also a simple natural language analysis job, we can re-use LLM-A with a different primer to return an optimal role-league mapping.
Challenges
The biggest hurdle for us was time. We only found out about this hackathon 2 weeks before the due date and between the two of us and full time jobs and family duties we didn't not have much bandwidth to work on this. Many compromises had to be made in the interest of time but we are still proud of the end result.
The other very frustrating thing was learning how to use AWS and its many many products. We found a lot of the documentation online to be quite obtuse and elusive. Learning how to process that scale of data was an extremely steep learning curve and consumed about a week of our time leaving very little time to work on the actual application.
We also found out the hard way how fast AWS costs can balloon if you don't know how to use a service correctly. We accidentally blew $300 on glue ironing out bugs in a notebook before we found out that we were being charged for the duration of the notebook being attached to a glue instance rather than just the duration of time we were running code.
As with all real life data based project, data sanitation and preprocessing was the name of the game and also the hardest, most frustrating part
Challenges Pt2 (electric boogaloo)
For the second round, things went a lot smoother. Our multi-LLM architecture handled the new prompt requirements surprisingly well. The small context window footprint and pass through structure was able to seamlessly meld Claude Sonnet's expansive knowledge base with our statistics to answer a whole variety of follow up questions. The only 2 deficiencies we saw were its inability to create substitutions and too much speculation on map analysis. An extra LLM request to create a substitution pool before building the user session fixes the former and by some stroke of luck, during the data sanitation we already had half-extracted a map affinity metric from the original data set. So we just finished the processing and passed it along to the analysis LLMs for the latter. This time we only spent a few dollars on R&D :)
The main challenge here was the throttling issues that everyone was experiencing. Luckily our primary input's token footprint is relatively small (roughly 10000 per request) but we do need to do 2 of those per team generation. A lot of time was spent wresting with Amazon Support to little avail. We spent some time doing character optimization on the agent instructions and prompts to get the token size down and throttle less and just accepted defeat on this one. With our current quota, we still throttle during the team generation, meaning team creation is very slow, but once the interactive session is passed to the user the throttling issues typically go away.
What we would do if we didn't have throttling issues
Because most of the heavy lifting in our project is done by unsupervised learning and filtering, any new data we want to add primarily just amounts to extending the vector we pass to the LLM. This means that while we are throttling, our project isn't necessarily hampered by it. That being said, the massively extended iteration cycles and the time wasted wresting with Amazon support did prevent us from perusing some other augmentations we had in mind:
Frequency Domain Play Style Analysis
Within the game data, there were observer events that contained the (x,y) coords of all of the players in the game at that moment. The thought here is if we were to serialize this data for each player and then sample it out into a time series signal, we could apply some signal processing techniques and maybe get some interesting data out of it. Particularly taking a Fourier transform of the x and y signals and looking at if the different frequency spectra of a player's movement is any indication of less tangible ideas of play style; for example if a player likes to jiggle peak certain areas or hard push certain areas or if they like to lurk around or tend to be more of an executor.
Port parquets to AWS Neptune
Currently rather than using a database, our application loads a few parquets into memory as dataframes. This was a design decision we made primarily in the interest of time, however it is not a good solution and does not scale. Further, the relationship between rounds and games is very complex, with things like tournament performance, mental, and economy greatly affecting a players performance. This is exactly what a graph database is built for and it would have been cool to use AWS Neptune to explore it.
What we would do if we had (a lot) more time
One very interesting idea we came up with was if we could exploit the transformer construct to do better speculation on team performance. Given enough data, we could be much more accurate with our approximation of a player's vector and form an embedding matrix of all player rounds. The attention step would then be perfect for contextualizing a team's round-by-round effect on each other and the multilayer perceptrons would be perfect for essentially simulating a round with these players in it. It would be cool to see if this would be able to accurately measure a theoretical team's performance.
Take Aways
Going into this both of us had no idea how transformer architecture works, let alone how to fine tune one, use RAG, and build an application around it. But we were able to use this hackathon as and excuse to pick that up. While we didn't end up having time to set up fine tuning and our attempts at making our architecture work with RAG were futile, our knowledge of the world has expanded and that's really why we like to do these sorts of things.
It is also very nice to finally get some AWS experience under the belt, as it has been something we both have been meaning to get around to but never had the opportunity until now.
Tooling
AWS Tooling Used:
- AWS Glue - Data Ingestion and Processing
- S3 Buckets - Storage for VCT JSON data and processed parquets from glue
- AWS Bedrock - LLM Api Calls
We also used https://valorant-api.com/ to get data on maps and weapons
Further Reading
I would like to take this moment to encourage anyone to check out our write up on this project in the github. We found some very interesting results during our analysis of the data and are also pretty proud of the application architecture we landed on to present a fully interactive chat session with ultra small context window footprint.
Log in or sign up for Devpost to join the conversation.