For the past few years, I have been really into the auto-battler game Teamfight Tactics. Here is a shortened summary:

Teamfight Tactics is a turn-based auto battler computer game. An auto battler, also known as auto chess, is a subgenre of strategy video games that feature chess-like elements where players place characters on a grid-shaped battlefield during a preparation phase, who then fight the opposing team's characters without any further direct input from the player. The aim of the game is to be the last one left alive out of 8 players, with each player starting with 100 health points. It is split into multiple stages, with 5 rounds each stage. Each round you battle against a single opponent.

The board is similar to that of a chess board, upon which you create your ‘team’ and the number of characters you have on your board at once corresponds to your level, starting at level 3 and ending at 10. Each character has up to 3 traits that can synergize with other characters in the game. If a trait appears in a certain number of characters, then it is activated, adding a combat bonus to the team. The objective is to build a team of characters that work well with each other, and trait synergies are typically the main contribution to this. A successful team will utilize multiple traits that work best with one another to counter the opposition.

There’s a lot left out of this summary as the game has many more dynamics, an incredibly high skill ceiling, and undergoes frequent changes, but these traits will be the main focus of this project. Specifically, in the game there is occasionally an opportunity to accept a side objective: activate 8 traits at any point for significant bonuses. While this is incredibly rewarding in regard to winning the game, it’s also respectably challenging especially considering the team size constraints. Given any level, there is a finite (and sometimes small) number of team compositions that meet this requirement. Essentially, this is an algorithmically solvable problem.

For this reason, for as long as I’ve played TFT, this challenge has captivated me even beyond the game. In fact, if I just wanted to get these team compositions as ‘cheat codes’ to help me win more games, I could easily find this data online. Tactics.tools, for example, is a prominent data site with a tool that provides these comps with a built-in filtering system for easy lookup. Instead, I wanted to tackle this myself for the purposes of exploring graph theory, algorithm & data analysis, and search strategy optimization. The primary objective would still be to find all team compositions that reach the minimum of 8 traits activated, but this simply serves as a practical application and completion benchmark for the core utility: An optimized search algorithm to find all subgraphs of size n with a specified number of features.

Note: TFT undergoes seasonal ‘set’ changes in which the entire unit and trait pools are changed. At the time of this project, TFT is on set 14 - Cyber City. These are the units and traits used for this project.

Admittedly, I did initially consider whether using Itertools’ combinations would be the best method of generating team comps, but the math didn’t sound optimal at all. If I were to use all possible combinations, a team size of 8 would result in over 2 billion combinations to validate. I also ambitiously decided my completion point would be to calculate at a team size of 9, which nets over 14 billion combinations. That wouldn’t be viable, especially considering that the problem is almost an edge maximization task; trait activation minimums vary so not every edge contributes to a full trait, but for the most part, more edges meant more trait activations. By using connectivity to generate the compositions instead of all combinations, I could narrow down the search pool by a significant amount. To be frank though, I also just wouldn’t have been satisfied using combinations alone without exposing myself to any graph theory. For all these reasons, I continued with implementing a data structure representing the TFT set 14 unit pool as a graph.

To start, I coded generic Unit and Trait classes in Python and manually entered the traits and units from the current pool. I also wrote an add_emblem method (adds a trait to a unit) and a method to initialize the graph by looping through each trait of each unit and saving references to its neighbors. From here, I added a validate method that returns the number of active traits for any team comp passed as an input. The Trait class (specifically an attributed list of tiers) was integral in doing this, as traits have varying requirements for activation - some require 3 units like ‘Street Demon’ or ‘Anima squad’ whereas ‘Divinicorp’ only requires 1. Once this was implemented, the graph was setup and ready to traverse.

Visualization is provided with each unit’s origin (primary trait) used for coloring.

My first attempts at graph traversal were using DFS and BFS with every single unit serving as an individual seed. I implemented both and adjusted the algorithms so that each possible path is saved as a comp. At this point, it was immediately apparent that this wouldn’t be viable as the program was struggling at comp sizes of up to 4. I also realized that a big flaw of this method was that a lot of the output comps were duplicates in a varying order. For example, one possible path is from Morgana (sky blue, bottom left) to Renekton, but another possible path is from Renekton to Morgana. As the depth increases, more of these permutations clutter the validation search pool.

To get around this, I considered incrementally removing units from being identified as neighbors. The idea was that all comps including a given unit should already be yielded from using that unit as a seed. For example, if Morgana will refer to Renekton, then Renekton shouldn’t need to refer to Morgana. However, this wouldn’t work because in actuality, using a unit as a seed only results the comps with the unit at its head. Using Morgana as a seed will never identify [Miss Fortune, Morgana, Renekton] as a valid path, and removing Morgana from the neighbor pool for Miss Fortune and Renekton would prevent this comp from being found at all. In trying to grapple with this complication, I realized that the mindset thus far was even more fundamentally flawed - I had incorrectly been operating under the underlying assumption that all valid comps would be connected in a linear path. As a result, I was looking for ways to find paths rather than subgraphs.

To correct this, I shifted to a recursive iterative method that operates on all units of the passed input. By doing it this way, the resulting outputs are no longer constrained to linear paths and instead allow for forking from any node. The pool of valid neighbors grows considerably large as a result, but this also allows to removal of permutations before each additional iteration. Additionally, I’m also now able to make the previously mentioned method of removing units is viable. For example, if all comps are sorted by unit cost and name - and the whole set of comps is sorted as well - then once the iterative method sees the first unit of each comp change, the previously first unit can be removed from the neighbor pool. I also considered adding lexicographic hierarchy, but unfortunately this wouldn’t work as the graph isn’t fully connected. With these changes, the algorithm calculates at a team size of 4 in 4 seconds. For 5, it takes almost 300 seconds - still very far from the goal of 9-unit comps.

I also added a parameter to force units into the output comps. On the surface, this works like a filtering system. Internally though, all output comps are results of iterating upon this list rather than all units, which cuts runtime down significantly. The depth of the search is also reduced to the team size minus the length of the filtered units. As a result, much larger teams are able to be computed, but only when filtering significantly - which doesn’t solve the problem.

At this point, I was out of optimization ideas and decided to compare my current progress to Itertools’ combinations as a benchmark. I added a simple implementation and was shocked to see that the combinations method was able to run in just a second, even for a team size of 9. Upon looking through the documentation, I discovered that this is because the combinations method generates an iterable rather than hold the entire output in memory. It is also written in highly optimized C and as a result, my algorithm would likely never be as fast in generating the entire list of possible comps. However, as previously mentioned, this comes at the cost of having an immensely long pool of combinations to be validated. Essentially, the runtime efficiency that Itertools provided in this context was simply offloaded to the validation process.

This brought me to my next idea; If graph traversal is too costly in pool generation runtime and combinations are too costly in validation runtime, perhaps I could find combinations of smaller branches.

The new goal was a hybrid method with an input corresponding to varying splits of the desired team size. For example, team sizes of 3 could be generated using either [1, 1, 1], [2, 1], or [3]. [1, 1, 1] passed as an input would perform a regular combination of 3 units whereas the [2, 1] would find all products of branches with length 2 and singular units. [3] on the other hand, would just be raw graph traversal. This way, I could test how different splits perform against regular graph traversal and combination.

I began by adjusting the previously built graph traversal method to no longer be recursive as I wanted each varying sample of team sizes to be kept in memory. Then, a list of all the singular units is saved as seeds. If any number greater than 1 is in the split, then this is passed through the method to generate and save all branches of length 2. This same process occurs up to twice more up to generate branches of size 4 depending on the split. After all the necessary branches are generated, combinations of the branch pools are generated depending on the number of times the branch length appears in the split. If the split is [3, 2, 2, 1] for example, the branches of length 3 and the seeds are used in combinations of 1, returning the original inputs. The branches of length 2 conversely are used in combinations of length 2. The list of branches with a length of 4 is empty anyways, but the combination is also performed with 0 selections. All of these combinations are then passed through the Itertools product method, but only if multiple branch lengths appear in the split. If this is not the case, then the combinations of those specific branches are returned as the final result. This prevents the product being called if the split is just a single combination, significantly cutting runtime.

Essentially, we’re trying to perform a combinatorial optimization of the equation below. The constants used in the combination equations represent the pool size of all branches of lengths 1, 2, 3, and 4 respectively.

For testing, I wrote a separate script that generates all possible splits of every number from 1-9 with a maximum branch length of 4. I then manually passed these through the hybrid method, saving the runtime stats to a csv. Only the splits of team size up to 5 were able to be validated in a reasonable time, so these runtimes are graphed alongside the total pool sizes.

I also plotted the relationship between these pool sizes and the respective validation times. Luckily, this showed a clear linear relationship which meant I can mathematically predict the validation runtimes of greater sized splits using the combinations equation in conjunction with the branch pools. Keep in mind that splits with only 1’s is equivalent to just using the Itertools combination method alone.

In visualizing these comparisons, it’s apparent that most of the splits were actually more costly than just using a sequence of 1’s. In fact, only one split per comp size ends up resulting in a smaller pool of possibilities than the combinations alone. However, before I could consider the runtime optimizations successful with the better performing splits, there were some issues that showed up when exporting this data.

First, when I was validating the splits that I could, I was using the same number of traits activated as the team size. In doing so, I noticed that the splits [1, 1, 1, 1], [2, 1, 1], and [3, 1] all resulted in 378 validated comps with at least 4 traits active, while the [4] split only validated 355. I sifted through the missing comps and realized that this is due to the fact that the trait Divinicorp only requires a single unit to be activated. As a result, an edge is not necessary to activate it, and searching only across edge-linked subgraphs was preventing a number of valid comps from being identified. Aside from performance issues, this was also another logistical reason graph traversal alone wouldn’t work for this project. To get around this, my first idea was to only consider splits that include a 1 in them. This way, a Divinicorp unit (or in theory, any possible trait requiring only 1 unit) could be included without having to be edge-linked.

Additionally, I later noticed something similar happening with the 5 splits. Most of them validated 1324 comps with at least 5 traits active, aside from [4, 1], which only found 1239. Upon going through these missing comps, I found that a similar issue was occurring with triplets. For example, [Shyvana, Veigar, Poppy] is a valid branch that activates 3 traits by itself. This, when paired with disconnected branches of length 2, can result in many valid comps that aren’t found in the [4, 1] split as it requires a connected branch of 4. For this, one possible solution would be to simply treat this like the previous issue. I could just disregard splits that don’t allow for subgraphs of sizes 1, 2, and 3 - granted that the team size is large enough. Even [2, 1, 1] for example allows for the 2 and 1 to merge for branches of length 3. However, as it stands, this solution removes so many splits that I’m effectively removing all optimization options for medium team sizes.

To continue, I decided the way forward was to try and reduce the constants used in the previous equation by pruning branches based on viability. If I could remove branches that fundamentally can’t produce enough traits regardless of what they’re combined with, then I could change the form of the equation and reduce the time cost for splits that currently aren’t performing well.

This is where I’m currently at on this project. You can follow along here or access the code on my github.

Rigid Body Dynamics Surrogate AI

Portfolio