Last month I won the Greed Tournament, a competition held by CodeCombat. In my previous post I talked about the experience of developing my coin gathering strategy. This time I plan to go through my solution in detail, breaking down the code piece by piece and examining the reasoning behind it.
There are two main components to any Greed strategy, namely:
- Gold Collection
- How to move your collectors around the map to gather gold efficiently.
- Unit Selection
- Choosing when to build new units and which units to build.
As I’ve mentioned before, my gold collection algorithm is based on gravitational forces that pull collectors around, allowing them to move towards clusters of coins with the highest overall payoff while also distributing them evenly around territory. My unit selection algorithm is also reasonably interesting — it uses state machine logic loosely based on the Subsumption architecture to manage a number of interacting behaviours and uses linear regression to predict enemy gold collection performance.
This post will focus on the first half of my code, which is mainly the gold collection algorithm. I’ll talk about unit selection in a future post. You can see the full solution in this Github Gist posted by CodeCombat.
Update: If you’re interested in the unit selection side, you can now read more about this in the next part.
The coin gathering algorithm is the most important part of the code and received the most focus over the course of the competition. Games are primarily decided based on who collects the most coins — after all, more coins means a bigger army — so every gain here translates directly to more victories in the field. Consequently, the coin gathering logic is also where the most statements should be spent, making it the focus of optimizations in order to keep within Greed’s statement limit.
My approach uses gravitational forces to guide collectors towards high-value coin clusters. Collectors are attracted to coins, but repelled from each other and from the edges of the map. Each turn, we take the sum of all these forces on each collector and produce a vector indicating the overall force. This vector indicates the direction in which the collector should move this turn.
This has two key benefits:
- Collectors are able to choose clusters of lower value coins over a single high-value target when the value of the cluster exceeds that of the single target.
- Collectors will naturally spread over the map, evenly covering territory. They will also adjust to cover areas left open by other collectors as they move to collect coins.
Additionally, in my solution, coins that are a similar distance away from the current allied collector and the closest enemy collector have an increased attraction strength. This encourages collectors to go after coins that have a higher risk of being taken soon, depriving the enemy of value as well as adding to our own.
Though I came up with this algorithm independently, this kind of approach has apparently been invented before! In the field of swarm intelligence, it appears that this is known as a Gravitational Search Algorithm. This was pretty cool to learn, since it had never really occurred to me that coin collection in Greed was essentially a swarm intelligence problem.
Coins and Collectors
The main logic for this is found in
(it’s called computeSpringAssignments because “springs” was the word that inspired it,
and that name stuck).
Here’s the body of the main loop,
which computes the overall force vector for an individual collector:
The first for loop iterates over all coins
and applies their influence to the
which acts as an accumulator.
strength of the attraction is given by the coin’s value in gold
multiplied by a scaling factor based on enemy distance.
This is all divided by distance squared,
reducing the strength the further away we are similar to gravity
or an electric field.
Finally, we use this strength value to scale a unit-length vector
pointing in the direction of the coin,
which becomes the final influence added to our
The second for loop does a similar thing, but over allied collectors.
This time, the influence vector is computed by
and then added to the overall force.
However, collectors are meant to be repelled from their friends,
FRIEND_ATTRACTION is negative, resulting in a “negative attraction”.
I won’t show the
computeAttraction function here,
but you can
see it in the gist.
Though these loops perform very similar computations,
the first loop looks quite a bit bigger than the second,
which lets the
computeAttraction function do most of the work.
The difference is the introduction of the coin strength multiplier
in the first loop, computed by
Originally, the first loop also used the
to compute the vector for each coin.
However, since I introduced that multiplier, which is also based on distance,
the code ended up computing the collector’s distance from the coin twice.
I decided to inline the usage of
computeAttraction in the first loop
at the last stages of the competition
so that I could reuse the calculation,
cutting down on some extra statements.
At this point we’ve considered the influence of coins and also that of allied collectors. You might wonder whether it would also be worthwhile to attach some sort of influence to the enemy collectors as well. In earlier versions of the algorithm I experimented extensively with this, trying both attractive and repulsive forces on enemy collectors. However, attractive forces led my collectors to chase hopeless coins, wasting time, and the repulsive forces scared them away, allowing the enemy to control the map. I couldn’t find a way to make this work, but I found a way to add in an enemy influence indirectly via the enemy multiplier, which I’ll talk about next.
The Enemy Multiplier
I’ve talked a bit about the
which is used as part of the coin influence calculation,
but so far I haven’t really talked about how it works.
As I mentioned, its purpose is to scale a coin’s value
based on how close it is to an enemy collector compared to our collector,
encouraging our collectors to nab coins before the enemy.
Here is the code:
In the extreme case, in which we are right next to the coin, our distance will be zero, and the enemy distance will be equal to the total distance (ours + enemy’s), so our scaling factor will be 1.0. As the ratio between these distances changes, so does the value, rising to 2.0 in the case where both collectors are an equal distance away. This means that the attraction to this coin will be twice as powerful as a coin that we are sure to get.
As the ratio turns in favour of the enemy, the output value increases further still, but we need to have a cutoff to prevent chasing coins that are hopelessly within enemy territory. My testing showed that it was profitable to go a bit beyond half way and chase coins that were a little bit closer to the enemy, so I determined the optimal cutoff value to be 2.222…. Beyond this, the multiplier is cut to 0.5, discouraging our collectors from chasing the coin.
By the way, if you’re wondering how I came about an odd recurring value such as 0.222…, it’s because of another optimisation! When I originally wrote this function, it looked like this:
In the final hours of the competition I noticed that,
if I flipped the entire function,
I could get rid of the extra division in the
Since this function is called inside a double-nested loop
(for every collector, for every coin)
getting rid of this one division saved quite a few statements,
which helped immensely in the closing hours of the competition.
Now that I’ve explained how the multiplier works,
I’ll quickly explain how I find the distance of the closest enemy.
In the code snippet at the start of this section
you may have spotted the reference to
This is a dictionary that stores, for each coin,
the distance to the closest enemy collector.
Here’s the function that creates it:
Nothing too exciting here. The code simply iterates over each coin, and for each coin iterates over each enemy collector, scanning for the one with the lowest distance. Once the inner loop is finished, the distance gets stored in the dictionary against the current coin.
Once we’ve computed the influence of all coins and all allied collectors, the next thing to do is add the wall influences:
Here we simply treat the walls as masses just outside the arena
that are always aligned with the collector on the appropriate axis.
Then we can use
computeAttraction to compute the influences.
WALL_ATTRACTION is negative, indicating a repulsive force.
It is likely that this section can be optimized somewhat
to reduce the number of statements,
since each wall only needs to be considered in one dimension.
I didn’t bother for the competition,
as I eventually obtained enough headroom
from optimising elsewhere that I could work on something else.
The Final Direction
Finally, we take our final
force vector and use it to to compute
the collector’s new target position:
I wanted my collectors to always move their full distance each turn,
to avoid wasting time standing still.
To make sure this always happens, my code ignores the length of the force vector
and uses its direction only, setting a target that is always out of reach.
To accomplish this, first we normalize the force vector,
then scale it by
When the next turn comes, our algorithm will compute a new target
which is further away again,
ensuring that the collector is always kept moving.
I also found that sometimes the algorithm can lead collectors
off the edges of the map.
This might happen because other allied collectors are very near,
overpowering the repulsive force of the walls.
In Greed, this is very bad, since a trip over the edge
sends the collector plummeting to its doom, never to be recovered!
Math.max calls are a quick fix for this,
constraining the target position
to within the map boundaries,
preventing any potential cliff-diving.
If you’re familiar with how to issue orders in Greed then you’ll notice that at no point has my code actually issued any. Instead, the data gets put into a list of collector/position pairs, to be acted on by the caller. This is a holdover from my previous greedy strategy, where this separation allowed me to run the same code from my opponent’s perspective and use the resulting data when computing orders for my own collectors. Though my code no longer does this, it is nice to keep a separation of concerns whenever possible.
Thoughts on Falloffs
The distance squared falloff was originally devised out of instinct from my understanding of gravity in the real world. The reason that the force fades according to distance squared can be understood intuitively by imagining a sphere expanding outwards from a single point. If you observe a rectangular patch on the surface of this sphere, it expands in both width and height as the sphere gets bigger. The width and height of this patch both grow proportionally to distance. Therefore, the area of this patch (width*height) grows proportionally to distance*distance, i.e. distance squared. Since the force from the point must project outwards covering the whole sphere, it fades in the same way.
However, you might notice that this result depends on having a 3D environment, where as the Greed map is a two-dimensional plane. In two dimensions, the sphere would instead by a circle, and the rectangular patch would instead by a line segment, which would only grow in one dimension as the circle expands! So you might think that a more appropriate falloff would be a linear one!
I noticed this during the competition, so I experimented with linear, cubic and even quartic powers. My findings were that a squared falloff gave slightly better performance than the rest, but I don’t have a solid explanation as to why. The linear falloff seemed to have some trouble deciding between clusters, taking a less efficient route, while the higher power falloffs tended to focus too much on single coins. Squared seemed to strike a nice balance between the two. I never tried fractional powers, but that could be worth investigating in case it leads to a finer balance.
You might also think that the walls, being planes in the space, would work better with a linear falloff, since the repulsion is emanating from the entire surface of the wall, rather than a single point. I also tried experimenting with this, however I could not find a configuration that performed better than the one I used for the competition.
Now you’ve seen how my coin collection algorithm works and I’ve explained the reasoning behind some of the important decisions that I made while working on the competition. We’ve covered the basics of the algorithm itself, in which collectors are attracted towards coins and away from each other and walls. We’ve also discussed the enemy multiplier, which gives a boost to coins that are reachable but threatened by the enemy. Finally, I’ve talked about my experiments, particularly with different kinds of falloff, that I ran in order to fine-tune the solution.
As of the time of writing, my solution appears to still be doing quite well on the ladder, keeping me firmly in the top 10. A number of new players have shown up in the top 10 since the end of the competition, several of whom appear to have submitted variations of my strategy (it is quite possible to simply copy-and-paste the code from the gist and use my solution exactly), which I suppose I am flattered by. :)
In the next part, I’ll talk about my unit selection strategy, which uses a state machine based on the Subsumption architecture to decide when to attack, when to defend and when to build collectors, and uses a regression model to estimate the enemy’s future strength.