Three weeks ago I entered the Greed tournament, a programming competition held by the great guys over at CodeCombat, spanning Tuesday 20th May to Tuesday 8th June. My aim was to make it into the top 50 and nab myself some free web hosting. However, once I got going, my competitive nature got the better of me and I ended up winning first place. Whoops. :)
You can read more about the strategy itself on the CodeCombat blog and find the code itself on Github. In this post I’ll talk about the journey that took me there and how I came up with the coin collection mechanism behind my winning strategy.
Update: I’ve now written a follow-up post explaining my coin collection code in more detail. Give it a read if you’d like to know more about my coin strategy and its implementation.
Update 2: There is now another post explaining how my code selects which unit to build each turn.
My initial strategy was very simple and I never expected to get very far. The example strategy simply selected the closest coin to the base. My first improvement was to change this so that the code iterated through each collector and chose the closest coin. I then improved the heuristic to weight coins according to their value over distance, allowing collectors to choose further away coins if they were more worthwhile.
The most obvious problem at this point was that collectors would often attempt to chase the same coin, wasting effort. To solve this I added a “taken” set which coins were added to once assigned. In subsequent iterations, coins already in the taken set were skipped. This would make sure I didn’t assign the same coin twice.
This was as far as I went on my first weekend. Already, with this simple strategy, I had made it to around 50th place on the ladder. It was easier than I thought! I knew, however, that there was more that could be done. I set a goal for the next weekend to try and break into the top 20.
Refining the Greedy Strategy
On Saturday 31st May I set to work. The greedy strategy I had so far made the best choice for each individual collector, but not always the best choice for the group as a whole. Depending on the order in which it iterated through the collectors, it was possible for one collector to be assigned a coin that would be better collected by someone else.
To solve this, I changed the algorithm to repeatedly scan through all coin/collector pairs, choosing the pairing with the highest utility overall each time, until either no collector or no coins are left. This way, the collectors worked together to maximise the overall utility gained by the team, rather than by each individual collector.
It was also around this time that I started using version control. I was previously writing my code directly into the CodeCombat editor, but this became somewhat unwieldy as my code grew larder and I wanted to try out more ideas. I switched my workflow so that I wrote code locally and copy-and-pasted it into the editor when it was time to test, using a local Git repository to keep history. I also began tagging each significant new version I sent to be ranked on the ladder and keeping a rough record of how each one did.
The biggest problem at this point was that collectors would often chase coins that the enemy was also chasing when the enemy was much closer. Since collectors all move at the same speed, the enemy would obviously win the race, meaning that my collectors wasted effort that could be better spent collecting other coins.
There were a number of approaches I thought about to try and resolve this. One possibility was to scan through the enemy collectors, look at each collector’s target position and see if this matches a coin. This would work well against algorithms that go directly to each coin, but not all algorithms work this way (my final algorithm is one of them). It also seemed like a lot of work which I didn’t want to do. To save development time, I decided to simply run my algorithm twice — once for the opponent, to decide which coins they were likely to target, and once for myself, ignoring any coins for which the assigned enemy collector was closer.
This worked reasonably well, but I was now bumping up against the statement limit in some games, and there were still players who were able to collect a lot more gold than me. I knew there had to be a better solution out there that didn’t require so much computation. Alas, my second weekend was over, so I got to thinking over the next week about what to do next.
So where could my greedy algorithm be improved? The first problem was that it only considered coins individually, so I would send a collector to grab a solitary gem when grabbing a tight cluster of lower value coins would have been more worthwhile. The other problem was in the end-game. When coins are scarce, it is advantageous to place collectors in positions that minimise their distance between all points on the map, in order to beat the enemy to any new coins that might appear. My greedy algorithm was not aware of this, so collectors would wander randomly and not space themselves out in any sensible fashion. Rather than try to bolt these features onto my existing algorithm, I decided to search for a general approach that might give rise to these behaviours.
I tried for a number of days to come up with an algorithm that could beat my existing greedy approach. I built my own offline coin gathering simulation and visualisation using d3 and experimented with numerous different tweaks and hacks, watching their behaviour against my existing greedy solution. I read several papers on variants of travelling salesman to see if there was anything I could use, but I found that the classical TSP solutions I learned about didn’t really translate to such a dynamic, competitive environment.
I was daydreaming at one point when the word “springs” just popped into my head. Almost immediately I had a vision of spring-like forces pulling collectors around the map towards coins. Of course, what I actually came up with, based on a distance-squared falloff, was much more intuitively like gravitational forces, but the name “springs” stuck. Anyway, I realised at once that this could solve both my problems: the attractive force of a tight group of coins would naturally draw collectors towards them in favour of lonely high-value coins and, in the end-game, having a repulsive force between collectors and walls would cause them to spread out evenly across the map. I also thought that having enemy collectors apply a repulsive force would discourage my own collectors from chasing coins too close to the enemy.
I initially implemented springs during the week in my simulation but it didn’t appear to perform any better than greedy. The curved trajectory that the collectors sometimes took towards coins also seemed wasteful compared to greedy’s straight-line paths. After some ineffectual tweaks I abandoned it and tried looking for something else. However, it really bugged me that it didn’t work, so I improved my simulation to run 100 games at a time and created a number of graphs using c3 showing turn-by-turn statistics. I eventually found that, when I removed the repulsive force from enemies, springs came out with a small but consistent lead. Once I knew I was on to something I was able to push this gain up even higher by fine-tuning the strengths of the forces. Even better, springs consumed fewer statements than greedy, so I now had a bigger budget to spend on unit selection without fear of crippling my performance.
The Final Push
I deployed springs on Saturday 7th June after a full day of work. It did very well, propelling me to first for some time before finally settling among the top five. However, the tournament was still open until Tuesday night, so I became very worried that I had played my hand too soon and that my efforts would be buried as my opponents upgraded their own approaches. After using Sunday to rest, I resumed my work on Monday and tried to search for something more I could do to give my that final edge.
With the help of my simulation, I eventually found that a modifier for coins in “enemy territory” got me a boost. I modified the worth of coins based on the difference in distance from my collectors and the closest enemy collector. Coins with an almost equal distance were made more valuable, before dropping to half value once they got close enough to the enemy that my collectors no longer stood a good chance of getting them. This got me another small but consistent boost compared with the vanilla springs algorithm — about 30 coins per game.
Wanting to give my opponents as little time as possible, I waited to unleash the final version until Tuesday evening and, after some last minute optimization, went to bed with just a couple of hours of competition time remaining. When I awoke the next morning I was on top of both ladders.
After the Competition
Needless to say I was very pleased with my success. The highest score I had observed on the ladder so far was around 6700ish, but my ranking score soared to around 7000 on the human ladder by the close of the competition. That must have put a scare into my opponents. :)
Having had a chance to read posts from other players on the CodeCombat forum about their strategies, I was humbled by how many aspects some of my opponents had considered that I had barely scratched the surface of. The top ranked Ogre player, Ian Elliott (Bellardia) produced a very complex approach that occupied twice as many lines as my own. He also used some much more aggressive optimization techniques in order to keep the statement counter down, a number of which had not really occurred to me in my own code.
Overall, I had a great time with the tournament and would like to thank CodeCombat once again for providing such a great event. I would also like to thank all my competitors for making the challenge so very interesting and really making me work for my first place. I hope to see you on the battlefield again in future to fight for yet more fame and glory!