ICFPC-2020: Gravity is a habit that is hard to shake off: _adept

_adept_

ICFPC-2020: Gravity is a habit that is hard to shake off

Jul 23, 2020 22:23

This is part three of the ICFPC-2020 write up. Part one is here.

Part two ended at T+35h when we managed to draw the first image extracted from interacting with galaxy.txt (and our team name is Just No Pathfinding, for reasons that would be explained later).

Alien message #39, "Interaction protocol", explained that every time you do "interact galaxy", you need to feed in a pair of numbers, and from reading discord we knew, that every time you got an image, you were supposed to "click" on it, and send click coordinates back, thus starting the next iteration cycle.

We made trivial UI, consisting of the single window with some debugging info splashed right across the bottom of the screen and rudimentary mouse support. We knew that interaction results in multiple images, and we are supposed to draw them on top of each other, so we chose different colors for each layer to be able to separate them visually.

After a bunch of small images were designed to test our ability to precisely click on pixels, we finally got to what should've been the "main screen" of the galaxy.txt.

This is how it looks like:

Unfortunately, organizers posted this image to their twitter some hours ago, spoiling the wonderful moment, but it was still an achievement.

We clicked on various items that display pictures of aliens (read wonderful writeup by tonsky to see a video of him exploring them)

This was an achievement that -- we sure -- would allow us to finally start scoring points and advance on the leaderboard.

Clicking on aliens sometimes led to some sort of mini-games the purpose of which was not clear. Two aliens led to the weird static pictures like this one:

Clicking on the center of the galaxy led you to some sort of mural, depicting the Galactic Tournament between civilizations, in which two ships face off and one of them advances to the next round and the other one dies:

Each stage of the tournament was clickable, displaying the "replay" of the battle between two ships, circling the weird square planet. Unfortunately, our evaluator really struggled with this part, taking 2-10 seconds per battle frame.

Still, we scored no points.

Going back to Discord, we realized that we should've paid more in the last 10 hours or so. Minigames were giving points during the lightning round (first 24h) only, so we were too late. Now points were awarded for driving the "ship" and competing against ships controlled by other teams.

To control the ship, you still had to send "alien messages", and you could do that either via your "galaxy UI", or by sending appropriate messages from any other program.

Since our galaxy interpreter was sometimes too slow, we decided to write a separate piece of code to drive the ship.

Protocol to drive the ship was only partially specified:

You were supposed to explore the "galaxy UI" to fill in the gaps in the protocol and figure out what the missing pieces are, and this is where we made our biggest blunder.

It was not a single big mistake, but rather a combination of several smaller ones.

We printed out information everything we sent to the server and everything we received but did not show it in the UI
Console logs were very verbose and noisy
We printed out "alien language" representation of the messages sent and received
We thought that we would be able to "reverse engineer" missing pieces of the protocol from interacting with the server.
Every screen of "galaxy UI" has some numeric glyphs on it. We did not have an image decoder, and we were not able to read them from memory
We were sad that lightning round is over and galaxy UI does not give you points anymore, and we wanted to be on the leaderboard.

Look at our logs. Can you easily see that newState, received from the server, is a tuple with numbers in it? Yeah, we neither.

Interact:
protocol: "galaxy"
state: ap (ap "cons" "3") (ap (ap "cons" (ap (ap "cons" "0") (ap (ap "cons" (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "0") "nil")))))))))) (ap (ap "cons" "nil") (ap (ap "cons" "0") "nil"))))) (ap (ap "cons" "0") (ap (ap "cons" "nil") "nil")))
vector: ap (ap "cons" "1") "0"
res = ap (ap "cons" "0") (ap (ap "cons" (ap (ap "cons" "3") (ap (ap "cons" (ap (ap "cons" "0") (ap (ap "cons" (ap (ap "cons" "0") (ap (ap "cons" "1") (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "2") (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "0") (ap (ap "cons" "0") "nil")))))))))) (ap (ap "cons" "nil") (ap (ap "cons" "0") "nil"))))) (ap (ap "cons" "0") (ap (ap "cons" "nil") "nil"))))) (ap (ap "cons" (ap (ap "cons" (ap (ap "cons" (ap (ap "cons" "1") "0")) "nil")) (ap (ap "cons" (ap (ap "cons" (ap (ap "cons" "0") "0")) (ap (ap "cons" (ap (ap "cons" "1") "0")) (ap (ap "cons" (ap (ap "cons" "2") "0")) (ap (ap "cons" (ap (ap "cons" "0") "1")) (ap (ap "cons" (ap (ap "cons" "2") "1")) (ap (ap "cons" (ap (ap "cons" "0") "2")) (ap (ap "cons" (ap (ap "cons" "1") "2")) (ap (ap "cons" (ap (ap "cons" "2") "2")) "nil"))))))))) (ap (ap "cons" (ap (ap "cons" (ap (ap "cons" "2") "-4")) (ap (ap "cons" (ap (ap "cons" "1") "-4")) (ap (ap "cons" (ap (ap "cons" "2") "-6")) (ap (ap "cons" (ap (ap "cons" "1") "-6")) (ap (ap "cons" (ap (ap "cons" "0") "-4")) (ap (ap "cons" (ap (ap "cons" "0") "-5")) "nil"))))))) "nil")))) "nil"))
flag = "0"
newState = ap ap cons 3 ap ap cons ap ap cons 0 ap ap cons ap ap cons 0 ap ap cons 1 ap ap cons 0 ap ap cons 0 ap ap cons 2 ap ap cons 0 ap ap cons 0 ap ap cons 0 ap ap cons 0 nil ap ap cons nil ap ap cons 0 nil ap ap cons 0 ap ap cons nil nil
pictures = (((1 0)) ((0 0) (1 0) (2 0) (0 1) (2 1) (0 2) (1 2) (2 2))
((2 -4) (1 -4) (2 -6) (1 -6) (0 -4) (0 -5)))
Shifting by 5, 7
waiting for click
clicked: (23,26) => (0,1)

As a result, we completely missed on the fact that mini-games in which you were supposed to control the ship use the same protocol as the one you are supposed to use to control the ship - complete with ship stats, commands that are issued when you click on "buttons" in the UI and so on.

If we played at least one ship tutorial mini-game with good debug output in front of our eyes, we would've noticed this, I am sure. But we did not.

Also, ship control tutorials unlocked "special abilities" that you could've activated via "JOIN GAME" message - like more efficient propulsion - and we completely missed this as well.

We were tired of alien glyphs and did not bother to decode "ship UI" in the tutorial mini-games, and moved straight to the ship control reverse-engineering.

Ship control protocol

Ship control protocol worked like this: you send the "JOIN GAME" message with the stats of your ship.

Stats are four integer numbers with undisclosed meanings, and the server could reject you if you request a non-sensical combination of stats.

The server responds with the "GAME STATE" message that describes whether you are the attacker or the defender, stats of all ships, and then in a loop, you send one of the available ship commands and receive new GAME STATE in return.

The attacker's goal is to kill the defender in the limited number of moves (255 or less).

The defender's goal is to survive until the end of the game or until the attacker dies, whichever comes first.

Ship commands included "accelerate", "shoot" and "detonate", and you could've sent more than one type of command per game tick -- so you can "accelerate" and "shoot" at the same time.

The battle is waged in the square field 255x255 units (there is no third dimension) in the middle of which there is a square planet, 32x32 units. If you crash into the planet, you die. Planet has gravity (attracts all ships).

You can fly out beyond the borders of the field, but you start to sustain damage (kinda like being outside of a safe zone in Fortnite/PUBG).

We started working on Ship Controller at T+46h (after some sleep).

By T+50h, we finally had something working. Something that could join the game and do nothing.
We submitted this version to the organizers and it even started winning some games for us.

The quick investigation had shown that those were the games where equally inept opponents managed to fall to the planet ahead of us.

Gravity is a contributing factor in nearly 73 percent of all accidents involving falling objects

We quickly realized that a weird square planet has equally weird gravity - each "side" of the planet exerted a pull on all points of space that was closer to it than to any other side. This effectively means that diagonals drawn through the square planet separated space into four regions in which vector of gravity was always oriented in the same way (and not directed to the center of the planet as we originally expected).

I marked gravity vectors on the picture below:

The first order of business was to figure out how to generate thrust to combat the pull of gravity and stop falling to our death.

Even though we messed up vectors and directions a couple of times, soon we got a ship that most of the time managed to avoid the fatal fall: battle replay, our ship is orange.

Pretty soon we got to a point where we were able to hover in the same position. That is, until we run out of fuel: again, we are orange

This is how we discovered that the first number in ship stats is the amount of fuel, and you can't have too much fuel: 200-250 seemed to be the max, or the server failed to let you into the game.

We also discovered that you can only generate 1 unit of trust - just enough to counteract gravity. But if you failed to engage your engines on the first turn of the game, you would've acquired a vector of speed directed towards the planet, and then your trust will be enough to counteract further acceleration, but will not be enough to stop you.

Now we started to win against opponents that did not steer at all (enough to get us to the 29th place), but started to discover that there is a more dangerous enemy out there: they shoot to kill.

Getting damage seemed to deplete your stats, mostly fuel, and eventually, your stats will reach zero and you will die.

Foretold is forewarned

We now had two ideas:

1. It would be nice to shoot back. We get the current position and velocity of the enemy ship, and we can compute gravity at its location. So we could precisely predict where he would be if it would decide not to accelerate. Shooting command just needed coordinates of the target. Seem easy.

2. It would be nice to conserve fuel. Just like we can predict the position of the enemy ship, we can predict the position of our ship (if we stop accelerating and will just coast). Moreover, we could extend this prediction arbitrary number of steps in the future, and thus compute how soon we will crash into the planet or run out of game zone bounds. If this number is large enough, we could stop accelerating and coast for a bit, saving fuel.

At T+56h, collision detection change was complete first. We marveled at the massive improvement in the test game played against ourselves: this time, we are both colors

We started "winning" even more, and for a while just sat back and watched games that were unfolded on the server, until we came across a shocking discovery: RGBTeam blew its ship ... and won the game, in which they were attacking and we were defending.

Ooookay. Two can play this game. I screenshotted the final state of the game, measured pixels, and computed that the "radius" of the blast was 10 pixels. Now, whenever we were attacking and found ourselves closer than 8 pixels from the enemy ship (just to be sure), we would blow up.

This would not happen too often, because we were not steering towards (or from) enemy ship, so we were just keeping an eye out for the game in which it will happen.

We still did not know what are the rest of the ship stats, so we resorted to playing games against the top of the leaderboard, which will allow us to observe their ship stats, and we copied them.

This way we discovered that more successful teams will have "attacker loadout" and "defender loadout", and we started doing the same.

Me, at T+58h in the team chat: "I think that tutorials in the Galaxy UI were probably telling you about ship capabilities, but who has time for that?"

Finally, we managed to find ourselves near the enemy ship and blew it up. As luck happened, it was the ship of RGBTeam. Revenge!

And then we had another match against "Cat is #1", where we detonated, and detonation covered their ship, but we did not blow them up.

I tried to estimate the amount of detonation damage from that one game (which was a mistake, as detonation damage fell off with distance, and depended on your ship configuration) and estimated that detonation yields 180-190 damage. We adjusted the detonation code to blow up only when the opposite ship is "weaker" than that.

Since we still haven't had the shooting code, we focused on developing something that would allow us to more reliably get the ship into the blast radius when attacking (and keep our distance when defending).

The solution that we quickly whipped out at T+59h was rather trivial: consider all 8 possible acceleration directions. Compute distance from the enemy ship right now and after the next move. When attacking, strive to reduce the distance. When defending, try to increase it.

This threw fuel efficiency out of the airlock, but we saw a clear improvement in our standings -- 23rd place.

New challenger appears

I already mentioned that the only contained data structure that alien language had was list/tuple. Game state contained a lot of information, including ship stats -- in a list. In our code, we split this list into two elements and called them our_ship and their_ship, to simplify coding.

Suddenly, at T+59.5h, we played a game on the server in which we ... timed out?!

Debug log ended with "assert failed: More than two ship?!"

Indeed, there seemed to be three ships in the last game state received. Alex set out to write a decoder of the last command submitted by enemy ship - it was part of the game state, but we never bothered to decode them before.

This yielded two improvements:

1)Now that we decoded enemy commands from the previous turn, we started to incorporate their thrust vector into our prediction code, which somewhat improved steering

2)We discovered undocumented command emitted by the enemy ship, which allowed you to "split off" part of the stats of your ship to create a new ship. The fourth number in the ship stats was clearly the number of "clones" you can make.

At T+60.5h, we were in a weird state. We knew how the ship splitting worked, but haven't incorporated it into our code. We had nice steering, but it was super inefficient and quite often we would run out of fuel and die. We still did not manage to start shooting back. Sometimes our steering code seemed to get into weird "fixpoint" and made us a sitting duck, like in this game here.

And yet, we were 19th on the leaderboard.

Our biggest worry was opponents that shoot us: despite our nice steering, we were unable to run sufficiently far away

At T+61h, we started to shoot back but somehow failed to inflict enough damage to kill the enemy.

Because we have so many issues that looked solvable and were quite tired, we made another mistake here: good move would be to print out as much stats about shooting as possible: our position, their positions, stats, commands, the damage inflicted, etc, and figure out why they seem to kill us and we are unable to.

Now I know that damage depends on the shooting direction, and it is best to shoot along the principal axis, or at 45 degrees to it. Also, shooting heats up the gun/ship, and you need to let it cool down, otherwise you will damage your own ship (which will look like rapidly decreasing fuel).

Instead, we concluded that guns are useless, disabled them, and moved to solve other better-understood problem.

We decided to ignore game information about 3rd, 4th, and subsequent ships. Our code still only knew our_ship and their_ship, we just stopped crashing if there are more than two ships in play.

Then, when defending, we started to split off ships with 1 fuel and other stats at 0 every time we were being hit: and it worked!

We improved steering code to better avoid enemy ship and game zone boundaries

We noticed that some teams just mirrored our commands, but were unable to think about the effective counter for that, and did see a way to employ this strategy ourselves.

At T+62, we noticed that we set the undocumented parameter in the "shoot" command at all times. Maybe this is "firepower"? We quickly confirmed that and started to cause some damage with our shots.

We quickly discovered, that shooting at every turn overheats your ship and causes catastrophic damage. There was ship stat that rapidly went up every time you shoot, and then slowly decreased down - by 10 units per tick if you coast, and by 2 units per tick if you under power. This was some sort of "heat" or something, and we tried to shoot with max power as often as we can, cooling down in between the shots, and risking the shots when we are not fully cooled down if the enemy ship was close to overheating.

Finally! Finally we started to land some damage. Not as much as enemy ships, but still.

At some point before T+62h, orgs amended the leaderboard to prank team "Cat is #1" - now their team name was displayed as "Cat is #N", where N was their position on the leaderboard :)

We also implemented "aggressive planet evasion mode" - if we found ourselves in less than 10 ticks from the projected crash into the planet, we will choose a side relative to the vector of movement ("left" or "right") that promises the quickest increase in the crash projection, and accelerate hard in that direction for several turns, achieving a close "flyby" of the planet instead of trying to push against the gravity.

Then we noticed the game in which our opponent seemingly split their ship into lots of ships in semi-stable orbits, which were really hard to shoot down: like this.

We quickly adopted the same strategy: if we were defending, and current planet crash projection was more than 100 ticks, we would split a "drone", and started to carry 10 drones when defending.

At T+68h, we changed our defensive loadout to have 100 potential drones, which greatly increased efficiency of this strategy.

We even had enough drones to win against enemies that were splitting their attacking ship to detonate all of them, trying to kill several of our drones at once.

Some enemies were really good at shooting down our drones, though. In this game we were just 3 ticks shy of winning as defenders, but they shot up all our ships.

Sometimes, though, we were lucky. In this game enemy splits their ship multiple time, but they all fly in lockstep, looking like a single point on the map. We managed to shoot and kill them all.

So in the end, our strategy was this:

- When attacking, start with 134 fuel, 64 guns, 10 coolers and 1 "ship"

- When defending, start with 158 fuel, no guns, 8 coolers and 100 "ships"

- Steer towards or away from the enemy ship, depending on the role, while avoiding the planet and going out of bounds of the game area.

- If no direction will move us closer to (or away from) our target, see if we can just coast for 100 turns. If not, correct in the direction that would avoid planet collision.

- If we hit the planet in the next 10 ticks, aggressively accelerate to achieve close flyby.

- Split off ships with 1 fuel when defending, if they have a chance to coast for at least 100 ticks

- Shoot when attacking, as long as we don't overheat. We did not know that direction matters, so we don't account for that

- When attacking, detonate if the opposing ship could be killed with 190 damage and is in range (again, we did not know the proper damage formula).

This allowed us to get 26th place on the final leaderboard, and the highest place we had during contest was 14th, in round 4.

In total, 95 teams were participating in the battles, of which only 65 scored some points.

Still, we scored better than some teams that I recognized: TDB, THIRTEEN, WILD BASHKORT MAGES, Skobochka.

All things considered, this is a solid win in my book.

Especially considering what we did not know or did not do:

- That blast damage rapidly decreases with distance

- That shooting damage depends on direction (remember the mysterious picture from the start of the post? That was the explanation of how shooting damage works)

- That is is possible to get 128 coolers by completing minigame in the Galaxy

- That it is possible to get x2 acceleration by completing minigame in the Galaxy

- That it was possible to figure out the meaning of all the ship stats AND observe ship splitting command in the tutorial missions in the Galaxy

- That there was a formula that describes the relationship between different ship stats and the max number you can have for each

- That ship stats defined detonation damage

- We only steered one ship and ignored all the enemy ships but first

What is in the name?

I promised to explain the team name. Before contest, we agreed that we don't want to solve yet another bot programming problem that devolves to finding the best routes on the map or in some search space with A* algorithm or something similar. Our commitment was so strong and we codified it in the team name :)

And here is why fictitious observatory was called Pegovka : пегий is an equine coat color, dark color with large white patches. Пеговка (Pegovka) is a traditional way to construct a village name from that word, very much like German "-ingen". Коурый is another equine coat color, somewhat reddish. Коуровка (Kourovka) is a very real observatory near Ekaterinburg, Russia

Word from the organizers

While we were programming ship battles, we largely ignored discord, and it seems that lots of other teams did so as well, so I have no interesting quotes from you.

However, after the contest others and I managed to speak with the organizers who said the following:

Participant: I feel sorry for the organizers who have clearly put in an immense amount of work and are now faced with all this negativity.

Organizer: but we probably deserve it for failing to meet expectations of so many people, experienced and new participants alike. That's the hard truth.

me: I really liked the contest, and I do think it is one of the best ever, but at the same time I am really sad about how the lightning part went. I think that small teams (esp. single-person teams) and those who did not participate on Discord since Jul 1 were at a huge disadvantage. Happy to explain

Organizer: actually, the whole pre-contest teaser thing was a result of a very worrying test. We kinda knew that the whole decoding thing was too hard for an average participant to crack in just 24hrs. So we decided to publish some messages in pre-contest teasers to give all participants some head start and a general idea of what we expect. We were sure that was not a big advantage to the teams that could participate in the pre-contest activities, because:
1. We carefully selected 15 messages that did not give out any idea of the task nature.
2. We tried to publish all the pre-contest "findings" in a neat organized way to be easily consumed by the new participants. "Condensed version" was intended to be self-sufficient (albeit mysterious and incomplete, as intended) problem specification. We probably failed that :-(
3. We did not require new participants to be familiar with any Haskell/Rust/Python code that was created before contest: all the new messages were published decoded, and having experience with the scripts gave no advantage. We did not expect anybody to adapt the scripts to decode new glyphs. We probably failed to communicate that.
4. We expected everyone to dedicate 1-2 hours before the contest to study everything already published. We failed to communicate that, because it felt so obvious :frowning: That was the most severe mistake probably.
5. We failed to realize that a chat of 500+ members trying to decode a large mass of messages will become an unmanageable mess. In hindsight, that should have been obvious.

Me: But at the same time:
- dockerized builds(future is here)
- organizers responding in the chat (almost unheard of)
- the infrastructure that remained up and stable throughout the contest
- the live scoreboard that updates without delay
- puzzle-like problem (i adore puzzle hunts -- and I think that if this was ran as puzzle hunt, it would've been awesome)
- hints in tutorials were really neat
- same for "replay battles"
- ability to run submissions locally
...
All of this was so go. SO GOOD. It was just awesome. It felt smooth, and effortless, and neat, and positively 21st century.

Organizer:
We also clinged too hard to the whole astronomer legend:

1. All the message images were generated exactly as if we generated them using the code written by the participants of pre-contest teasers.
2. We gave names to all the operators based on the suggestions from this chat. WHY DID YOU CALL BACKTICK OPERATOR ap?! Aliens even made it look like a backtick in images!! :-D
3. We did not disclose any internal knowledge that was not available to the "astronomer" or clearly stated by some participant in the chat. We pretended that our team of ICFPC organizers studied the message at a pace of an "average team" and published our results to help weaker teams to keep up and stay in the contest. That was a controversial desicion too: we got much hate because some teams spent tens of hours inventing a VM implementation only to see pseudocode published for everyone.

Many of the things we could communicate clearer just "broke the story" of "spoiled the lore", and you can blame this entirely on me, the mastermind of the whole Ivan Zaitsev character

Organizer: I think, in the end we designed a contest that would best fit an imaginary team that we strive to become...

Best played:
1. With a team of 4-5+ engineers.
2. With people that are ready to dedicate the entire 72 hours, each member sleeping only for 6-8 hours total during the contest.
3. For people considerably smarter than us.

We didn't intend it to turn out that way. But in the end it did :(

I can't begin to imagine how hard is it to organize and run ICFPC, and despite all my ranting, I am really happy that we got a chance to play with the novel idea, and not something that got done dozen of times before.

Code

If you want to see the code, it is here: https://github.com/adept/icfpc2020

Expect contest-qualify code :) This entry was originally posted at https://dastapov.dreamwidth.org/132184.html. Please comment there using OpenID.

#n, icfpc