Some possibly-meaningless analyses of Super Smash Bros. matchup charts

Oct 23, 2015 02:40

So here's a question for you: How does one generate a tier list from a matchup chart?

A matchup chart, obviously, contains way more information than a tier list. If you know how literally every character matches up against every other character, you should certainly be able to say which characters are, on the whole, better than others. But how? If you have two characters you want to compare, and one dominates the other, then this is easy; but what if neither dominates?

(I'm assuming here that a "tier list" means just a list of all characters sorted by rank; I'm ignoring any further information, such as numerical ratings or actual divisions into tiers.)

This is the method I came up with: First, we treat the matchup chart as a (zero-sum, symmetric) game. (Since matchup charts are usually presented as being out of 10, you may have to subtract 5 from each entry before you can feed it into your standard tools.) Then, compute the optimal strategy for this game. (I'll assume this is unique, as it will be in all cases analyzed here; I'm not going to worry about what happens if it isn't.) Finally, rank each character by how well they do against this optimal strategy. (There might be ties; I'm just not going to worry about that for now.)

EDIT 6 August 2016: Welp, I've now actually encountered the case of the optimal strategy being nonunique; I decided to handle it by taking the centroid of the set of optimal strategies and using that. How meaningful that is, I don't know, but I couldn't think of anything better. (Note that for a two-player zero-sum game, the set of optimal strategies must be convex.)

Originally I had left out the final step, instead thinking that how often a character occurs in the optimal strategy should be used as the method of ranking. But later I realized this makes no sense -- suppose you have two characters, A and B, which are very similar, but A slightly dominates B. Then B should rank just a little bit below A, but this method would instead put B in last place. I don't think that's the intent of a tier list. We can, I suppose, use this as a tiebreaker for the actual method I suggest though.

Now obviously there are some problems with this. In real life, not everyone is actually capable of playing every character to its greatest extent. Still, I think it is an interesting model. What does it result in when applied to Super Smash Bros.? And do the results match the more usual tier lists?

Now, it's quite a bit more work to produce a matchup chart than it is to produce a tier list, so matchup charts are not always so up to date. As such, we'll compare each matchup chart not to the current tier list but to the one closest in time to when the matchup chart was produced.

SSBWiki lists four matchup charts: One for the original Super Smash Bros., one for Melee, one for Brawl, and one for the Japanese version of the original Super Smash Bros (various things were altered for the more well-known international version). (For obvious reasons, there is as yet no Super Smash Bros. 4 matchup chart; indeed, there isn't even yet a tier list. Since the game is still actively being patched, producing one would be premature. Also, without a Super Smash Bros. 4 Back Room, it's not clear who would produce an "authoritative" one.)

The SSB chart will be compared against the 3rd SSB tier list; the SSBM chart against the 10th SSBM tier list; and the SSBB chart against the 8th (current) SSBB tier list. (Again, this is based on the dates the matchup charts are from.) SSBWiki doesn't list a separate Japanese SSB tier list, but there seems to be one implicit in the Japanese SSB matchup charts, i.e., the ordering of the characters used in the chart, so that's what we'll compare it to.

Actually, before I continue, let me observe -- it's not obvious that my method should make sense that for the SSBM and SSBB matchup charts listed here, since rather than state how often a given character wins a given matchup, they just give abstract descriptions such as "slight advatage", "large advantage", "Close to unloseable", etc. Nonetheless, as we'll see, this will prove not to be an issue.

The thing about the SSB, SSBM, and SSBB matchup charts is that all three are dominance-solvable; the optimal strategy, it turns out, isn't any sort of mixed strategy at all, but simply to always play the best character (i.e., Pikachu for SSB, Fox for SSBM, and Meta Knight for SSBB). This is why I said it doesn't matter that the SSBM and SSBB charts don't actually give matchup splits, since only qualitative comparisons are needed to establish this result.

In the SSB case, Pikachu dominates all characters but Kirby and Fox; but once we restrict to those three, Pikachu dominates.

In the SSBM case, the characters not dominated by Fox are Falco, Sheik, Marth, Jigglypuff, Peach, Captain Falcon, Ice Climbers, Samus, Ganondorf, Link, and Donkey Kong. Once we restrict to those, Fox dominates everyone but Sheik, Marth, Jigglypuff, Peach, Captain Falcon, Ganondorf, Link, and Donkey Kong. And once we restrict to those, Fox dominates everyone but Sheik and Jigglypuff. Once we restrict to those three, Fox dominates.

In the SSBB case, the characters not dominated by Meta Knight are Ice Climbers, Olimar, Diddy Kong, Marth, Falco, Pikachu, Wario, Lucario, King Dedede, Donkey Kong, and Sheik. Once we restrict to those, Meta Knight dominates everyone but Ice Climbers, Olimar, Diddy Kong, Marth, Pikachu, Lucario, and King Dedede. And once we restrict to those, Meta Knight dominates everyone but Ice Climbers, Olimar, Diddy Kong, Marth, and Lucario. Once we restrict to those six, Meta Knight dominates.

In all three cases, then, my method says that for these games, the tier list should be ordered purely based on how each character does against the unique best character (we'll ignore the question of how ties are broken). As you can easily check, this is not how they actually are ordered.

This leaves the case of DSB (Japanese SSB). Here the optimal strategy is in fact mixed! Let's start with a dominance analysis. Pikachu dominates everyone but Kirby, Captain Falcon, and Ness; and once we restrict to those 4, Ness is dominated by Kirby. But once we restrict attention to the remaining three, we find that none of them dominates. In fact, the unique optimal strategy is a mix of 1/4 Pikachu, 1/4 Kirby, and 1/2 Captain Falcon.

This is surprising because it suggests that in some sense Captain Falcon is the best character, rather than Pikachu! I'm not sure how believable that really is, but that's what it says.

This case is more interesting than the others, so let's do a full listing. I'm going to rank characters by how they do against this optimal strategy, as discussed above. I'll then break ties by how often they appear in this optimal strategy. Finally I'll break the remaining ties by their rank in the (inferred) Japanese tier list, i.e., I'll try to keep the resulting list as close to the usual tier list as possible within the bounds stated above.

If we do this, we get the following results:

Games won Games used Usual rank Resulting rank Difference
Captain Falcon 5 5 3 1 +2
Pikachu 5 2.5 1 2 -1
Kirby 5 2.5 2 3 -1
Ness 4.75 0 5 4 +1
Fox 4.25 0 4 5 -1
Jigglypuff 4.25 0 8 6 +2
Mario 4 0 6 7 -1
Luigi 3.5 0 11 8 +3
Samus 3.25 0 7 9 -2
Link 3 0 9 10 -1
Yoshi 3 0 10 11 -1
Donkey Kong 1.75 0 12 12 0

Here "games won" is how many games out of 10 they win against the optimal strategy above, "games used" is how many games out of 10 they're used in the optimal strategy above, "usual rank" is their rank in the usual Japanese tier list (that I've inferred), "resulting rank" is their resulting rank in this new ordering, and "difference" is, well, the difference.

So, that's a little different, with Luigi jumping up 3 whole spots. Most of the rest isn't too different, but remember, I deliberately broke ties in a way designed to keep it similar.

Is any of this really meaningful? I have no idea. Captain Falcon getting the edge over Pikachu certain seems wrong... but then, that was on tiebreaker, by a different method. Still, I thought it was interesting to see how it would result. If anyone wants to try it on other games and report the results, I'd be interested to hear!

-Harry
Previous post Next post
Up