How Pokémon Stadium 2 utterly fouls up team selection
Alternative title: how to be passive aggressive with asterisks.
Overview
In Pokémon Stadium 2 the vast majority of battles have you and your opponent bring a team of six. From there, both sides choose a subset of three to bring. This post is on how the game decides which subset of three to bring and my descent into madness brought about by trying to understand the implications of all the mistakes the programmers made. To check my understanding of this and to provide a helpful tool for determining the effect of the player’s choice of rentals on the teams the AI brings, I’ve made something you can play with. To understand what exactly is going on and get some intuition for it, read on.
What they tried to do
It’s easiest to start with what the likely intended algorithm was. This is purely to give a rough picture and provide a structure for all the details. Do not take everything in this section as gospel: some of it is just flat-out wrong. I will put an asterisk beside anything that is wrong on some level.
The two Rival fights are of a different format than any other important fight in the game so to make the rest of this post easier to word I’ll just state now how he behaves. He always brings all three of his Pokémon, in Round 1 his lead is random with an equal probability for each, and in Round 2 he always leads Lugia.
First the game assigns a fitness value to each of its six Pokémon. Some trainers do this by picking six independent* random values from 0 to 65535 inclusive. The majority do this by instead looking at each of your six Pokémon, computing a matchup fitness unaffected by the rest of your team* and then summing the six obtained values. Each matchup fitness is based off the worst-case amount of damage you can do with the moves your Pokémon can learn*, the damage they can do to you with each of their moves, and some bonuses for status and other things their moves can do. All of these are weighted with values specific to the opponent in question*.
Next the game looks over all 20 possible teams of three. It is worth noting that the order of the members of each trio are just the order they appear in the team of six. If the mode is Poké Cup, the game first rules out any team with a total level above 155. For specific trainers it will then rule out any teams that do not contain the first Pokémon of the six. For some others it will rule out those that do not contain the last. A few trainers, like Round 1 Erika, must bring both.
Some trainers are not allowed to bring both of their Pokémon with the highest fitness values. For determining what the best two are, earlier Pokémon take precedence in case of a tie. The game then randomly selects one of them to forbid, rather than only filtering off teams containing both. Some trainers must also bring a Pokémon from each column. As a fun aside, Round 1 Min & Lyn have this condition but must also bring Ledyba which locks them out of bringing Hoppip.
How the Pokémon are ordered, and what is meant by columns
For each remaining valid trio, the game assigns a team fitness. To do this, it simply sums the individual members’ fitness values and then applies multipliers for weaknesses shared by multiple team members*. Then the game determines the best eight teams according to their team fitness values (randomly choosing which to keep in the event of a tie). Some opponents are more restrictive though, and only take the best four teams. Out of these trios the game then selects one, assigning an equal probability* to them all.
With the team selected the last thing the game must do is choose the lead. Some opponents just pick a random lead out of the three. Some opponents always go with the first member of the trio. For the others, the game assigns to each member of the trio a lead fitness. This starts out at 0 for each member. 20 is added for the Pokémon with the highest speed (ties broken randomly) and 30 is either added or subtracted for the Pokémon with the highest fitness (ties broken randomly) depending on whether the opponent has a preference for leading with the highest fitness Pokémon or keeping it in back. A final bonus is added for any Pokémon that has Baton Pass, the size of which depends on the opponent. With the lead fitness values assigned, the game picks the Pokémon with the highest lead fitness (again, breaking ties randomly) and swaps it with the first Pokémon of the trio.
Needless to say all random decisions and values come from a PRNG that is very good* and is initialised very well* and there are no significant and easily noticeable restrictions or biases that result from this*.
AI values
As mentioned above certain opponents have specific behaviour for which teams they allow and a bunch of weights giving how much they care about Baton Pass on leads, the damage you can do to them, and so on. So before going deep into the details of the calculations, here is that data.
Individual fitness values
The calculation of the individual fitness values is easily the most involved step in the entire process and they are the means by which the player’s team can influence the AI’s team selection. As mentioned above some opponents just assign random fitness values which gives them a fixed probability distribution for team selection regardless of the player’s team. For the rest though, there are a number of steps (and mistakes…). The game computes six values for the matchups of its Pokémon against each of your six Pokémon and then adds these together. These values are generally independent but a few bugs mean in theory there can occasionally be an interaction.
Step 1: damage you can do
Most opponents don’t read your moves for team selection and consider all moves your Pokémon could know at that level. Some do though, in which case they just go over your moves. First though, note for this step the following moves are ignored: Absorb, Beat Up, Bide, Dream Eater, Explosion, Fissure, Future Sight, Giga Drain, Guillotine, Horn Drill, Hyper Beam, Leech Life, Mega Drain, Razor Wind, Selfdestruct, and Skull Bash. Moves with a power of 0 are also ignored. Moves that directly do damage like set damage moves are assigned a base power of 1 to get around this, except for Guillotine which is assigned a base power of 0.
For moves not filtered out by the above, the game calculates how much damage the move would do against its Pokémon. Moves with variable power are treated as having the following power:
- Flail and Reversal: 60
- Return and Frustration: the correct power for the friendship value
- Present: 52
- Magnitude: 64
- Hidden Power: 1, and treated as normal type.
The OHKO moves (skipped due to the above, but will be important later) are treated as doing 65535 damage, except for Guillotine which is counted as doing 0 damage. Super Fang, Dragon Rage, SonicBoom, Night Shade, and Seismic Toss are treated as doing the correct amount of damage. Psywave is treated as doing damage equal to 0.75 * the user’s level. For other moves the standard damage formula applies, including the effects of Light Ball, Thick Club, and items that boost moves of a specific type like Pink Bow and TwistedSpoon. However, the random factor applied at the end of damage calculation is skipped and a max damage roll is assumed. Additionally, some opponents skip applying the effect of STAB and resistances/weaknesses.
The maximum damage from these moves is then divided by the HP of the AI’s Pokémon and scaled to lie between 0 and 255 inclusive (capping out at 255).
Lastly some notes on how the AI determines if your Pokémon can learn a move. It mostly goes the same as Challenge Cup, except egg moves and learning level-up moves due to breeding only apply to Pokémon at least level 5. Additionally, some special event moves are considered, and these are:
- Fearow and Rapidash: Pay Day, minimum level 1
- Pikachu and Raichu: Fly and Surf, minimum level 1
- Psyduck and Golduck: Amnesia, minimum level 1
- Farfetch’d: Baton Pass, minimum level 5
- Magikarp: Dragon Rage, minimum level 1
- Dratini: ExtremeSpeed, minimum level 15
- Gligar: Earthquake, minimum level 5
These also apply to evolutions in the same way as explained in the Challenge Cup post for typical moves.
Oh, and there is one incredibly stupid bug worth mentioning, especially since as of writing I haven’t implemented it. The problem is when checking if your Pokémon learns a move it fails to properly initialise the level. Now once it’s gone into damage calculation for the first move, the level gets set correctly for the rest of the computation of that matchup’s value. This is part of what stops it being too damaging.
The other saving grace is that the game iterates over the moves in reverse order. The last move is Beat Up, which is skipped anyway. The next two are Whirlpool and Rock Smash, and more than half of Pokémon learn at least one of these via HM which solves the problem for them. Hidden Power is a late move and a universal TM which also puts a cap on things quickly. In total, the moves that can be potentially missed that have an opportunity of being the best move are:
- Caterpie and Metapod: Tackle
- Weedle and Kakuna: Poison Sting
- Ekans and Arbok: Crunch
- Exeggcute and Exeggutor: AncientPower (often tied by Double-Edge, but not always)
- Magikarp: Dragon Rage and Flail
- Unown: Hidden Power (remember it’s counted as Normal BP 1)
- Wobbuffet: Counter and Mirror Coat (remember they’re counted as BP 1)
- Larvitar and Pupitar: Crunch
Worth noting that an uninitialised level might potentially have the opposite effect: a Pokémon potentially being seen as learning one of these moves when it shouldn’t be able to. In that connection should also be mentioned AncientPower for Larvitar and Pupitar: any one that can learn AncientPower can also learn Rock Slide, but it is possible for a legal low-level Pupitar from Crystal to not have access to either yet temporarily be thought of as level 30 or higher and so have access to AncientPower.
It seems that the uninitialised level always starts as zero, regardless of the order of your team. I’ve not done an in-depth investigation of what happens to the value, but I have played around with team order and seen that this doesn’t affect the calculation for Wobbuffet.
Step 2: damage they can do
This is similar to the above, except the game works out the above for each of the AI’s moves and keeps them separate rather than just taking the maximum (see step 4). The game also no longer skips moves like Giga Drain and Hyper Beam. For this calculation, Beat Up is treated as a typical Dark move with base power 10.
Step 3: other things they can do
Here the game comes up with a special bonus for each of the AI’s moves. This is based off effects other than damage, and is affected by various weights each AI has. In the following, note that the chance of the secondary effect activating is a number from 0 to 255 rather than the more standard percentages.
Moves that cause status (sleep, poison, burn, freeze, paralysis)
The value is generally
(28 * (AI weight for given status) * (chance of secondary effect)) / 255,
rounded down. However the value is instead zero if the secondary chance cannot happen for type reasons. Tri Attack is ignored for this.
Moves that raise the AI’s stats
(2 * (AI weight for raising given stat) * (chance of secondary effect) * (stages raised)) / 255.
Evasion is considered a stat for this. Attacks that can raise stats are counted as raising by two stages for some reason, except for AncientPower which is counted as one stage but computes the above expression for the five stats it can raise and adds the results. Curse is ignored for this.
Moves that lower the player’s stats
This is an absolute mess. The intent was clearly to do the same as the above but with a bunch of weights for lowering stats. However, whoever coded that function mixed up the order of the arguments for the stat to effect and the chance of a secondary effect, so as a result the game is accessing different memory and multiplying by a number for the stat in question. The multiplier for the stats thus come to:
- Attack: 0
- Defense: 1
- Speed: 2
- Special Attack: 3 (but no moves do this)
- Special Defense: 4
- Accuracy: 5
- Evasion: 6
For the effect chance, there are five values and they come to the following:
- 10% moves: still falls within the AI data structure and so has an opponent-dependent value
- 20% moves: if this is the first of the AI’s Pokémon, 0. Otherwise the fitness of the first Pokémon divided by 65536, rounded down, and then modulo 256
- 30% moves: if this is one of the first three of the AI’s Pokémon, 0. Otherwise the fitness of the third Pokémon divided by 256, rounded down, and then modulo 256
- 50% moves: 0
- 100% moves: 0
It is worth noting that since the game is now multiplying by the stat ID in place of the effect chance, the resulting value will be smaller. Nevertheless, my program takes account of this.
Flinch moves
The same type of calculation as for status, except 0 if the attacker’s speed is not greater and type matchup is ignored.
Confusion moves
The same type of calculation as for status but type matchup is ignored. Swagger is ignored.
Attract, Baton Pass, Disable, Focus Energy, Light Screen, Mist, Reflect
A fixed opponent-dependent value for each of the six moves in question, except for Attract which is zero if the opponent’s Pokémon cannot infatuate the player’s Pokémon.
Step 4: putting it together
The matchup’s value starts at zero. For each of the AI Pokémon’s four moves, the game first takes the damage the move can do to the player, as a proportion out of 255. The game then multiplies this by 128, the accuracy for the move (as a value from 0 to 255), the weight the AI assigns to damage it causes, then divides by 255 and rounds down. Then the game takes any bonus for secondary effects calculated in step 3, multiplies by the move’s accuracy and a weight assigned to secondary effects, then again divides by 255 and rounds down. As a third and final term the game takes the amount of damage the player can do as calculated in step 1 and multiplies it by 58 and a weight assigned to damage by the player. An exception to this is if the AI’s move is Selfdestruct or Explosion, in which case the AI instead treats the damage done by the player as maximum for this one part.
These three terms are then added together and that is added to the value. Note therefore that the term for damage done by the player is typically multiplied by 4 for the final result, excepting the cases of Selfdestruct and Explosion. This is true even if the AI’s Pokémon has less than four moves: for the empty moves slots, the terms for damage and secondary effects caused by the AI are zero but there is still a deduction for damage done by the player.
Penalty for resistances (yes, really)
This calculation is truly amazing because there are three separate errors here. A note before going into it though: some opponents don’t do this part at all.
The game loops over all attack types. Wait, sorry, that’s not right. Mistake one: the game only loops over physical attack types. This is caused by there being at least two different numbering systems the game uses for types, and there is a mismatch here.
For each physical attack type, the game counts how many members of the trio
are weak to resist that attack type. There’s mistake 2. If the team fitness
is positive, the game will multiply the team fitness with
1 - 0.1 * (number of pairs that resist attack type). If the team fitness is
negative, the factor will instead be 1 + 0.1 * (number of pairs that resist
attack type). Note that the number of pairs will be 0, 1, or 3.
The last mistake is in the multiplication. The game does not multiply by, say, 1.3 directly. It instead multiplies by 13 and then divides by 10, and similarly for any factor it will multiply by (factor * 10) and then divide by 10. This multiplication can cause an overflow. And this isn’t some nitpicky artificial scenario: this can happen and confused me for a good hour when it came up during testing. Specifically, it comes up in Werster’s Complete the Game PB against Chen in R1 Poké Cup Great Ball. The team Haunter / Spinarak / Misdreavus should be impossible with that rental team, but the combined resistances factor is 2.4167 and the final multiplication by 13 is enough to cause the team fitness to overflow and become positive.
How not to do randomness
I was initially planning to do a separate post on how Stadium 2’s PRNG is bad, but the problem is so pervasive in team selection that I feel this post would be incomplete without discussing it.
The short version for people who know about such things is that Stadium 2 uses an LCG for this that returns all the bits of the seed without reordering them in any way. This is especially pernicious because most of the time the game then takes the result modulo 2 or 8. The cherry on top is that the game initialises this PRNG right before team generation with a value that is the bitwise and of two values that for most practical purposes are independent, horrifically skewing the distribution.
For the rest of you, I won’t bore you with the maths behind LCGs since it’s not that necessary to get the salient points. The most important consequence of the choice of PRNG is that if the game generates a sequence of numbers that lie between 0 and 7 inclusive, it will go in the following cycle:
…, 0, 7, 2, 1, 4, 3, 6, 5, 0, 7, 2, 1, 4, 3, 6, 5, …
This is because 8 is a power of two, and other powers of two have this property. The most common one that comes up here is 2, for when the game needs to decide between one of two options. This goes in a cycle of 0, 1, 0, 1, … and is related to the cycle for 8 by taking the remainder from division by 2.
The importance of this cyclic behaviour is it creates strange relationships between random choices. The big one is in the choice of lead when the game has two options with equal highest lead fitness. Most of the time the AI has eight teams to pick from, so it picks one, and then the next number on the cycle determines the choice of lead. If there’s a speed or fitness tie that comes up while computing the lead fitness values, this is also related so the constraint is still there.
It should be noted that if instead of a power of two the number is odd, then the choice is not at all related but the cycle still goes one forward and so there is still a relationship between random numbers before and after. This means that for trainers that just have a random lead, the lead can be treated as independent of what comes before, or close enough. This also comes up for trainers who have less than eight valid teams, often due to the combination of not using both of their highest fitness Pokémon and the Poké Cup level cap. In the case of six teams there’s some relationship: whether the result is odd or even is determined by the position in the cycle, but beyond that it’s fairly independent.
All of this is pretty constraining but I’ve saved the best for last. See, I’ve not told you how the game determines where on the cycle to start and it makes this decision right before going into team selection. And it does it absolutely terribly. It takes the bitwise and of two independent random values, when the combining operation you want is a bitwise xor. This means the starting value satisfies the following distribution:
Starting value | Probability |
---|---|
0 | 42.1875% |
1 | 14.0625% |
2 | 14.0625% |
3 | 4.6875% |
4 | 14.0625% |
5 | 4.6875% |
6 | 4.6875% |
7 | 1.5625% |
This is why most opponents will massively favour a given trio (provided we fix the player’s team). In the general case, we’re likely to start with 0. The game then moves one along to determine which of the two best Pokémon to ban (even if the opponent in question doesn’t care about that) generating the value of 7, or really 1. Then assuming there’s no ties in total team fitness and there are at least eight valid teams, it will generate the value of 2 and so pick the third best team. Then if the opponent doesn’t just pick a random lead, the lead is determined by the position on the cycle.
As a result if you’re trying to make one desired team the common team then you’re typically focusing on which team the game thinks is the third best rather than the best, which is rather counter-intuitive and practically impossible to guess just from playing the game. If the opponent doesn’t bring both of their best Pokémon and you have a specific Pokémon that you really don’t want to see then you really want to steer them towards judging it as their second best Pokémon.
Finally, the same problem happens with random values from 0 to 65535 since 65536 is also a power of two. This means that for trainers with random fitness values, any one of the six fitness values determines the other five and generally their team selection. Also one of the two random values that are combined has two higher bits set to zero which means the initial value in the cycle lies between 0 and 16383 inclusive. But the next value is what is used as the first random fitness, so the first fitness isn’t forced to be low.