For this blog post we decided to jump on the PokémonGO hype and add a bit of science into the craze. Our goal is to give you the optimal portfolio of Pokémon to train, so you can be as effective as possible against a wide variety of opponents. As each Pokémon has its strengths and weaknesses, we created clusters of Pokémon with similar characteristics and looked at the few selected ones allowing the player to compete against as many different enemies as possible.
We used the Pokémon API fan service available on the internet to find all the information about the little creatures.
The data we used consists of:
Basic identification information of the Pokémon - for example name, color, shape, happiness, growth rate or the base experience.
Properties and characteristics of the Pokémon (e.g. health or speed)
Moves - this specifies the name of a possible move and its properties.
Abilities of the Pokémon, for example stench, drizzle, speed-boost or battle-armor.
The data is available for 811 Pokémon. Although we have done the analysis for all the Pokémon, in this post, we focus only on the first 150 Pokémon as those are the ones that are available in PokémonGO. These datasets were transformed and 457 features were created for each Pokémon.
Our goal was to identify the optimal portfolio of Pokémon, which every trainer should have in order to cover almost all the types. This will increase the probability of winning in a random fight (provided that the Pokémon have the same CP). The approach we took was through clustering all of the Pokémon and assigning them to different clusters. Afterwards finding the Pokémon, which are the nearest to the center of each of the clusters to have our portfolio as diverse as possible.
Using the dendrogram and the elbow rule we have selected 4 as the optimal number of clusters. This also makes sense as it is a reasonable number of Pokémon, that one can manage to train to a strong level. The output of the K-means clustering can be seen in the graphic below. The clusters are slightly overlapping as the dimension was reduced to 2 using the principal components analysis.
Using the unsupervised random forest algorithm we have also explored the features, that create the biggest difference between the Pokémon.
The highest importance by far has the feature avg_accuracy - the average accuracy of all moves where accuracy is available. The following features with similar importance are class_no_physical - number of moves with physical damage, target_avg_pp_all_opponents - average value of power points of moves, which can be used against all opponents, type_avg_pp_normal - average value of power points of moves, which can be used against Pokémon with type normal.
Below you can find a couple of options for your optimal portfolio. We explain what makes each cluster of Pokémon unique and give you 5 alternatives with the first being the ideal selection for this cluster and so on.
Group of mostly grass, bug and poison Pokémon. This is the group with the most weaknesses but they are doing well against other Pokémon from same kinds. The best choice in this group is Bellsprout.
The second group consists of mostly flying, normal and a few fire Pokémon. The representatives of this group are on average very strong against steel and fairy Pokémon. Do not use against psychics. The best candidate to have is Pidgey.
Psychic, fairy and electric group. Strong against ice, grass, ghost, fire and fighting Pokémon. They are on average very weak against dragons and poison counterparts. The most valuable Pokémon from this cluster is Clefairy.
Mostly water, rock and dragon Pokémon. Strong against fire and normal kinds. Not very effective against flying and fairy types. Shelder is the must have from this group.
This is it for our Pokémon analysis. We hope it helps you get a kick-ass portfolio and become a legendary master trainer. So when you head out for the next hunt remember to keep your eyes open for our suggested candidates. You don´t wanna end up in the next gym battle unprepared, do you?