They found 6 groups because they set K = 6. They can set K to whatever they want and it'll produce a result. Thats how the thing works! It splits off the outgroup. Watch it go from K = 2 to K = 6. At K = 2 it identifies the dividing line at about the Himalayas. At K = 3 the Sinai appears. K = 4 the Bering strait. K=5 is the Pacific Ocean. At this point they stop because its about to start splitting off tiny islands and isolated communities and not reveal much of use. Already at K=6 it was starting to split off odd groups such as the Kalash rather than any recognised "races". Furthermore each of those pixels on the figure is an individual and look at where they sources their ethnicities. Look how over-represented the Isle of Orkney and Basque people are!
The paper is a well constructed exercise in identifying population structure but it is a poor sample of the human race, because that is not what they intended to do, and you should not use it that way!
As they themselves said:
"Because most alleles are widespread, genetic differences among human populations derive mainly from gradations in allele frequencies rather than from distinctive “diagnostic” genotypes. Indeed, it was only in the accumulation of small allele-frequency differences across many loci that population structure was identified."
Human genetic diversity is clinal. You draw a line across a continent, graph a locus or loci, and the populations from one end to the other will go from all red to all blue by gradation (assuming no recent migrations or evolutionary pressure on the locus akin to adult lactose production).
This is why they need to analyse many hundreds and thousands of loci to produce a result. They need to sum 400 clines to generate the power to make groups appear. 400 clines to identify the blips at major geographical barriers such as the Himalayas and the Pacific Ocean.