DeepMind’s StarCraft 2 AI is now better than 99.8 percent of all human players

Homo neanderthalensis

Mar 21, 2019

DeepMind today announced a new milestone for its artificial intelligence agents trained to play the Blizzard Entertainment game StarCraft II. The Google-owned AI lab’s more sophisticated software, still called AlphaStar, is now grandmaster level in the real-time strategy game, capable of besting 99.8 percent of all human players in competition. The findings are to be published in a research paper in the scientific journal Nature.

Not only that, but DeepMind says it also evened the playing field when testing the new and improved AlphaStar against human opponents who opted into online competitions this past summer. For one, it trained AlphaStar to use all three of the game’s playable races, adding to the complexity of the game at the upper echelons of pro play. It also limited AlphaStar to only viewing the portion of the map a human would see and restricted the number of mouse clicks it could register to 22 non-duplicated actions every five seconds of play, to align it with standard human movement.

Still, the AI was capable of achieving grandmaster level, the highest possible online competitive ranking, and marks the first ever system to do so in StarCraft II. DeepMind sees the advancement as more proof that general-purpose reinforcement learning, which is the machine learning technique underpinning the training of AlphaStar, may one day be used to train self-learning robots, self-driving cars, and create more advanced image and object recognition systems.

“The history of progress in artificial intelligence has been marked by milestone achievements in games. Ever since computers cracked Go, chess and poker, StarCraft has emerged by consensus as the next grand challenge,” said David Silver, a DeepMind principle research scientist on the AlphaStar team, in a statement. “The game’s complexity is much greater than chess, because players control hundreds of units; more complex than Go, because there are 10^26 possible choices for every move; and players have less information about their opponents than in poker.”

Back in January, DeepMind announced that its AlphaStar system was able to best top pro players 10 matches in a row during a prerecorded session, but it lost to pro player Grzegorz “MaNa” Komincz in a final match streamed live online. The company kept improving the system between January and June, when it said it would start accepting invites to play the best human players from around the world. The ensuing matches took place in July and August, DeepMind says.

The results were stunning: AlphaStar had become among the most sophisticated Starcraft II players on the planet, but remarkably still not quite superhuman. There are roughly 0.2 percent of players capable of defeating it, but it is largely considered only a matter of time before the system improves enough to crush any human opponent.

Image: DeepMind
This research milestone closely aligns with a similar one from San Francisco-based AI research company OpenAI, which has been training AI agents using reinforcement learning to play the sophisticated five-on-five multiplayer game Dota 2. Back in April, the most sophisticated version of the OpenAI Five software, as it’s called, bested the world champion Dota 2 team after only narrowly losing to two less capable e-sports teams the previous summer. The leap in OpenAI Five’s capabilities mirrors that of AlphaStar’s, and both are strong examples of how this approach to AI can produce unprecedented levels of game-playing ability.

Similar to OpenAI’s Dota 2 bots and other game-playing agents, the goal with this type of AI research is not just to crush humans in various games just to prove it can be done. Instead, it’s to prove that — with enough time, effort, and resources — sophisticated AI software can best humans at virtually any competitive cognitive challenge, be it a board game or a modern video game. It’s also to show the benefits of reinforcement learning, a special brand of machine learning that’s seen massive success in the last few years when combined with huge amounts of computing power and training methods like virtual simulation.

Like OpenAI, DeepMind trains its AI agents against versions of themselves and at an accelerated pace, so that the agents can clock hundreds of years of play time in the span of a few months. That has allowed this type of software to stand on equal footing with some of the most talented human players of Go and, now, much more sophisticated games like Starcraft and Dota.

Yet the software is still restricted to the narrow discipline it’s designed to tackle. The Go-playing agent cannot play Dota, and vice versa. (DeepMind did let a more general-purpose version of its Go-playing agent try its hand in chess, which it mastered in a matter of eight hours.) That’s because the software isn’t programmed with easy-to-replace rule sets or directions. Instead, DeepMind and other research institutions use reinforcement learning to let the agents figure out how to play on their own, which is why the software often develops novel and wildly unpredictable play styles that have since been adopted by top human players.

“AlphaStar is an intriguing and unorthodox player — one with the reflexes and speed of the best pros but strategies and a style that are entirely its own. The way AlphaStar was trained, with agents competing against each other in a league, has resulted in gameplay that’s unimaginably unusual; it really makes you question how much of StarCraft’s diverse possibilities pro players have really explored,” Diego “Kelazhur” Schwimer, a pro player for team Panda Global, said in a statement. “Though some of AlphaStar’s strategies may at first seem strange, I can’t help but wonder if combining all the different play styles it demonstrated could actually be the best way to play the game.”

DeepMind hopes advances in reinforcement learning achieved by its lab and fellow AI researchers may be more widely applicable at some point in the future. The most likely real-world application for such software is robotics, where the same techniques can properly train AI agents how to perform real-world tasks, like the operation of robotic hands, in virtual simulation. Then, after simulating years upon years of motor control, the AI can take the reins of a physical robotic arms, and maybe one day even control full-body robots. But DeepMind also sees increasingly more sophisticated — and therefore safer — self-driving cars as another venue for its specific approach to machine learning.

Correction: A previous version of this article stated DeepMind restricted AlphaStar to 20 actions every five minutes. That is incorrect; the restriction was to 22 non-duplicated actions every five seconds. We regret the error.

Where Do You Find Them?
Sep 15, 2019
There's a massive problem with AIs like this that never seems to come up. They're still vulnerable to same anti-AI techniques that you can use against normal bots. You find a mistake that they make or a strategy they can't handle and you repeatedly fuck them with it. They always talk about how it "only took a few hours of training" but that time actually represents millions of games. They learn very slowly and lose repeatedly to the best players. A top 0.2 human player will be able to take a few games from the world best. The AIs don't once they've been figured out which, in both go and starcraft, the humans managed to do within a handful of games. Reinforcement learning, while it has shown some great results, is very bad for this as well, because even knowing the killer strategy that the humans are using you can't train the AI to beat it.

We might get self driving cars that are better than most humans (if only because most humans aren't great drivers) within the next 10 years. Imagining they'll be useful for general purpose applications is pure fantasy. They can't even handle novel circumstances within their tiny area of expertise. It'll come one day I'm sure, but we're a long way off.

Heather Mason

Best Girl
True & Honest Fan
Mar 6, 2017
I don't think these dorks realize how big of a power gap exists between 99.8% of players and the top 0.2%. The gap between top 10 and top 100 is probably even bigger than that. Robots drool human rule

Kaede Did Nothing Wrong

True & Honest Fan
Apr 10, 2019
why you guys gotta be so salty? deepmind is really impressive AI and it's fun they train it with games. here's casted commentary of some of its games, gameplay starts at 14:30

No Exit

Hit and Run attack!
True & Honest Fan
Oct 5, 2019
Maybe I've just gotten old and AI learning has lost it's sheen but I don't see what the big deal is anymore. Teaching most AIs to play games seems pointless and dumb to me, especially when talking about using them for things offline in an unpredictable environment. Like self-driving cars are cool and all but they're only really safer when all other cars are self-driving. The only real thing about something like that that is an improvement is the reaction and calculation time and that hardly requires a sophisticated AI to implement.
All this pointless maximization really does have me believing we're going to end in the paperclip apocalypse though.


It's not gay if you wear a condom
Aug 31, 2019
no shit an AI is going to be better at a strategy game based on fast clicking and constant knowledge of what's happening across the map...


pretends to be Russian for stickers
Sep 9, 2018
That'd probably be a lot easier since you could just build a bunch of high-speed tanks with sawblades that dismember all the opposing players.
they'd all get red cards, but the secret is that you bring more teammates than they do so that you have one tank that can't get a red card as the entirety of the enemy team is dead.


I, Scout, humbly present a toast to Miss Pauling!
Apr 16, 2019
the Korean people
Why’d you have to say it like that?

AI and robots are really fucking cool. I don’t care if they have the potential to kill everybody.


Tactical Autism Response Division
True & Honest Fan
Mar 11, 2019
This is a black day for the Korean people, whose only national accomplishment is being really fucking good at Starcraft.
They're also good at alcoholism
worst korea gets hammered.png

So it's not a complete wash