Home News DeepMind AI topples experts at complex game Stratego

DeepMind AI topples experts at complex game Stratego


DeepNash Has mastered an online version Stratego.Credit: Lost In the Midwest/Alamy

Another Machines have taken over the ability to understand and master a game once considered too difficult for artificial intelligence (AI). An AI is called DeepNash, made by LondonDeepMind, a DeepMind-based company, has matched expert people at StrategoThe board game ”, which requires strategic thinking and long-term planning in the face of incomplete information.

The The achievement is described in Science On 1 December1This follows a study that reported an AI that could play Diplomacy2In which the players negotiate and cooperate with each other.

“The rate at which qualitatively different game features have been conquered — or mastered to new levels — by AI in recent years is quite remarkable,” ” Michael Wellman The University Of Michigan In Ann ArborA computer scientist who studies game theory and strategic reasoning. “Stratego and Diplomacy are quite different from each other, and also possess challenging features notably different from games for which analogous milestones have been reached.”

Imperfect Information

Stratego It is more complex than chess because of its unique characteristics. Go AIs have mastered poker and other games, including poker (the two latter two in 2015).3 20194). In Stratego, two players place 40 pieces each on a board, but cannot see what their opponent’s pieces are. The The goal is to move pieces in turn to eliminate the opponent and capture a flag. Stratego’s game tree — the graph of all possible ways in which the game could go — has 10535 States, compared to Go’s 10360. In Terms of incomplete information at a game’s start Stratego There are 1066 Potential private positions, which dwarfs 106 These are starting situations for two-players Texas hold’em poker.

“The sheer complexity of the number of possible outcomes in Stratego means algorithms that perform well on perfect-information games, and even those that work for poker, don’t work,” ” Julien PerolatDeepMind researcher in Paris.

So Perolat Deep with their colleaguesNash. The AI’s name is a nod to the US mathematician John NashHis work was the inspiration for the term Nash equilibrium, a stable set of strategies that can be followed by all of a game’s players, such that no player benefits by changing strategy on their own. Games Can have one, many or zero Nash equilibria.

DeepNash combines a reinforcement-learning algorithm with a deep neural network to find a Nash equilibrium. Reinforcement Learning is about finding the best policy for dictating action in every game state. To Learn the optimal policy from DeepNash It has played over 5.5 billion games against it. If one side gets a reward, the other is penalized, and the parameters of the neural network — which represent the policy — are tweaked accordingly. EventuallyDeepNash An approximate convergence occurs Nash equilibrium. Unlike Alpha, an AI that was used in game play before AlphaGoDeepNash Does not go through the game tree looking for optimization.

For Two weeks before AprilDeepNash Competed with humans Stratego Online gamers Gravon. After 50 matches, DeepNash It was ranked third out of all Gravon Stratego Since 2002. “Our work shows that such a complex game as Stratego, involving imperfect information, does not require search techniques to solve it,” One member of the team says Karl TuylsDeepMind researcher in Paris. “This is a really big step forward in AI.”

“The results are impressive,” Reckon. Noam BrownA researcher at Meta AI is headquartered in New York CityThis is the report of the 2019 poker-playing AI team led by. Pluribus4.

Diplomacy Machine

Brown And his coworkers at Meta AI set their sights at a different challenge: building an AI capable of playing DiplomacyA game that can have up to seven players and each player representing a major power in pre-First World War Europe. The Goal is to control supply centres through the movement of units (fleets, armies). ImportantlyThe game is different from two-player games like Two-Player Games. It requires active communication between players and private communication. Go Or Stratego.

“When you go beyond two-player zero-sum games, the idea of Nash equilibrium is no longer that useful for playing well with humans,” ” Brown.

So, the team trained its AI — named Cicero — on data from 125,261 games of an online version of Diplomacy Human players. Combining These with self-play data Cicero’s strategic reasoning module (SRM) learnt to predict, for a given state of the game and the accumulated messages, the probable policies of the other players. Using This prediction allows the SRM to choose an optimal action and signal its approval. ‘intent’ To Cicero’s dialogue module.

The dialogue module was built on a 2.7-billion-parameter language model pre-trained on text from the Internet Then, fine-tune the message using messages from Diplomacy People play games. Given Intent from the SRM is generated by the module. It generates a conversational text message (for example: CiceroThe following is a representation England, might ask France: “Do you want to support my convoy to Belgium?”).

In 22 November Science Paper2The team found that 40 online games were played. “Cicero achieved more than double the average score of the human players and ranked in the top 10% of participants who played more than one game”.

Real-World behavior

Brown It is possible to create real-world applications using game-playing AIs that interact with humans. “If you’re making a self-driving car, you don’t want to assume that all the other drivers on the road are perfectly rational, and going to behave optimally,” He says. CiceroHe adds that it is a significant step in the right direction. “We still have one foot in the game world, but now we have one foot in the real world as well.”

Wellman He agrees but states that more work is required. “Many of these techniques are indeed relevant beyond recreational games” He believes that this can be applied in the real world. “Nevertheless, at some point, the leading AI research labs need to get beyond recreational settings, and figure out how to measure scientific progress on the squishier real-world ‘games’ that we actually care about.”

Breaking News

Be the first to read breaking news on OopsTop.com. Today’s latest news, and live news updates, read the most reliable English news website Oopstop.com

Leave a Reply