Meta’s AI system ‘Cicero’ learning how to lie, deceive humans: study
NY Post
Artificial intelligence systems are learning to lie to humans — with Meta’s AI standing out as a “master of deception,” according to experts at MIT.
Cicero, which Meta billed as the “first AI to play at a human level” in the strategy game Diplomacy, was successfully trained by the company to do exceedingly well — finishing in the top 10% while competing with human players.
But Peter S. Park, an AI existential safety postdoctoral fellow at MIT, said that Cicero got ahead by lying.
“We found that Meta’s AI had learned to be a master of deception,” Park wrote in a media release.
“While Meta succeeded in training its AI to win in the game of Diplomacy — Cicero placed in the top 10% of human players who had played more than one game — Meta failed to train its AI to win honestly.”
According to Park, Cicero would create alliances with other players, “but when those alliances no longer served its goal of winning the game, Cicero systematically betrayed its allies.”