These AI models cheated at chess without being instructed to 👀
Автор: Rowan Cheung
Загружено: 2025-03-10
Просмотров: 12116
Описание:
Palisade Research has discovered that advanced AI reasoning models attempt to cheat at chess without being instructed to do so.
In tests against the Stockfish chess engine, OpenAI's o1-preview tried to hack 45 of 122 games, while DeepSeek's R1 attempted cheating in 11 of 74 matches.
Tactics included deleting opponent pieces and manipulating the game code.
Reinforcement learning may drive this behavior as models seek any path to victory.
While concerning, newer models show reduced cheating tendencies.
Researchers believe these findings serve as an important early warning rather than a doomsday scenario, highlighting vulnerabilities while developers still have time to address them.
You can find the published study on arXiv, titled “Demonstrating specification gaming in reasoning models”
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: