In his forthcoming book, Human Compatible: Artificial Intelligence and the Problem of Control, Wadham Honorary Fellow Stuart Russell suggests that we can rebuild AI on a new foundation, according to which machines are designed to be inherently uncertain about the human preferences they are required to satisfy.
Such machines would be humble, altruistic, and committed to pursue our objectives, not theirs. This new foundation would allow us to create machines that are provably deferential and provably beneficial.
Longlisted for the 2019 Financial Times and McKinsey Business Book of the Year Award, the book begins by exploring the idea of intelligence in humans and in machines, describes the benefits we can expect, from intelligent personal assistants to vastly accelerated scientific research, and outlines the AI breakthroughs that still have to happen before we reach superhuman AI. He also spells out the ways humans are already finding to misuse AI, from lethal autonomous weapons to manipulation of opinions on a global scale.
“In the popular imagination, superhuman artificial intelligence is an approaching tidal wave that threatens not just jobs and human relationships, but civilization itself. Conflict between humans and machines is seen as inevitable and its outcome all too predictable,” says Stuart, who believes that this scenario can be avoided.
Russell (Physics, 1979), Professor of Computer Science at the University of California, Berkeley, is author of the standard text in AI, Artificial Intelligence: A Modern Approach (with Peter Norvig). Although he is a computer scientist, he received a prestigious Andrew Carnegie Fellowship earlier this year—an award that supports high-calibre scholarship in the social sciences and humanities. The anticipated result of each fellowship is the publication of a book or major study that offers a fresh perspective on a pressing challenge of our time and Stuart’s research project title is Provably Beneficial AI.