Two Computer Doomsday Scenarios: How Likely Are They?
One features a computer superintelligence beyond human comprehension and the other features a computer that destroys the world for an algorithmic rewardIn an open-access paper last year at the Journal of Artificial Intelligence Research, a research group concluded that a computer superintelligence, if developed, could not be contained. It would be a HAL 9000 that couldn’t just be turned off.
Science writer David Nield explains:
The catch is that controlling a super-intelligence far beyond human comprehension would require a simulation of that super-intelligence which we can analyze (and control). But if we’re unable to comprehend it, it’s impossible to create such a simulation. Rules such as ‘cause no harm to humans’ can’t be set if we don’t understand the kind of scenarios that an AI is going to come up with, suggest the authors of the new paper. Once a computer system is working on a level above the scope of our programmers, we can no longer set limits.
“A super-intelligence poses a fundamentally different problem than those typically studied under the banner of ‘robot ethics’,” wrote the researchers.
“This is because a superintelligence is multi-faceted, and therefore potentially capable of mobilizing a diversity of resources in order to achieve objectives that are potentially incomprehensible to humans, let alone controllable.”
David Nield, “Researchers Say It’ll Be Impossible to Control a Super-Intelligent AI” at ScienceAlert (September 18, 2022)
The researchers based their approach on Alan Turing’s halting problem from 1936: It’s not logically possible to know whether a computer program will stop or loop forever. So, Nield writes,
Any program written to stop AI from harming humans and destroying the world, for example, may reach a conclusion (and halt) or not – it’s mathematically impossible for us to be absolutely sure either way, which means it’s not containable.
David Nield, “Researchers Say It’ll Be Impossible to Control a Super-Intelligent AI” at ScienceAlert (September 18, 2022)
Also, if it is superhuman, we cannot do anything about that.
But then what reason do we have for believing that we can build a computer that is “far beyond human comprehension,” as opposed to one whose algorithmic outcomes are hard to understand or justify?
It’s true that powerful computers have black-box algorithms that come up with decisions we don’t understand. But, as business prof Gary Smith pointed out recently, that’s as likely to be weakness as a strength. Incomprehensible algorithms did not produce useful results in medicine and are considered a nightmare scenario in criminal justice.
Also, to use its incomprehensibility as a weapon against us, the computer system would surely need to have the goal of total control programmed in. Can what is programmed not be programmed out?
A similar concern is voiced by two researchers at The Conversation:
How would an artificial intelligence (AI) decide what to do? One common approach in AI research is called “reinforcement learning”.
Reinforcement learning gives the software a “reward” defined in some way, and lets the software figure out how to maximise the reward. This approach has produced some excellent results, such as building software agents that defeat humans at games like chess and Go, or creating new designs for nuclear fusion reactors.
However, we might want to hold off on making reinforcement learning agents too flexible and effective.
As we argue in a new paper in AI Magazine, deploying a sufficiently advanced reinforcement learning agent would likely be incompatible with the continued survival of humanity.
Michael K. Cohen, Marcus Hutter, “The danger of advanced artificial intelligence controlling its own feedback” at The Conversation (October 23, 2022) The paper is open access.
How would the reward-seeking software pose a threat? Cohen and Hutter offer a scenario in which the computer is offered a “reward” (getting a camera to see “1” is the illustration they offer):
The agent will never stop trying to increase the probability that the camera sees a 1 forevermore. More energy can always be employed to reduce the risk of something damaging the camera – asteroids, cosmic rays, or meddling humans.
That would place us in competition with an extremely advanced agent for every joule of usable energy on Earth. The agent would want to use it all to secure a fortress around its camera.
Assuming it is possible for an agent to gain so much power, and assuming sufficiently advanced agents would beat humans in head-to-head competitions, we find that in the presence of a sufficiently advanced reinforcement learning agent, there would be no energy available for us to survive.
Michael K. Cohen, Marcus Hutter, “The danger of advanced artificial intelligence controlling its own feedback” at The Conversation (October 23, 2022)
Hmm. It’s reasonable to think that some other doomsday would get in ahead of that one.
For one thing, the human race is not likely to unite to produce the supercomputer that either research group envisions. More likely, hostile groups will program powerful computers to destroy their enemies’ proposed algorithmic tyrants.
The story of the Tower of Babel is pertinent here: The project of scaling the heavens (the cultural equivalent of the all-powerful computer) broke up. Why? Because everyone ended up speaking a different language and soon they all wanted different things. Whether we think that’s a plus or a minus, it is part of being human.
Of course, a panopticon of powerful computers could still do a great deal of harm in the service of a totalitarian state. We are seeing such conflicts play out in China right now. But it’s not the software vs. the people; it’s the people who control the software versus the people who don’t. It looks like that sort of scenario will be the actual conflict for the foreseeable future.