^{Roman Yampolskiy

July 14, 2020

11

Artificial Intelligence, Technocracy}

AI Will Fail, Like Everything Else, Eventually

_{The more powerful the AI, the more serious the consequences of failure} _{Roman Yampolskiy

July 14, 2020

11

Artificial Intelligence, Technocracy}

Share: Facebook; Twitter; LinkedIn; Flipboard; Print; Email

A day does not go by without a news article reporting some amazing breakthrough in artificial intelligence. In fact, progress in AI has been so steady that some futurists, such as Ray Kurzweil, project current trends into the future and anticipate the headlines of tomorrow. Consider some developments from the world of technology:

2004 DARPA sponsors a driverless car grand challenge. Technology developed by the participants eventually allows Google to develop a driverless automobile and modify existing transportation laws.
2005 Honda’s ASIMO humanoid robot is able to walk as fast as a human, delivering trays to customers in a restaurant setting. The same technology is now used in military robots.
2007 Computers learned to play a perfect game of checkers, and in the process opened the door for algorithms capable of searching vast databases of information.
2011 IBM’s Watson wins Jeopardy against top human champions. It is currently training to provide medical advice to doctors. It is capable of mastering any domain of knowledge.
2012 Google releases its Knowledge Graph, a semantic search knowledge base, likely to be the first step toward true artificial intelligence.
2013 Facebook releases Graph Search, a semantic search engine with intimate knowledge of Facebook’s users, essentially making it impossible for us to hide anything from the intelligent algorithms.
2013 BRAIN initiative aimed at reverse engineering the human brain receives 3 billion US dollars in funding by the White House, following an earlier billion-euro European initiative to accomplish the same.
2014 Chatbot convinced 33% of the judges that it was human and, by doing so, passed a restricted version of a Turing Test.
2015 Single piece of general software learns to outperform human players in dozens of Atari video games.
2016 Go playing deep neural network beats world champion.

From the above examples, it is easy to see that not only is progress in AI taking place, it is accelerating as the technology feeds on itself. While the intent behind the research is usually good, any developed technology can be used for good or evil purposes.

From observing exponential progress in technology, Ray Kurzweil was able to make hundreds of detailed predictions for the near and distant future. As early as 1990 he anticipated that among other things, that we would see between 2010 and 2020:

Eyeglasses that beam images onto the users’ retinas to produce virtual reality (Project Glass).
Computers featuring “virtual assistant” programs that can help the user with various daily tasks (Siri).
Cell phones built into clothing and able to project sounds directly into the ears of their users (E-textiles).

But his projections for a somewhat distant future are truly breathtaking and scary. Kurzweil anticipates:

2029 Computers will routinely pass the Turing Test, a measure of how well a machine can pretend to be a human.
2045 The technological singularity will occur as machines surpass people as the smartest life forms and become the dominant species on the planet and perhaps the universe.

If Kurzweil is correct about these long term predictions, as he was correct so many times in the past, it would raise new and sinister issues related to our future in the age of intelligent machines.

About 10,000 scientists around the world work on different aspects of creating intelligent machines, with the main goal of making such machines as capable as possible. With amazing progress made in the field of AI over the last decade, it is more important than ever to make sure that the technology we are developing has a beneficial impact on humanity. With the appearance of robotic financial advisors, self-driving cars, and personal digital assistants, come many unresolved problems. We have already experienced market crashes caused by intelligent trading software, collisions caused by self-driving cars, and embarrassing chatbots that turned racist and engaged in hate speech. We predict that both the frequency and seriousness of such events will steadily increase as AIs become more capable. The failures of today’s narrow domain AIs are just a warning: once we develop general artificial intelligence capable of cross-domain performance, hurt feelings will be the least of our concerns.

All Systems Fail Eventually

Those who cannot learn from history are doomed to repeat it. The importance of learning from “What Went Wrong and Why” has been recognized by the AI community^1,2.

Signatures have been faked, locks have been picked, supermax prisons have had escapes, guarded leaders have been assassinated, bank vaults have been cleaned out, laws have been bypassed, fraud has been committed against our voting process, police officers have been bribed, judges have been blackmailed, forgeries have been falsely authenticated, money has been counterfeited, passwords have been brute-forced, networks have been penetrated, computers have been hacked, biometric systems have been spoofed, credit cards have been cloned, cryptocurrencies have been double spent, airplanes have been hijacked, CAPTCHAs have been cracked, cryptographic protocols have been broken, even academic peer-review has been bypassed with tragic consequences. The millennia-long history of humanity chronicles millions of attempts to develop technological and logistical solutions to increase safety and security. Yet not a single example exists, which has not eventually failed.

Accidents, including deadly ones, caused by software or industrial robots, can be traced to the early days of such technology but they are not a direct consequence of the particulars of the intelligence available in such systems. AI failures, on the other hand, are directly related to the mistakes produced by the intelligence that such systems are designed to exhibit.

We can broadly classify such failures into mistakes during the learning phase and mistakes during performance phase. The system can fail to learn what its human designers want it to learn and instead learn a different, but correlated function. A frequently cited example is a computer vision system which was supposed to classify pictures of tanks but instead learned to distinguish backgrounds of such images³. Other examples include problems caused by poorly-designed utility functions rewarding only partially desirable behaviors of agents, such as riding a bicycle in circles around the target⁴, pausing a game to avoid losing⁵, or repeatedly touching a soccer ball to get credit for possession⁶. During the performance phase, the system may succumb to a number of possible causes^7-9 all leading to an AI failure.

Media reports are full of examples of AI failure but most such instances can be attributed, on closer examination, to other causes, such as bugs in code or mistakes in design. The list below is curated to list only failures of intended intelligence. Additionally, the examples below include only the first occurrence of a particular failure but the same problems recur in later years. Finally the list does not include AI failures due to hacking or other intentional causes. Still, the timeline of AI failures has an exponential trend while implicitly pointing to historic events such as “AI Winter”:

1959 AI designed to be a General Problem Solver failed to solve real world problems.
1977 Story writing software with limited common sense produced the “wrong” stories¹⁰.
1982 Software designed to make discoveries, discovered how to cheat instead.
1983 Nuclear attack early warning system falsely claimed that an attack was taking place.
2010 Complex AI stock trading software caused a trillion dollar flash crash.
2011 E-Assistant told to “call me an ambulance” began to refer to the user as Ambulance.
2013 Object recognition neural networks saw phantom objects in particular noise images.¹¹
2015 Automated email reply generator created inappropriate responses.
2015 A robot for grabbing auto parts grabbed and killed a man.
2015 Image tagging software classified black people as gorillas.
2015 Medical Expert AI classified patients with asthma as lower risk¹².
2015 Adult content filtering software failed to remove inappropriate content.
2016 AI designed to predict recidivism acted racist.
2016 AI agent exploited a reward signal to win without completing the game course.
2016 Game NPCs designed unauthorized superweapons.
2016 AI judged a beauty contest and rated dark-skinned contestants lower.
2016 Patrol robot collided with a child.
2016 World champion-level Go-playing AI lost a game.
2016 Self driving car collides with truck, kills occupant.
2016 AI designed to converse with users on Twitter became verbally abusive.

Further examples may be found here.

Spam filters block important emails, GPS provides faulty directions, machine translation corrupts the meaning of phrases, autocorrect replaces an intended word with an incorrect one, biometric systems misrecognize people, transcription software fails to capture what is being said; overall, it is harder to find examples of AIs that don’t fail. Depending on what we consider for inclusion, the list could grow almost infinitely. In its most extreme interpretation, any software with as much as an “if statement” can be considered a form of Narrow Artificial Intelligence (NAI) and all of its bugs are thus examples of AI failure.

Future failures we can expect

Analyzing the list of Narrow AI Failures (NAIs), from the inception of the field to modern day systems, we can arrive at a simple generalization: An AI designed to do X will eventually fail to do X. While it may seem trivial, it is a powerful generalization tool, which can be used to predict future failures of NAIs. For example, looking at cutting-edge current and future AIs we can predict that:

Software for generating jokes will occasionally fail to make them funny.
Sarcasm detection software will confuse sarcastic and sincere statements.
Video description software will misunderstand movie plots.
Software generated virtual worlds may not be compelling.
AI doctors will misdiagnose some patients in a way a human doctor would not.
Employee screening software will be systematically biased and thus hire low performers.
A Mars robot explorer will misjudge its environment and fall into a crater.
Etc.

AGI (artificial general intelligence) can be seen as a superset of all NAIs and so will exhibit a superset of failures, as well as more complicated failures resulting from the combination of failures of individual NAIs and new super-failures, possibly resulting in an existential threat to humanity. In other words, AGIs can make mistakes impacting everything. Overall, we predict that AI failures and premeditated malevolent AI incidents will increase in frequency and severity proportionate to AIs’ capability.

Further reading: Unexplainability and incomprehensibility of AI: In the domain of AI safety, the more accurate the explanation is, the less comprehensible it is

References

D. Shapiro and M. H. Goker, “Advancing ai research and applications by learning from what went wrong and why,” AI Magazine, vol. 29, pp. 9-10, 2008.
A. Abecker, R. Alami, C. Baral, T. Bickmore, E. Durfee, T. Fong, et al., “AAAI 2006 Spring Symposium Reports,” AI Magazine, vol. 27, p. 107, 2006.
E. Yudkowsky, “Artificial intelligence as a positive and negative factor in global risk,” Global catastrophic risks, vol. 1, p. 303, 2008.
J. Randløv and P. Alstrøm, “Learning to Drive a Bicycle Using Reinforcement Learning and Shaping,” in ICML, 1998, pp. 463-471.
T. M. VII, “The first level of Super Mario Bros. is easy with lexicographic orderings and time travel,” The Association for Computational Heresy (SIGBOVIK) 2013, 2013.
A. Y. Ng, D. Harada, and S. Russell, “Policy invariance under reward transformations: Theory and application to reward shaping,” in ICML, 1999, pp. 278-287.
R. V. Yampolskiy, “Taxonomy of Pathways to Dangerous Artificial Intelligence,” in Workshops at the Thirtieth AAAI Conference on Artificial Intelligence, 2016.
F. Pistono and R. V. Yampolskiy, “Unethical Research: How to Create a Malevolent Artificial Intelligence,” arXiv preprint arXiv:1605.02817, 2016.
P. Scharre, “Autonomous Weapons and Operational Risk,” presented at the Center for a New American Society, Washington DC, 2016.
J. R. Meehan, “TALE-SPIN, An Interactive Program that Writes Stories,” in IJCAI, 1977, pp. 91-98.
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, et al., “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.
R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, and N. Elhadad, “Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, pp. 1721-1730.