Cybernetic Brain. Electronic chip in form of human brain in electronic cyberspace. Illustration on the subject of 'Artificial Intelligence'.

#7 AI Smash Hit: Why AI Can’t Do Your Thinking for You

Robert J. Marks: you change a pixel or two in an image and the deep convolutional neural network is totally wrong

Our Walter Bradley Center director Robert J. Marks is back with Jonathan Bartlett and Eric Holloway, assessing their Top Ten real advances (“Smash Hits”) in AI in 2020. Readers may recall that we offered a fun series during the holidays about the oopses and ums and ers in the discipline (typically hyped by uncritical sources). So now we celebrate the real achievements and our nerds think that #7 is honest recognition of the vulnerabilities of machine learning.

Our story begins at 19:37. Here’s a partial transcript. (Show Notes and Additional Resources follow, along with a link to the complete transcript.)

Robert J. Marks:Hacking AI and exposing vulnerabilities in machine learning? What’s going on here Eric?

Eric Holloway: AI suffers from a problem known as “underspecification.”

Because there’s such huge parameter models you don’t really know what the AI does outside of its dataset. Now, a lot of the times you’ve kind of interpolated between data points, so between those points, maybe you can know what’s going on, but there’s a lot of unknown areas in there. And hackers can prod those unknown areas and nudge the AI models in directions that the hackers want the models to go. And that’s just, I think, is an inescapable symptom of our AI systems, because to make these things work in the real world you have to have these really high parameter models to fit really complex data, but the paradox of the situation is that they become very brittle and much more easier to manipulate.

Robert J. Marks: Well in fact you hear about the deep convolutional neural networks trained on images, and all of a sudden you change a pixel or two in an image and the deep convolutional neural network is totally wrong, so they are incredibly brittle. So it’s this sort of thing that you’re talking about, right?

Eric Holloway: Yeah, and it’s not just the result is completely wrong, but the machine’s confidence in it’s result is complete certainty. It’s absolutely certain about the wrong result. And in this particular example, they took, I think, a self-driving AI and they could just subtly manipulate traffic signs and make the AI make very disastrous decisions. Like for example, they gave it a sign that said speed limit 35, and they changed the number three slightly so the AI thought it was 85:

In an 18-month-long research process, Trivedi and Povolny replicated and expanded upon a host of adversarial machine-learning attacks including a study from UC Berkeley professor Dawn Song that used stickers to trick a self-driving car into believing a stop sign was a 45-mile-per-hour speed limit sign. Last year, hackers tricked a Tesla into veering into the wrong lane in traffic by placing stickers on the road in an adversarial attack meant to manipulate the car’s machine-learning algorithms.

Patrick Howell O’Neill, “Hackers can trick a Tesla into accelerating by 50 miles per hour” at MIT Technology Review

Note: It’s been worse. At Google in 2015, AI identified a black man and his friends as gorillas, due to age-old glitches in photography, swopped into AI. In 2018, Amazon had to dump an AI human resources hiring program that penalized intelligent women.

Hint: AI will not do your thinking for you. Glad we’ve cleared that up. 😉

Here are the Smash Hits to date:

