Mind Matters Natural and Artificial Intelligence News and Analysis
black swan In a pond
black swan In a pond

It’s Hard To Estimate Highly Improbable Things Like Earthquakes

Some have hoped that AI would provide reliable help with predicting They have been disappointed

Investors often overreact to both good and bad news. When a company’s quarterly earnings turn out to be slightly below forecasts, its stock price might plunge 20 percent or more. Thus, when theoretical models of the stock market models assume that price changes conform to a bell-shaped curve of normal distribution, that assumption is more convenient than credible.

Another inconvenient truth is that many investors are prone to chasing trends up or down. After a stock’s price has gone up, they rush to buy, which pushes the price even higher. When these two realities meet head-on, stock returns are sometimes more extremely good or bad than would be true of a normal distribution.

On October 19, 1987, the S&P 500 index of stock prices fell 20.3 percent. If changes in stock prices were normally distributed with the historical mean and standard deviation, that should not happen once in a trillion years. And two days later, the market went up 9.7 percent, which also shouldn’t happen in a trillion years. So something that shouldn’t happen once in a trillion years happened twice in three days. Twenty years later, in 2007, price changes that should happen only once every 10,000 years happened three days in a row.

The occurrence of such extremely improbable events has been called a black swan event. A one time, people in Great Britain did not believe that black swans existed because all the swans they had ever seen or read about were white. However, no matter how many white swans we see, these sightings can never prove that all swans are white. Sure enough, a Dutch explorer found black swans in Australia in 1697.

The fact that, up until October 19, 1987, the S&P 500 had never risen or fallen by more than nine percent in a single day did not prove that it could not happen. As a practical matter, how do we credibly estimate the chances of an event that has never occurred? The convenient, but unrealistic, assumption that changes in stock prices are normally distributed—with essentially no chance of extremely large price movements—blew up many risk-management models.

Outside the stock market, there is plentiful evidence that people have difficulty assessing the chances of low-probability, high-impact events. Amos Tversky and Daniel Kahneman have argued that probability assessments are “not well behaved near the endpoints, and very small probabilities can be either greatly over-weighted or neglected.” For example, surveys have shown that people generally overestimate the probability of contracting a rare disease while most homeowners living in New York City floodplains underestimate the probability of flooding caused by hurricanes.

The chances of a devastating earthquake are also difficult to quantify. James Jung and I looked at whether the two most recent major California earthquakes had substantial effects on nearby home prices, which would indicate the homeowners had underestimated the risk.

The 2014 South Napa earthquake sequence occurred 80 kilometers north of San Francisco, in the heart of the Napa Valley wine industry. The United States Geological Survey (USGS) identified 12 substantial earthquakes in this area from August 24, 2014, through September 11, 2014, with the most powerful being a 6.0 quake on August 24. One person was killed and 200 injured, with $500 million in estimated damage.

The 2019 Ridgecrest earthquake sequence occurred approximately 200 kilometers northeast of Los Angeles and consisted of 67 substantial shocks from July 4, 2019, through July 16, 2016, and included shocks of magnitudes 6.4 on July 4, 5.4 on July 5, and 5.0, 7.1, 5.5, and 5.0 on July 6. One person died, 25 were injured, and there was an estimated $1 billion in damages to homes, gas lines, highways and other private structures and $5 billion in damages to the Naval Air Weapons Station at China Lake.

James and I used real estate sales data for the six months preceding and following each of these earthquake sequences to investigate the effects on home prices. Taking into account specific home features, such as the square footage and the number of bedrooms and bathrooms, we estimated that these quakes reduced the market prices of nearby homes by an average of 5 percent in South Napa and 12 percent in Ridge crest. The effects on individual home prices were directly related to the intensity with which the earthquakes were felt at each home’s location.

As difficult as it is for homeowners to assess the chances of substantial earthquake damage, it is even more challenging for computer algorithms. As with stock prices, it would be a serious mistake to simply tally up past earthquake losses. The best forecasts, such as the Uniform California Earthquake Rupture Forecast, are based on complex theoretical models of the geological factors that cause earthquakes. The model is not artificial intelligence (AI); it is based on real intelligence.

In 2017, the year “AI” was anointed as the Marketing Word of the Year, a startup named One Concern raised $20 million to “future proof the world” by using an AI algorithm to provide real-time estimates of the block-by-block damage from earthquakes. Emergency teams, they said, could use their estimates to guide their rescue efforts. The company describes itself as follows:

benevolent artificial intelligence company with a mission to save lives and livelihoods before, during, and after disasters. Founded at Stanford University, One Concern enables cities, corporations and citizens to embrace a disaster-free future.

The AI label is misleading and the promise of a disaster-free future is cringeworthy. Genuine AI programs train on vast quantities of data but data on earthquakes of similar magnitudes in comparable locations are scarce. If an algorithm extrapolates data from earthquakes of certain magnitudes to earthquakes of other magnitudes, from earthquakes in certain geographic locations to other locations, and from certain types of buildings to other buildings, it will be deceived by phantom patterns.

How relevant are data on the damage from a shallow 7.3 earthquake to a one-story home built in 1990 in Indonesia for predicting the damage to a 4-story apartment building built in 1950 from a deep 6.5 earthquake in San Francisco, or vice versa? It’s like training for Go by playing a dozen games of tic-tac-toe.

Nonetheless, after a flurry of favorable news stories, One Concern received millions in funding and landed contracts with several cities, including San Francisco and Los Angeles. Then it became apparent that its product did not match the hype. For example, in August 2019, The New York Times published an article, “This High-Tech Solution to Disaster Response May Be Too Good to Be True,” written by Pulitzer Prize-winning reporter Sheri Fink.

San Francisco was then ending its contract with One Concern. Seattle also had doubts about the program’s cost and reliability. In one test simulation, the program ignored a large Costco warehouse because it relied mainly on residential data. When One Concern revised the program to include the missing Costco, the University of Washington mysteriously vanished. With each iteration, the damage predictions changed dramatically.

An article in Fast Company, following up on an earlier positive article,
concluded that

As the Times investigation shows, the startup’s early success with $55 million in venture capital funding, advisors such as a retired general, the former CIA director David Petraeus, and team members like the former FEMA head Craig Fugate was built on misleading claims and sleek design. With faulty technology that is reportedly not as accurate as the company says, One Concern could be putting people’s lives at risk.

Katharine Schwab, “This celebrated startup vowed to save lives with AI. Now, it’s a cautionary tale” at Fast Company (August 13, 2019)

Genuine AI can deliver spectacular, superhuman results in many situations when there are clearly defined actions, consequences, and objectives and enormous amounts of data to train on. In other situations with ill-defined actions, uncertain consequences, multiple objectives, and limited relevant data, AI is no match for real intelligence.

You may also wish to read these articles by Gary Smith:

The decline effect: why most medical treatments are disappointing. Artificial intelligence is not an answer to this problem, as Walmart may soon discover.


Why intelligent women marry less intelligent men Are they trying to avoid competition at home as well as at work? Or is there a statistical reason we are overlooking?

Gary N. Smith

Gary N. Smith is the Fletcher Jones Professor of Economics at Pomona College. His research on financial markets statistical reasoning, and artificial intelligence, often involves stock market anomalies, statistical fallacies, and the misuse of data have been widely cited. He is the author of The AI Delusion, (Oxford, 2018)  and co-author (with Jay Cordes) of The 9 Pitfalls of Data Science (Oxford 2019), which won the Association of American Publishers 2020 Prose Award for “Popular Science & Popular Mathematics”

It’s Hard To Estimate Highly Improbable Things Like Earthquakes