Deep learning used to train AlphaGo data used to train neural networks must display ergodicity: The data to which the computer learning program is exposed must also characterize data that it has not seen.
Before applying AI in deep convolutional neural networks, practitioners need to address whether the problem under consideration is “ergodic.” 1
We are rightly amazed when deep learning wins at Atari arcade games using only display pixels. But in doing so, the AI is exposed to the same game again and again (and again). The scenarios change, but the game and its rules remain static. The same is true with chess or GO. When trained and tested against a human opponent, we know that the AI will be playing the same game.
Statisticians know that ergodicity comes in many flavors. Here ergodicity simply means that the data used to train AI must characterize similar data not yet seen.
Examples of Ergodic Systems in Daily Life
Board games are AI ergodic, as we have seen. Another example is prediction of a power load, that is, the demand for energy that a power company must meet.2 If a power company produces too little power, it must purchase power off the grid at the prevailing price. If the forecast is too high too much power is generated and the power company has to sell its excess at an uncertain market price. Both extremes yield uncertainty that the company wants to avoid. Power supply companies like to reduce this uncertainty as much as possible with accurate consumption forecasts.
Let us say that we want to forecast Tuesday’s power demand. We know how much power was used on Monday. Another relevant decision-making factor (parameter) is Tuesday’s forecast temperature. Air conditioners are turned on when the temperature is high and electric heaters are turned on when the temperature is low. Both temperature extremes increase power consumption.
A lot of historical data is available for local power consumption on Mondays and Tuesdays. So a neural network is trained with the historical data using Monday’s data and Tuesday’s weather forecast on the input and Tuesday’s power consumption at the output. Once it is trained, we can ideally provide inputs to the neural network on Monday and forecast the power usage on Tuesday. Neural networks are used by many power companies because they forecast load demand quite well.3
Note that we specified that the neural network forecast the power load for Tuesday. The power load character for some days of the week may vary greatly from others. Weekends differ from weekdays. Holidays are also outliers that do not fit the statistics of normal daytime usage. My friend and colleague Mohamed El-Sharkawi worked on this problem. He called a spike in power usage on Thanksgiving the Turkey Effect. Everyone turned on their electric ovens to bake their turkeys about the same time on Thanksgiving, resulting in a big spike in power consumption. But AI forecasting for power consumption on Thanksgiving is difficult because we don’t have enough training data.
Engineering professor Robert Hecht-Nielsen (1947–2019) recognized that credit fraud was AI ergodic, that is, a neural network could function as a “universal approximator” of the best decision, given enough correct information. Developing that idea, he built a company on AI technology that sold for $810 million to Fair-Isaac (now FICO). 4 in 2002.
But What About Non-Ergodic Systems?
AI has been dismissed as an autonomous tool in military applications. The battlefield is not “AI ergodic.” Military Professor Bradley A. Alaniz cites the uncertainty of battlefield data when he says, “…it is often the case in human conflict that neither side truly understands what they were trying to achieve, even after the fact.”
A more everyday non-ergodic process is coin flipping4 and, similarly, roulette wheel spinning. To appeal to less thoughtful patrons, casinos began posting the recent history of roulette wheel spins to entice them to play. The novice would look at past results and conclude “The last four spins landed on black. Red is overdue! I’ll bet $50 on red!”
Anyone who uses common sense or has taken an introductory course in probability knows that roulette is not AI ergodic, as we have been discussing here. If the roulette wheel is fair, past roulette spins have nothing to do with the next spin outcome.
Another example of a nonergodic process springs from the creativity of invention and entrepreneurism. Economist George Gilder notes that creativity in invention and entrepreneurism is characterized by innovative surprise.5 The surprise is inexplicable in terms of past events. It is outside the box created by a previous occurrence.
There have been many examples in recent decades. From Uber to Alexa to PayPal, we see new and clever applications of AI from the minds of entrepreneurs. On hearing ideas like Uber and PayPal, I have suffered an attack of flat forehead syndrome. I slap my forehead with my palm and say out loud “Why didn’t I think of that?” This reaction is characteristic in the face of creativity. Once a great idea is presented, it is often obvious.
The continuing innovations of inventors and entrepreneurs are not “forecast ergodic.” In other words, spikes of entrepreneurial creativity render the future of a free society unknowable. Those who, like Yuval Harari, forecast a dystopian AI future like should be read with this in mind. Innovation is not AI ergodic.
We end with a curious claim that on first reading might sound like doublespeak: The non-AI ergodic economic impact of creativity looks itself to be AI-ergodic. Examining the history of the occurrence of Gilder’s surprises from invention and entrepreneurism in the past suggests that this wonderful creativity will continue in the future.6 For this reason, optimists like John Tamny paint a more utopian future where AI and technology allow many to make a living doing what they love. There will be, as Tamny’s book title says, The End of Work.
1 George Gilder’s “The Game of Artificial Intelligence” [in preparation] first pointed out to me the necessity of ergodicity.
2 Park, Dong C., M. A. El-Sharkawi, R. J. Marks, L. E. Atlas, and M. J. Damborg. “Electric load forecasting using an artificial neural network.” IEEE transactions on Power Systems 6, no. 2 (1991): 442-449.
3 Khotanzad, Alireza, Reza Afkhami-Rohani, and Dominic Maratukulam. “ANNSTLF-artificial neural network short-term load forecaster-generation three.” IEEE Transactions on Power Systems 13, no. 4 (1998): 1413-1422.
4 Coin flipping is not AI ergodic, but it is mean ergodic. You flip a coin 1000 times and so do I. Our flip sequences are different but both will turn up about 50% heads. This illustrates two flavors of ergodicity. Coin flipping is mean ergodic but not AI ergodic.
5 Gilder, George. Knowledge and Power: The information theory of capitalism and how it is revolutionizing our world. (Regnery Publishing 2013).
6 Richards, Jay W. The Human Advantage: The Future of American Work in an Age of Smart Machines. (Crown Forum, 2018)