^{Eric Holloway

January 21, 2020

6

Artificial Intelligence, Machine Learning, Philosophy of Mind, Programming}

Will an AI Win a Nobel Prize for Science All by Itself One Day?

_{No, but Support Vector Machines (SVMs) can allow scientists to frame questions so that a comprehensible answer is more likely} _{Eric Holloway

January 21, 2020

6

Artificial Intelligence, Machine Learning, Philosophy of Mind, Programming}

Share: Facebook; Twitter; LinkedIn; Flipboard; Print; Email

In recent news from Okinawa University:

Over the last few decades, machine learning has revolutionized many sectors of society, with machines learning to drive cars, identify tumors and play chess — often surpassing their human counterparts.
Now, a team of scientists based at the Okinawa Institute of Science and Technology Graduate University (OIST), the University of Munich and the CNRS at the University of Bordeaux have shown that machines can also beat theoretical physicists at their own game, solving complex problems just as accurately as scientists, but considerably faster.
In the study, recently published in Physical Review B, a machine learned to identify unusual magnetic phases in a model of pyrochlore — a naturally-occurring mineral with a tetrahedral lattice structure. Remarkably, when using the machine, solving the problem took only a few weeks, whereas previously the OIST scientists needed six years.
Okinawa Institute of Science and Technology (OIST) Graduate University, “Man versus machine: Can AI do science?” at ScienceDaily

Paper. (paywall)

What to make of it? First, the Okinawa researchers have discovered a useful technique for accelerating science discoveries. This article is not the usual “NEXT!: AI can do science!” stuff where someone feeds a deep learning network a whole bunch of data and everyone is just amazed by the insights it regurgitates.

Here is a useful thing they have really done: They used Support Vector Machines (SVMs), a neat machine learning algorithm which, unlike neural networks, can (in favorable circumstances) give useful information about the problem domain, and extract that information for human observers. But it is not a magic bullet.

First, the algorithm was not fed a bunch of raw data from which it derived a model all by itself. If it had, then we could indeed say that “AI has done science,” as the title chosen by the university invites us to consider.

No, scientists and engineers did all the heavy lifting. They created the model framework and then had the computer do a brute force search until it found parameters that minimized their error bars. In particular, the SVM system depends on a “kernel,” derived by the human engineers, which enables the algorithm to “learn” effectively from the data.

The kernel is a mathematical relationship between data points that defines how closely the points are related. Most machine learning algorithms represent data as numbers and measure distance numerically. Kernels, on the other hand, allow non-numerical items such as text documents and pictures to be compared. That makes SVMs much more widely applicable than standard machine learning algorithms.

Once an engineer has derived a good kernel, then the SVM does its magic, and comes up with a model for the data. So, the secret sauce that makes the SVM work is the human-derived kernel. This is where the real “science” and “learning” takes place, the kind of discoveries that are generated by creative thinking and understanding, instead of probing a parameter space until an error bar is minimized.

Once we have the kernel, what does the SVM do with it? The SVM is a kind of machine learning called supervised learning. The vast majority of useful machine learning is supervised learning: The algorithm is fed labeled data and derives a model to predict labels for unlabeled data.

In the simplest form of supervised learning, we can cleanly separate the different labeled groups with lines. Different algorithms have different criteria for picking lines. The SVM picks the lines based on which separating line is the farthest from all points, called a “maximum margin classifier.” It defines this line using data points.

The fewer data points it needs to define the line, the better the model. If only a few points are necessary, then the model is likely to have high accuracy when labeling new data. Additionally, if the SVM can minimize the definition down to just a few data points, then the researchers can interpret the result and thus it is no longer a black box. SVMs seem like the one algorithm to rule them all. Other algorithms, such as neural networks, are actually special cases of SVMs.

But that’s where the limitations begin. Identifying the minimal set of data points that define the separating line turns out to be a very hard problem. As the number of data points increases, the number of possible combinations increases exponentially. And in the worst case, the SVM must check all the combinations.

All of them? This is known as an NP-Hard Problem in computer science. If our problem grows in size, solving it could quickly require more runtime than the lifespan of the entire universe. Thus, it is not possible to just blindly setup an SVM to work on a problem and hope for success. Effective SVM performance requires a significant amount of human engineering to add enough active information to cut the search space down to a reasonable size.

This is why it is, in general, impossible for AI to “do science.” The real science is all the active information provided by the engineer, which cannot come from an algorithm.

Now, turning back to the news from Okinawa University, the real story here is “engineers do science.” To get the SVM to generate a result required them to expend a significant amount of effort in defining the problem. It’s not easy. The kernel must define how many degrees of freedom the model has. Given too many degrees of freedom, the model can end up fitting anything and suck in a lot of noise along with signal. This is known as the Vapnik–Chervonenkis (VC) dimension of the model.

In order to guarantee that our algorithm has actually “learned” something from the data, the VC dimension must be at least finite. But it is possible to define SVM kernels that have infinite dimensions, thus allowing the model to suck in anything and landing us right back into black box territory. That’s another potential downside of the SVM algorithm.

That said, this is a step in the right direction for applying machine learning to science. Done correctly, SVMs have very strong theoretical guarantees that deep learning and neural network approaches do not have—machine learning in science without the black box. At the end of the day, science is done for humans, not machines, and if we cannot interpret what the machine derives, it cannot truly be science.

If you enjoyed this piece by Eric Holloway on computers vs. the human mind, you might also enjoy these recent ones:

The challenge of teaching machines to generalize: Teaching students simply to pass tests provides a good illustration of the problems.

Could AI think like a human, given infinite resources? Given that the human mind is a halting oracle, the answer is no.

and

Will artificial intelligence design artificial superintelligence? And then turn us all into super-geniuses, as some AI researchers hope? No, and here’s why not.