The web proved that gathering data and using machine learning techniques resulted in superior performance on a number of central tasks in information extraction and natural language processing, like entity extraction, co-reference resolution, and many others (sentiment analysis et al).
For all practical purposes, the debate about this raging on among AI scientists was resolved definitely by about 2010 — that the idea of hiring smart people to hand-code “knowledge” in a computer-readable language was quite limited. It had its day, to be sure, but it wasn’t a path to AGI. It wasn’t a path to anything other than hiring philosophers.
My own career transitioned from doing the manual code-it-all-in approach to training and developing systems based on provision of data. The reason the web was so central in this shift in AI was simple: there’s a lot of data! I mean text messages, tweets, comments, blogs, and for image recognition jpegs and so on. We simply didn’t have this volume of data before the world wide web. What we learned was that the old school way of thinking about AI — statistical approaches using only data are brittle — was just wrong. This started the problem.
ChatGPT finished it. It (as far as we know) has literally no concept, in the ontologists’ sense, of anything — it doesn’t “know” that living humans have heads (to take a famous example), and that houses are structures where people live. It doesn’t “know” anything. But the lesson is — and it’s a hard lesson to swallow — we don’t really need to do all that manual effort in the first place. It’s like digging with a spoon and someone hands you a shovel. Keep digging with a spoon?
The statistical approach is not, after all, “brittle,” it just takes around 10 billion dollars and vast quantities of data with a deep neural network to get answers and results we’ve been trying to achieve for decades. This is a fact.
ChatGPT Doesn’t Know Anything
There is one very BIG problem. Because the large language models are relying on statistics (it’s basically revising the probabilities to attempt to get to a maximum probable state), a ChatGPT or similar system will occasionally simply start talking nonsense. Since it doesn’t know anything in the first place, there’s no way to talk it into sanity once it goes insane. This is what happened to AI. It killed the old idea of manual coding of knowledge (ontologists), but it left us with another problem.
SO. I think this fact is actually fatal. If we can’t rely on the results — even though they’re mostly accurate — what will we do with it? It’s like that friend you have that’s brilliant but occasionally strips naked and runs around in traffic. This is AGI? Argh. As in life, AI research advances on one hand and doesn’t on the other. Unintended consequences keep popping up. What seems ideal turns out to be a dead end, even if a sexy one.
Black box problem. This just means that we can’t really understand WHY the GPT “brain” produced a result, because if you have something on the order of a trillion parameters (variables to fix values), there’s no way to recover the reasoning steps. There aren’t reasoning steps. It’s doing matrix multiplication using a mechanism called attention, which I’ve talked about before.
So. Self-driving cars? Out, because of safety concerns. Military applications? Thinking it’s an “AGI” useful for deciding nuclear war would be literally insane. Absolutely no one in their right mind would use large language models for this purpose.
The sad fact is that, while it definitely resolved a long standing debate in the field about “statistics” or “logic” (statistics won), it left us in another dead end. I can SEE this and I talk about it frequently, but the world hasn’t quite caught up yet. We’re in another AI winter.
Originally featured at Colligo.