Is Human Intelligence Simple? Part 3: Disambiguating Types of Simplicity
In Which I Give Stephen Pinker A Hard Time
Steven Pinker and Scott Aaronson have a lovely debate about AI, hosted on Scott Aaronson’s blog, which turns out to be centrally about how “simple” intelligence is.
Here’s what Pinker has to say about intelligence:
I think the concept of “general intelligence” is meaningless. (I’m not referring to the psychometric variable g, also called “general intelligence,” namely the principal component of correlated variation across IQ subtests. This is a variable that aggregates many contributors to the brain’s efficiency such as cortical thickness and neural transmission speed, but it is not a mechanism (just as “horsepower” is a meaningful variable, but it doesn’t explain how cars move.) I find most characterizations of AGI to be either circular (such as “smarter than humans in every way,” begging the question of what “smarter” means) or mystical—a kind of omniscient, omnipotent, and clairvoyant power to solve any problem. No logician has ever outlined a normative model of what general intelligence would consist of, and even Turing swapped it out for the problem of fooling an observer, which spawned 70 years of unhelpful reminders of how easy it is to fool an observer.
If we do try to define “intelligence” in terms of mechanism rather than magic, it seems to me it would be something like “the ability to use information to attain a goal in an environment.” (“Use information” is shorthand for performing computations that embody laws that govern the world, namely logic, cause and effect, and statistical regularities. “Attain a goal” is shorthand for optimizing the attainment of multiple goals, since different goals trade off.) Specifying the goal is critical to any definition of intelligence: a given strategy in basketball will be intelligent if you’re trying to win a game and stupid if you’re trying to throw it. So is the environment: a given strategy can be smart under NBA rules and stupid under college rules.
Since a goal itself is neither intelligent or unintelligent (Hume and all that), but must be exogenously built into a system, and since no physical system has clairvoyance for all the laws of the world it inhabits down to the last butterfly wing-flap, this implies that there are as many intelligences as there are goals and environments.
Pinker’s position is worth making explicit here, because it’s a rather surprising stance.
First of all, he thinks “general intelligence” is not a real thing.
On the other hand, he claims it’s not meaningless to characterize some humans as “smarter” than others. He believes psychometric g, which is derived from scores on intelligence tests, is a valid measure of the “brain’s efficiency”.
So, says Pinker, some humans have higher IQs than others; they score higher on intelligence tests. What’s more, he thinks high-IQ humans are that way because they have some combination of neurological features, like thicker cortices or faster-transmitting neurons, that enable these better test scores.
On the other hand, Pinker thinks that “intelligence” in the sense of “the ability to use information to attain a goal in an environment” is not something that you can have more of, or less of, across the board. Since possible goals and possible environments are so diverse, there is no such thing as being “good at” using information to attain goals in general.
In a later part of the debate, he scoffs at the idea that a digitally-uploaded, sped-up Albert Einstein or Bertrand Russell would be “superintelligences”, pointing out that these supposed “geniuses” had dumb ideas outside their fields of expertise (both believed in one-world governments!)
So, Pinker says, some people are smarter than others, but being smart doesn’t mean they’ll tend to be better at “things in general”.
These two claims juxtaposed come out to a very strange view of the world:
Pinker believes that IQ is a coherent measure of “smarts” in some sense — he calls it “brain efficiency.”
Pinker doesn’t believe that people with high IQs are generally better at using information to achieve goals.
I could imagine believing the first claim. I could imagine believing the second claim. But I can’t imagine believing both.
Pinker has a long public track record of believing that IQ is a coherent and meaningful-in-practice measure. Here he is in 2006, approvingly summarizing an American Psychological Association review as follows:
They reported that IQ tests measure a stable property of the person; that general intelligence reflects a real phenomenon (namely, that measures of different aspects of intelligence intercorrelate); that it predicts a variety of positive life outcomes; and that it is highly heritable among individuals within a group.
So, Pinker thinks people with high IQs are good at multiple tasks involving “aspects of intelligence”, and have better life outcomes like higher incomes and more educational attainment — but they aren’t better at using information to achieve goals?
If Pinker’s model is that people whose brains score higher on certain very simple quantitative traits, like thicker cortices or faster neural transmission, will do better on various cognitive tests, and reap more rewards at school and at work, but not have anything resembling an above-average ability to solve problems or achieve goals in general, then I’m not sure how he thinks that could work.
It’s possible that IQ could be a coherent, valid measure of some rather specialized and arbitrary skill, which our Western-industrialized institutions of work, school, and money happen to revolve around and reward, without IQ indicating any kind of truly general problem-solving or goal-achieving competence.
But if so, you wouldn’t expect that this arbitrary special skill we measure as IQ would be identical with a high score on a combination of simple, generic physiological traits like “cortical thickness” or “neural transmission speed” that basically amount to “more, faster, and more-connected neurons.”
You’d expect bigger faster brains to make people better at most things brains could possibly do. Or you’d expect traits like “cortical thickness” or “neural transmission speed” to not matter much at all for cognitive outcomes. But you would NOT expect these kinds of ultra-generic “make brain bigger and faster” traits to selectively make people better at only some hyper-specific arbitrary skill and nothing else.
So all in all, I think Pinker’s “IQ is meaningful” but “general intelligence is meaningless” hybrid position is incoherent.
Leaving the IQ aspect aside though, for a moment, let’s consider Pinker’s claim that there is no single capacity of “general intelligence” in the “using information to attain goals” sense, but rather “as many intelligences as there are goals and environments.”
This is basically a claim that intelligence is so far from being simple that it shouldn’t even be considered a single thing. There are many different capacities, to use information to achieve different goals in different environments, and there is no way to reduce these to a small handful of essential, nearly-universal information-processing skills.
This actually is a coherent claim. But his example of the basketball game doesn’t support it.
True, the winning strategy for a game depends on what the rules of the game are, and whether you’re trying to win the game or throw it. But there’s undoubtedly overlap in the underlying skills that go into playing different versions of basketball — it’s useful to be strong, fast, and coordinated, for instance. And even if you’re trying to throw a game, it’s probably good to be athletic if only to be able to fool people into thinking you might win.
Now, to be clear, an ability that’s useful for all, or nearly all, sports, like “strength”, is not restricted to a particular implementation. A human arm and a robot arm can both be strong while using completely different mechanisms to power movement, different limb architectures, etc. This is what Pinker is getting at when he says that psychometric g is like “horsepower”, a meaningful metric that still doesn’t explain how cars move.
You could imagine a distant planet (Moron Mountain?) where many kinds of genetically unrelated aliens play basketball. They don’t all have muscles or bones or even limbs, their bodies aren’t made of the same chemical compounds, etc. The best basketball players on Moron Mountain would still (I claim) tend to be especially strong, especially fast, and so on. But they wouldn’t come by their strength or speed via anything close to the same anatomical mechanisms.
Pinker is correct to say that traits like “strength” or “speed” are mechanistically diverse — there isn’t one single best way to design a “strong” body. And it’s plausible that cognitive traits like “intelligence” are similarly mechanistically diverse. Perhaps, there’s no particular reason to suppose that an AI, an intelligent extraterrestrial, or even a distantly related animal like an octopus, would have cognitive mechanisms that are closely analogous to those in the human or mammalian brain.
The opposite of mechanistic diversity would be mechanistic unity. If you believed in mechanistic unity for intelligence, you would think that there would be a handful of convergent “design patterns” such that any conceivable intelligent entity would “do its thinking” in more-or-less similar ways.
Personally, I have no strong opinions right now on the mechanistic-unity vs. mechanistic-diversity question.
I read Pinker as claiming something stronger than mechanistic diversity, though. He seems to not believe that domain-general skills like “strength” would make a person (or animal, or machine) tend to be better at a wide variety of sports and physical activities. Analogously, it sounds like he does not believe that there are domain-general cognitive skills like “working memory” that make a person (or animal, or machine) better at a wide range of problem-solving tasks. You might call this position non-generalizationism — the belief that there are no highly general skills.
Non-generalizationism is conceivably possible but I don’t think it’s plausible. Stronger people clearly are better at a wide range of physical skills (that’s why all kinds of athletes have adopted weight training). It’s almost as obvious that there are cognitive subskills like “working memory” that are important components for a wide range of tasks and goals. (Moreover, I don’t think it’s even logically possible to hold the beliefs Pinker does about psychometric g while disbelieving that there are highly domain-general cognitive skills!)
Generalizationism would be the opposite belief — that there are some core skills which are useful for achieving a wide range of goals across a wide range of (real-world, physically-occurring) environments.
Generalizationism might also be called descriptive simplicity — the idea that we can say a lot about an organism’s or system’s overall goal-achieving capabilities, via referring only to an information-theoretically “smaller”, simpler core set of basic universal skills.
Descriptive complexity, by contrast, would mean that we couldn’t say much about what goals a system can and can’t accomplish without laboriously spelling out how the system performs on lots of different goals in lots of different circumstances.
I think we should expect certain capacities (however they may be implemented mechanistically) to be good for accomplishing most coherent goals in most real-world environments. Things like, perhaps:
Flexible and powerful control of matter; the ability to move a wide range of things (including one’s body) precisely where and how one intends
Information processing; the ability to store large amounts of information, for long time periods, and manipulate it according to a wide variety of patterns, with a high degree of fidelity
Behavioral flexibility; the ability to do a wide range of different things depending on the situation and the goal
Generalization beyond direct observation; the ability to choose actions that are tuned to correspond to predicted (un-observed, novel, or future) environments and situations
Robustness; the ability to achieve similar (or analogous) outcomes consistently when pursuing a fixed goal, despite varying environments and situations
These kinds of traits can have meaningful interpretations whether you’re talking about humans, animals, machines, or sometimes even very simple algorithms.
Even if you don’t think animals, machines, or algorithms can really have “goals” of their own, you can talk about what kinds of traits you see in their behavior, and I think you will come up with pretty intuitive conclusions where more “complex” or “powerful” or “robust” or “flexible” things tend to be better at most tasks that we could set for them.
You can, of course, be just as much of an AGI skeptic as Pinker is, while still believing that general intelligence is real. You can simply believe that some things (humans? other animals?) have general intelligence, while no existing AIs do, and you can think there’s nothing about current trends in AI technology that should make us believe that situation will change imminently.
I think “general intelligence” traits are probably “like horsepower”, to again take Pinker’s analogy about psychometric g. They are desiderata; if you were designing a system to be generally good at “solving problems” or “achieving goals”, you would (I claim) necessarily prefer it to be more flexible, robust, powerful, etc, all else equal.
But knowing that a system is “highly flexible” need not tell you very much about the system’s mechanism, design, or implementation.
You might be able to infer some things about the structure or mechanism of a system based on certain desiderata around information processing though.
A system that can engage in varied/flexible behaviors must have some physical mechanism that’s configurable into a wide variety of states, for instance. There needs to be some mechanism whereby different environmental conditions or different goals can trigger different states and thus different actions.
This means that systems that have some kind of combinatorially complex reconfigurability — things like logic gates that can open and close, physical neurons which can grow or lose dendrites connecting to other neurons, DNA molecules which can have different arrangements of base pairs, etc — are better “candidates” for implementing intelligent behavior than physical phenomena that don’t have that Lego-like configurable quality.
Moreover, we know (from examples like DNA, neurons, or circuits) that modular reconfigurable structures can produce flexible and complex behaviors.
You might call it mechanistic simplicity to claim that it suffices to define a few architectural/structural/mechanistic desiderata in order to guarantee you can achieve the highly general skills/traits/properties that comprise general intelligence.
Mechanistic unity says “if you want a more general intelligence, you’re necessarily going to have to build it according to some simple universal design patterns.” Mechanistic simplicity is a converse claim: “There exist some simple structural features which, if you build a system with them, you will get a higher degree of general intelligence.”
Mechanistic simplicity implies that “one weird trick” (or a few simple tricks) in evolution were all that stood between more primitive organisms and the possibility of advanced general intelligence of the kind we see in humans.
It’s entirely possible for intelligence to be descriptively simple but mechanistically complex.
By analogy, think of Moore’s Law. For decades, computer circuits have been getting exponentially smaller, faster, and cheaper.
The desiderata described by Moore’s Law — feature size, clock speed, cost per FLOP — are extremely simple to define and measure. And these simple traits explain a lot of different kinds of improvements in a huge variety of things that one can do with computers.
Moore’s Law is a great example of descriptive simplicity: you can explain a lot of diverse capability growth via a small, simple set of core underlying metrics.
On the other hand, this doesn’t mean that Moore’s Law is due to “one weird trick” in semiconductor manufacturing technology. There have been lots of different, largely unrelated technological improvements, all pointing in the same direction, towards denser/faster/cheaper circuits. What’s relentlessly consistent is the economic incentive towards improving this simple set of hardware metrics. But if you ask “how did computers get so much better?” the answer will be “well, it was lots of different things.” Moore’s Law is mechanistically complex.
It’s in principle possible that the evolution of human intelligence looks sort of like this. Given a persistent evolutionary incentive to get “smarter”, we could have acquired a diverse array of mostly-independent genetic changes, all pointing in the same direction outcome-wise (“smarter hominids”) but possibly doing completely different mechanistic things at the level of genes, cells, physiology, or anatomy.
I don’t think that’s actually the case when it comes to recent hominid evolution, because there are only a few brain-related genes that changed and they seem to have such straightforward relationships with quantitative traits like brain size, neuron proliferation, and dendrite growth.
But it’s still plausible that a form of “mechanistic complexity” might be going on over longer evolutionary timescales, when it comes to mammalian or vertebrate or animal intelligence. Getting from a reptile to a mammal brain, for instance, might still have required lots of different changes operating in concert.
In the next few posts, I’ll start looking at animal (not just hominid) cognition and brain evolution, to see to what extent it looks like there might be either descriptive or mechanistic simplicity underlying animal cognitive capacities.
Some people try to justify non-generalizationism with the No Free Lunch theorem, which says that “any two optimization algorithms are equivalent when their performance is averaged over all possible problems.”
But “all possible problems” means all mathematically possible problems within a given abstract model. In the real, physical world, which obeys certain physical laws and regularities, not all theoretically conceivable things happen. Certainly, not all theoretically conceivable things happen with equal probability. So it’s certainly possible for one algorithm to be “better on average” than another on situations that arise in the real world.
E.g. cross-validation works better than random chance if you make even the most basic assumptions about the distribution of data, namely an “Occam’s razor” assumption that sequences of lower Kolmogorov complexity are more likely than sequences of higher complexity.
The No Free Lunch Theorem doesn’t imply that all algorithms perform equally well across realistic distributions of scenarios, and so certainly it doesn’t imply that some agents can’t be generally better problem solvers than others. For instance, a squirrel is better at solving problems than a rock.
Thank you for the shout-out of the Pinker-Aaronson talk- I will now go there to read it up. - But right now, I feel your post is kinda straw-manning Pinker (not just getting his first name wrong): Sure there is sense in talking "smart" and "smarter" among humans. Or among bonobos. Less when it is human-bonobo, much less when humanGI and AI. - Task: "calculate pi to its 100,000,000,000,000th decimal place in less than 6 months". Dumbo: fail, me fail, you fail, John von Neumann: fail. Computer: did it . https://thenewstack.io/how-googles-emma-haruka-iwao-helped-set-a-new-record-for-pi/ So what does that show about General intelligence?? - You say: more strength is better for any sport. Well, ever saw a tank play basketball? - Now AI, new task: "Join this group for improv acting". (And do not say this is not helped by general intelligence.) . - Short: the tasks AI is expected to do - when not playing (chess, go, jeopardy ...) - and what even smart humans do, may very well be too different for "GI" to be useful concept here, esp. to compare AI with HI. (As the "strength" of an aircraft carrier is not in the same league as M. Ali strength was. Not even same sport. Not even sport. - Obviously Klitschko can knock me out, and you can outsmart me. This can not be what Pinker is talking about. - Now I go to check. ;)
Interesting thanks, I think you should give Pinker a bit of a break because he is in academia and needs to walk a narrow line when talking about intelligence. I also think that a computer analogy can be useful here. g (and IQ) is a measure of your hardware. Our culture, language, mathematics is the software we run in our brain. So people with 'better' software might still do something better than someone with a higher IQ, but worse software.