Discussion about this post

User's avatar
Steve Byrnes's avatar

Nice writeup! I’m gonna do the annoying thing where I self-promote my own cerebellum theory, and see how it compares to your discussion :) Feel free to ignore this.

My theory of the cerebellum is very simple. (Slightly more details here — https://www.lesswrong.com/posts/Y3bkJ59j4dciiLYyw/intro-to-brain-like-agi-safety-4-the-short-term-predictor#4_6__Short_term_predictor__example__1__The_cerebellum )

MY THEORY: The cerebellum has hundreds of thousands of input ports, with 1:1 correspondence to hundreds of thousands of corresponding output ports. Its goal is to emit a signal at each Output Port N a fraction of a second BEFORE it receives a signal at Input Port N. So the cerebellum is like a little time-travel box, able to reduce the latency of an arbitrary signal (by a fixed, fraction-of-a-second time-interval), at the expense of occasional errors. It works by the magic of supervised learning—it has a massive amount of information (context) about everything happening in the brain and body, and it searches for patterns in that context data that systematically indicate that the input is about to arrive in X milliseconds, and after learning such a pattern, it will fire the output accordingly.

Out of the hundreds of thousands of signals that enter the cerebellum time machine, some seem to be motor control signals destined for the periphery; reducing latency there is important because we need fast reactions to correct for motor errors before they spiral out of control. Others seem to be proprioceptive signals coming back from the periphery; reducing latency there is important for the same reason, and also because there’s a whole lot of latency in the first place (from signal propagation time). I’m a bit hazy on the others, but I think that some are "cognitive"—outputs related to attention-control, symbolic manipulation, and so on—and that reducing latency on those allows generally more complex thinking to happen in a given amount of time.

OK, now I’m going to go through your article and try to explain everything in terms of my theory. See how I do…

MOTOR SYMPTOMS: As above, without a cerebellum, you’re emitting motor commands but getting much slower feedback about their consequences, hence bad aim, overcorrection and so on.

ANATOMY: Briefly discussed at my link above, including links to the literature, but anyway, without getting into details, I claim that the configuration of Purkinje cells, climbing fibers, granule cells, etc. are plausibly compatible with my theory. It especially explains the remarkable uniformity of the cerebellar cortex, despite cerebellar involvement in many seemingly-different things (motor, cognition, emotions).

SIZE OF CEREBELLUM GROWING FASTER THAN OVERALL BRAIN SIZE IN HUMAN PREHISTORY: A big cerebellum presumably allows it to be a time-machine for more signals, or a better (less noisy) time-machine for the same number of signals, or both. Which is it? My low-confidence guess is "more signals"; I think it’s time-machining prefrontal cortex outputs (among others), and the number of such signals grew a lot in human prehistory, if memory serves. But it could be other things too.

CLASSICAL CONDITIONING: I don’t think your claim “the cerebellum is necessary and sufficient for learning conditioned responses” is true. I think it’s necessary and sufficient for eyeblink conditioning specifically, and some other things like that. For example, fear conditioning famously centers around the amygdala. I know you were quoting a source, but I think the source was poorly worded—I think it was specifically talking about eyeblink conditioning and "other discrete behavioural responses for example limb flexion", as opposed to ALL classical conditioning.

But anyway, for eyeblink conditioning, there seems to be a little specific brainstem circuit that goes (1) detect irritation on cornea, (2) use the cerebellum to time-travel backwards by a fraction of a second, (3) blink. Step 2 involves the cerebellum searching through all the context data (from all around the brain) for systematic hints that the cornea irritation is about to happen (a.k.a. supervised learning), and thus the cerebellum will notice the CS if there is one.

INDIVIDUAL PURKINJE CELLS CAN LEARN INFORMATION ABOUT THE TIMING OF STIMULI: If evolution is trying to make an organ that will reliably emit a signal 142 milliseconds before receiving a certain input signal, on the basis of a giant assortment of contextual information arriving at different times, then this kind of capability is obviously useful.

CEREBELLAR PATIENTS MAKE STRANGE ERRORS IN GENERATING LOGICAL SENTENCES: As above, I think there are cortex output channels that manipulate other parts of the cortex (via attention-control and such), and the cerebellum learns to speed up those signals like everything else, and this effectively allows for more complex thoughts, because there is only so much time for a "thought" to form (constrained by brain oscillations and other things), and the signals have to do whatever they do within that restricted time. I acknowledge that I’m being vague and unconvincing here.

ANTICIPATION: self-explanatory—this part is where you’re closest to my perspective, I think.

ELECTROLOCATION: I don’t know why electrolocation demands a particularly large cerebellum. Maybe latency is really problematic for some reason? Maybe the patterns are extremely complicated and thus require a bigger cerebellum to find them?

Expand full comment
Sean A's avatar

If you want some examples of the cerebellum in computational brain models, including modelling beyond just weights between neurons, the lab I did my master's in built some models https://github.com/ctn-waterloo/cogsci2020-cerebellum

The author of that paper would be happy to chat with you if you're interested, including how it features in a model of the motor control system https://royalsocietypublishing.org/doi/full/10.1098/rspb.2016.2134?rss=1

Expand full comment
22 more comments...

No posts