Over spring break I decided to learn what predictive coding actually was. I'd seen the term float around in papers and it kept showing up adjacent to things I care about (representations, inference, how networks encode information.) So I sat down and tried to learn it.
It did not go well at first. I was so lost.
The assumption I didn't know I was making
I’ve been taking courses and working with the traditional ML paradigm day to day for the last three years since I graduated high school. I basically etched into my brain an idea of how these learning systems work. I would describe it like this: you have an input, you have a label, the input passes forward through the network, you compare the output to the label, and you send the error backward to update the weights. That's it. Input, forward pass, prediction vs. label, loss, then backprop. Everything I learned so far has fit into that frame.
Predictive coding didn’t fit into that framework. The issue wasn't due to its complexity; most of the time, learning something new isn’t difficult because it’s complex. It's difficult because it appears familiar enough that you don’t realize it’s broken the frame you're trying to fit it to until you’re already confused and frustrated.
The dark room
I spent time grasping with the intuition first. You can start by explaining predictive coding with a thought experiment: imagine walking through your house in the dark. You're not building a map of the room by bumping into things. You already have a map of the room running in your head. When your hand touches the wall sooner than expected, that surprise (the gap between what you predicted and what actually happened) is the only signal you really need to ground yourself in your mental map. Your brain isn't processing the full sensory scene every moment. It's mostly running on predictions, with just enough error correction to stay calibrated.
That part clicked. It made sense. I too can walk in dark rooms. I understood the hierarchy, higher layers generate predictions, lower layers report errors, the whole network is minimizing surprise rather than learning from scratch. I worked through the early math, derived the prediction error, understood why you'd want local learning rules. The conceptual picture felt solid.
Then I tried to actually connect it to implementation and something quietly fell apart.
Where I got stuck
The issue was specific. In the traditional setup, you have an input x, a latent representation r, and a weight matrix. In standard ML I know exactly what each of those is: x goes in, weights transform it, you get something out. My brain automatically slotted r into the role of "the output" and the weights into the role of "what gets updated."
That's wrong, and it took me a while to see it.
I tried to help remedy my confusion with a visualization that I got Claude to build. I stepped through it, watched the numbers update and even that didn't help so I put it down for a few days.
When I came back to it, I found a derivation that actually finally made it click. And the thing that resolved it was seeing the generative model written out explicitly before anything else.
The generative model direction
Here's the setup that finally made it land. You have an input image x and a latent representation r. The network's job is to recreate x using r. The way it does that is:
You take r, pass it through a nonlinearity, multiply by W, and you get a prediction of what the input should look like. The weights W are a decoder, they map from the latent space into the input space.
The prediction error is then just the gap between the actual input and this reconstruction:
And now here's the thing that broke my ML prior: x is both the input and the target of the loss. There's no separate label. The network is trying to find a latent state r such that decoding r through W reconstructs x as faithfully as possible. The loss is:
When I first read this I kept thinking: okay, but when do we compare to the label? The answer is that for the reconstruction task, there is no label. x is the label. You're doing unsupervised representation learning through reconstruction, and the question you're asking is: what r best reconstructs this input?
Inference is optimization over r
This is the other piece that didn't click from the intuition alone. During inference, when a new input arrives, the weights W are held fixed. What you're doing is gradient descent on r:
You start with some initial r, probably small random values, and you iterate this until r converges to r*. The latent state that best reconstructs x under the current weights. That's the inference step. You're not updating the weights here. You're finding the representation.
The weights update happens separately, on a slower timescale, after r has settled:
Two timescales. Fast: find the r that explains this input. Slow: update the weights based on how well r represented it.
In standard backprop, the distinction between "inference" and "learning" doesn't really exist in the same way. You do a forward pass, you do a backward pass, you update weights. The weights are what change, and they change in response to every input. In predictive coding the representations themselves are what change during inference, and the weights encode the long-run generative model of the world.
Once I had that, once I saw that r was the thing being optimized and not just a fixed hidden layer, the rest of the math followed cleanly. The two-timescale structure made sense. The locality of the learning rules made sense. I had finally shed some light in that dark room.
What sits differently now
I keep coming back to the generative model framing. Standard discriminative models learn a mapping from inputs to outputs. Predictive coding learns a model of how inputs are generated, and then uses inference to invert that model on new data. These are different bets about what's worth representing.
Whether predictive coding actually outperforms backprop at scale is genuinely unsettled. The neuroscience motivation is compelling but the empirical results are mixed. What I find interesting is the representational hypothesis underneath it: that a network trained to reconstruct its inputs might end up with more useful latent structure than one trained only to produce correct outputs. That's a question I'm thinking about for separate reasons.
The other thing I take away from this is about how I learn. The walkthrough I did first wasn't wrong, it gave me the framework. But something about building intuition before grounding it in a concrete implementation left a gap I couldn't find. What closed it wasn't more intuition. It was a derivation that defined every variable before using it and left nothing implicit. That combination of intuition first, then a rigorous no-gaps treatment was what it actually took. I don't think either would have worked alone.