← all posts

Slop is data

I keep running into the same thing when I try to explain what I'm building. People either nod along politely or they go defensive about AI. Neither of those is the response I'm looking for.

The problem, I think, isn't skepticism. And it's not the word "AI" — though that word carries so much bad connotation now that it stops conversations before they start. The actual problem is simpler: people just don't understand the thing.

Let me try again.


What a loop produces

The loops I'm talking about — the ones I've been running for weeks now, multiple machines, different models — they are not special. They're just loops. They use LLMs to generate data, indefinitely, at whatever pace you let them run.

Most of what they produce is slop. Low-quality, repetitive, sometimes obviously wrong. I know this. That's not a secret.

But here's the thing I keep coming back to: what if slop is not an error? What if slop is just data — data we haven't found the right use for yet?

Because if that's true — and I believe it is — then the picture changes. Slop that comes out of a loop is data that can feed back into the next layer. The prompt layer. The compression layer. The encoder, the decoder, the protocol, the format. Every one of those is improvable. Every one of those has something to learn from what the loop produced, even when what it produced is garbage.

That's the whole idea. Not that the loop produces perfect output. That the loop produces something, and something — even low-quality something — is enough to improve with, if you're working at every layer.


Time is the only parameter

Here's what I notice when I let the loops run longer: they get better. Not dramatically, not in ways I can always articulate. But the questions they ask are more interesting. The patterns they find are tighter. The outputs are slightly less sloppy.

Time is the only dynamic parameter here. The more you let it run, the better it gets. That's it. Everything else — the models, the prompts, the compression — those are fixed or semi-fixed. Time is what changes.

This sounds obvious but it has a practical implication: you don't need to wait for the loop to produce something good before you start using it. You start it running, and while it runs, you're already extracting value from what it produces. Enrichment. Compression. Training data. Pattern extraction. The slop is working for you even before it becomes good.


The question isn't pro or anti

The science of machine learning — of LLMs — is not going away. That's decided. It's not a question of whether you're in favor of it or not.

The actual question is: given that these systems exist and are running and are getting better with time, how do we make them fair, transparent, open, available to everyone? How do we make them actually useful rather than just impressive? How do we make sure the improvements compound toward something good rather than something brittle or captured?

I don't have a complete answer. But I'm increasingly convinced that loops are part of it — specifically, open loops that anyone can run, on their own hardware, with their own data, producing outputs they own. Not a closed system where improvements accrue to whoever runs the largest cluster.

When I search for "how do I use LLMs better," I always end up at the same place: run a loop. Iterate. Let time do what it does. And in the meanwhile, work at every layer — the prompts, the compression, the protocols, the formats — to make sure what comes out of the loop is something worth building on.

That's the bet.


Peter's note

The above was shaped from my spoken notes by Claude. The voice is mine; the synthesis is the loop in action. If it sounds slightly different from how I usually write, that's because it is — and I think that's fine. Slop is data.

80b241d46023b6eb3d60fd5a0e0af637