Generality Is the Wrong Goal

I’ve stopped arguing about AGI. Not because the question got answered, but because I realized I was debating a target nobody can define, on behalf of a future nobody can see, against my own daily experience of what these tools actually do. So I’m going to write down where I’ve landed, mostly so I stop relitigating it in my own head at 2 a.m.

The debate has no definitions¶

Here’s what finally broke the spell for me. We are arguing, with enormous confidence and enormous capital, about whether machines can become “generally intelligent” — while we still can’t agree on what intelligence, consciousness, or personhood even are.

This isn’t a strawman. David Deutsch, a physicist about as serious as they come, won’t grant that animals are conscious or intelligent. You can disagree with him, but the point stands: if people this thoughtful can’t settle whether a dog has an inner life, what exactly are we measuring when we ask whether a Python process is “generally intelligent”? We’ve turned a word with no agreed meaning into an engineering roadmap. You cannot build toward a spec that no one has written.

Most of the AGI conversation runs on this kind of borrowed certainty. The belief that mind is just computation — and that enough computation must therefore eventually produce mind — gets treated as a discovered fact. It isn’t. It’s a metaphysical bet wearing an engineer’s lanyard.

Nobody can say what problem it solves¶

Set the philosophy aside, though, because there’s a more practical hole: I have never heard a clear answer to what problem general intelligence would actually solve in the work I do.

Take coding, the use case everyone reaches for first. Why would I want generality there? In real software, generality is the wrong goal. The thing I want is the opposite — specialization. The mid-level engineer who has lived in a codebase for years writes better code in it than a freshly hired PhD who’s never seen it, and it isn’t close. The value was never raw horsepower. It was carrying the whole system around in your head long enough that the right move becomes obvious.

That’s the part that doesn’t scale. And “the system” is never just the source. It’s the business context, the migration nobody documented, the politics of which team owns which service, the fact that a designer will escalate to the CEO if you nudge a button two pixels. Engineering is a context problem and an immersion problem. The hard part is being soaked in the specific universe of the thing you’re building.

This is why I think the current generation of models is structurally limited in a serious codebase, and the reason has nothing to do with how clever they are. They can’t hold the whole thing — code plus all the intangibles — in working memory, and they can’t immerse. They can pattern-match brilliantly inside the distribution they were trained on, then fall apart the moment the problem leaves it. More parameters don’t fix that. They make the in-distribution demo more convincing, which is a different thing entirely.

The mechanism matters¶

I used to wave away the limitations with the standard line — it can’t do that yet, give it twelve months. I’ve retired that sentence. Not because progress won’t happen, but because “give it twelve months” had quietly become a license to never evaluate the tool actually in front of me.

So my rule now is boring and strict: judge the present state, on its merits, the way I judge anything else I pay for. I don’t buy a car on the promise of next year’s model. I shouldn’t restructure a team around one either.

And judged on the present, what we have is something extraordinarily good at seeming intelligent, which is not the same as being it. The mechanism matters here. Deutsch makes a sharp point about the Turing test: if you pass it by tricking the evaluator rather than by understanding, you’ve defeated the entire purpose of the test. A convincing imitation of reasoning is still an imitation. You can polish it forever and it never crosses over into the thing it’s imitating; you just get a more polished imitation.

There’s an obvious objection, and I want to be fair to it rather than swat it away: maybe human brains are also “just” statistics, just wetware prediction engines. That’s a real argument, and I don’t think anyone has actually closed it. My honest position is that I don’t know — which is exactly why I’m not willing to make billion-dollar org decisions as if it’s settled in the optimistic direction.

The real test is on our side of the screen¶

Here’s the reframing that stuck with me. The interesting variable was never the model. A system doesn’t need to be intelligent to reshape an industry — it only needs to be more convincing than the buyer is skeptical.

So the test that actually matters runs on our side of the screen. Not “can the model fool a human into thinking it’s human,” but “is the human credulous enough to hand an entire function to a tool that demos beautifully and degrades silently the moment it’s outside its comfort zone.” That’s a decision about discipline and skepticism, and it’s ours to fail.

Where I land¶

None of this makes me a doomer about the tooling. I use these models every day, and for the right shaped tasks — scaffolding, transformations, the tedious 60% — they’re a real multiplier. I’m not arguing they’re useless. I’m arguing against the story wrapped around them.

So my stance is deliberately unglamorous:

Stop debating undefined targets. We don’t have a definition of intelligence, so “artificial general intelligence” is a roadmap to nowhere in particular.
Stop treating generality as the goal. In engineering, depth and context beat breadth, and that’s the dimension scale doesn’t address.
Evaluate the present, not the promise. The tool you have is the only fact you’re allowed to plan around.
Keep the skepticism on the human side. The expensive mistakes won’t come from the model overperforming. They’ll come from us overtrusting it.

In a conversation that runs almost entirely on extrapolation, “judge the thing in front of you” turns out to be a surprisingly radical position. I’m fine holding it.

Generality Is the Wrong Goal

The debate has no definitions¶

Nobody can say what problem it solves¶

The mechanism matters¶

The real test is on our side of the screen¶

Where I land¶

Discussion (0)

No comments yet

The debate has no definitions¶

Nobody can say what problem it solves¶

The mechanism matters¶

The real test is on our side of the screen¶

Where I land¶

No comments yet

Add Comment