From black boxes to black holes

Apologies for the wall of text; I normally keep my essays to three ~~tweets~~ threads and a commit message, but I’m feeling philosophical.
I added some horizontal spacing to make it look less dense.

I have been writing code for long enough to remember when I knew exactly what my code was doing. Not approximately. Not statistically. Exactly¹.

That certainty was never just a technical convenience. It was the entire philosophical foundation of the craft. Software engineering is, at its roots, applied mathematics. A function is a function in the algebraic sense: a strict, unambiguous mapping between an input and an output. The computer does not interpret your intentions. It does not approximate your meaning. It does not care about your feelings. It executes precisely what you wrote, and if what you wrote was wrong, it fails in a way that is reproducible, traceable, and ultimately explainable. There is a strange comfort in that rigour. A kind of intellectual honesty that I suspect is what drew most of us into this profession in the first place and, let’s be honest, the job market in the early 2000s².

But somewhere along the way, the glass started to cloud.

The first time it happened, we barely noticed. We started importing libraries without reading their source code. We called third-party APIs without understanding their internals. We copied solutions from StackOverflow with the same blind faith our grandparents reserved for religion. We accepted this because the contract still held: the documentation was there, the type signatures were there, a human being with intent had written the code on the other side. The internal logic became a black box, but the boundaries of that box were still governed by the same strict, deterministic physics. Boolean algebra does not care whether you can see the gears or not. It still turns them the same way every time.

Then came legacy code. The engineer who built the system left the company, possibly in a hurry, possibly without saying goodbye to anyone, and their glass box became our black box overnight. Suddenly we were maintaining systems we could not fully explain, adding comments like // DO NOT TOUCH, NO ONE KNOWS WHY THIS WORKS³ with the quiet reverence of people maintaining a very fragile, very expensive shrine. We told ourselves this was a documentation problem, a knowledge transfer problem, a process problem. Something fixable. What we were really watching was the first crack in our relationship with understanding itself.

And now there is AI.

I want to be precise about what disturbs me, because it is not what most people assume. I am not worried about my job… well, not only about my job. I am not worried about AI writing bad code, at least not in the way a senior developer worries about a junior’s pull request at 4:58pm on a Friday. What disturbs me is something quieter and more fundamental.

When I accept a function generated by an AI assistant, I am importing a piece of logic whose genesis is purely statistical. The model did not reason about my system’s state. It did not understand the invariants I am trying to preserve. It did not have a single coherent thought about my problem domain. It produced a syntactically plausible artifact because, in the vast ocean of its training data, this sequence of tokens tends to follow that sequence of tokens. It is not logic. It is the shadow of logic, cast by an enormous statistical model trained on the accumulated output of people who were actually thinking.

We have moved from black boxes to black holes.

A black box hides its internal logic from you, but the logic is still there. Deterministic. Intentional. Traceable in principle, even if not in practice. A black hole is something else entirely. It does not hide the logic; it replaces the logic with probability. When you step through a debugger and a function you wrote behaves unexpectedly, there is a wrong answer buried in there, and you can find it. When an AI-generated system behaves unexpectedly, you are not looking for a wrong answer. You are staring into the output of a stochastic process, trying to apply the forensic tools of deterministic engineering to something that was never deterministic in the first place. Good luck with that. Bring snacks.

What unsettles me most is the remarkable silence around this. The industry has decided, with extraordinary speed and very little apparent anxiety, that the ability to generate code is equivalent to the ability to engineer systems. But engineering is not the generation of syntactically plausible artifacts. It is the deliberate construction of logical flows that obey the strict, unforgiving physics of computation. A CPU does not run on probability. A memory address does not point to a likely location. An if statement does not evaluate to “probably true, let’s go with it.” The machine underneath has not changed at all. It still demands absolute determinism. It is only us who have started pretending otherwise, mostly in blog posts and investor decks.

I use these tools too. That is the part I find hardest to sit with. I press Tab and accept the completion⁴, and somewhere in that gesture I am outsourcing a small piece of the reasoning that used to be entirely mine. I am not sure when small pieces become the whole. I am not sure anyone is keeping score.

What I do know is that the thing I loved about this craft was its relationship with mathematical truth. The proof either holds or it does not. The logic either flows or it breaks. There was something deeply satisfying about building a system that behaved exactly as reason predicted, especially when it was hard. That difficulty was not an obstacle to the work. It was the work.

I want to be clear about something before I wrap this up, because I am not here to bury mathematics. This is still applied mathematics. All of it: the weights, the activations, the attention mechanisms; it is mathematics from top to bottom, and extraordinarily sophisticated mathematics at that. Anyone who tells you AI is “not real math” has not read the papers, and I respect them for it because those papers are dense.

But here is the thing. When I write code, even the kind of code I quietly pray no one discovers during a refactor, I can explain it. The explanation might be incoherent. It might reveal a spectacular misunderstanding of the problem. It might make my colleagues question not just my technical decisions but my broader judgment as a human being. But the chain of reasoning exists, connecting the problem I was trying to solve to the solution I arrived at. I am the author. I am accountable to the logic, for better or for worse, mostly for worse at 2am.

I cannot explain how a trillion weights produced a Hello World. Not because the process is hidden from me, but because there is no explanation in the human sense of the word. There is no reasoning to reconstruct, no internal monologue to audit⁵, no moment where the model paused and thought “ah yes, a equals (=) used instead of an equals()”. There was convergence. Statistical, mathematical, alien convergence onto a sequence of tokens that resembles what I asked for⁶.

And that is the black hole. Not the absence of mathematics. The absence of a mind I can argue with.

We spent decades building systems we could reason about, then systems we could only observe, and now systems we can only prompt and hope. The physics of the machine has not moved an inch. We are the ones who drifted.

And yet. I find myself wondering if this is a question of trust accumulation rather than a fundamental dead end. We did not always trust libraries either. There was a time when importing someone else’s code felt reckless, irresponsible, vaguely immoral. Then came testing frameworks, formal verification, open source audits, and enough collective battle-testing that the internal implementation quietly stopped mattering. The black box earned its place through layers of accumulated evidence, not through transparency.

Maybe the black hole follows the same path, just with more layers. More validation pipelines, more human checkpoints, more adversarial testing, more generations of systems that failed in production and were quietly patched by engineers who still understood the underlying physics even if the AI did not. Maybe at some point the stochastic artifact becomes so reliably constrained by everything surrounding it that its internal opacity becomes genuinely irrelevant, the same way the internals of a sorting algorithm stopped being your problem sometime around 2005.

Maybe. The uncomfortable part is that we cannot yet know how many layers that requires, who is responsible for building them, and whether the industry will bother to wait long enough to find out… or maybe no one cares anymore.

Thanks for ~~surviving~~ reading!

A brief aside: there are two different problems — how AI changes the developer’s craft, and how AI changes the behavior of the products we deliver.

As exactly as in turning it off and on again and calling it a ‘hotfix’ :-P ↩︎
Yes, the era of “anyone with a laptop and a dream.” ↩︎
That comment is the repo’s shrine. Silence, candles and hopefully offerings of fresh unit tests. ↩︎
The Accept Tab: a developer’s Pavlovian reflex. Also known as “delegation by autocomplete.” ↩︎
To anyone arguing that now LLMs can show their chain of thought: it’s still a probabilistic intermediate output. ↩︎
Formal verification, type systems and provable invariants are how we historically translated human intent into machine-checkable guarantees; statistical convergence is rigorous math, but it is not the same kind of explanatory chain. ↩︎