The work that matters isn’t the model — it’s everything around it. The contracts, the execution environment, the tool interfaces, the verification layer, the observability. This is what I’m starting to call “harness engineering.”
It’s not prompt engineering (that’s one small piece). It’s not ML engineering (we’re not training models). It’s the discipline of building reliable systems around models — systems that constrain, verify, and direct model behavior toward useful outcomes.
Think of it like the difference between building an engine and building a car. The engine is impressive, but the car needs a chassis, suspension, steering, brakes, and instruments. Right now, most of the industry is building engines and bolting them directly to wheels.
An agent orchestrator is a harness. A Datalog reasoning layer over an LLM is a harness. The pattern I keep seeing: the harness is where the reliability comes from, not the model.
This is still early. I need to think more about what the boundaries of this discipline actually are and whether it’s genuinely distinct from “systems engineering but with LLMs.”