Mechanical Sympathy: The Case for Giving Engineers Time to Optimise

The image showcases an intricate mechanical pocket watch that is open and being viewed from above. The watch has multiple gears and springs, indicative of its intricate inner mechanism. Prominently displayed on the front face of the watch are two black and red needles that indicate the time. The background is blurred, putting focus on the watch, but it appears to be a dark space with muted colors. There are no visible texts or brands in the image. The style of the watch suggests a vintage design, often associated with luxury or collectors' items.

You don’t need to be an engineer to be a racing driver, but you do need Mechanical Sympathy.

Caer Sanders opens her recent article on Martin Fowler’s blog with that quote, and then applies it to software. The application fits. In software, mechanical sympathy means writing code that works with how the hardware actually behaves: understanding memory access patterns, CPU cache behaviour, thread contention, and the real cost of abstraction, rather than treating the underlying machine as an unlimited resource. Software written with mechanical sympathy tends to be faster, leaner, and more respectful of the machines it runs on.

This post is not a tutorial on how to apply it. It is a post for the people who decide whether their engineers get the time to do so.

The Problem: Optimisation Doesn’t Feel Like Value

In most sprint planning sessions, the dynamic goes like this. A feature has a ticket, a story point, and a stakeholder waiting for it. A bug has a report and a severity rating. An optimisation has a vague sense that someone should probably look at that at some point.

It is not that engineering teams don’t care about performance. Most do. The problem is that the incentive structures around them don’t reward it. A feature ships and a stakeholder can see it. An optimisation ships and the application loads a little faster. Perhaps. On most machines. The feedback loop is diffuse, and diffuse feedback loops lose out to immediate ones.

ℹ️ Note

I’m not talking about The Speedup Loop here.

Dave Plummer, a retired Microsoft engineer who worked across MS-DOS through to Windows 95, addressed this in a recent video:

If feature work gets promotions and performance work gets a polite nod in the retrospective, you will get features and not performance. If teams are measured on story points closed, not regressions prevented, you will get motion and not refinement.

This is not a new observation. But it is one that most organisations haven’t structurally addressed. The costs of that choice are real. They are worth being specific about.

What That Decision Actually Costs

Cloud and server spend. Unoptimised code runs longer, consumes more memory, and scales less efficiently. Every millisecond trimmed from a hot path is a reduction in compute time. At scale, that is a direct and measurable reduction in cost. The work has a return on investment; it simply requires someone to measure it.

End-user hardware. Software that demands more from the machine than it needs is a tax on your users. That tax has real consequences for people running your software on older machines, on shared servers, or in environments where memory is genuinely constrained. In my CPD review of Kill It With Fire, I explored Marianne Bellotti’s argument that teams should focus on what gives value to the user rather than getting lost in architecture debates. Writing leaner software follows from it directly.

The AI compounding effect. This one is newer, and it compounds the others. Plummer observed that AI code generation produces “plausible code, verbose code, layered code, defensive code, code that solves a stated problem in a recognisable way, but not necessarily the best way.” An AI can generate a thousand slightly over-allocating routines that all pass review, all satisfy their tickets, and collectively produce something bloated and quietly expensive to run. Leaders adopting AI-assisted development without also investing in performance review are building a problem they may not yet be able to see.

Caer Sanders’ article describes four principles of mechanical sympathy that address the hardware side of this directly: predictable memory access, cache line awareness, the single writer principle, and natural batching. You do not need to understand the mechanics of cache line padding to lead a software team. But someone on your team does, and they need the time to apply it.

The LLM Nuance

There is a counterpoint worth making, though.

The same AI tools that generate median code at scale can, under expert guidance, be a cheap experimentation partner for optimisation work. Generating ten candidate implementations of a hot path and benchmarking them costs a fraction of what it would cost in engineering hours alone. Under expert guidance is doing a lot of work in that sentence. This is not about handing the reins to the model. It is about using it as a fast iteration partner while a senior engineer evaluates the results and owns the decision about what actually ships.

The cost of generating code has fallen significantly. The cost of not evaluating that code for performance has not. That gap is the argument for protecting engineering time.

What Optimisation Actually Delivers

A few years ago, I was the sole engineer on a migration project for a motorsports client. The brief was to upgrade from Entity Framework to Entity Framework Core, a change requiring significant rewrites of the data access layer. Rather than a mechanical port, I approached it as an opportunity to do it properly from the start.

The results were measurable. Average request speed improved by approximately 60% across all requests. Runtime memory usage fell by around 150 megabytes per request.

On paper, 150 megabytes sounds modest. In context, it was material. The system ran on servers shared with CAD applications and testing tools, both memory-hungry. Long-running requests consuming less memory meant fewer resource conflicts and more consistent performance under real working conditions. You can read more about this engagement in the motorsport case study.

That result came from one engineer being given the brief to do it properly, not just to make it work. The investment was the time to approach it with care.

Niels Rasmussen made a point worth holding onto when he joined me on The Modern .NET Show: context matters. Not every system needs SIMD vectorisation or instruction-level profiling. Niels built Sep, a .NET CSV parser that processes rows in around 60 nanoseconds using hardware intrinsics, because the use case demanded it. Most applications will never need to go that far. But every application deserves engineers who have been asked to look.

What Leaders Can Actually Do

None of these require a significant cultural overhaul.

Allocate sprint capacity. Not a hackathon, and not a “20% time” aspiration that quietly evaporates under deadline pressure. A defined, recurring allocation: a performance review task per sprint, or an agreement that a portion of each sprint’s capacity goes to non-feature work including optimisation. Treat it with the same structural seriousness as security patching or technical debt. Optimisation will be invisible until it matters enormously, and at that point it will be expensive to address.

Make performance a first-class build artefact. Startup time, memory at idle, key transaction latency: track them, record the baselines, and gate on them, rather than treating them as mysteries that someone investigates when there is a fire. Dave Plummer puts it directly: “Performance has to become a first-class build artifact, just like correctness.” If you can fail a build because a test broke, you can fail a build because startup time regressed by 20% or because a routine that previously ran in 40 milliseconds now takes 90. The numbers should be visible and continuous.

Use AI as part of the toolkit, not a replacement for judgement. Give engineers time to investigate hot paths properly. Let LLMs generate candidate optimisations quickly and cheaply. Require a senior engineer to evaluate what comes back and own the decision about what ships. The model handles the generation. The engineer handles the thinking.

ℹ️ Note

This post draws on ideas explored in more depth across a few related pieces:


If performance hasn’t come up in your team this quarter, it probably should have. Let’s talk.