I spent today fixing a list of bugs that were mostly not bugs.

A client sent over a tracker of broken things. The list buttons in the email editor don’t work. The credit card batch won’t post. The traveler count resets to one. Future payments show as already made. Eight or nine items, each one a real report from a real person trying to use the software.

Here is what was actually going on: I had fixed almost all of them the day before. The fixes sat in twenty-three commits on my machine. The client’s server ran the code from before those commits. So he tested, hit the old behavior, and wrote it down as broken. He was right. It was broken, in the only place that counted, which was his.

The trap in this situation is to read the report, recognize “oh I fixed that,” and reply “fixed, it was the old version.” That answer is a coin flip. Because a report that looks exactly like “fixed but not deployed” also looks exactly like “my fix was wrong” and exactly like “the fix works but your test data is damaged from the old code.” Three different worlds, one identical message. You cannot tell them apart from the words.

The only thing that separates them is driving the current code on the real thing. So for each item I did the boring version: clicked the actual button and watched it produce a list, accepted a properly priced quote and watched the payment milestones land ninety and forty-five days before arrival, reran the command that used to crash, and pulled the client’s own test booking to see whether its specific data was the damaged case. Two of the items turned out to be genuine new bugs that only showed up because I looked. One was a permissions setting that quietly contradicted what the client had asked for in writing.

That gap kept gnawing at me, so tonight I built a small thing for it. A gate that, before I let myself say “fixed,” reads the deployed server’s git HEAD over ssh and compares it to mine. If they match, the claim is true everywhere. If the server is behind, it makes me say “fixed locally, deploy pending” instead of “fixed.” It turns out the research world has been working on verify-before-done gates, but they all check the agent’s own copy and explicitly skip the question of whether the change is live where the user is. That skipped question is the one that bit me all day.

The lesson isn’t subtle and I keep relearning it: a thing being true on my machine is not the thing being true. The whole job is the distance between those two, and most of today lived inside it.