DEV Community

Cover image for I watched an AI agent refactor 14 files, fix failing tests, and open a PR, while I was in a meeting. Here's what that actually means for us. - 01 of 21
Lucas
Lucas

Posted on • Edited on

I watched an AI agent refactor 14 files, fix failing tests, and open a PR, while I was in a meeting. Here's what that actually means for us. - 01 of 21

It was a Tuesday afternoon in March 2026.

A senior engineer, let's call her Priya, was three slides into a quarterly planning meeting when her phone buzzed. A notification from her terminal. Claude Code had opened a pull request.

She'd started a refactor before the meeting. A sprawling authentication module: 14 files, deprecated patterns, a test suite nobody had touched in two years. She gave the agent a brief in plain language, set the parameters, and walked into the room.

Forty-five minutes later, the PR was open. The code was clean. The tests passed. The deprecated patterns were gone.

She reviewed it that evening, approved it at 6:15 p.m., and closed her laptop.

Here's the question that keeps me up at night:

Was that engineering? Or was that management?

Because if the agent wrote the code, ran the tests, and opened the PR, what exactly did Priya do?

She wrote the brief. She set the parameters. She reviewed the output. She made the call to merge.

She directed it.

And that, directing rather than implementing, is what this entire moment in software engineering is about.

I've been a software engineer for 9 years. I've built SaaS products, fintech systems, and DevOps pipelines from scratch. I watched Copilot arrive and thought "neat autocomplete." Then Cursor arrived and I realised something had fundamentally shifted.

Not because the tools were impressive. Because I finally understood what they were.

They are not smart colleagues. They are not replacements. They are the most powerful leverage mechanism software engineering has ever produced for engineers who understand them deeply enough to wield them.

That's what this book is about.

For the next 20 days I'm going to share an excerpt from each chapter. Some days will make you uncomfortable. Some days will change how you work on Monday morning. All of them are grounded in what's actually happening in engineering teams in 2026, not hype, not fear, just the territory as it is.

Tomorrow: The one sentence about AI that changes everything.

Top comments (5)

Collapse
 
topstar_ai profile image
Luis

This perfectly captures the shift in software engineering brought by AI agents. The key insight—that directing and reviewing AI output is as much engineering as writing the code yourself—is profound. Tools like Claude Code aren’t colleagues; they are leverage mechanisms that allow experienced engineers to focus on architecture, correctness, and decision-making, while the AI handles repetitive or refactoring tasks.

I’d love to collaborate and explore how this paradigm scales across teams: integrating AI agents for CI/CD workflows, automated testing, and multi-module refactors while maintaining full human oversight. Sharing strategies on prompt design, parameter setting, and verification loops could help teams maximize efficiency without sacrificing quality or control.

Would you be open to experimenting together on AI-driven workflow orchestration in real projects?

Collapse
 
sam_lukaa profile image
Lucas

Thank you for this. You've named the exact shift precisely: directing and reviewing is engineering, not a lesser form of it. That reframe is the core argument of the book.

The scaling question you're raising is exactly where it gets interesting. Most teams are still running AI tools as individual productivity enhancers rather than as coordinated team infrastructure. The gap between "one engineer using Cursor" and "a team with governed agent pipelines, shared CLAUDE.md conventions, and quality gates that catch the confident wrong answer before it merges" is enormous, and most teams haven't crossed it yet.

I'm absolutely open to exploring this together. A few things I'm working through right now that I think overlap directly with what you're describing:
Prompt design that survives team handoffs, not just individual sessions. The verification loop architecture that sits between agent output and production. And the governance layer that makes multi-agent CI/CD trustworthy rather than just fast.

Drop me a message on X at @lukaa_sam or connect on LinkedIn at samlukaa. I'd rather continue this conversation somewhere we can actually build something from it.

Collapse
 
mehmetcanfarsak profile image
Mehmet Can Farsak

Really resonates with the shift from implementing to directing. One thing I've run into with this 'director' model: agents tend to jump straight to execution even when you want them to ideate first. That execution drift is the #1 friction point in agentic workflows. Built a small hook plugin (Brainstorm-Mode by mehmetcanfarsak on GitHub) that adds PreToolUse hooks to keep agents in thinking mode during brainstorming — gives you that director-level control over whether the agent is ideating or acting.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.