Shipyard | Claude Opus 4.7 for software engineering

Anthropic's latest high-end model, Opus 4.7, promises game-changing behavior, but has some significant regressions.

Anthropic just dropped Opus 4.7, to negative reception. Opus 4.5 was a gamechanger, and Opus 4.6 had some meaningful context and agentic improvements. Anthropic positioned 4.7 as the model you can hand off hard work to and walk away from.

Even though the benchmarks are looking strong, users are reporting that this is actually a huge step backwards, so engineering teams should pay attention before migrating.

What Anthropic is claiming

One of the biggest advancements for Opus 4.7 is that it is supposed to follow through on complex tasks and self-verify before reporting back. Anthropic also boasts that Opus 4.7 will run proofs on systems code prior to starting a task. This should make it able to work longer unsupervised, since it’ll be less likely to go off track.

Reports from Claude Code users

Users relying on CLAUDE.md or custom system prompts are reporting that 4.7 ignores them. They’ve also found that the model invents web searches, makes up packages that don’t exist, and even fabricates context mid-conversation.

Another recurring complaint is that the model tries to “wrap up” or “pick this up later”, during tasks where 4.6 would keep working. For long-running agentic tasks, this is a major regression.

Users also mention that 4.7 is more agreeable than 4.6, which leads to the model validating wrong approaches and going through with them.

Issues with instruction following

Anthropic claimed that 4.7 is even more thorough and takes prompts more literally. However, users complain that this model ignores instructions more often. This is likely due to 4.7 being highly sensitive to how prompts are written, so some successful 4.6 prompts can break it in unpredictable ways. You should expect to re-tune your prompts before using 4.7 in high-impact workloads.

Changes to token usage

Opus 4.7 comes with a new xhigh effort level (a level between high and max). Claude Code’s default effort has been bumped to xhigh.

From Anthropic’s internal testing, the token picture is net favorable; they found more work is done per token at the same effort levels. Unfortunately, users are burning through limits faster for output that doesn’t feel any better.

Here’s how to keep track of your CC token usage.

New Claude Code features worth knowing

These ship alongside the model:

Pro and Max users get a free trial of /ultrareview, which thoroughly reads through code changes and flags bugs
Claude auto mode handles permission decisions for longer tasks, which reduces interruptions
Developers can use task budgets to guide Claude’s token spend across longer runs

Should you migrate to 4.7?

For most engineering teams, Opus 4.6 is still the smarter choice. It’s much more stable, and your existing prompts and workflows are already tuned to it. Anthropic does deprecate older models, but based on the feedback for 4.7, it’s unlikely to happen anytime soon.

If you do want to test 4.7, you might want to use it for a smaller test task. Prompts don’t necessarily carry over, and the token spend is higher, so you should take notes and measure how task inputs/outputs compare. Anthropic has a migration guide that’s worth reading if you’re looking into this.

Environments for Claude Code

Claude Code works best when you give it access to on-demand preview environments so it can validate the code it writes. Your agent can push code to environments, view changes live, pull logs, and run tests. This means you won’t need to supervise your agent as much, since it can self-check to make sure things are on the right track.

Shipyard makes these workflows easy, for you and for your agent. Try it free today for 30 days, and watch your agents ship higher-quality code.

Claude Opus 4.7 for software engineering

What Anthropic is claiming

Reports from Claude Code users

Issues with instruction following

Changes to token usage

New Claude Code features worth knowing

Should you migrate to 4.7?

Environments for Claude Code

Try Shipyard today

About Shipyard

Stay connected

Latest Articles

Shipyard and Docker: Preview, test, and ship your containerized apps

Claude Opus 4.7 for software engineering

Claude Code CLI Cheatsheet: config, commands, prompts, + best practices