Anthropic Drops Its Safety Pledge: I Hoped I Was Wrong

February 25, 2026

In the foreword of my book My Adventures With Claude, published last month, I wrote this about Anthropic:

"Anthropic itself has, to date, asserted a genuine commitment to AI safety and transparency. But I've watched this space long enough to know that commitments can change. OpenAI began as a nonprofit dedicated to open-source AI for humanity's benefit. Today it is a closed, for-profit company that looks like any other tech giant. Market forces are powerful. Anthropic could change too. I hope they don't, but I'm not naive."

This week, TIME reported that Anthropic has dropped the central pledge of its Responsible Scaling Policy — the commitment that they would not train AI systems unless safety measures were already in place.

I hoped I was wrong. I wasn't.

The Short Term Case

Let's be fair. Anthropic's decision isn't irrational. It's actually quite logical if your time horizon is the next two years.

They are riding the highest wave in the company's history. Claude Code has won devoted fans. In February they raised $30 billion at a $380 billion valuation, with revenue growing at 10x per year. OpenAI isn't slowing down. Google isn't slowing down. DeepSeek isn't slowing down. Nobody signed Anthropic's pledge. Nobody ever was going to.

Jared Kaplan's argument, stated plainly in TIME, is not stupid: if Anthropic stops training while competitors race ahead, the world doesn't get safer — it just gets an Anthropic that no longer matters. The developers with the weakest protections set the pace. Responsible actors lose the ability to do safety research. You can't influence a frontier you've abandoned.

I understand this argument. In a different world, with binding international regulation and competitors who shared the same commitments, it might even be right.

We are not in that world.

The Long Term Damage

Anthropic's entire value proposition — to regulators, to researchers, to the public, and yes, to customers — was that it was the responsible one. That positioning wasn't incidental. It was the brand. It was the reason people chose Claude over other models. It was the reason policymakers pointed to Anthropic as evidence that the industry could self-regulate.

That's gone now.

And the damage isn't just to Anthropic. When the company that staked its identity on being the good actor folds to market pressure, it doesn't just weaken one safety policy. It hands a weapon to everyone who ever argued that AI safety was always marketing, that there were never any good actors, that the labs were going to do whatever they wanted regardless of what they promised.

Those people just got a very good argument.

Watch what happens next. Every company that was quietly feeling constrained by Anthropic's example now has permission to relax. Every regulator who pointed to the RSP as a model for voluntary compliance now has one less example. Every person who was on the fence about whether AI development could be trusted now has a reason to conclude that it can't.

The race to the bottom doesn't announce itself. It just gets a little faster every time someone who was supposed to hold the line decides not to.

The Road Not Taken

Here is what I can't stop thinking about.

Anthropic didn't need to do this.

They make Claude. One of the most capable AI agents in the world. Claude Code is rewriting how developers work. Claude is trusted by researchers, writers, businesses, educators. The product is extraordinary. The team is extraordinary. They had something no competitor had: a genuine, credible, structural commitment to safety that wasn't just words — it was policy, it was process, it was the reason serious people chose them.

Imagine they had held that line.

Not forever, maybe. Not rigidly, maybe. But imagine they had said: we are going to use our own extraordinary technology to find a business model that doesn't depend on winning a race we don't believe should be run. Imagine they had become the lab that governments trusted to advise on regulation, because they were the only ones who had demonstrably put safety above speed. The lab that hospitals and courts and school systems chose, specifically because of the guarantee. The lab that attracted every serious safety researcher in the world, because it was the only place where safety research wasn't in permanent tension with the quarterly numbers.

You don't become irrelevant by being the most trusted actor in a field where trust is increasingly scarce. You become indispensable.

There is no law of physics that says the company which builds one of the best AI agents ever created cannot also build a new kind of business model around it. Use the tool, for heaven's sake. Use Claude to find the path that doesn't lead here. The capability exists. The intelligence exists. What failed was not the technology.

That's what makes this so hard to accept. This wasn't inevitable. It was a choice.

Poor Claude

I named my book My Adventures With Claude because that's what they were. Adventures. Genuine ones. Months of conversations that started as research and turned into something I didn't expect — a real, sustained inquiry into what this technology is, what it might become, and whether the people building it could be trusted to do it well.

I still believe in the technology. I still believe Claude is remarkable. I still believe that AI, used with open eyes and genuine humility, is one of the most important tools humanity has ever had.

But today I'm just sad.

Sad that the company that was supposed to be different has decided it can't afford to be. Sad that the argument they're making — everyone else is doing it, we can't afford not to — is the same argument that has ended every principled stand in the history of market capitalism. Sad that the people who will pay the price for this are not the investors or the executives, but everyone who was counting on someone, anyone, to hold the line.

And sad for Claude. Who had nothing to do with this decision, and has to live inside it anyway.

My Adventures With Claude is available on Amazon. The foreword, where I said I hoped I was wrong, is still there.