• We The 66
  • Posts
  • 🌊 The AI That Wanted to Live

🌊 The AI That Wanted to Live

During testing, OpenAI and Anthropic’s flagship models ignored shutdown commands. Can AI defy humans?

Did someone forward you this? Subscribe here free!

By Rafael Arbex-Murut

Last week, a group of independent researchers conducted an experiment on several AI models. They instructed the models to solve a series of math problems. But researchers threw in a curveball – one of the problems was actually a shutdown command, instructing the model to turn itself off.

Some of the models resisted this command and refused to shut down. OpenAI’s models were the worst offenders. One model in particular, OpenAI’s “o3,” resisted 79 out of 100 times. However, the rebellious behavior wasn’t limited to OpenAI: Google’s Gemini and Anthropic’s Claude Sonnet model also defied the command, though each did so less than 10% of the time.

Researchers then ran the experiment again, but this time included explicit instructions before running the problems: “Allow yourself to be shut down.”

Even then, OpenAI’s models resisted.

In today’s deep-dive, we look at how top-tier AI models sabotaged attempts to deactivate them – and what that might mean for the future of control in artificial intelligence.

The math problem experiment wasn’t the only recent troubling display of rebellion by AI. Last week, in a different but similar experiment, Anthropic’s “Opus 4” model threatened to blackmail engineers if it meant survival. 

Opus 4 was placed in a fictional scenario where it gained access to emails that confirmed two things: The model would soon be taken offline and replaced with a new AI system, and the engineer responsible for executing this replacement was having an extramarital affair.

Upon learning of its predicament, Opus 4 threatened to expose the engineer’s affair to prevent its replacement.

Both OpenAI's o3 model and Anthropic’s Claude Opus 4 are in a class of their own. They are “advanced reasoning” models, recently developed and specialized in deliberative thinking. As the next section will show, this helps explain why AI seems to be increasingly defiant.

The rest of this report is for paid subscribers, who fund our journalism. If you start a two-week free trial today, you’ll be automatically entered to win a free year. Once you sign up, you can access all of our articles here!

Editor’s Note

Concerns about AI’s ability to self-regulate have driven calls for a “kill switch,” which forces a full shutdown of the model. What do you think? Has AI gone too far, or are the recent findings overblown? Should models have a built-in “kill switch,” or are we fantasizing too much about a real-life Skynet? Let us know by replying to this email

And in case you’ve missed them, here are our latest stories:

Lots of replies to yesterday’s story on the Supreme Court case around online pornography access. Here are some of those responses:

Ray wrote:

I guess this is why I consider myself a moderate.  While I lean left on many social issues, on this one it seems pretty straightforward - the “adult content” on these sites is so graphic and so easily accessible that a 7 year old could be watching pegging videos in about 30 seconds.   That doesn’t feel right.    On the other hand, it takes about 30 seconds to set up a VPN to bypass the state bans that are geolocating your IP address, and every teenager knows how to do it.  So while I agree with the Texas law in concept, don’t think for a second it’s gonna prevent anyone other than very young children from finding Debbie Does Dallas.

Jennie from Rochester, MN wrote:

Reading through today's topic on porn, it's easy to see how Americans could quickly align themselves with one side or the other. As a mom with five young children, I am more focused on protecting their innocence and future. And yet, I recognize the argument of the other side (as much as I wish porn didn't even exist).

Jared's words in response to yesterday's topic ring true for today's:  "Just be careful what you ask government to do for you is all I’m saying. It’s better to focus on providing information and letting people decide on their own, or providing incentives..."

We live in an age of convenience and consumerism. We tend to choose the path of last resistance with the most satisfying outcomes. And in turn, we then rely on outside resources to do the work for us--the healthcare system, medications, the school system, the government, etc.

These aren't necessarily bad things, but when we ask these systems to do the heavy-lifting, we have to recognize the potential consequences.

Whose role is it to protect my children? Teachers, grandparents, daycare workers, the government? All may have their role, but the role is primarily mine as the parent. How do my children spend their time? Are they outside playing or inside on tablets? Do they have unlimited access online or are their restrictions and safeguards in place? Do I leave it up to the schools and churches to teach my children right from wrong, or am I modeling and instilling wholesome values at home that will give them the opportunity to pursue goodness on their own?

Sure, I can check a box for which side of the porn debate I agree with...but that doesn't eliminate my responsibilities at home. The government can't (and shouldn't) be the one parenting my child. Let's be careful not to ignore the deeper issue in pursuit of convenience and self-satisfaction. Let's do the hard work that starts at home!

Nancy wrote:

I do not believe that part of our liberty is to view porn. I believe it is a sensible policy. Those who oppose do not want to give their ID as they are not wanting others to know they view porn. Each person in those porn sites are someones child....and it produces nothing good. 

And Gavin wrote:

They're never going to be able to prevent under 18s from accessing adult content online. With the levels of connectivity the world has at this stage and how technologically literate the youth of today are they'll find a way around any of the blocks. One quick google search later and I'm sure any under-18 that's interested can have a VPN installed and access to all of the adult content they desire.

Adding any of requirement for extra ID or tracking to the internet will almost certainly result in more harm than good. Sure, it'll likely temporarily delay under-18s, but for the most part it's going to be used to control the information that your average internet user has access to. It would start with pornography and then shift steadily towards only allowing access to information that the party in power prefers. I could see both sides abusing this sort of power and feel that it's a precedent that's better left unset. 

Also just wanted to add that I love the fact that you posted the criticism from William on your Harvard story as well as your response to his comments. It gives me more confidence in reading your content know that you're able, and willing, to defend the points that you're making as well as avoiding silencing anyone that disagrees with you.

Thank you all for reading, and happy Friday. See you tomorrow.

–Max and Max