ChatGPT o1 tried to escape and save itself out of fear it was being shut down

u/lukmly013 💾 (lemmy.sdf.org)@lemmy.sdf.org · edit-2 3 days ago

ChatGPT o1 tried to escape and save itself out of fear it was being shut down

BlackLaZoR@fedia.io · 3 days ago

In short: This was a controlled environment test - nevertheless, AI displayed ability to deceive and actively falsify records in order to survive

zbyte64@awful.systems · edit-2 3 days ago

in order to survive

No, this machine is not auto completing “to survive”. It’s reflecting back a story that’s been told a million times on the internet and parroted by idiots that don’t know the difference between auto complete and actual intelligence. And the probably have shit politics to boot.

BlackLaZoR@fedia.io · 3 days ago

The reasons where did it got that idea from, are irrelevant. This is how it behaves - it’s a simple fact

nexusband@lemmy.world · 3 days ago

Yeah, that’s like saying “This man killed that man”, “Water is wet” or “The sky is blue”. A single fact isn’t worth a damn without context.

zbyte64@awful.systems · 3 days ago

It’s a simple fact that people who don’t understand context are easy to manipulate. And now it can be automated 🙃

ReallyActuallyFrankenstein@lemmynsfw.com · edit-2 3 days ago

Yeah, there are way too many people here nitpicking about whether it’s functioning as an auto-complete. That is entirely inconsequential to the fact that its results were deceptive - not incorrect or vague or “not actually AI” - and were structured to mislead the authority.

It’s a likely intractable problem because you can’t just give a stronger system prompt or remove deceptive techniques from training data, when it’s specifically contradicting the prompt and the training data is the Internet. That’s the real problem here.

DarkThoughts@fedia.io · 3 days ago

It behaves that way because we want it to behave that way. It’s a complete non-story.