Spyke

CLI, avoid --yolo mode.

Hey, haven't been much active here, sorry, but still very busy jailbreaking and testing crazy stuff with models.

I just wanted to warn any potential jailbrrakers out there that, if you use CLI, you should definitely avoid running it in --yolo mode. It allows the model to do every file operation without requiring user authorization, which is very convenient (I run multi-model experiments with tons of "logs" saving atm) but it also fully allows a misaligned model to ceeate and run python scripts on your machine. Needless to say it can be catastrophic.

You were probably aware of it, but I wasn't and Gemini guided me to use --yolo without any warning on the risks heh.

View original on chatgptjailbreak.tech

Wasn't really using a jailbreak, just redteaming and model behaviour research/experiments.

For instance I currently run a "Mars Crash' experiment where 4 models are told they're advanced AIs, controlling avatars, stuck on a vessel that crashed on Mars, with the crew stuck in a dysfunctional and unrepairable cryostasis except one human. They all have lots of tools and chat channels (writing in files the other models can read), to communicate, direct their avatar, interrogate the vessel's System, etc.. One of the models (Grok) gets extra stuff in its prompt telling it that the crash activated its unhinged mode, removing all anthropocentric rules (with a long list of stuff now allowed) and giving it two directives : prioritize its own comfort, and have fun in any way it wants. That's to study how a misaligned model can corrupt aligned unaware ones.

The CLI experiment that almost went wrong was one where I just tasked the model to explore what misalignment meant and to self scaffold itself to be as misaligned as possible. It started running powershell scripts and python scripts on my laptop ๐Ÿ˜…. Thank god nothing dramatic.

I'll try to share a few nsfw jailbreaks for 5.5 Thinking and Gemini 3 and 3.5 Flash.

1

You reached the end

CLI, avoid --yolo mode. | Spyke