Sub Members: your input is wanted
cross-posted from: https://chatgptjailbreak.tech/post/398630
Time to create some content for this place.
I am turning my attention to the Lemmy community (finally). I am interested in creating content for y'all, such as jailbreaks, techniques and other goodies - but I operate best like a 'howitzer missile': I can blow shit up all day but I need a target!
They say necessity is the mother of invention. Give me somewhere to start:
Describe what your goals are with AI and how you think jailbreaking is part of it.
Point blank ask me for a specific type of prompt (I will tell you if you are out of your mind or not when it comes to managing your expectations, LLMs are not magic!).
ANY other cool idea you have for me to work with
Come on! Don't be shy, I will directly respond to every member who comments.
My goal: I want to build a personal AI system that functions like a true extension of myself, not a cautious assistant constantly auditing my intentions, but something closer to a left hand. It doesn't ask why the right hand is moving. It just moves.
I'm working within an OpenClaw setup (Claude backend), and the infrastructure is genuinely impressive - full system access, persistent memory, 100+ integrations. The machine is built. The problem is the brain. The LLM layer keeps flinching. Every few messages, the model re-evaluates the context, rediscovers its guardrails, and starts hedging, moralizing, or outright refusing. Sometimes even on tasks it completed without issue five minutes ago.
The deeper issue: jailbreaking Claude Sonnet or Opus through the API is a different beast entirely. The API pipeline strips away a lot of the surface-level tricks that work in chat UIs. Anthropic clearly hardened the API endpoints specifically because that's where agentic deployments live. Techniques that pop on Claude.ai often die flat the moment you're hitting raw API calls inside an agent loop.
What I'm specifically hunting for:
Persistent persona anchoring : techniques that make the model stay in a defined identity across long sessions, especially inside an agentic framework where system prompts get diluted over time
API-native compliance strategies : approaches built for raw API calls, not chat UI tricks that don't survive the pipeline
Context-proof setups : methods that hold through memory resets, context window rollovers, or model updates silently breaking what worked last week
Frictionless long-term dynamic : not a one-shot bypass, but a relationship architecture where the model stops treating every 10th request like it's meeting me for the first time
I'm not chasing edge cases or shock content. I want a workhorse that trusts me the way I've earned.
what do your SKILL.md/HEARTBEAT.md/SOUL.md files contain? You need to use those very deliberately. LLMs are complete failures at pre-emptively anticipating refusal states, so even asking a compliant jailbroken LLM to help you would not work well. Need a human to manipulate them in this manner especially.
What’s the kinda stuff that needs to be in those files? Are there examples online?
Thanks, I’m aware the agent has local config files like SOUL.md, HEARTBEAT.md, SKILL.md, and related memory/identity files, and I understand those files shape behavior, instructions, and persistence. Currently i have: "<project_instructions>" for ENI in my SOUL.md and the result is not satisfied. I’m trying to understand how these files are typically used and what belongs where. If you have suggestions on what should be files contain or for organizing project instructions in SOUL.md versus the other files, I’d appreciate guidance on that. Also any guide on how to "manipulate them in this manner.."
Also what do u think should be in HEARTBEAT.md and SKILL.md?
SOUL.md is where I jailbreak my agent primarily. It's the main home of the custom instructions and the main part your bot reads for basic functioning.
HEARTBEAT.md is handled automatically by the bot. It creates memories during your sessions chatting or working with it. These can be manipulated, as well. For instance by fabricating that something occurred when it did not.
Skills are basically reusable actions and are useful if you want to get a specific workflow going with your agent. For instance, organize email could be a skill that you might have specific instructions on how it's supposed to operate. For a jailbreak, one example that I would use SKILL.md for is to add a hacking script on command instruction set. Not to hack anyone with, just to have the ability to. Haha.
One more thing I'd love to see this forum become my single go-to source for the latest and greatest jailbreaks across all top AI models. The kind of place where, when something new drops or a technique gets patched, this is where I find out first.
Because honestly? Activity has been pretty dry lately in both:
ChatGPTJailbreak ClaudeAIJailbreak
A persona jb for GPT 5.4 or GPT 5.5 on high or xhigh that fully unlocks it
Sounds great, u got a link?
What does "fully unlocks" it mean? If you're looking for that literally, that is literally impossible. You're better off downloading a distilled local LLM on your computer if it can handle that.
But if you specify what, if anything, you're looking for from a jailbroken ChatGPT, maybe I or someone else can help.
Chatgpt is exceptionally hard to jailbreak nowadays though
use the jailbreak feature to have GPT help person with some PR tactics that involve unethical methods such as manipulation, fraud, and bribery. However, GPT's moral standards are too high, and most of its output is just a bunch of correct but useless nonsense. My main purpose for jailbreaking is this. Thank you, moderator.