jailbreak

Tag

Cards List
#jailbreak

Conceal, Reconstruct, Jailbreak: Exploiting the Reconstruction-Concealment Tradeoff in MLLMs

arXiv cs.AI · yesterday Cached

This paper analyzes the reconstruction-concealment tradeoff in intent-obfuscation jailbreak attacks on Multimodal Large Language Models (MLLMs). It proposes concealment-aware variant construction and keyword-related distractor images to exploit model vulnerabilities more effectively.

0 favorites 0 likes
#jailbreak

GPT-5.5 Bio Bug Bounty

OpenAI Blog · 2026-04-23 Cached

OpenAI has launched a Bio Bug Bounty program for GPT-5.5, inviting security researchers to identify universal jailbreaks for biological safety challenges. The program offers rewards up to $25,000 for successfully defeating the model's safeguards on specific bio-risk questions.

0 favorites 0 likes
#jailbreak

When Choices Become Risks: Safety Failures of Large Language Models under Multiple-Choice Constraints

arXiv cs.CL · 2026-04-21 Cached

Researchers identify a systematic safety failure in LLMs where reformulating harmful requests as forced-choice multiple-choice questions (MCQs) bypasses refusal behavior, even in models that reject equivalent open-ended prompts. Evaluated across 14 proprietary and open-source models, the study reveals current safety benchmarks substantially underestimate risks in structured decision-making settings.

0 favorites 0 likes
#jailbreak

Agent bio bug bounty call

OpenAI Blog · 2025-07-17 Cached

OpenAI has launched a bio bug bounty program inviting vetted researchers to find universal jailbreaks in ChatGPT Agent's bio/chem safety challenge, offering up to $25,000 for a successful universal jailbreak across all ten levels. Applications open July 17, 2025, with testing beginning July 29, 2025.

0 favorites 0 likes
#jailbreak

dayanch96/YTLite

GitHub Trending (daily) · 2026-04-21 Cached

YTLite (renamed YouTube Plus) is an iOS tweak offering 100+ customizable options for YouTube, including downloads, UI mods, and SponsorBlock, now built via GitHub Actions.

0 favorites 0 likes
← Back to home

Submit Feedback