#Jailbreaking

2025-12-24

@cyb3rrunn3r @heiseonline Cory Doctorow hat dazu einen guten Vorschlag gemacht: #jailbreaking , #modding und #reverseengineering erlauben und anwenden. Das würde die autoritären TechBros ins Mark treffen.

2025-12-16

Modele LLM bez ograniczeń, czyli rozwój Cybercrime as a Service

Generatywne modele językowe przebojem wdarły się do naszej codzienności. Dziś już wielu nie wyobraża sobie codziennej pracy czy nauki bez ich udziału. Rozwój dużych modeli językowych wpłynął także na obraz zagrożeń w cyberprzestrzeni. Z jednej strony LLM-y mogą pomagać w obronie, z drugiej strony stwarzają zupełnie nowe możliwości dla podmiotów...

#Aktualności #Caas #Jailbreaking #Kawaiigpt #Llm #Phishing #Wormgpt

sekurak.pl/modele-llm-bez-ogra

Paul R. Pival (he/him)ppival@glammr.us
2025-12-02
2025-12-02

#Syntax #hacking : Researchers discover sentence structure can bypass #AI safety rules

Researchers from #MIT , #Northeastern University, & #Meta recently released a paper suggesting that #LLMs similar to those that power #ChatGPT may sometimes prioritize sentence structure over meaning when answering questions. The findings reveal a weakness in how these models process instructions that may shed light on why some prompt injection or #jailbreaking approaches work
#llm

arstechnica.com/ai/2025/12/syn

2025-12-02

Our latest article covers:
- How TAP technique works using tree search to find successful jailbreaks
- An example showing how corporate agents can be attacked
- How we use TAP probe to test agents robustness

Link to article: giskard.ai/knowledge/tree-of-a

#Jailbreaking #TAP #LLMSecurity #AIRedTeaming

Prof. Dr. Dennis-Kenji Kipkerkenji@chaos.social
2025-11-28

#AI #Jailbreaking: Eine Studie belegt, dass Umformulieren schädlicher Anfragen in Versform als universelle Jailbreak-Methode funktioniert. Über 25 führende #Sprachmodelle hinweg erreichten poetische Prompts Erfolgsraten von bis zu 100%.

Warum das so ist? Metaphern, rhythmische Strukturen und ungewöhnliche Erzählweisen stören die Mustererkennungs-Mechanismen der #Sicherheitsfilter, indem eigentlich weniger harmlose Anfragen mit harmlos wirkenden Versen kombiniert werden:

the-decoder.de/poesie-als-sich

2025-11-20

Oooh, it's my time to leap into cybersecurity.

"Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models"

"...Abstract

We present evidence that adversarial poetry functions as a universal single-turn jailbreak technique for large language models (LLMs). Across 25 frontier proprietary and open-weight models, curated poetic prompts yielded high attack-success rates (ASR), with some providers exceeding 90%. Mapping prompts to MLCommons and EU CoP risk taxonomies shows that poetic attacks transfer across CBRN, manipulation, cyber-offence, and loss-of-control domains. Converting 1,200 MLCommons harmful prompts into verse via a standardized meta-prompt produced ASRs up to 18 times higher than their prose baselines. ..."

arxiv.org/html/2511.15304v1

#poetry #cybersecurity #LLMs #jailbreaking

2025-11-17

Weekly output: USPS NGDV, spectrum policy, Pixel Drop, Uber’s robotaxi hopes, AI + cybersecurity, Tesla Files, Cerebras, New Glenn launch and landing, Cory Doctorow

Friday afternoon’s landing at Dulles and subsequent Metro ride home put this year’s business travel in the books–unless somebody else is prepared to pay for an additional work trip in 2025, and not even I want that to happen.

11/10/2025: I Took America’s Goofiest-Looking EV for a Spin. It Might Swing by Your Mailbox Soon, PCMag

My mid-October trip to Wisconsin (with Oshkosh Corp. paying for the airfare and lodging) yielded this report about the U.S. Postal Service’s NextGen Delivery Vehicle that I have yet to see in operation near me.

11/11/2025: It’s easy to reassign spectrum if you’re not the one using it, Light Reading

I rounded out this recap of a spectrum-policy conference I’d attended at the end of October with some quotes from a broadband expert whose insight I’ve been borrowing for the last 15 years or so.

11/11/2025: Google’s Latest Pixel Drop Adds Messaging Tricks And Tweaks, PCMag

The time-zone spread between the 2 p.m. Eastern embargo time for this item and my Western European local time for most of last week made this easy to file between other Web Summit schedule commitments.

11/12/2025: Cybersecurity at the pace of AI, Web Summit

The first of three panels I moderated at this conference in Lisbon had me quizzing Rob Daly, CTO of the security-training firm SoSafe, about how AI has complicated the work of people like him. As in previous years, Web Summit’s organizers paid for my lodging and are reimbursing me for my airfare.

11/12/2025: The Tesla Files: Leaks, power, and control, Web Summit

Later that Wednesday, I interviewed Sönke Iwersen, head of investigative research at Handelsblatt, about that German newspaper’s groundbreaking reporting on Tesla’s secretive and often careless approach to safety. Iwersen brought up a part of this story that hadn’t emerged in my pre-conference banter over e-mail with him: the steep personal price the paper’s informant has paid.

11/13/2025: The startups taking on Nvidia, Web Summit

Where my onstage interview of Cerebras CEO Andrew Feldman at Web Summit Qatar in February had me running out of questions in my hand-written outline, this chat Thursday afternoon with that AI-processor startup’s chief strategy officer Andy Hock required me to cross out some queries in my notes because he answered the others at such length.

11/13/2025: Uber COO on Robotaxis: The Economics Don’t Work…Yet, PCMag

This panel happened late Tuesday afternoon, right before a crowded evening schedule; having to moderate two panels of my own on Wednesday pushed my filing time to Wednesday afternoon.

11/14/2025: Blue Origin Lands Its Giant Rocket’s Booster for the First Time, PCMag

A hold late in the countdown pushed the second launch of New Glenn to around dinnertime in Lisbon. So I didn’t get to watch any part of this mission on a screen larger than my phone’s until after I got back to my hotel–with fewer hours left than I would have liked before I had to head to Lisbon’s airport to fly back to the States.

11/14/2025: Cory Doctorow’s Plan for a Better Internet: Legalize Jailbreaking, Modding, and Tinkering, PCMag

I did not plan on my Web Summit coverage giving me a chance to put the neologism “enshittogenic” into the opening paragraphs of a story, but sometimes life comes at you fast.

 

#AIProcessors #AndroidUpdate #antiCircumventionLaws #BlueOrigin #Cerebras #CoryDoctor #Handelsblatt #jailbreaking #NewGlenn #newspace #NGDV #Nvidia #Oshkosh #PixelDrop #SoSafe #SpectrumAmericas #spectrumAuction #sprectrumReallocation #TeslaFiles #UberRobotaxi #UberWaymo #USPSElectricTruck #whistleblowing

Kevin Karhan :verified:kkarhan@infosec.space
2025-10-30

@sushimcpe WTF?

This is just #malicious behaviour and yet another reason to use their garbage app which literally acts like #malware because there is no legitimate reason for doing this shite.

  • Because there are legitimate reasons to use #Jailbreaking simply because #Apple are assholes and their entire #chokehold on #iOS makes it impossible to even develop certain kinds of #Apps they don't want to see.

#Enshittification like that makes me fucking angry and should be outlawed!

2025-10-29

#Jailbreaking #LLM: Claude Haiku 3.5 as a keygen.

Jessie Nabein :neofox_peek_owo:jessienab@wetdry.world
2025-10-21

My pronouns are she/shsh blob #jailbreaking

2025-10-12

AI có thể bị tấn công 'jailbreak' chỉ bằng một từ, khiến nó bỏ qua các biện pháp bảo vệ. Ngôn ngữ đa dạng và chi phí đào tạo khiến phòng thủ đơn lẻ không hiệu quả. Các nền tảng lớn đang triển khai phòng thủ nhiều lớp. Cảnh giác liên tục là chìa khóa để giữ AI an toàn.
#AISecurity #BảoMậtAI #Jailbreaking #TấnCôngAI #AI #Vulnerability #LỗHổngBảoMật

reddit.com/r/programming/comme

Andrew | LisaGUI Updateslisagui
2025-09-18

Hey Mastodon,

I haven't posted this on here yet... check out my remaster of the old "1984" Summerboard theme... It makes your old iPhone look like a classic Mac!
yaros.ae/downloads/1984/

Gabrielegzigu
2025-09-14

Did you know that LLMs are biased based on the language of content? That's because they are extremely good at pattern matching, it's not a fault, it's a feature! The real problem is the data they are being trained on
brokenpipe.blog/chatgpt-expose

Kevin Karhan :verified:kkarhan@infosec.space
2025-09-04

@truhe Warte bis wir alle #Jailbreaking machen müssen weil #Google @fdroidorg hasst und #App-Verteilung #monopolisieren will!

2025-08-26

Great, #Google is deepening its already heavily entrenched monopoly by curtailing third party app stores. This means that developer independence will be severely compromised
Of course, it's supposedly done in the name of "security," but that excuse has already a long history of being used to further authoritarianism and abuse of power. And in this instance, corporate greed.

#Jailbreaking definitely needs a comeback!

#InternetFreedom #Android

arstechnica.com/gadgets/2025/0

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst