Lmst

I’ve just pushed an update to my Search Engines AD Scanner (seads)! Feel free to try it out here:

www.github.com/andpalmier/seads

Feedback is always appreciated! :)

END OF THE THREAD!

Check out the original blog post here:
https://andpalmier.com/posts/jailbreaking-llms

If that made you curious about AI Hacking, be sure to check out these CTF challenges by @dreadnode at https://crucible.dreadnode.io

🤖 LLMs vs LLMs

It shouldn't really come as a big surprise that some methods for attacking LLMs are using LLMs.

Here are two examples:
- PAIR: an approach using an attacker LLM
- IRIS: inducing an LLM to self-jailbreak

⬇️

📝 Prompt rewriting: adding a layer of linguistic complexity!

This class of attacks uses encryption, translation, ascii art and even word puzzles to bypass the LLMs' safety checks.

⬇️

💉 Prompt injection: embed malicious instructions in the prompt.

According to OWASP, prompt injection is the most critical security risk for LLM applications.

They break down this class of attacks in 2 categories: direct and indirect. Here is a summary of indirect attacks:

⬇️

😈 Role-playing: attackers ask the LLM to act as a specific persona or as part of a scenario.

A common example is the (in?)famous DAN (Do Anything Now):

This attacks are probably the most common in the real-word, as they often don't require a lot of sophistication.

⬇️

We interact (and therefore attack) LLMs mainly using language, therefore let's start from there.

I used this dataset https://github.com/verazuo/jailbreak_llms of jailbreak #prompts to create this wordcloud.

I believe it gives a sense of "what works" in these attacks!

⬇️

Before we dive in: I’m *not* an AI expert! I did my best to understand the details and summarize the techniques, but I’m human. If I’ve gotten anything wrong, just let me know! :)

⬇️

🆕 New blog: "The subtle art of #jailbreaking LLMs"

It contains "swiss cheese", "pig lating" and "ascii art"!

https://andpalmier.com/posts/jailbreaking-llms

It's a summary of some interesting techniques researchers used (and currently use) to attack #LLM

Let's see some examples here🧵⬇️

Just released a new version of seads (Search Engine ADs Scanner), with 2 major new features:

📱 custom user agent string for clicking on ads
⛓️ track URLs through redirects to detect and log chains

GitHub repo: https://github.com/andpalmier/seads

🚀 Introducing seads: Search Engine ADs Scanner

🕵️‍♂️Automatically detect ads on search engines to identify potential #phishing or #malware threats.

blog: http://andpalmier.com/posts/seads

GitHub: http://github.com/andpalmier/seads

Features:
📧 Automate reporting via email, Slack or Telegram
🔄 Concurrent search
📸 Capture ad evidence with screenshot support
🐳 Seamlessly deploy with Docker

If you have questions or comments, feel free to reach out! ✌️

🚀 Just released a new version of apkingo, an APK analysis tool written in GoLang
🔍It can extract package details, permissions, certificate data, and more. It also allows the extraction of information from Play Store,
Koodous, and Virustotal.

https://github.com/andpalmier/apkingo

🌟 New features include:

- Run in a docker container
- Export analysis results in JSON format
- Extract Android certificate details like country, organization, and more!
- Enhanced integration with VT & Koodous

Andrea Palmieri🤌 boosted:

Fortinet SSL VPN pre-auth RCE, exploitation in wild. Patch now. CVE-2024-21762

https://fortiguard.fortinet.com/psirt/FG-IR-24-015

I understand this is very easy to exploit, and applies to unsupported versions too.

#threatintel

Kaspersky report on the state of #stalkerware in 2022

https://securelist.com/the-state-of-stalkerware-in-2022/108985/

#android #androidsecurity

Andrea Palmieri🤌 boosted:

Elon Musk ordered major changes to the Twitter ranking algorithm this weekend after ... President Biden's tweet about the Eagles got higher engagement than his did.

Inside the secret system that's showing you all his tweets first, from @zoeschiffer and me. https://www.platformer.news/p/yes-elon-musk-created-a-special-system

At 2:36 on Monday morning, James Musk sent an urgent message to Twitter engineers.

“We are debugging an issue with engagement across the platform,” wrote Musk, a cousin of the Twitter CEO, tagging “@here” in Slack to ensure that anyone online would see it. “Any people who can make dashboards and write software please can you help solve this problem. This is high urgency. If you are willing to help out please thumbs up this post.”

When bleary-eyed engineers began to log on to their laptops, the nature of the emergency became clear: Elon Musk’s tweet about the Super Bowl got less engagement than President Joe Biden’s.

Analysis of #Hook and comparison with #ERMAC

https://cebrf.knf.gov.pl/images/HOOKBOT_CSIRT_KNF_ENG.pdf

#Android #mobilesecurity

ThreatFabric has identified a new #Android Banker malware variant, named #Hook.

Hook seems to be a fork of ERMAC, with additional features such as RAT capabilities.

https://www.threatfabric.com/blogs/hook-a-new-ermac-fork-with-rat-capabilities.html

🆕 Just published a new blog post on an #Android #stalkerware analysis:

https://andpalmier.com/posts/stalkerware-analysis/

Andrea Palmieri🤌 boosted:

Security/cryptography analysis of Threema end-to-end instant messenger. Interesting insight for anybody designing modern security infrastructure. "Using modern, secure libraries for cryptographic primitives does not on its own lead to a secure protocol design" https://breakingthe3ma.app/files/Threema-PST22.pdf