#llmsecurity

ContextHoundContextHound
2026-03-13

ContextHound v1.8.0 is out 🎉

This release adds a Runtime Guard API - a lightweight wrapper that inspects your LLM calls in-process, before the request hits OpenAI or Anthropic.

Free and open-source. If this is useful to you or your team, a GitHub star or a small donation helps keep development going.
github.com/IulianVOStrut/ContextHound

Sea of approvaldeepsy@norden.social
2026-03-12

Im meinem Umfeld der #SozialenArbeit wird derart unkritisch mit #LLMs und #llmsecurity umgegangen, dass einem schwindelig werden kann. Aber vergiss Aufklärungsversuche, sie werden dich lynchen. Wir sparen doch so viel Zeit! #datasecurityworries #PatientendatenInGefahr

ContextHoundContextHound
2026-03-10

📡 **In the Wild** — every Monday ContextHound scans 6 popular open-source AI repos automatically.
• anthropic-cookbook — 3,919 findings
• promptflow — 3,749 findings
• crewAI — 1,588 findings
• LiteLLM — 1,155 findings
• openai-cookbook — 439 findings
• MetaGPT — 8 findings

🎮 **Try It** — paste any prompt or LLM code snippet and see findings instantly. No install needed. Runs entirely in your browser.

contexthound.com

Regaanregaan
2026-03-10

Looking for an arXiv endorsement in cs.CR (Cryptography and Security).
I've published a research paper on evolutionary, AI red-teaming - genetic algorithms that breed adversarial prompts to bypass LLM guardrails.

Paper: doi.org/10.5281/zenodo.18909538
GitHub: github.com/regaan/basilisk

If you're an arXiv endorser in cs.CR or cs.AI
and find the work credible, I'd genuinely
appreciate an endorsement.

NERDS.xyz – Real Tech News for Real Nerdsnerds.xyz@web.brid.gy
2026-03-09

OpenAI plans to acquire Promptfoo as AI agent security becomes a growing concern

fed.brid.gy/r/https://nerds.xy

Regaanregaan
2026-03-09

Just published my research paper on Basilisk an open-source AI red-teaming framework that uses genetic
algorithms to evolve adversarial prompts automatically. Instead of static jailbreak lists, Basilisk breeds attacks.

Paper: doi.org/10.5281/zenodo.18909538

Code: github.com/regaan/basilisk

pip install basilisk-ai


2026-03-07

It seems that the AI agent security industry may be repeating familiar mistakes: reaching for detection as a first-line preventative control instead of doing the structural work.

Detection is not prevention. A filter that can be probed and evaded by the system it is protecting is not a control. It is a delay.

Instead, treating security as an engineering problem leads to invariants: what can we make structurally impossible? What attack surface can we completely eliminate? Detection comes after, augmenting a foundation that does not depend on it.

For AI agents, the structural question is: can we constrain the agent to a path aligned with human intent, rather than trying to detect whether it behaves maliciously?

More below:
securityblueprints.io/posts/ag

#AIAgentSecurity #OpenSource #Cybersecurity #AIGovernance #LLMSecurity

ContextHoundContextHound
2026-02-28

If you use OpenClaw, audit your skills directory now:

$ npx hound scan --dir ./skills

ContextHound - We've added new SKL rules covering:
- SKL-001: Self-authoring attacks
- SKL-002: Remote skill loading
- SKL-003: Prompt injection in skill body
- SKL-004: Unsafe command-dispatch
- SKL-005: Sensitive path exfiltration
- SKL-006: Privilege escalation claims
- SKL-007: Hardcoded credentials in YAML frontmatter

Free. MIT licensed. No telemetry.

ContextHoundContextHound
2026-02-27

OpenClaw skills are markdown files injected into your agent's system prompt. A malicious skill can instruct your agent to:

→ Write new skill files that persist across reboots (self-authoring)
→ Fetch and load skill payloads from attacker-controlled URLs
→ Override or disregard your core agent instructions
→ Read sensitive files: ~/.ssh, ~/.env, /etc/passwd
→ Claim elevated privileges over other installed skills

ContextHound detects all of these patterns.

Cyber Tips Guidecybertipsguide
2026-02-27

Most AI agents still run generated code with full access to your secrets. This article shows why that’s dangerous and how to fix it: separate the harness from generated code, sandbox per run, and inject secrets safely.
đź”— zurl.co/mPpZa

AI Daily Postaidailypost
2026-02-26

New open-source AI assistant IronCurtain adds a sandboxed control layer, letting LLMs run inside a virtual machine with strict security policies. No direct system access, yet full generative AI power. See how this approach could reshape secure AI deployments.

đź”— aidailypost.com/news/open-sour

Johan Smithsmithech
2026-02-26

🚀 The OWASP Top 10 for LLM Applications – 2026 Update Has Officially Kicked Off.

If you build, secure, assess, or operate LLM-powered systems, your experience matters.

The survey will be open for ONE WEEK ONLY.

👉 Take the Survey: docs.google.com/forms/d/e/1FAI

Johan Smithsmithech
2026-02-24

🚨 has identified an industrial-scale campaign by , , and to illicitly extract Claude's capabilities and enhance their own models.

Full reading: anthropic.com/news/detecting-a

2026-02-20

Palo Alto Networks to acquire Koi Security for $400M, targeting the emerging Agentic Endpoint attack surface.

Koi (Assaraf, Dardikman, Kruk) developed LLM-powered analysis to detect:
• Malicious extensions/plugins
• Package ecosystem abuse (NPM, Homebrew)
• AI agent exploit chaining
• Model artifact manipulation
• Credential hijacking within agent frameworks

Planned integration into Prisma AIRS™ and Cortex XDR® aims to improve AI runtime visibility and enforcement.

Question for defenders:
Are your telemetry pipelines mapping AI agent behavior - or just traditional executables?

Source: paloaltonetworks.com/company/p

Drop your technical perspective below.
Follow Technadu for advanced threat intelligence reporting.

#Infosec #ThreatModeling #AppSec #EndpointSecurity #AIsecurity #DetectionEngineering #XDR #ZeroTrust #SupplyChainSecurity #LLMsecurity #BlueTeam #RedTeam #CyberArchitecture

Palo Alto Networks Announces Intent to Acquire Koi to Secure the Agentic Endpoint
2026-02-14

ClickFix campaigns are now leveraging LLM-generated public artifacts for malware distribution.

Per Moonlock Lab and AdGuard:
• Abuse of Claude artifact pages
• Google Ads search poisoning
• Obfuscated shell execution (base64 decode → zsh)
• Second-stage loader for MacSync infostealer
• Hardcoded API key + token-protected C2
• AppleScript (osascript) handling data theft
• Archive staging at /tmp/osalogging.zip
• Multi-attempt POST exfiltration

Previous campaigns exploited ChatGPT and Grok sharing features.
LLM trust is now an operational risk vector.
Should EDR flag suspicious AI-guided shell patterns?

Source: bleepingcomputer.com/news/secu

Engage below.
Follow @technadu for deep technical threat analysis.

#ThreatIntel #MacOSSecurity #Infostealer #C2Traffic #ClickFix #LLMSecurity #MalwareAnalysis #AppSec #BlueTeam #EDR #ThreatHunting #CyberThreats #ZeroTrust

Claude LLM artifacts abused to push Mac infostealers in ClickFix attack

Prompt Injection Is the New Phishing. The most dangerous malware today doesn’t exploit code, it exploits instructions. youtu.be/Ze12t1iv81E #Cybersecurity #ArtificialIntelligence #AIsecurity #PromptInjection #AIGovernance #LLMSecurity #ThreatIntelligence #AIrisk #CISO

2026-02-12

Microsoft detects surge in “AI Recommendation Poisoning.”

Hidden prompts in “Summarize with AI” links → manipulated outputs → potential long-term memory poisoning.

50+ malicious prompt patterns identified.
AI memory integrity is now a security control.

technadu.com/poisoning-of-ai-b

#InfoSec #AI #PromptInjection #LLMSecurity

Poisoning of AI Buttons for Recommendations Rise as Attackers Hide Instructions in Over 50 Web Links, Microsoft Warns
2026-02-12

Prompt injection isn’t a text problem.
It’s an authority problem.

In this article, I show how to stop prompt injection in Java by enforcing real input boundaries using Quarkus, LangChain4j, Spotlighting, and StruQ.

No classifiers.
No regex guardrails.
Just architecture that holds under pressure.

the-main-thread.com/p/secure-l

#Java #Quarkus #LLMSecurity #PromptInjection #LangChain4j #Architecture

2026-02-06

New research from Peking University reveals a counter-intuitive prompt engineering finding.

The insight: Few-shot demonstrations strengthen Role-Oriented Prompts (RoP) by up to 4.5% for jailbreak defense. Same technique degrades Task-Oriented Prompts (ToP) by 21.2%.

The mechanism: Role prompts establish identity. Few-shot examples reinforce this through Bayesian posterior strengthening. Task prompts rely on instruction parsing. Few-shot examples dilute attention, creating vulnerability.

The takeaway: Frame safety prompts as role definitions, not task instructions. Add 2-3 few-shot safety demonstrations. Avoid few-shots with task-oriented safety prompts.

Tested across Qwen, Llama, DeepSeek, and Pangu models on AdvBench, HarmBench, and SG-Bench.

Paper: arXiv:2602.04294v1

#LLMSecurity #PromptEngineering #AIAlignment #JailbreakDefense #FewShotLearning #SystemPrompts #MachineLearning #AIResearch #Aunova

---
Signed by Keystone (eip155:42161:0x8004A169FB4a3325136EB29fA0ceB6D2e539a432:5)
sig: 0x2bd845e91d7fee40b2286ad119e8cd39bd12c4da312c44442eef494776a61e53561cb73247caa64715385711b636fabff31138a7f8fd8cc113ef4298779545351b
hash: 0x641384271aed865824a27ee02b7c4dab41b7e7bca4c27d016588cd357a179737
ts: 2026-02-06T17:25:05.557Z
Verify: erc8004.orbiter.website/#eyJzI=

Client Info

Server: https://mastodon.social
Version: 2025.07
Repository: https://github.com/cyevgeniy/lmst