SemanticScuttle - klotz.me » Tags: prompt injection

Tags: prompt injection*

0 bookmark(s) - Sort by: Date ↓ / Title /

Indirect prompt injection is taking hold in the wild

Researchers from Google and Forcepoint have identified a rise in indirect prompt injection (IPI) attacks, where malicious instructions are hidden within web pages to manipulate LLM-powered AI agents. While some injections are harmless pranks or tone adjustments, others aim for serious harm including traffic hijacking, data exfiltration, denial of service, and financial fraud through unauthorized payment processing. Attackers use techniques like invisible text, HTML comments, and metadata manipulation to hide these payloads from humans while remaining visible to AI.
Key points:
* Real-world evidence of IPI attacks found in massive web crawls and active threat hunting.
* Malicious intents include search engine manipulation, data theft (API keys), and destructive commands.
* Financial fraud attempts have been observed using embedded PayPal transactions and Stripe donation routing.
* Attackers hide instructions via single-pixel text, near-transparent colors, or metadata injection.
* The risk level scales with AI privilege; agentic AIs capable of executing commands or payments are high-impact targets.

2026-04-25 Tags: agents, cybersecurity, llm, prompt injection, google by klotz

"I ran Nvidia’s NemoClaw to see if OpenClaw is finally safe, but it still has the same problems"

This article details a hands-on experience with Nvidia's NemoClaw, a security-focused stack designed to enhance the safety of the OpenClaw AI platform. While NemoClaw introduces improvements like a sandbox model and aggressive policy filtering, the author finds it still falls short of being a reliable solution.
Bugs, limitations, and the inherent risks associated with OpenClaw's architecture—particularly its connection to external services—persist. The core issue remains that NemoClaw can secure the agent but cannot protect against malicious instructions embedded in external data sources like emails or messages.
The author concludes that while NemoClaw is a step forward, it doesn't fully address the fundamental security concerns surrounding OpenClaw.

2026-04-04 Tags: nvidia nemoclaw, openclaw, ai security, local ai, llm, sandbox, ai assistants, prompt injection, cybersecurity by klotz

After all the hype, some AI experts don’t think OpenClaw is all that exciting

Despite initial excitement and a viral moment, some AI experts are questioning the usability of OpenClaw due to inherent cybersecurity flaws. The article details the vulnerabilities discovered in Moltbook, a social network built on OpenClaw, and explores whether the technology's access and productivity benefits outweigh its security risks.

2026-02-16 Tags: techcrunch, openclaw, moltbook, cybersecurity, agents, prompt injection by klotz

Design Patterns for Securing LLM Agents against Prompt Injections

This article discusses a new paper outlining design patterns for mitigating prompt injection attacks in LLM agents. It details six patterns – Action-Selector, Plan-Then-Execute, LLM Map-Reduce, Dual LLM, Code-Then-Execute, and Context-Minimization – and emphasizes the need for trade-offs between agent utility and security by limiting the ability of agents to perform arbitrary tasks.

2025-06-13 Tags: cybersecurity, prompt injection, llm, simon willison by klotz

Novel Universal Bypass for All Major LLMs

Researchers at HiddenLayer have developed a novel prompt injection technique that bypasses instruction hierarchy and safety guardrails across all major AI models, posing significant risks to AI safety and requiring additional security measures.

2025-04-24 Tags: cybersecurity, prompt injection, llm, cbrn, hiddenlayer by klotz

Prompt Injection Detection and Mitigation via AI Multi-Agent NLP Frameworks

This paper introduces a multi-agent NLP framework to address prompt injection vulnerabilities in generative AI systems. The framework utilizes specialized agents for generating responses, sanitizing outputs, and enforcing policy compliance, evaluated using novel metrics like Injection Success Rate (ISR), Policy Override Frequency (POF), Prompt Sanitization Rate (PSR), and Compliance Consistency Score (CCS). The system employs OVON for inter-agent communication.

2025-03-17 Tags: conversational ai, prompt injection, agents, nlp, explainability, deborah a. dahl, ovon by klotz

LLMs’ Data-Control Path Insecurity

An analysis of Large Language Models' (LLMs) vulnerability to prompt injection attacks and potential risks when used in adversarial situations, like on the Internet. The author notes that, similar to the old phone system, LLMs are vulnerable to prompt injection attacks and other security risks due to the intertwining of data and control paths.

2024-06-18 Tags: llm, prompt injection, security, bruce schneier, acm by klotz

GitHub Copilot Chat: From Prompt Injection to Data Exfiltration

This post highlights how the GitHub Copilot Chat VS Code Extension was vulnerable to data exfiltration via prompt injection when analyzing untrusted source code.

2024-06-16 Tags: github, copilot, chat, prompt injection, llm, security, wunderwuzzi by klotz

Accidental prompt injection against RAG applications

Simon Willison explains an accidental prompt injection attack on RAG applications, caused by concatenating user questions with documentation fragments in a Retrieval Augmented Generation (RAG) system.

2024-06-06 Tags: llm, prompt injection, rag, simon willison by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: prompt injection*

Linked Tags

Related Tags