0 bookmark(s) - Sort by: Date ↓ / Title /
Researchers at HiddenLayer have developed a novel prompt injection technique that bypasses instruction hierarchy and safety guardrails across all major AI models, posing significant risks to AI safety and requiring additional security measures.
SWE-agent is an agent that uses a language model (like GPT-4) to automatically fix GitHub issues, perform web tasks, solve cybersecurity challenges, or execute custom tasks through configurable agent-computer interfaces.
SWE-agent is an open-source tool that utilizes large language models (LLMs) like GPT-4o and Claude Sonnet 3.5 to autonomously fix bugs in GitHub repositories, solve cybersecurity challenges, and perform complex tasks. It features a mode called EnIGMA for offensive cybersecurity and prioritizes simplicity and adaptability.
Researchers from AWS and Intuit have designed a zero-trust security framework for the Model Context Protocol (MCP), addressing threats like tool poisoning and unauthorized access through multi-layered defenses including Just-in-Time access control and behavior-based monitoring.
This article examines the dual nature of Generative AI in cybersecurity, detailing how it can be exploited by cybercriminals and simultaneously used to enhance defenses. It covers the history of AI, the emergence of GenAI, potential threats, and mitigation strategies.
The article explores the concept of Large Language Model (LLM) red teaming, a practice where practitioners provide inputs to LLMs to test their boundaries and assess risks. It discusses the characteristics of LLM red teaming, including its manual, collaborative, and exploratory nature. The article also delves into the motivations behind red teaming, the strategies employed, and how the findings contribute to model security and safety.
An article discussing ten predictions for the future of data science and artificial intelligence in 2025, covering topics such as AI agents, open-source models, safety, and governance.
The article discusses how open-source Large Language Models (LLMs) are helping security teams to better detect and mitigate evolving cyber threats.
AI Risk Database is a tool for discovering and reporting the risks associated with public machine learning models. It provides a comprehensive overview of risks and vulnerabilities associated with publicly available models.
First / Previous / Next / Last
/ Page 1 of 0