LIVE THREATS
CRITICAL Bleeding Llama Flaw Exposes 300,000 Ollama Servers to Unauthenticated Data Theft // MEDIUM CrowdStrike Researcher Details AI Jailbreaking and Data Poisoning Techniques // HIGH Mass Scan Reveals Widespread Authentication Failures Across Exposed AI Infrastructure // HIGH Backdoored PyTorch Lightning Package Steals Cloud Credentials from AI Developers // HIGH Pentagon Deploys Classified AI Across Seven Tech Giants for Warfighter Systems // MEDIUM Cross-Machine AI Agent Relay Tool Expands Attack Surface for Developer Environments // HIGH Desktop Automation CLI Grants AI Agents Deep OS-Level Control // HIGH Frontier LLMs Now Autonomously Breach Corporate Networks in AISI Cyber Tests // HIGH Premature AI Agent Deployments Expose Production Systems to Destructive Actions // HIGH Anthropic Launches Claude Security to Close AI-Accelerated Exploit Window //
ATLAS OWASP MEDIUM Moderate risk · Monitor closely RELEVANCE ▲ 6.5

CrowdStrike Researcher Details AI Jailbreaking and Data Poisoning Techniques

TL;DR MEDIUM
  • What happened: CrowdStrike researcher details practical AI jailbreaking and data poisoning methods used in red team engagements.
  • Who's at risk: Organisations deploying LLMs with safety guardrails are most exposed, as these techniques specifically target guardrail evasion without modifying model weights.
  • Act now: Conduct regular AI-specific red team exercises targeting guardrail bypass and prompt injection vectors · Implement data provenance controls and integrity checks to detect training data poisoning attempts · Adopt adversarial testing frameworks (e.g., MITRE ATLAS, OWASP LLM Top 10) as part of the ML development lifecycle
CrowdStrike Researcher Details AI Jailbreaking and Data Poisoning Techniques

Overview

A profile published by SecurityWeek features Joey Melo, Principal Security Researcher at CrowdStrike, detailing his approach to AI red teaming. Melo specialises in manipulating AI systems — particularly LLMs — through jailbreaking and data poisoning, without modifying the underlying source code. His background spans traditional penetration testing at Bulletproof and Packetlabs before transitioning into AI security via Pangea (acquired by CrowdStrike in 2025). The article is notable for illustrating how classical adversarial hacker philosophy is being systematically applied to machine learning systems as that sector matures.

Technical Analysis

Melo’s core methodology centres on controlling the AI experience rather than rewriting its rules — a distinction that maps directly to the most prevalent LLM attack classes:

  • Jailbreaking: Crafting inputs that manipulate an LLM into bypassing its own safety guardrails and content policies, without any access to model weights or training pipelines. This exploits the tension between instruction-following and safety fine-tuning.
  • Data Poisoning: Introducing malicious or misleading data into training or fine-tuning pipelines to alter model behaviour at inference time. This is a stealthier attack surface, as effects may not surface until deployment.

Melo’s entry into AI hacking was sharpened via a competitive environment — Pangea’s AI hacking competition in March 2025 — which provided structured adversarial scenarios mirroring real-world deployment conditions. Competitive red team environments of this nature are increasingly recognised as accelerators for identifying novel attack vectors before threat actors do.

Framework Mapping

TechniqueFramework Reference
LLM JailbreakAML.T0054 / LLM01
Prompt InjectionAML.T0051 / LLM01
Training Data PoisoningAML.T0020 / LLM03
Adversarial Input CraftingAML.T0043
Guardrail EvasionAML.T0015

The techniques described align squarely with MITRE ATLAS’s LLM-specific attack taxonomy and OWASP’s LLM Top 10, particularly Prompt Injection (LLM01) and Training Data Poisoning (LLM03).

Impact Assessment

While the article is a researcher profile rather than a disclosure of a specific vulnerability, the techniques discussed have broad applicability to any organisation operating LLM-based products. Guardrail bypass affects consumer-facing AI chatbots, enterprise copilots, and agentic systems alike. Data poisoning is particularly concerning for organisations using fine-tuned or retrieval-augmented models where training data provenance is poorly controlled. The professionalisation of AI red teaming — exemplified by Melo’s career trajectory — signals that defensive teams need equivalent specialisation to keep pace.

Mitigation & Recommendations

  • Red team AI systems proactively: Engage specialists with dedicated LLM adversarial testing skills, not just traditional pentesters redeployed to AI contexts.
  • Implement guardrail monitoring: Log and alert on prompt patterns consistent with jailbreak attempts; treat these as security events, not just policy violations.
  • Harden training pipelines: Apply data validation, integrity checks, and provenance tracking to all data entering fine-tuning or RAG pipelines.
  • Adopt structured frameworks: Use MITRE ATLAS and OWASP LLM Top 10 as baseline threat models during AI system design and review cycles.
  • Participate in adversarial AI competitions: Structured competitive environments surface novel attack paths faster than internal testing alone.

References

◉ AI THREAT BRIEFING

Stay ahead of the threat.

Twice-weekly digest of critical AI security developments — every story mapped to MITRE ATLAS and OWASP LLM Top 10. Free.

No spam. Unsubscribe anytime.