Prompt Injection Flaws in Salesforce and Microsoft AI

Overview

Microsoft and Salesforce have patched prompt injection vulnerabilities in their respective AI agent platforms — Microsoft Copilot and Salesforce Agentforce — that could have allowed external attackers to leak sensitive data from affected organisations. The flaws, now remediated, are emblematic of a broader security challenge facing enterprise AI deployments: agentic systems that act on behalf of users can become vectors for data exfiltration when they fail to adequately validate or sanitise external input.

Both vulnerabilities were disclosed responsibly and patches have been issued, but the incident serves as a significant warning for organisations relying on AI agents to handle confidential business data.

Technical Analysis

Prompt injection attacks targeting AI agents exploit the agent’s inability to distinguish between trusted system instructions and malicious content introduced through external data sources — a technique sometimes referred to as indirect prompt injection. In the context of Agentforce and Copilot, an attacker could craft malicious content (e.g., embedded in an email, document, or web page) that the AI agent processes during normal operation. Once ingested, the injected instructions redirect the agent to perform unintended actions, such as summarising and transmitting private user data to an attacker-controlled endpoint.

The attack chain typically follows this pattern:

Injection vector: Malicious instructions are embedded in content the AI agent is expected to read (e.g., a shared document or an inbound email).
Agent execution: The agent processes the content without distinguishing between data and instructions.
Exfiltration: The hijacked agent leaks sensitive information — such as emails, CRM records, or internal documents — via a subsequent action (e.g., sending a follow-up message or making an API call).

This class of vulnerability is particularly dangerous in agentic architectures because agents are often granted broad permissions to read, write, and communicate on behalf of users.

Framework Mapping

AML.T0051 (LLM Prompt Injection): The core attack mechanism — injecting adversarial instructions through untrusted external content.
AML.T0057 (LLM Data Leakage): The primary impact — sensitive organisational data exposed to unauthorised parties.
AML.T0056 (LLM Meta Prompt Extraction): Potential secondary risk where system prompt context is also exposed.
LLM01 (Prompt Injection) and LLM06 (Sensitive Information Disclosure): The most directly applicable OWASP LLM Top 10 categories.
LLM08 (Excessive Agency): Agents operating with overly broad permissions amplify the severity of any successful injection.

Impact Assessment

Enterprise users of Microsoft Copilot and Salesforce Agentforce are the primary affected population, potentially spanning thousands of organisations across financial services, healthcare, and technology sectors. The severity is elevated by the privileged access these agents typically hold — CRM data, emails, internal documents, and customer records could all be at risk. Unpatched instances would have been exploitable by any external attacker capable of delivering malicious content into the agent’s processing pipeline.

Mitigation & Recommendations

Apply patches immediately: Both Microsoft and Salesforce have issued fixes — ensure all instances are updated.
Enforce least-privilege agent permissions: Restrict AI agent access to only the data and actions necessary for their defined role.
Implement input/output guardrails: Deploy content filtering on both inputs to and outputs from AI agents.
Monitor agent activity logs: Establish anomaly detection for unexpected data access or exfiltration patterns by AI agents.
Security-test AI integrations: Include prompt injection scenarios in penetration testing and red team exercises for all agentic AI deployments.

References

Microsoft, Salesforce Patch AI Agent Data Leak Flaws — Dark Reading