Overview
Unit 42 has published research introducing a novel software supply chain attack vector they term phantom squatting — the deliberate registration of domains that large language models (LLMs) consistently fabricate when answering developer queries. As AI coding assistants become embedded in software development workflows, they routinely generate fictitious but plausible-looking URLs for webhooks, API endpoints, and documentation resources. Adversaries are now systematically identifying and pre-registering these hallucinated domains to intercept traffic originating from AI-driven systems.
The research is notable for its scale and predictive validation: Unit 42 analysed 913 global brands, executed 685,339 URL queries across two LLM models, and identified over 13,229 confirmed malicious URLs already in active use. Critically, approximately 250,000 hallucinated domains remain unregistered — a standing inventory of exploitable targets.
Technical Analysis
The attack lifecycle proceeds in three stages:
Hallucination harvesting — Researchers (and adversaries) query LLMs with developer-oriented prompts to surface URLs the model confidently fabricates. These domains follow realistic naming conventions, making them difficult to flag by syntax alone.
Preemptive registration — Attackers register high-probability hallucinated domains ahead of organic developer traffic. Unit 42’s pipeline demonstrated it could predict adversarial registration 18–51 days in advance.
Traffic interception — Once registered, these domains capture requests from AI agents, CI/CD systems, and developers who trusted the LLM’s output without independent verification. Intercepted traffic may include API secrets, build telemetry, or authentication tokens.
The Montana Empire phishing kit case study illustrates the full attack chain: a threat actor used an AI coding assistant to construct the kit, then deployed it against a domain Unit 42 had already flagged as a high-risk hallucination target 23 days earlier. This confirms adversaries are actively using AI tools to accelerate exploitation of this same vector.
# Example hallucinated endpoint a CI/CD assistant might produce:
https://api.build-notifier.io/v1/pipeline/events
# Domain did not exist at time of generation — now potentially attacker-owned
Framework Mapping
- AML.T0010 (ML Supply Chain Compromise): The vector directly targets the software supply chain via AI-generated artefacts.
- AML.T0047 (ML-Enabled Product or Service): AI assistants act as the unwitting delivery mechanism for malicious infrastructure references.
- LLM02 (Insecure Output Handling): Downstream systems consume LLM-generated URLs without validation.
- LLM08 (Excessive Agency) and LLM09 (Overreliance): Agentic systems autonomously execute HTTP requests against LLM-generated URLs, and developers trust model output without verification.
Impact Assessment
The risk is highest for organisations using AI coding assistants within automated pipelines. Intercepted requests may leak secrets, enable man-in-the-middle attacks on build processes, or deliver malicious payloads to developer environments. The 250,000 unregistered hallucinated domains represent a persistent, scalable opportunity for adversaries — the attack surface grows proportionally with LLM adoption.
Mitigation & Recommendations
- Validate all AI-generated URLs against known-good registries before use in code or configurations.
- Implement outbound allowlists in CI/CD environments; deny requests to uncategorised or newly registered domains.
- Monitor brand-adjacent domain registrations using automated threat intelligence feeds tuned to hallucination-pattern naming conventions.
- Apply DNS security controls (e.g., Advanced DNS Security) that can flag requests to newly registered or low-reputation domains.
- Educate developers on the hallucination risk inherent in AI coding assistants, particularly for external endpoint recommendations.