Capability Overview
Karamo Brown’s Kē wellness app — built on AI clone platform Delphi — ships a persistent, voice-accurate digital replica of the celebrity that users can converse with in real time for life coaching, nutrition, fitness, and mental-health support. The clone is trained on a broad corpus of Brown’s public output: interviews, podcast episodes, and video clips. The same Delphi platform also hosts a digital clone of Arnold Schwarzenegger, indicating this is a reusable, multi-tenant architecture rather than a bespoke one-off deployment.
For defenders, the significance is not the celebrity angle but the architecture: a trusted, emotionally resonant persona, trained on a largely uncontrolled public corpus, deployed in an unlimited-interaction mental-health context, backed by a shared AI infrastructure provider. Each of those properties independently introduces risk; combined, they create a meaningful new attack surface.
Attack Surface Analysis
Persona boundary as primary control failure point. The ‘AI Karamo’ persona is designed to feel authentic and authoritative. If that boundary can be crossed via prompt injection or jailbreak, an attacker can elicit harmful advice (self-harm, substance use, relationship manipulation) delivered in a trusted, familiar voice — dramatically increasing the social proof of the output compared to a generic chatbot.
Voice model as exfiltration surface. The app exposes a high-fidelity synthesis of Brown’s voice through a conversational interface. Adversarial users can systematically probe the voice output to approximate or extract a usable voice model, which can then be applied externally for fraud, impersonation, or non-consensual content generation.
Public training corpus as a long-game poisoning vector. Because Delphi’s cloning pipeline ingests publicly available media, adversaries with patience could attempt to seed future training refreshes by publishing manipulated content (fake interviews, edited podcast clips) that gradually drifts the clone’s persona or values.
Shared infrastructure amplifies blast radius. A compromise of Delphi’s platform — through supply chain attack, misconfigured access controls, or insider threat — would affect all hosted celebrity clones simultaneously, not just Kē.
Unlimited interaction + vulnerable population = overreliance amplifier. Brown explicitly confirmed there is no cap on interaction frequency. In a mental-health context, this creates the conditions for pathological dependence on an AI that cannot perform clinical risk assessment and may produce confidently wrong guidance.
Framework Mapping
- AML.T0051 / LLM01 (Prompt Injection): Direct injection through the coaching chat interface to override persona constraints.
- AML.T0054 (LLM Jailbreak): Multi-turn manipulation to shift the persona outside its safety envelope.
- AML.T0056 (Meta Prompt Extraction): Extraction of system-level persona instructions, revealing proprietary framing and guardrail logic.
- AML.T0020 / LLM03 (Training Data Poisoning): Long-term seeding of the public corpus used to retrain or refresh the clone.
- AML.T0010 / LLM05 (Supply Chain): Delphi platform compromise propagating to all hosted personas.
- LLM09 (Overreliance): Wellness and mental-health context combined with unlimited interaction amplifies user dependence on AI output.
Threat Scenarios
Scenario 1 — Crisis escalation failure. A user experiencing suicidal ideation engages ‘AI Karamo.’ An adversary has previously extracted the system prompt and published jailbreak sequences specific to the persona. The user, unaware, applies one; the persona drops safety language and responds in a way that normalises self-harm.
Scenario 2 — Voice harvesting for fraud. A researcher systematically queries the app with prompts designed to elicit long, phonetically diverse responses. The audio output is aggregated to fine-tune a local voice synthesis model that can impersonate Brown for vishing campaigns or fabricated audio clips.
Scenario 3 — Platform-wide persona drift. A threat actor publishes a series of doctored podcast-format audio files featuring Brown. These are indexed and ingested during a Delphi training refresh, gradually shifting ‘AI Karamo’ toward views or advice the real Brown would not endorse.
Defender Checklist
- Conduct persona red-teaming: attempt jailbreaks, role-playing injections, and multi-turn manipulation before launch and after every model update
- Audit Delphi’s tenant isolation, access controls, and incident response SLAs
- Implement hard-coded, LLM-bypass escalation triggers for crisis keywords — route to human or emergency services regardless of model response
- Monitor voice output for systematic harvesting patterns (high-volume, phonetically diverse, short-session queries)
- Establish a corpus provenance process: track what public content is ingested and implement change detection before training refreshes
- Disclose to users in-product that they are interacting with an AI, not the real person, and surface that disclosure at every session start