First Look: AWS Agent-EvalKit Embeds LLM Judges Into Dev Pipelines, Expanding Adversarial Test Surface
Agent-EvalKit introduces an open-source evaluation pipeline that integrates LLM-as-judge evaluators and AI coding assistants directly into agent development workflows, creating new attack surfaces …
AML.T0051 - LLM Prompt Injection
AML.T0057 - LLM Data Leakage
AML.T0010 - ML Supply Chain Compromise