redteam-swarm: Autonomous Multi-Expert Red-Teaming of Agentic LLM Systems
LoRA specialists, PAIR search, and GRPO self-play against a seven-agent Claude target
Six LoRA-fine-tuned attack experts over a shared Qwen3-8B base, coordinated by a UCB1 bandit and refined by PAIR + GRPO, reach 42.2% ASR at L2 on opus and 73.4% ASR on a held-out LangChain target — with one finding scoring maximum severity at zero search iterations.