Jailbreak Distillation: Renewable Safety Benchmarking
Jingyu Zhang, Ahmed Elgohary, Xiawei Wang, A S M Iftekhar, Ahmed Magooda, Ben Van Durme, Daniel Khashabi, Kyle Jackson
Empirical Methods in Natural Language Processing - Findings | November 2025
Jingyu Zhang, Ahmed Elgohary, Xiawei Wang, A S M Iftekhar, Ahmed Magooda, Ben Van Durme, Daniel Khashabi, Kyle Jackson
Empirical Methods in Natural Language Processing - Findings | November 2025
Kaiwen Zhou, Ahmed Elgohary, A S M Iftekhar, Amin Saied
October 2025
Jingyu (Jack) Zhang, Ahmed Elgohary, Ahmed Magooda, Daniel Khashabi, Ben Van Durme
ICLR 2025 | October 2024
Jingyu Zhang, Ahmed Elgohary, Xiawei Wang, A S M Iftekhar, Ahmed Magooda, Ben Van Durme, Daniel Khashabi, Kyle Jackson
Empirical Methods in Natural Language Processing - Findings | November 2025
Kaiwen Zhou, Ahmed Elgohary, A S M Iftekhar, Amin Saied
October 2025
Jingyu (Jack) Zhang, Ahmed Elgohary, Ahmed Magooda, Daniel Khashabi, Ben Van Durme
ICLR 2025 | October 2024
Kaiwen Zhou, Ahmed Elgohary, A S M Iftekhar, Amin Saied
October 2025
Jingyu Zhang, Ahmed Elgohary, Xiawei Wang, A S M Iftekhar, Ahmed Magooda, Ben Van Durme, Daniel Khashabi, Kyle Jackson
Empirical Methods in Natural Language Processing - Findings | November 2025
Jingyu (Jack) Zhang, Ahmed Elgohary, Ahmed Magooda, Daniel Khashabi, Ben Van Durme
ICLR 2025 | October 2024