Detecting backdoored language models at scale
We’re releasing new research on detecting backdoors in open-weight language models and highlighting a practical scanner designed to detect backdoored models at scale and improve overall trust in AI systems.
Copy and paste this URL into your WordPress site to embed
Copy and paste this code into your site to embed