Secure Featurization and Applications to Secure Phishing Detection

Secure inference allows a server holding a machine learning (ML) inference algorithm with private weights, and a client with a private input, to obtain the output of the inference algorithm, without revealing their respective private inputs to one another. While this problem has received plenty of attention, existing systems are not applicable to a large class of ML algorithms (such as in the domain of Natural Language Processing) that perform featurization as their first step. In this work, we address this gap and make the following contributions:

1. We initiate the formal study of secure featurization and its use in conjunction with secure inference protocols. 2. We build secure featurization protocols in the one/two/three-server settings that provide a tradeoff between security and efficiency. 3. Finally, we apply our algorithms in the context of secure phishing detection and evaluate our end-to-end protocol on models that are commonly used for phishing detection.