We examine the feasibility of private set intersection (PSI) over massive
datasets. PSI, which allows two parties to find the intersection of their sets
without revealing them to each other, has numerous applications including to
privacy-preserving data mining, location-based services and genomic
computations. Unfortunately, the most efficient constructions only scale to
sets containing a few thousand elements—even in the semi-honest model and
over a LAN.

In this work, we design PSI protocols in the server-aided setting, where the
parties have access to a single untrusted server that makes its
computational resources available as a service. We show that by exploiting the
server-aided model and by carefully optimizing and parallelizing our
implementations, PSI is feasible for billion-element sets even while
communicating over the Internet. As far as we know, ours is the first
attempt to scale PSI to billion-element sets which represents an increase of
five orders of magnitude over previous work.

Our protocols are secure in several adversarial models including against a
semi-honest, covert and malicious server; and address a range of security and
privacy concerns including fairness and the leakage of the intersection size.
Our protocols also yield efficient server-aided private equality-testing (PET)
with stronger security guarantees than prior work.