We present a technique for partially replicating data items at scale according to expressive policy specifications. Recent projects have addressed the challenge of policy-based replication of personal data (photos, music, etc.) within a network of devices, as an alternative to centralized online services. To date, the policies supported by such systems have been relatively simple, in order to facilitate scaling the policy calculation to large numbers of items.

In this paper, we show how such replication systems can scale while supporting much more expressive policies than previous schemes: item replication expressed as constraints, devices referred to by predicates rather than explicitly named, and replication to storage nodes acquired on-demand from the cloud. These extensions introduce considerable complexity in policy evaluation, but we show a system can scale well by using equivalence classes to reduce the problem space. We validate our approach via deployment on an ensemble of devices (phones, PCs, cloud virtual machines, etc.), and show that it supports rich policies and high data volumes using simulations and real data based on personal usage in our group.