Context-sensitive points-to analysis is critical for several program optimizations.
However, as the number of contexts grows exponentially, storage
requirements for the analysis increase tremendously for large programs, making
the analysis non-scalable. We propose a scalable flow-insensitive contextsensitive
inclusion-based points-to analysis that uses a specially designed multidimensional
bloom filter to store the points-to information. Two key observations
motivate our proposal: (i) points-to information (between pointer-object and between
pointer-pointer) is sparse, and (ii) moving from an exact to an approximate
representation of points-to information only leads to reduced precision without
affecting correctness of the (may-points-to) analysis. By using an approximate
representation a multi-dimensional bloom filter can significantly reduce the memory
requirements with a probabilistic bound on loss in precision. Experimental
evaluation on SPEC 2000 benchmarks and two large open source programs reveals
that with an average storage requirement of 4MB, our approach achieves
almost the same precision (98.6%) as the exact implementation. By increasing
the average memory to 27MB, it achieves precision upto 99.7% for these benchmarks.
Using Mod/Ref analysis as the client, we find that the client analysis is
not affected that often even when there is some loss of precision in the points-to
representation. We find that the NoModRef percentage is within 2% of the exact
analysis while requiring 4MB (maximum 15MB) memory and less than 4 minutes
on average for the points-to analysis. Another major advantage of our technique
is that it allows to trade off precision for memory usage of the analysis.