Learning patterns in data requires to extract interesting, statistically significant regularities in (large) data sets, e.g. detection of cancer cells in tissue microarrays and estimating their staining or role mining in security permission management. Admissible solutions or hypotheses specify the context of pattern analysis problems which have to cope with model mismatch and noise in data. An information theoretic approach is developed which estimates the precision of inferred solution sets and regularizes solutions in a noise adapted way. The tradeoff between “informativeness” and “robustness” is mirrored by the balance between high information content and identifiability of solution sets, thereby giving rise to a new notion of context sensitive information. Cost function to rank solutions and, more abstractly, algorithms are considered as noisy channels with a generalization capacity. The effectiveness of this concept is demonstrated by model validation for spectral clustering based on different variants of graph cuts. The concept also enables us to measure how many bit are extracted by sorting algorithms when the input and thereby the pairwise comparisons are subject to fluctuations.