To evaluate an innovation in computer systems, performance analysts measure execution time or other metrics using one or more standard workloads. The performance analyst may carefully minimize the amount of measurement instrumentation, control the environment in which measurement takes place, and repeat each measurement multiple times. Finally, the performance analyst may use statistical techniques to characterize the data.
Unfortunately, even with such a responsible approach, the collected data may be misleading due to bias and the observer effect. Bias occurs when an experimental setup inadvertently favors one particular outcome. Observer effect occurs if data collection alters the behavior of the system being measured. This talk demonstrates that observer effect and bias are (i) large enough to mislead performance analysts and (ii) common enough that they cannot be ignored.
While these phenomenon are well known to the natural and social sciences this talk will demonstrate that research in computer systems typically does not take adequate measures to guard against bias and observer effect.