We present distributed regression, an efficient and general framework for in-network modeling of sensor data. In this framework, the nodes of the sensor network collaborate to optimally fit a global function to each of their local measurements. The algorithm is based upon kernel linear regression, where the model takes the form of a weighted sum of local basis functions; this provides an expressive yet tractable class of models for sensor network data. Rather than transmitting data to one another or outside the network, nodes communicate constraints on the model parameters, drastically reducing the communication required. After the algorithm is run, each node can answer queries for its local region, or the nodes can efficiently transmit the parameters of the model to a user outside the network. We present an evaluation of the algorithm based upon data from a 48-node
sensor network deployment at the Intel Berkeley research lab.