The geometry of an acoustic environment can be an important information in many audio signal processing applications. To estimate such a geometry, previous work has relied on large microphone arrays, multiple test sources, moving sources or the assumption of a 2D room. In this paper, we lift these requirements and present a novel method that uses a compact microphone array to estimate a 3D room geometry, delivering effective estimates with low-cost hardware. Our approach first probes the environment with a known test signal emitted by a loudspeaker co-located with the array, from which the room impulse responses (RIRs) are estimated. It then uses an l1-regularized least-squares minimization to fit synthetically generated reflections to the RIRs, producing a sparse set of reflections. By enforcing structural constraints derived from the image model, these are classified into 1st, 2nd and 3rd-order reflections, thereby deriving the room geometry. Using this method, we detect walls using off-the-shelf teleconferencing hardware with a typical range resolution of about 1 cm. We present results using simulations and data from real environments.