We present a novel energy-based algorithm to estimate the positions of microphones and speakers in an ad hoc microphone array setting. Compared to traditional time-of-flight based approaches, energy-based approach has the advantage that it does not require accurate time synchronization. This property is particularly useful for ad hoc microphone arrays because highly accurate synchronization across microphones may be difficult to obtain since these microphones usually belong to different devices. This new algorithm extends our previous energy-based position estimation algorithm [1] in that it does not assume the speakers are in the same positions as their corresponding microphones. In fact, our new algorithm estimates both the microphones and speakers simultaneously. Experiment results are shown to demonstrate its performance improvement over the previous approach in [1], and evaluate its robustness against time synchronization errors.