In this paper, we provide a systematic study to the prevailing ResNet architecture by showing a connection from a general deeply-fused net view to ensembling. We start by empirically demonstrating the resemblance between the expanded form of the deeply-fused net and an ensemble of neural networks. Our empirical results uncover that the deepest network among the ensemble components does not contribute the most significantly to the overall performance and instead it provides a manner to introduce many layers and thus guarantee the ensemble size. Guided by the above study and observation, we develop a new deeply-fused network that combines two networks in a merge-and-run fusion manner. It is less deep than a ResNet with the same number of parameters but yields an ensemble of the same number of more-capable component networks, thus improving the classification accuracy. We evaluate the proposed network on the standard recognition tasks. Our approach demonstrates consistent improvements over the ResNet with the comparable setup, and achieves the state-of-the-art results (e.g., 3:57% testing error on CIFAR-10, 19:00% on CIFAR-100, 1:51% on SVHN).