Robust training of deep neural networks requires a large amount of data. However gathering and labeling this data can be expensive and determining which distribution of features are needed for training is not a trivial problem. This is compounded when training neural networks for autonomous navigation in continuous nondeterministic environments using only visual input. Increasing the quantity of demonstrated data does not solve this problem as the demonstrated sequences of actions are not guaranteed to produce the same outcomes and slight changes in orientation generate drastically different visual representations. This results in a training set with a different distribution than what the agent will typically encounter in application. Here, we develop a method that can grow a training set from the same distribution as the agent’s experiences and capture useful features not found in demonstrated behavior. Additionally, we show that our approach scales to efficiently handle complex tasks that require a large amount of data (experiences) for training. Concretely, we propose the deep asynchronous Dagger framework, which combines the Dagger algorithm with an asynchronous actor-learner architecture for parallel dataset aggregation and network policy learning. We apply our method to the task of navigating 3D mazes in Minecraft with randomly changing block types and analyze our results.