Attackers often create systems that automatically rewrite and reorder their malware to avoid detection. Typical machine learning approaches, which learn a classifier based on a handcrafted feature vector, are not sufficiently robust to such reorderings. We propose a different approach, which, similar to natural language modeling, learns the language of malware spoken through the executed instructions and extracts robust, time domain features. Echo state networks (ESNs) and recurrent neural networks (RNNs) are used for the projection stage that extracts the features. These models are trained in an unsupervised fashion. A standard classifier uses these features to detect malicious files. We explore a few variants of ESNs and RNNs for the projection stage, including Max-Poolingand Half-Frame models which we propose. The best performing hybrid model uses an ESN for the recurrent model, Max-Pooling for non-linear sampling, and logistic regression for the final classification. Compared to the standard trigram of events model, it improves the true positive rate by 98.3% at a false positive rate of 0.1%.