Automatic Stylistic Composition of Bach Chorales with Deep LSTM

18th International Society for Music Information Retrieval Conference |

This work was done as part of the Cambridge lab’s involvement in Cambridge University’s MPhil in Machine Learning, Speech, and Language Technology program, and was carried out by two students (under our supervision) named Feynman Liang and Martin Tomczak. In this research project we set out to build an AI system based upon LSTMs that was able to compose music like Johann Sebastian Bach. This presented some challenges:

  1. How do you represent music for use in AI?
  2. How is music composed, and can an AI reproduce that process?
  3. How do you determine if a piece of generated music is “in the style” of a composer, i.e. Bach?

Addressing each of these was an interesting challenge that required expertise both from inside the lab and without. One of the key collaborators, Mark Gotham, is a computational musicologist. He helped us answer questions (1) and (2) by introducing us to notations that were amenable to machine learning, and by walking us through how students in musical composition learn to compose music in the style of certain composers. In particular, he introduced us to the task of “completing music”, where one or more parts are provided and the student has to complete the other parts for the music.

In this context, we applied our AI know-how to train a deep LSTM model to compose and complete musical compositions. Really intriguingly, analysis of the trained model provided evidence of neurons specializing without prior knowledge or explicit supervision to detect common music-theoretic concepts such as tonics, chords, and cadences. This left us with the problem of how to evaluate it, and for this we created a website, http://bachbot.com (try it out yourself!) and used it to conduct one of the largest musical discrimination tests ever performed involving 2,336 participants. Among the results, the proportion of responses correctly differentiating BachBot from Bach was only 1% better than random guessing.