The Microsoft Cognitive Toolkit

The Microsoft Cognitive Toolkit

A free, easy-to-use, open-source, commercial-grade toolkit that trains deep learning algorithms to learn like the human brain.

The Microsoft Cognitive Toolkit—previously known as CNTK—empowers you to harness the intelligence within massive datasets through deep learning by providing uncompromised scaling, speed and accuracy with commercial-grade quality and compatibility with the programming languages and algorithms you already use. Hear about the team that developed the Cognitive Toolkit, or read more below.

Learn more at GitHub

Speed & Scalability

Speed & Scalability

The Microsoft Cognitive Toolkit trains and evaluates deep learning algorithms faster than other available toolkits, scaling efficiently in a range of environments—from a CPU, to GPUs, to multiple machines—while maintaining accuracy.

Commercial-Grade Quality

Commercial-Grade Quality

The Microsoft Cognitive Toolkit is built with sophisticated algorithms and production readers to work reliably with massive datasets. Skype, Cortana, Bing, Xbox, and industry-leading data scientists already use the Microsoft Cognitive Toolkit to develop commercial-grade AI.



The Microsoft Cognitive Toolkit offers the most expressive, easy-to-use architecture available. Working with the languages and networks you know, like C++ and Python, it empowers you to customize any of the built-in training algorithms, or use your own.


Highly optimized, built-in components

  • Components can handle multi-dimensional dense or sparse data from Python, C++ or BrainScript
  • FFN, CNN, RNN/LSTM, Batch normalization, Sequence-to-Sequence with attention and more
  • Reinforcement learning, generative adversarial networks, supervised and unsupervised learning
  • Ability to add new user-defined core-components on the GPU from Python
  • Automatic hyperparameter tuning
  • Built-in readers optimized for massive datasets

Efficient resource usage

  • Parallelism with accuracy on multiple GPUs/machines via 1-bit SGD and Block Momentum
  • Memory sharing and other built-in methods to fit even the largest models in GPU memory

Easily express your own networks

  • Full APIs for defining networks, learners, readers, training and evaluation from Python, C++ and BrainScript
  • Evaluate models with Python, C++, C# and BrainScript
  • Interoperation with NumPy
  • Both high-level and low-level APIs available for ease of use and flexibility
  • Automatic shape inference based on your data
  • Fully optimized symbolic RNN loops (no unrolling needed)

Training and hosting with Azure

  • Takes advantage of high-speed resources when used with Azure GPU and Azure networks
  • Host trained models easily on Azure and add real-time training if desired

Model Gallery

To help get you started, we’ve assembled 48 different code samples, recipes and tutorials across scenarios working with a variety of datasets: images, numeric, speech and text.

Fast R-CNN

Train object detection from images by adapting pre-trained classification models on arbitrarily sized regions of interest using ROI pooling.

  • Language(s): BrainScript
  • Type: Recipe, Tutorial

Grapheme to Phoneme (G2P)

Sequence-to-sequence model with attention mechanism for a grapheme to phoneme translation task on the CMUDict dataset.

  • Language(s): BrainScript, Python
  • Type: Recipe


Deep residual learning invented by Microsoft Research. This was the winning model of the ILSVRC and MS-COCO challenges in 2015.

  • Language(s): BrainScript, Python
  • Type: Recipe, Tutorial

See more …

Other GitHub Resources

  • Articles providing tips and detailed overviews of how to put the Cognitive Toolkit to work
  • Extensive documentation on setting up, testing and training your first datasets using the Cognition Toolkit