Established: October 1, 2016


Microsoft Research Blog


LightGBM is a new gradient boosting tree framework, which is highly efficient and scalable and can support many different algorithms including GBDT, GBRT, GBM, and MART. LightGBM is evidenced to be several times faster than existing implementations of gradient boosting trees, due to its fully greedy tree-growth method and histogram-based memory and computation optimization. It also has a complete solution for distributed training, based on the DMTK framework. The distributed version of LightGBM takes only one or two hours to finish the training of a CTR predictor on the Criteo dataset, which contains 1.7 billion records with 67 features, on a cluster of 16 machines.