Established: October 1, 2016

Gradient Boosting Decision Tree (GBDT) is a popular machine learning algorithm, and has quite a few effective implementations. Although many engineering optimizations have been adopted in these implementations, the efficiency and scalability are still unsatisfactory when the feature dimension is high and data size is large. A major reason is that for each feature, they need to scan all the data instances to estimate the information gain of all possible split points, which is very time consuming. LightGBM is a GBDT open-source tool enabling highly efficient training over large scale datasets with low memory cost. LightGBM adopts two novel techniques Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB). With GOSS, LightGBM can train each tree with only a small fraction of the full dataset. With EFB, LightGBM handles high-dimensional sparse features much more efficiently. LightGBM also support distributed training with low communication cost and fast training on GPUs.



Portrait of Tie-Yan Liu

Tie-Yan Liu

Distinguished Scientist, Assistant Managing Director

Portrait of Weidong Ma

Weidong Ma

Associate Researcher II

Portrait of Qi Meng

Qi Meng

Senior Researcher

Portrait of Yu Shi

Yu Shi

Associate Researcher