LightGBM

What is LightGBM?

LightGBM (Light Gradient Boosting Machine) is a fast, efficient gradient boosting library developed by Microsoft. It is designed for speed and performance, especially on large datasets.

Common uses:

Large tabular datasets
When training speed is important
Data with many features or categories

Why Use LightGBM?

Very fast training
Low memory usage
Handles categorical features natively
Works for both regression and classification

Key Parameters

Parameter	Purpose
`n_estimators`	Number of boosting rounds
`learning_rate`	Step size shrinkage
`max_depth`	Maximum depth of a tree
`num_leaves`	Number of leaves per tree
`feature_fraction`	Fraction of features used per tree

Step-by-Step Example: LightGBM for Regression

import lightgbm as lgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import numpy as np

# Example data
y = np.array([1, 2, 3, 4, 5])
X = np.arange(5).reshape(-1, 1)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

# Create and fit LightGBM regressor
lgb_reg = lgb.LGBMRegressor(n_estimators=100, learning_rate=0.1, max_depth=3, random_state=0)
lgb_reg.fit(X_train, y_train)

# Predict and evaluate
preds = lgb_reg.predict(X_test)
mse = mean_squared_error(y_test, preds)
print('Predictions:', preds)
print('MSE:', mse)

Explanation:

LGBMRegressor: LightGBM's regressor for regression tasks.
fit(X_train, y_train): Trains the model.
predict(X_test): Makes predictions.

When to Use LightGBM

Large datasets
When you need fast training
When you have many features or categories