Random Forest

What is a Random Forest?

A random forest is an ensemble of many decision trees. Each tree makes a prediction, and the forest combines them (by voting or averaging) for a better result.

Common uses:

Predicting categories (classification) or numbers (regression)
Handling complex data with many features
Reducing overfitting compared to a single tree

Why Use Random Forests?

More accurate than a single tree
Reduces overfitting
Works for both regression and classification

How Does a Random Forest Work?

Builds many trees on random subsets of the data and features
Combines their predictions (majority vote for classification, average for regression)

Key Parameters in scikit-learn

Parameter	Purpose
`n_estimators`	Number of trees in the forest
`max_depth`	Maximum depth of each tree
`max_features`	Number of features to consider at each split
`random_state`	Controls randomness for reproducibility

Step-by-Step Example: Random Forest for Regression

from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import numpy as np

# Example data
y = np.array([1, 2, 3, 4, 5])
X = np.arange(5).reshape(-1, 1)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

# Create and fit random forest
rf = RandomForestRegressor(n_estimators=100, max_depth=3, random_state=0)
rf.fit(X_train, y_train)

# Predict and evaluate
preds = rf.predict(X_test)
mse = mean_squared_error(y_test, preds)
print('Predictions:', preds)
print('MSE:', mse)

Explanation:

RandomForestRegressor(n_estimators=100, max_depth=3): Builds a forest of 100 trees.
fit(X_train, y_train): Trains the forest.
predict(X_test): Makes predictions.

How is Random Forest Different from a Single Tree?

Uses many trees, not just one
Each tree sees a random subset of data and features
More robust and less likely to overfit

Random Forest

What is a Random Forest?

Why Use Random Forests?

How Does a Random Forest Work?

Key Parameters in scikit-learn

Step-by-Step Example: Random Forest for Regression

How is Random Forest Different from a Single Tree?

Visualizing the Random Forest Process

On this page