Random Forest vs Gradient Boosting

1

Section 1

Single Tree Baseline

One decision tree can fit nonlinear data, but it can also overfit and vary a lot with small data changes. Establishing this baseline motivates why ensembles help.

Tree Prediction (1D Regression)

True function Tree prediction (step) Training data Validation data

Controls

Dataset preset

Sample count60

Noise σ0.40

Max depth4

Show split boundaries Show leaf means

Train MSE

–

Val MSE

–

Leaves

–

Move the depth slider to see flexibility vs variance.

2

Section 2 · Bagging

Random Forest Builder

Random forest trains many trees independently on bootstrapped samples and averages their predictions. The averaging reduces variance and usually makes the result more robust than a single tree.

Forest Average vs Individual Trees

True function Individual trees Forest average Training data

Bagging pipeline

Bootstrap sample → train tree independently → repeat for B trees → average predictions

Controls

Number of trees15

Tree depth4

Bootstrap sampling (with replacement) Split-candidate randomness Show individual trees faintly

Trees

–

Avg leaves

–

Train MSE

–

Val MSE

–

More trees → lower variance of the average prediction.

Tree gallery — different bootstrap samples produce different trees

3

Section 3 · Boosting

Gradient Boosting Builder

Gradient boosting builds trees sequentially. Each new tree learns a correction to the current model's residual error, so the final prediction is a sum of many small refinements.

Cumulative ensemble prediction

True function Boosted prediction Training data

Residuals + next weak learner

Residual = y − F(x) Next tree fit to residuals

Controls

Boosting stages10

Learning rate0.30

Weak-learner depth2

Current stage0

Stage

0

Trees so far

0

Train MSE

–

Val MSE

–

Higher learning rate → larger jumps per stage; lower → slower, smoother improvement.

Ensemble build stack

4

Section 4

Side-by-Side Comparison

On the same training data, compare a single tree against a random forest and a gradient boosting model. The shapes of their predictions reveal what each method is doing.

Shared prediction plot

True function Single tree Random forest Gradient boosting Training data

Controls

Dataset preset

Sample count70

Noise σ0.45

Outliers0

RF: # trees25

RF: depth4

GB: stages25

GB: learning rate0.30

GB: depth2

Random Forest

Parallel trees, trained independently
Bootstrap samples create diverse training subsets
Final prediction is the average across trees
Mainly fights variance & instability

Train MSE

–

Val MSE

–

# trees

–

Avg depth

–

Averaging smooths out individual tree noise.

Gradient Boosting

Sequential trees built in order
Each new tree fits the residual error
Final prediction is an additive sum of corrections
Mainly improves fit / bias step by step

Train MSE

–

Val MSE

–

# stages

–

Stage depth

–

Each stage chips away at the remaining error.

5

Section 5

Key Differences Summary

A concise side-by-side after you've explored both ensembles.

Concept	Random Forest bagging	Gradient Boosting boosting
Build style	Parallel	Sequential
Data usage	Bootstrap resampling	Residual fitting on the same training set
Combination	Average predictions	Add scaled corrections
Main effect	Reduce variance	Reduce residual error step by step
Typical weakness	Larger model, less interpretable	Can overfit if too aggressive
Key control	Number / depth of trees	Learning rate + number of stages

One-line takeaway. Random forest improves stability by averaging many imperfect models. Gradient boosting improves fit by repeatedly correcting mistakes.

Sources

Slides/ML_Lecture3Sp26_ML_Review.pdf pp. 14–15 — random forest regression alongside decision trees
Slides/ML_Lecture3Sp26_ML_Review.pdf pp. 20–24 — gradient boosting concept & sequential residual-correction steps
Slides/ML_Lecture3Sp26_ML_Review.pdf p. 26 — iterative process of gradient boosting
Slides/ML_Lecture3Sp26_ML_Review.pdf p. 27 — learning-rate effect
Slides/ML_Lecture3Sp26_ML_Review.pdf p. 28 — random forest bagging / bootstrap aggregation / averaging
Slides/ML_Lecture3Sp26_ML_Review.pdf p. 29 — random forest example
Slides/ML_Lecture3Sp26_ML_Review.pdf p. 68 — random forest steps, strengths, and tradeoffs