EE508 · ML Fundamentals

Random Forest vs Gradient Boosting

A single tree can fit nonlinear patterns, but it can also be unstable. Random forests and gradient boosting improve tree models in different ways: one averages many trees, the other adds trees sequentially to correct remaining errors.

1

Section 1

Single Tree Baseline

One decision tree can fit nonlinear data, but it can also overfit and vary a lot with small data changes. Establishing this baseline motivates why ensembles help.

Tree Prediction (1D Regression)
True function Tree prediction (step) Training data Validation data
Controls
Sample count60
Noise σ0.40
Max depth4
Train MSE
Val MSE
Leaves

Move the depth slider to see flexibility vs variance.

2

Section 2 · Bagging

Random Forest Builder

Random forest trains many trees independently on bootstrapped samples and averages their predictions. The averaging reduces variance and usually makes the result more robust than a single tree.

Forest Average vs Individual Trees
True function Individual trees Forest average Training data
Bagging pipeline
Bootstrap sample    train tree independently    repeat for B trees    average predictions
Controls
Number of trees15
Tree depth4
Trees
Avg leaves
Train MSE
Val MSE

More trees → lower variance of the average prediction.

Tree gallery — different bootstrap samples produce different trees
3

Section 3 · Boosting

Gradient Boosting Builder

Gradient boosting builds trees sequentially. Each new tree learns a correction to the current model's residual error, so the final prediction is a sum of many small refinements.

Cumulative ensemble prediction
True function Boosted prediction Training data
Residuals + next weak learner
Residual = y − F(x) Next tree fit to residuals Each new tree does not replace the model — it adds a correction to the mistakes that remain.
Controls
Boosting stages10
Learning rate0.30
Weak-learner depth2
Current stage0
Stage
0
Trees so far
0
Train MSE
Val MSE

Higher learning rate → larger jumps per stage; lower → slower, smoother improvement.

Ensemble build stack
4

Section 4

Side-by-Side Comparison

On the same training data, compare a single tree against a random forest and a gradient boosting model. The shapes of their predictions reveal what each method is doing.

Shared prediction plot
True function Single tree Random forest Gradient boosting Training data
Controls
Sample count70
Noise σ0.45
Outliers0
RF: # trees25
RF: depth4
GB: stages25
GB: learning rate0.30
GB: depth2
Random Forest
  • Parallel trees, trained independently
  • Bootstrap samples create diverse training subsets
  • Final prediction is the average across trees
  • Mainly fights variance & instability
Train MSE
Val MSE
# trees
Avg depth

Averaging smooths out individual tree noise.

Gradient Boosting
  • Sequential trees built in order
  • Each new tree fits the residual error
  • Final prediction is an additive sum of corrections
  • Mainly improves fit / bias step by step
Train MSE
Val MSE
# stages
Stage depth

Each stage chips away at the remaining error.

5

Section 5

Key Differences Summary

A concise side-by-side after you've explored both ensembles.

Concept Random Forest bagging Gradient Boosting boosting
Build style Parallel Sequential
Data usage Bootstrap resampling Residual fitting on the same training set
Combination Average predictions Add scaled corrections
Main effect Reduce variance Reduce residual error step by step
Typical weakness Larger model, less interpretable Can overfit if too aggressive
Key control Number / depth of trees Learning rate + number of stages

One-line takeaway. Random forest improves stability by averaging many imperfect models. Gradient boosting improves fit by repeatedly correcting mistakes.

Sources