MLP FU
Models/Regression

Polynomial Regression

Where to Use Polynomial Regression

Polynomial regression is useful when the relationship between your features and the target is curved or non-linear, but you still want a simple, interpretable model.

Common use cases:

  • Predicting growth that accelerates or decelerates (e.g., population, sales)
  • Modeling physical phenomena with curves (e.g., projectile motion)
  • Fitting data that looks like a curve, not a straight line

Why Use Polynomial Regression?

  • Flexibility: Can fit curved data better than linear regression
  • Simplicity: Still easy to understand and explain
  • Interpretability: You can see how each power of the feature affects the prediction

How to Use Polynomial Regression

  1. Prepare your data: Make sure your features (X) and target (y) are numbers.
  2. Transform features: Use PolynomialFeatures to add powers of your features (e.g., x, x², x³).
  3. Build a pipeline: Combine feature transformation and regression into one step.
  4. Train the model: Fit the pipeline to your training data.
  5. Make predictions: Use the pipeline to predict on new data.
  6. Evaluate: Check how well your model predicts using metrics like mean squared error.

What are the Inputs and Outputs?

  • Input (X): Table of numbers (features). Each row is a sample, each column is a feature.
  • Output (y): A single number for each sample (the value you want to predict).
  • Prediction: The model outputs a number for each input row, which is its guess for the target value.

How Does Polynomial Regression Work?

Polynomial regression transforms your features by adding powers (like x², x³) and then fits a linear model to these new features. This lets the model fit curves, not just straight lines.


Step 1: Visualize the Problem

Suppose you have data that looks like a curve, not a straight line.

import numpy as np
import matplotlib.pyplot as plt

# Create some curved data
y = np.array([1, 4, 9, 16, 25])  # y = x^2
X = np.arange(1, 6).reshape(-1, 1)

plt.scatter(X, y)
plt.xlabel('X')
plt.ylabel('y')
plt.title('Curved Data Example')
plt.show()

Explanation:

  • This code creates and plots data where y = x², which is a curve.

Step 2: Transform Features for Polynomial Regression

To fit a curve, we need to add new features: x², x³, etc. PolynomialFeatures does this for us.

from sklearn.preprocessing import PolynomialFeatures

# 1. Create the transformer (degree=2 for x and x^2)
poly = PolynomialFeatures(degree=2, include_bias=False)

# 2. Transform the features
X_poly = poly.fit_transform(X)
print('Transformed features:
', X_poly)

Explanation:

  • PolynomialFeatures(degree=2): Adds x² as a new feature.
  • fit_transform(X): Makes a new table with both x and x² for each sample.

Step 3: Implement Polynomial Regression via Pipeline

A pipeline lets you combine the transformation and regression steps so you don't have to do them separately.

from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

# 1. Make a pipeline: transform features, then fit regression
pipeline = make_pipeline(PolynomialFeatures(degree=2, include_bias=False),
                      LinearRegression())

# 2. Train the pipeline
pipeline.fit(X, y)

# 3. Make predictions
preds = pipeline.predict(X)
print('Predictions:', preds)

Explanation:

  • make_pipeline(...): Chains together the feature transformation and regression steps.
  • fit(X, y): Trains the whole pipeline.
  • predict(X): Makes predictions using the pipeline.

Step 4: Visualize the Fitted Curve

Let's see how well the model fits the data.

plt.scatter(X, y, label='Original Data')
plt.plot(X, preds, color='red', label='Polynomial Fit')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Polynomial Regression Fit')
plt.legend()
plt.show()

Explanation:

  • This code plots the original data and the curve predicted by the model.

Visualizing the Polynomial Regression Process