This tutorial provides a comprehensive, step-by-step example of polynomial regression using NumPy, with Matplotlib for visualizing the results.
import numpy as np
import matplotlib.pyplot as plt
Set a random seed
np.random.seed(12)
Set the number of data points N
and the degree D
of the polynomial that is going to fit to the data.
Note: for D = 1
this is linear regression.
N = 100 # number of points
D = 2 # degree of polynomial
To generate some random data that is suitable for polynomial regression we're going to use the following functions:
np.random.uniform
draws samples from a uniform distributionnp.random.sample
returns random floats in the half-open interval $[0.0, 1.0)$.np.random.normal
draws random samples from a normal (Gaussian) distribution.np.polyval
evaluates a polynomial (defined by its coefficients) at the given pointsWe are first going to generate a set of $x$ values taken at random in a given range ($[-N, N[$).
The array c
is used to store the coefficients of a random polynomial. To generate a random polynomial of degree $D$ we're going to generate $D$ random coefficients and set the coefficient of the leading term to $1$.
x = np.random.uniform(-N , N, N) # x values are uniformly distributed between -100 and +100
c = 200*np.random.sample((D,))-100 # random polynomial coefficients drawn from the interval [-100, 100)
c = np.insert(c, 0, 1) # first coeff. of the random poly is 1
y = np.polyval([1] + c, x)
noise = np.random.normal(-0.5*np.std(y), 0.5*np.std(y), N) # some normally distributed noise
y += noise # add noise to values
plt.scatter(x, y)
plt.show()
t = np.linspace(-N, N, num=N)
p = np.poly1d(np.polyfit(x, y, D))
plt.plot(x, y, 'o', t, p(t), '-')
plt.show()
Show the coefficients of the approximating polynomial p
.
np.set_printoptions(formatter={'float': '{: 12.6f}'.format})
print("Coefficients of p: {}".format(p.c))
print("Original coefficients: {}".format(c))
Coefficients of p: [ 1.782148 -21.511448 -3135.140655] Original coefficients: [ 1.000000 -22.956604 19.148379]
Visualize the original polynomial p_orig
and its approximation p
obtained with polynomial regression.
p_orig = np.poly1d(c)
plt.plot(x, y, 'o', t, p(t), '-', t, p_orig(t), '-')
plt.show()