%matplotlib inline
A implementation of Ordinary Least Squares simple and multiple linear regression.
from mlxtend.regressor import LinearRegression
Illustration of a simple linear regression model:
In Ordinary Least Squares (OLS) Linear Regression, our goal is to find the line (or hyperplane) that minimizes the vertical offsets. Or in other words, we define the best-fitting line as the line that minimizes the sum of squared errors (SSE) or mean squared error (MSE) between our target variable (y) and our predicted output over all samples $i$ in our dataset of size $n$.
$$SSE = \sum_i \big(\text{target}^{(i)} - \text{output}^{(i)}\big)^2$$$$MSE = \frac{1}{n} \times SSE$$Now, LinearRegression
implements a linear regression model for performing ordinary least squares regression using one of the following three approaches:
The closed-form solution should be preferred for "smaller" datasets where calculating (a "costly") matrix inverse is not a concern. For very large datasets, or datasets where the inverse of $[X^T X]$ may not exist (the matrix is non-invertible or singular, e.g., in case of perfect multicollinearity), the gradient descent or stochastic gradient descent approaches are to be preferred.
The linear function (linear regression model) is defined as:
$$y = w_0x_0 + w_1x_1 + ... + w_mx_m = \sum_{i=0}^{n} = \mathbf{w}^T\mathbf{x}$$where $y$ is the response variable, $\mathbf{x}$ is an $m$-dimensional sample vector, and $\mathbf{w}$ is the weight vector (vector of coefficients). Note that $w_0$ represents the y-axis intercept of the model and therefore $x_0=1$.
Using the closed-form solution (normal equation), we compute the weights of the model as follows:
$$ \mathbf{w} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^Ty$$Random shuffling is implemented as:
import numpy as np
import matplotlib.pyplot as plt
from mlxtend.regressor import LinearRegression
X = np.array([ 1.0, 2.1, 3.6, 4.2, 6])[:, np.newaxis]
y = np.array([ 1.0, 2.0, 3.0, 4.0, 5.0])
ne_lr = LinearRegression(minibatches=None)
ne_lr.fit(X, y)
print('Intercept: %.2f' % ne_lr.b_)
print('Slope: %.2f' % ne_lr.w_[0])
def lin_regplot(X, y, model):
plt.scatter(X, y, c='blue')
plt.plot(X, model.predict(X), color='red')
return
lin_regplot(X, y, ne_lr)
plt.show()
Intercept: 0.25 Slope: 0.81
import numpy as np
import matplotlib.pyplot as plt
from mlxtend.regressor import LinearRegression
X = np.array([ 1.0, 2.1, 3.6, 4.2, 6])[:, np.newaxis]
y = np.array([ 1.0, 2.0, 3.0, 4.0, 5.0])
gd_lr = LinearRegression(eta=0.005,
epochs=100,
minibatches=1,
random_seed=123,
print_progress=3)
gd_lr.fit(X, y)
print('Intercept: %.2f' % gd_lr.w_)
print('Slope: %.2f' % gd_lr.b_)
def lin_regplot(X, y, model):
plt.scatter(X, y, c='blue')
plt.plot(X, model.predict(X), color='red')
return
lin_regplot(X, y, gd_lr)
plt.show()
Iteration: 100/100 | Cost 0.08 | Elapsed: 0:00:00 | ETA: 0:00:00
Intercept: 0.82 Slope: 0.22
# Visualizing the cost to check for convergence and plotting the linear model:
plt.plot(range(1, gd_lr.epochs+1), gd_lr.cost_)
plt.xlabel('Epochs')
plt.ylabel('Cost')
plt.ylim([0, 0.2])
plt.tight_layout()
plt.show()
import numpy as np
import matplotlib.pyplot as plt
from mlxtend.regressor import LinearRegression
X = np.array([ 1.0, 2.1, 3.6, 4.2, 6])[:, np.newaxis]
y = np.array([ 1.0, 2.0, 3.0, 4.0, 5.0])
sgd_lr = LinearRegression(eta=0.01,
epochs=100,
random_seed=0,
minibatches=len(y))
sgd_lr.fit(X, y)
print('Intercept: %.2f' % sgd_lr.b_)
print('Slope: %.2f' % sgd_lr.w_)
def lin_regplot(X, y, model):
plt.scatter(X, y, c='blue')
plt.plot(X, model.predict(X), color='red')
return
lin_regplot(X, y, sgd_lr)
plt.show()
Intercept: 0.24 Slope: 0.82
plt.plot(range(1, sgd_lr.epochs+1), sgd_lr.cost_)
plt.xlabel('Epochs')
plt.ylabel('Cost')
plt.ylim([0, 0.2])
plt.tight_layout()
plt.show()
import numpy as np
import matplotlib.pyplot as plt
from mlxtend.regressor import LinearRegression
X = np.array([ 1.0, 2.1, 3.6, 4.2, 6])[:, np.newaxis]
y = np.array([ 1.0, 2.0, 3.0, 4.0, 5.0])
sgd_lr = LinearRegression(eta=0.01,
epochs=100,
random_seed=0,
minibatches=3)
sgd_lr.fit(X, y)
print('Intercept: %.2f' % sgd_lr.b_)
print('Slope: %.2f' % sgd_lr.w_)
def lin_regplot(X, y, model):
plt.scatter(X, y, c='blue')
plt.plot(X, model.predict(X), color='red')
return
lin_regplot(X, y, sgd_lr)
plt.show()
Intercept: 0.24 Slope: 0.82
plt.plot(range(1, sgd_lr.epochs+1), sgd_lr.cost_)
plt.xlabel('Epochs')
plt.ylabel('Cost')
plt.ylim([0, 0.2])
plt.tight_layout()
plt.show()
with open('../../api_modules/mlxtend.regressor/LinearRegression.md', 'r') as f:
print(f.read())
## LinearRegression *LinearRegression(eta=0.01, epochs=50, minibatches=None, random_seed=None, print_progress=0)* Ordinary least squares linear regression. **Parameters** - `eta` : float (default: 0.01) solver rate (between 0.0 and 1.0) - `epochs` : int (default: 50) Passes over the training dataset. Prior to each epoch, the dataset is shuffled if `minibatches > 1` to prevent cycles in stochastic gradient descent. - `minibatches` : int (default: None) The number of minibatches for gradient-based optimization. If None: Normal Equations (closed-form solution) If 1: Gradient Descent learning If len(y): Stochastic Gradient Descent learning If 1 < minibatches < len(y): Minibatch learning - `random_seed` : int (default: None) Set random state for shuffling and initializing the weights. - `print_progress` : int (default: 0) Prints progress in fitting to stderr if not solver='normal equation' 0: No output 1: Epochs elapsed and cost 2: 1 plus time elapsed 3: 2 plus estimated time until completion **Attributes** - `w_` : 2d-array, shape={n_features, 1} Model weights after fitting. - `b_` : 1d-array, shape={1,} Bias unit after fitting. - `cost_` : list Sum of squared errors after each epoch; ignored if solver='normal equation' ### Methods <hr> *fit(X, y, init_params=True)* Learn model from training data. **Parameters** - `X` : {array-like, sparse matrix}, shape = [n_samples, n_features] Training vectors, where n_samples is the number of samples and n_features is the number of features. - `y` : array-like, shape = [n_samples] Target values. - `init_params` : bool (default: True) Re-initializes model parameters prior to fitting. Set False to continue training with weights from a previous model fitting. **Returns** - `self` : object <hr> *predict(X)* Predict targets from X. **Parameters** - `X` : {array-like, sparse matrix}, shape = [n_samples, n_features] Training vectors, where n_samples is the number of samples and n_features is the number of features. **Returns** - `target_values` : array-like, shape = [n_samples] Predicted target values.