Concise example of a basic linear regression in Julia, using GLM
and RDatasets
.
Reference:
using RDatasets, GLM
# List Datasets in the RDatasets julia library, in the "ISLR" package (there's lots more)
RDatasets.datasets("ISLR")
11 rows × 5 columns
Package | Dataset | Title | Rows | Columns | |
---|---|---|---|---|---|
String15 | String31 | String | Int64 | Int64 | |
1 | ISLR | Auto | Auto Data Set | 392 | 9 |
2 | ISLR | Caravan | The Insurance Company (TIC) Benchmark | 5822 | 86 |
3 | ISLR | Carseats | Sales of Child Car Seats | 400 | 11 |
4 | ISLR | College | U.S. News and World Report's College Data | 777 | 19 |
5 | ISLR | Default | Credit Card Default Data | 10000 | 4 |
6 | ISLR | Hitters | Baseball Data | 322 | 20 |
7 | ISLR | OJ | Orange Juice Data | 1070 | 18 |
8 | ISLR | Portfolio | Portfolio Data | 100 | 2 |
9 | ISLR | Smarket | S&P Stock Market Data | 1250 | 9 |
10 | ISLR | Wage | Mid-Atlantic Wage Data | 3000 | 12 |
11 | ISLR | Weekly | Weekly S&P Stock Market Data | 1089 | 9 |
# RDatasets has a dataset() function to reads in a dataset from the catalog
auto = dataset("ISLR", "Auto");
first(auto, 5)
5 rows × 9 columns (omitted printing of 1 columns)
MPG | Cylinders | Displacement | Horsepower | Weight | Acceleration | Year | Origin | |
---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 18.0 | 8.0 | 307.0 | 130.0 | 3504.0 | 12.0 | 70.0 | 1.0 |
2 | 15.0 | 8.0 | 350.0 | 165.0 | 3693.0 | 11.5 | 70.0 | 1.0 |
3 | 18.0 | 8.0 | 318.0 | 150.0 | 3436.0 | 11.0 | 70.0 | 1.0 |
4 | 16.0 | 8.0 | 304.0 | 150.0 | 3433.0 | 12.0 | 70.0 | 1.0 |
5 | 17.0 | 8.0 | 302.0 | 140.0 | 3449.0 | 10.5 | 70.0 | 1.0 |
# Call GLM, note you just use the column names without any Str of Symbol stuff
ols = lm(@formula(MPG ~ Cylinders + Displacement), auto)
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}} MPG ~ 1 + Cylinders + Displacement Coefficients: ───────────────────────────────────────────────────────────────────────────── Coef. Std. Error t Pr(>|t|) Lower 95% Upper 95% ───────────────────────────────────────────────────────────────────────────── (Intercept) 36.5377 1.19661 30.53 <1e-99 34.1851 38.8903 Cylinders -0.576348 0.443276 -1.30 0.1943 -1.44786 0.295169 Displacement -0.0511185 0.00722576 -7.07 <1e-11 -0.0653249 -0.0369121 ─────────────────────────────────────────────────────────────────────────────