Simple linear regression is a statistical method that is used to analyze the relationship between two continuous variables: an independent variable and a dependent variable. Simple linear regression can be used in various practical applications. For example, analysis of the relationship between a person's weight and their height, the price of a product and the number of units sold. This tutorial shows how to perform simple linear regression using scikit-learn.
Prepare environment
- Install the following package using
pip
:
pip install scikit-learn
Code
The following code shows how to use a simple linear regression model that allows to predict the value of y
for the given value of x
. Relationship between x
and y
variables are described by formula y = 2 * x + 1
.
The code defines the data points that will be used to train the model. The xs
array contains the independent variable values, and the ys
array contains the dependent variable values. In this case, each independent variable has one associated dependent variable. After the data points are defined, we create a linear regression model which is trained using the xs
and ys
arrays.
import numpy as np
from sklearn.linear_model import LinearRegression
xs = np.array([[-2.0], [-1.0], [0.0], [1.0], [2.0], [3.0], [4.0]])
ys = np.array([[-3.0], [-1.0], [1.0], [3.0], [5.0], [7.0], [9.0]])
model = LinearRegression()
model.fit(xs, ys)
x = 15.0
y = model.predict([[x]])
print(y[0])
Once the model has been trained, we try to predict a value of y
for a previously unknown value of x
. In our case, if x
is 15.0, then the trained model returns that y
is 31. It can be verified:
y = 2 * x + 1 = 2 * 15 + 1 = 31
Leave a Comment
Cancel reply