Perform Simple Linear Regression using scikit-learn

Perform Simple Linear Regression using scikit-learn

Simple linear regression is a statistical method that is used to analyze the relationship between two continuous variables: an independent variable and a dependent variable. Simple linear regression can be used in various practical applications. For example, analysis of the relationship between a person's weight and their height, the price of a product and the number of units sold. This tutorial shows how to perform simple linear regression using scikit-learn.

Prepare environment

  • Install the following package using pip:
pip install scikit-learn

Code

The following code shows how to use a simple linear regression model that allows to predict the value of y for the given value of x. Relationship between x and y variables are described by formula y = 2 * x + 1.

The code defines the data points that will be used to train the model. The xs array contains the independent variable values, and the ys array contains the dependent variable values. In this case, each independent variable has one associated dependent variable. After the data points are defined, we create a linear regression model which is trained using the xs and ys arrays.

import numpy as np
from sklearn.linear_model import LinearRegression

xs = np.array([[-2.0], [-1.0], [0.0], [1.0], [2.0], [3.0], [4.0]])
ys = np.array([[-3.0], [-1.0], [1.0], [3.0], [5.0], [7.0], [9.0]])

model = LinearRegression()
model.fit(xs, ys)

x = 15.0
y = model.predict([[x]])
print(y[0])

Once the model has been trained, we try to predict a value of y for a previously unknown value of x. In our case, if x is 15.0, then the trained model returns that y is 31. It can be verified:

y = 2 * x + 1 = 2 * 15 + 1 = 31

Leave a Comment

Cancel reply

Your email address will not be published.