**Binary classification** is the process that is used to classify data points into one of two classes. For example, whether a customer will buy a product or not, emails are spam or not, whether a patient has certain disease or not.

Let’s say we have a set of labeled points:

f1 | 3.2 | 6.3 | 6.2 | 3.3 | 3.1 | 5.6 | 3.9 | 5.5 |

f2 | 5.5 | 5.9 | 5.7 | 5.6 | 5.4 | 5.8 | 5.5 | 5.7 |

label | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 |

We have two features for each point, represented by the `f1`

and `f2`

. In addition, we have one of two class labels for each point. Label 0 represents the first class, and label 1 represents the second class.

And we have some unlabeled points:

f1 | 4.3 | 5.3 | 5.0 | 4.2 |

f2 | 5.4 | 5.9 | 5.8 | 5.6 |

A trained model should predict whether a new point should be labeled 0 or 1.

This tutorial provides example how to create and train a model which classifies data points into one of two classes. We will use TensorFlow 2.

Using `pip`

package manager install `tensorflow`

from the command line.

`pip install tensorflow`

In order to better understand the data, we will display a scatter plot where each value in the data is represented by a dot. We declare arrays of features (`trainF1`

, `trainF2`

) and labels (`trainLabels`

) for training a model. Unlabeled points for testing are declared as `testF1`

and `testF2`

.

```
import numpy as np
import matplotlib.pyplot as plt
trainF1 = np.array([3.2, 6.3, 6.2, 3.3, 3.1, 5.6, 3.9, 5.5], dtype=float)
trainF2 = np.array([5.5, 5.9, 5.7, 5.6, 5.4, 5.8, 5.5, 5.7], dtype=float)
trainLabels = np.array([1, 0, 0, 1, 1, 0, 1, 0], dtype=float)
testF1 = np.array([4.3, 5.3, 5.0, 4.2], dtype=float)
testF2 = np.array([5.4, 5.9, 5.8, 5.6], dtype=float)
colors = ['red', 'green']
colorsList = [colors[int(label)] for label in trainLabels]
plt.scatter(trainF1, trainF2, c=colorsList)
plt.scatter(testF1, testF2, c='black', marker='x')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.grid()
plt.show()
```

Label 0 is represented by red dots and label 1 is represented by green dots. A unlabeled points are represented by black crosses.

We build a model with three `Dense`

layers. We have two features so model has two inputs. Model has one output.

Inputs for the model should be presented in the single array. So `trainF1`

and `trainF2`

arrays are joined along a new axis by using `stack`

method.

Model is compiled using binary cross-entropy loss function and Adam optimizer. We use 300 epochs to train the model. After training the model is saved in `HDF5`

format.

```
from tensorflow import keras
import numpy as np
trainF1 = np.array([3.2, 6.3, 6.2, 3.3, 3.1, 5.6, 3.9, 5.5], dtype=float)
trainF2 = np.array([5.5, 5.9, 5.7, 5.6, 5.4, 5.8, 5.5, 5.7], dtype=float)
trainFeatures = np.stack((trainF1, trainF2), 1)
trainLabels = np.array([1, 0, 0, 1, 1, 0, 1, 0], dtype=float)
model = keras.Sequential([
keras.layers.Dense(64, activation='relu', input_shape=(2,)),
keras.layers.Dense(32, activation='relu'),
keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy')
model.fit(trainFeatures, trainLabels, epochs=300)
model.save('model.h5')
```

A model was trained and now we can predict labels for the given data points.

Model has one output which value is in the range from 0 to 1. If value of the output is less than 0.5 are assigned to label 0 and value greater than or equal to 0.5 are assigned to label 1.

- If prediction < 0.5 then label = 0
- If prediction >= 0.5 then label = 1

```
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
trainF1 = np.array([3.2, 6.3, 6.2, 3.3, 3.1, 5.6, 3.9, 5.5], dtype=float)
trainF2 = np.array([5.5, 5.9, 5.7, 5.6, 5.4, 5.8, 5.5, 5.7], dtype=float)
trainLabels = np.array([1, 0, 0, 1, 1, 0, 1, 0], dtype=float)
testF1 = np.array([4.3, 5.3, 5.0, 4.2], dtype=float)
testF2 = np.array([5.4, 5.9, 5.8, 5.6], dtype=float)
testFeatures = np.stack((testF1, testF2), 1)
model = keras.models.load_model('model.h5')
predictedLabels = model.predict(testFeatures)
predictedLabels = [int(label >= 0.5) for label in predictedLabels]
colors = ['red', 'green']
colorsList = [colors[int(label)] for label in trainLabels]
colorsListPredicted = [colors[label] for label in predictedLabels]
plt.scatter(trainF1, trainF2, c=colorsList)
plt.scatter(testF1, testF2, c=colorsListPredicted, marker='x')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.grid()
plt.show()
```

A scatter plot shows that labels for data points was predicted correctly.

## Leave a Comment

Cancel reply