Swiftorial Logo
Home
Swift Lessons
Tutorials
Learn More
Career
Resources

Understanding F1 Score in Machine Learning

Introduction

The F1 Score is a measure of a model's accuracy in binary classification problems. It is particularly useful when the classes are imbalanced. The F1 Score is the harmonic mean of Precision and Recall, providing a single metric that balances both concerns.

Precision and Recall

Before diving into the F1 Score, it's essential to understand Precision and Recall.

Precision is the ratio of correctly predicted positive observations to the total predicted positives. It answers the question: "What proportion of positive identifications was actually correct?"

Precision = True Positives / (True Positives + False Positives)

Recall (also known as Sensitivity) is the ratio of correctly predicted positive observations to the all observations in the actual class. It answers the question: "What proportion of actual positives was identified correctly?"

Recall = True Positives / (True Positives + False Negatives)

F1 Score

The F1 Score is the harmonic mean of Precision and Recall. The formula is:

F1 Score = 2 * (Precision * Recall) / (Precision + Recall)

The F1 Score ranges between 0 and 1. A score of 1 indicates perfect Precision and Recall, while a score of 0 indicates the worst performance.

Why Use the F1 Score?

The F1 Score is particularly useful when you have an uneven class distribution. For example, if you have a dataset with 95% negative and 5% positive instances, accuracy might not be a good metric because predicting all instances as negative would give you a 95% accuracy. However, the F1 Score would give a more balanced view of the model's performance.

Example Calculation

Let's consider a confusion matrix for a binary classification problem:

                True Positives (TP) = 70
                True Negatives (TN) = 90
                False Positives (FP) = 20
                False Negatives (FN) = 10
                

First, calculate Precision and Recall:

                Precision = TP / (TP + FP) = 70 / (70 + 20) = 0.7778
                Recall = TP / (TP + FN) = 70 / (70 + 10) = 0.875
                

Now, calculate the F1 Score:

                F1 Score = 2 * (Precision * Recall) / (Precision + Recall)
                         = 2 * (0.7778 * 0.875) / (0.7778 + 0.875)
                         = 0.8235
                

The F1 Score in this example is approximately 0.8235.

Using F1 Score in Python

In Python, you can easily calculate the F1 Score using the scikit-learn library. Here is an example:

import numpy as np
from sklearn.metrics import f1_score

# Sample data
y_true = np.array([0, 1, 1, 0, 1, 1, 0, 0, 1, 0])
y_pred = np.array([0, 1, 0, 0, 1, 1, 1, 0, 1, 0])

# Calculate F1 Score
f1 = f1_score(y_true, y_pred)
print("F1 Score:", f1)
                
F1 Score: 0.8

Conclusion

The F1 Score is a valuable metric for evaluating the performance of a classification model, especially when dealing with imbalanced datasets. By understanding and utilizing Precision, Recall, and the F1 Score, you can get a better sense of your model's strengths and weaknesses.