sklearn.metrics.f1_score¶
- sklearn.metrics.f1_score(y_true, y_pred, labels=None, pos_label=1, average='weighted')¶
Compute the F1 score, also known as balanced F-score or F-measure
The F1 score can be interpreted as a weighted average of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0. The relative contribution of precision and recall to the F1 score are equal. The formula for the F1 score is:
F1 = 2 * (precision * recall) / (precision + recall)
In the multi-class and multi-label case, this is the weighted average of the F1 score of each class.
Parameters : y_true : array-like or list of labels or label indicator matrix
Ground truth (correct) target values.
y_pred : array-like or list of labels or label indicator matrix
Estimated targets as returned by a classifier.
labels : array
Integer array of labels.
pos_label : str or int, 1 by default
If average is not None and the classification target is binary, only this class’s scores will be returned.
average : string, [None, ‘micro’, ‘macro’, ‘samples’, ‘weighted’ (default)]
If None, the scores for each class are returned. Otherwise, unless pos_label is given in binary classification, this determines the type of averaging performed on the data:
- 'micro':
Calculate metrics globally by counting the total true positives, false negatives and false positives.
- 'macro':
Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- 'weighted':
Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
- 'samples':
Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).
Returns : f1_score : float or array of float, shape = [n_unique_labels]
F1 score of the positive class in binary classification or weighted average of the F1 scores of each class for the multiclass task.
References
[R158] Wikipedia entry for the F1-score Examples
>>> from sklearn.metrics import f1_score >>> y_true = [0, 1, 2, 0, 1, 2] >>> y_pred = [0, 2, 1, 0, 0, 1] >>> f1_score(y_true, y_pred, average='macro') 0.26... >>> f1_score(y_true, y_pred, average='micro') 0.33... >>> f1_score(y_true, y_pred, average='weighted') 0.26... >>> f1_score(y_true, y_pred, average=None) array([ 0.8, 0. , 0. ])