Any predicted bounding box that has a confidence above the threshold will be counted as a prediction. After that, we look for the ground truths you provide. We filter for any predations that have an IoU above the threshold.
Of these predictions, we select the prediction with the highest confidence. Repeat this for all ground truth predictions. Of the predictions that remain and that have a high enough conference, we count as FP background predations.
For your example the values in the row of kizu are predictions with the label kizu and a confidence above the set threshold. For the values in this row and in the column of background: these are predictions with label kizu but without a matched (specially, without label considered) ground truth. In other words, it’s a FP for kizu that is not linked to any of your ground truth labels correctly labelled or not.
