In addition to the precision-recall analysis, we developed a novel method for quantifying intersubject consistency, which minimizes problems of intersubject disagreement arising from the fact that certain subjects are simply more exhaustive in labeling all possible occlusions than other subjects. We defined the
most conservative subject (MCS) for a given image as the subject who had labeled the fewest pixels. Using the MCS labeling, we generate a binary image mask
F, which is 1 for any pixel within
R pixels of an occlusion labeled by the MCS, and 0 for all other pixels. Applying this mask to the labeling of each subject yields a “reduced” labeling, which is valid for inter-subject comparison since it only includes the most prominent occlusions labeled by all of the subjects. To calculate the comparison between two subjects, we randomly assigned one binary edgemap as the “reference” (
Iref) and the other binary edge-map as the “test” (
Itest). We used the reference map to define a weighting function
fγ(
r), which was applied to all of the pixels in the test map that quantified how close each pixel in the test map was to a pixel in the reference map. Mathematically, our index is given by
where
r((
x,
y),
Iref) is the distance between (
x,
y) and the closest pixel in
Iref,
Nt is the number of pixels in the test set and the function
fγ(
r) is defined for 0 ≤
γ < ∞ by
and for
γ = ∞ by
where
R is the radius of the mask, which we set to
R = 10 in our analysis. The parameter
γ in our weighting function sets the sensitivity of
fγ to the distance between the reference labeling and the test labeling. Setting
γ = 0 counts the fraction of pixels in the test edgemap, which lie inside the mask generated by the reference edgemap, and setting
γ = ∞ measures the fraction of pixels in complete agreement between the two edgemaps.