top of page
GeoWGS84AI_Logo_edited.jpg

Kappa Coefficient

A statistical measure used to assess the accuracy of classification maps, especially in remote sensing, by comparing observed and expected agreement between classified data and reference data.

Kappa Coefficient

How do you define the Kappa Coefficient?

By comparing observed accuracy with expected accuracy due to chance, the Kappa Coefficient, also called Cohen's Kappa, is a statistical metric used to assess the accuracy and dependability of categorization findings.


Definition: Considering the potential for agreement to arise by chance, the Kappa Coefficient calculates the degree of agreement between two raters or datasets (classified map vs. ground truth, for example).


Formula:


κ=Po−Pe1−Pe\kappa = \frac{{P_o - P_e}}{{1 - P_e}}κ=1−Pe​Po​−Pe​​

  • PoP_oPo​ = Observed agreement

  • PeP_ePe​ = Expected agreement by chance


Interpretation:

  • 1.00 → Perfect agreement

  • 0.81–1.00 → Almost perfect agreement

  • 0.61–0.80 → Substantial agreement

  • 0.41–0.60 → Moderate agreement

  • 0.21–0.40 → Fair agreement

  • 0.00–0.20 → Slight agreement

  • < 0 → Less than chance agreement

Related Keywords

The agreement between two raters that goes beyond chance is measured by Cohen's Kappa coefficient. A value of 1 denotes perfect agreement, a value of 0 denotes agreement equal to chance, and a value of -1 denotes disagreement. The ranges are generally 0.01–0.20 for mild agreement, 0.21–0.40 for reasonable agreement, 0.41–0.60 for moderate agreement, 0.61–0.80 for substantial agreement, and 0.81–1.00 for nearly perfect agreement.

The degree of disagreement between two raters is taken into account when calculating the weighted kappa coefficient for ordinal data. Using a weight matrix that ranges from -1 (total disagreement) to 1 (perfect agreement), it compares observed and chance agreement.

Taking into consideration chance agreement, Cohen's kappa calculates the degree of agreement between two raters on categorical data. An extension that permits more than two raters is Fleiss's kappa, which offers a single statistic for overall agreement across several assessors. Higher numbers show stronger agreement than chance. Both range from -1 to 1.

Beyond chance, agreement is measured by the Kappa coefficient. For instance, Kappa = (0.70–0.50)/(1–0.50) = 0.40 indicates moderate agreement if two physicians agree on 70% of diagnoses but 50% agreement is predicted by chance.

bottom of page