Rainer Ferdinand Wunderlich

Postdoctoral researcher at INRAE

Two alternative evaluation metrics to replace the true skill statistic in the assessment of species distribution models


Journal article


R. Wunderlich, Yu-Pin Lin, Johnathen Anthony, Joy R. Petway
Nature Conservation, vol. 35, 2019, pp. 97-116


Semantic Scholar DOI
Cite

Cite

APA   Click to copy
Wunderlich, R., Lin, Y.-P., Anthony, J., & Petway, J. R. (2019). Two alternative evaluation metrics to replace the true skill statistic in the assessment of species distribution models. Nature Conservation, 35, 97–116. https://doi.org/10.3897/natureconservation.35.33918


Chicago/Turabian   Click to copy
Wunderlich, R., Yu-Pin Lin, Johnathen Anthony, and Joy R. Petway. “Two Alternative Evaluation Metrics to Replace the True Skill Statistic in the Assessment of Species Distribution Models.” Nature Conservation 35 (2019): 97–116.


MLA   Click to copy
Wunderlich, R., et al. “Two Alternative Evaluation Metrics to Replace the True Skill Statistic in the Assessment of Species Distribution Models.” Nature Conservation, vol. 35, 2019, pp. 97–116, doi:10.3897/natureconservation.35.33918.


BibTeX   Click to copy

@article{r2019a,
  title = {Two alternative evaluation metrics to replace the true skill statistic in the assessment of species distribution models},
  year = {2019},
  journal = {Nature Conservation},
  pages = {97-116},
  volume = {35},
  doi = {10.3897/natureconservation.35.33918},
  author = {Wunderlich, R. and Lin, Yu-Pin and Anthony, Johnathen and Petway, Joy R.}
}

Abstract

Model evaluation metrics play a critical role in the selection of adequate species distribution models for conservation and for any application of species distribution modelling (SDM) in general. The responses of these metrics to modelling conditions, however, are rarely taken into account. This leads to inadequate model selection, downstream analyses and uniformed decisions. To aid modellers in critically assessing modelling conditions when choosing and interpreting model evaluation metrics, we analysed the responses of the True Skill Statistic (TSS) under a variety of presence-background modelling conditions using purely theoretical scenarios. We then compared these responses with those of two evaluation metrics commonly applied in the field of meteorology which have potential for use in SDM: the Odds Ratio Skill Score (ORSS) and the Symmetric Extremal Dependence Index (SEDI). We demonstrate that (1) large cell number totals in the confusion matrix, which is strongly biased towards ‘true’ absences in presence-background SDM and (2) low prevalence both compromise model evaluation with TSS. This is since (1) TSS fails to differentiate useful from random models at extreme prevalence levels if the confusion matrix cell number total exceeds ~30,000 cells and (2) TSS converges to hit rate (sensitivity) when prevalence is lower than ~2.5%. We conclude that SEDI is optimal for most presence-background SDM initiatives. Further, ORSS may provide a better alternative if absence data are available or if equal error weighting is strictly required.


Share


Follow this website


You need to create an Owlstown account to follow this website.


Sign up

Already an Owlstown member?

Log in