TY - JOUR
T1 - A probabilistic framework for behavioral identification from animal-borne accelerometers
AU - Dentinger, Jane E.
AU - Börger, Luca
AU - Holton, Mark D.
AU - Jafari-Marandi, Ruholla
AU - Norman, Durham A.
AU - Smith, Brian K.
AU - Oppenheimer, Seth F.
AU - Strickland, Bronson K.
AU - Wilson, Rory P.
AU - Street, Garrett M.
N1 - Funding Information:
Funding for this research was provided by the Mississippi Agriculture and Forestry Research Station (MAFES) Strategic Research Initiative and the Noble Research Institute. We also thank AJ Benney, Clay Gibson, Bryant Haley, Gwen Jones, Kelsey Paolini, and numerous other volunteers for their contributions. All procedures were approved by the MSU Institutional Animal Care and Use Committee (Protocol #16–062). G.M.S. L.B. and R.P.W. conceived the ideas for the paper. M.D.H. B.K.S. B.K.S. R.J.M. G.M.S. and S.F.O. combined expertise to create the methodology. J.E.D. and D.A.N. collected the data. J.E.D. performed the analyses in collaboration with M.D.H. R.J.M. and G.M.S. J.E.D. and G.M.S. led writing of the manuscript. All authors contributed to the drafts and gave final approval for publication. Data will be archived at the Mississippi State University Repository for permanent archiving. All R and MATLAB scripts are available in the Appendix VI: 1:15.
Funding Information:
Funding for this research was provided by the Mississippi Agriculture and Forestry Research Station (MAFES) Strategic Research Initiative and the Noble Research Institute . We also thank AJ Benney, Clay Gibson, Bryant Haley, Gwen Jones, Kelsey Paolini, and numerous other volunteers for their contributions. All procedures were approved by the MSU Institutional Animal Care and Use Committee (Protocol #16–062).
Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2022/2
Y1 - 2022/2
N2 - Many studies of animal distributions use habitat and climactic variables to explain patterns of observed space use. However, without behavioral information, we can only speculate as to why and how these characteristics are important to species persistence. Animal-borne accelerometer and magnetometer data loggers can be used to detect behaviors and when coupled with telemetry improve our understanding of animal space use and habitat requirements. However, these loggers collect tremendous quantities of data requiring automated machine learning techniques to identify patterns in the data. Supervised machine learning requires a set of training signals with known behaviors to train the model to identify the unique signal characteristics associated with each behavior. In contrast, unsupervised approaches aggregate unlabeled signals into groups based purely on signal similarity but, without additional information, do not identify specific behaviors. In this paper, we propose a probabilistic framework for interpreting uncertainty in machine learning techniques—the probability profile—and demonstrate how to post hoc identify behaviors within signal groups. We assess model performance using a matrix-based measure of dissimilarity. We used a Random Forest (RF) and a clustered self-organizing map (CSOM) for comparison and demonstrate the use of a behavioral profile for each using a data set of high-frequency accelerometer and magnetometer data collected from 7 captive wild pigs (Sus scrofa) moving in a 1 ha outdoor enclosure. We found that the RF had more discrimination than the CSOM which had fewer clusters associated with high probabilities of a single behavior (>50%). The leave-p-out cross validation statistic of the probability matrix (L1¯) indicated that there was an average maximum dissimilarity of 20% and 65% between the training and test data sets for the RF and CSOM methods, respectively. Using a probability profile to describe groups predicted from machine learning allows the variation and error inherent in behavioral prediction to be incorporated directly into the model to better reflect the nuances of behavior derived from accelerometer and/or magnetometer signals. We discuss the data requirements of this framework, demonstrate its application to field data, highlight critical assumptions and caveats, and examine how it may be used to generate new ecological inference.
AB - Many studies of animal distributions use habitat and climactic variables to explain patterns of observed space use. However, without behavioral information, we can only speculate as to why and how these characteristics are important to species persistence. Animal-borne accelerometer and magnetometer data loggers can be used to detect behaviors and when coupled with telemetry improve our understanding of animal space use and habitat requirements. However, these loggers collect tremendous quantities of data requiring automated machine learning techniques to identify patterns in the data. Supervised machine learning requires a set of training signals with known behaviors to train the model to identify the unique signal characteristics associated with each behavior. In contrast, unsupervised approaches aggregate unlabeled signals into groups based purely on signal similarity but, without additional information, do not identify specific behaviors. In this paper, we propose a probabilistic framework for interpreting uncertainty in machine learning techniques—the probability profile—and demonstrate how to post hoc identify behaviors within signal groups. We assess model performance using a matrix-based measure of dissimilarity. We used a Random Forest (RF) and a clustered self-organizing map (CSOM) for comparison and demonstrate the use of a behavioral profile for each using a data set of high-frequency accelerometer and magnetometer data collected from 7 captive wild pigs (Sus scrofa) moving in a 1 ha outdoor enclosure. We found that the RF had more discrimination than the CSOM which had fewer clusters associated with high probabilities of a single behavior (>50%). The leave-p-out cross validation statistic of the probability matrix (L1¯) indicated that there was an average maximum dissimilarity of 20% and 65% between the training and test data sets for the RF and CSOM methods, respectively. Using a probability profile to describe groups predicted from machine learning allows the variation and error inherent in behavioral prediction to be incorporated directly into the model to better reflect the nuances of behavior derived from accelerometer and/or magnetometer signals. We discuss the data requirements of this framework, demonstrate its application to field data, highlight critical assumptions and caveats, and examine how it may be used to generate new ecological inference.
KW - Accelerometers
KW - Behavior
KW - Machine learning
KW - Random forest
KW - SOM
KW - k-means clustering
UR - http://www.scopus.com/inward/record.url?scp=85119077225&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119077225&partnerID=8YFLogxK
U2 - 10.1016/j.ecolmodel.2021.109818
DO - 10.1016/j.ecolmodel.2021.109818
M3 - Article
AN - SCOPUS:85119077225
VL - 464
JO - Ecological Modelling
JF - Ecological Modelling
SN - 0304-3800
M1 - 109818
ER -