1. Introduction
Understanding behavior and physiology of animals in their natural environments is fundamental to ecology [
1]. For centuries, animal behaviorists and ecologists have relied on direct observations to gather insights on animals’ activities. Wild animals may be difficult to observe [
2,
3]; direct observations can introduce observer bias [
1,
4] as well as the potential to affect animal behavior [
3,
5,
6,
7]. The use of bio-loggers, animal-borne devices that provide data on animal movement, behavior, and physiology without the need for direct observation, have proven to be powerful tools to study animal behavior. GPS trackers, video cameras, temperature loggers, depth recorders, physiological loggers, etc. have aided behavioral ecology researchers to observe and understand animal behavior [
8,
9]. Animal-borne accelerometers, devices that provide data of static and dynamic acceleration, are particularly powerful tools that aid in the study of animal behavior and have applications in the fields of captive animal welfare [
10,
11,
12,
13], behavioral ecology [
14,
15,
16], and evolutionary studies [
16,
17,
18].
Accelerometers have enabled animal behavior researchers to study species that may otherwise be very difficult or impossible to observe directly, either due to their cryptic nature, behaviors or qualities that make them less easily detected by predators or prey [
19], or the difficulty of accessing or navigating their environments. The first study to utilize accelerometers to discern behavioral patterns was conducted by Yoda et al. (1999) [
20] to classify the movement behaviors of Adélie penguins (
Pygoscelis adeliae). Since then, improvements in technology (e.g., logger size as well as battery and storage capacity) enabled opportunities to research a wider range of species that were previously inaccessible. For example, Nakamura, Goto, and Sato (2015) [
21] attached accelerometers to sun fish that dive up to 200 m for extended periods of time making them nearly impossible to observe directly. Nocturnal mammals are particularly difficult to observe, and researchers often rely on other metrics such as vocalizations to determine abundance or radio tracking [
22]. Although radio tracking is useful to discern general movement patterns and social organization, animals may remain completely out of view. To date, accelerometers have been used most widely in studies of birds and marine mammals [
15]. Only a few studies have used accelerometers to study primate behavior [
13,
15,
23,
24,
25,
26] and the majority of these sought to identify broad activity categories rather than specific behaviors. Even fewer studies have focused on nocturnal primates (e.g., those for which accelerometers would be the most useful).
Another benefit of bio-logging is the elimination of observer bias, since the presence of humans can unintentionally influence animal behavior [
6,
7,
27]. Even when animals are habituated, human presence can affect the behavior of non-habituated animals and influence their interactions with the habituated focal animals [
5,
28]. Direct observations are also limited by the boundaries of our own physical and sensory abilities; our individual experiences implicitly cause us to focus on certain events and subjects more than others [
1,
2,
4,
29].
Modern accelerometers tend to last for longer periods than older models and collect data continuously for an animal’s entire active period, which a human observer is rarely capable of unless through video recording. Despite improvements, battery life of accelerometers continues to be a major challenge. For instance, battery life can be affected by weather and humidity; seasonal variation must also be considered when planning deployment and retrieval of devices in the field [
30,
31,
32]. Battery life of accelerometers is also affected by the frequency interval at which the accelerometer is set to record data [
32]. High recording frequencies (>25 Hz) drain the device’s battery more quickly than low frequencies. Some research has been carried out to determine whether lowering recording frequency to extend battery life significantly affected precision of behavior classification. Hounslow et al. (2019) [
33] tested a range of frequencies (1–30 Hz) on lemon sharks (
Negaprion brevirostris) and found that classification precision of fine-scale behaviors did not decrease significantly until recording frequency reached as low as 5 Hz. McGowan et al. (2022) [
34] compared two accelerometer models and found that the model with higher capacity and higher recording frequency outperformed the other. Generally, it is recommended to use mid to high range frequencies when attempting to classify more complex behaviors, but low frequencies are acceptable to classify less complex behaviors and will extend the life of the device’s battery.
The detailed three-dimensional datasets derived from accelerometers can be used to identify specific animal behaviors and require complex stochastic analytical methods to infer behavior [
35]. Additionally, the raw accelerometer dataset only provides acceleration and orientation information so various models can be used to infer the actual behaviors. Machine-learning models are used to develop an algorithm that automatically identifies patterns within the dataset. Broadly, there are two categories of machine-learning algorithms: supervised and unsupervised [
36]. The most important difference between supervised and unsupervised learning algorithms is their inputs and outputs. Supervised learning algorithms produce classifications based on the labels researchers assign to the training dataset while unsupervised learning algorithms produce associative clusters of data using pattern recognition. There are several ways to cluster the data, which means there are multiple possible outcomes, so researchers must indicate a similarity measure for the model to follow. Unsupervised learning algorithms are more complex and less precise than supervised algorithms but may be used to identify the labels that can then be applied to a supervised learning algorithm [
37].
A supervised learning algorithm is used by behavioral ecologists when an ethogram, a list of distinct behaviors and their descriptions, is already known [
38,
39,
40]. These behaviors are used to label a portion of the training dataset. A statistical model is then applied to the data subset to classify behaviors using acceleration signatures [
15,
41]. This method requires that the researchers have pre-existing knowledge of the species, which is often not the case with many cryptic or difficult to access species that researchers know very little about. In these cases, an unsupervised learning algorithm is used which forgoes the need for direct observations [
15,
42,
43]. Several supervised learning models have been used to develop classification algorithms for animal acceleration data including decision trees, support vector machines, and random forest models [
44,
45].
One group of primates that lend themselves particularly well to wearing accelerometers are the Lorisidae—African pottos (Perodicticinae) and Asian lorises (Lorisinae) Their cryptic lifestyles make them particularly difficult to observe, but at the same time, their non-jumping movements that are often slow can be picked up well by an accelerometer [
46]. Direct observations by human researchers have provided detail about behaviour of slow lorises in particular in the wild, but for significant portions of time animals are out of view [
47]. Despite the challenge of following these nocturnal primates, they have been shown to eat gum, nectar, and insects; their activity patterns are influenced by weather and moon phase; they go into torpor often in dense foliage where this behavior may be missed; and are frequently social, a behavior said to be rare for nocturnal primates [
47,
48,
49,
50]. Although all slow loris species are arboreal and prefer tree connectivity, several slow loris species occur in agroforests with reduced canopy connectivity that may disrupt loris activities or impact their energetics [
51,
52,
53,
54]. Understanding the impacts of these factors is particularly important for Javan slow lorises, which are classified as Critically Endangered by the International Union of the Conservation of Nature (IUCN) Red List due to intense deforestation and fragmentation for agriculture [
52,
55]. Indeed, as natural forests shift more and more to agriculture, there is a call to understand behavior and ecology within agroforestry matrix environments [
56,
57].
Here we present a case study of applying a supervised learning approach to train a model to identify behaviors from accelerometer data of a wild Javan slow loris (
Nycticebus javanicus), from a well-known population occurring within an agroforest in Indonesia. Using direct behavioral observation data, we applied a supervised learning approach to train a random forest model [
58,
59]. Next, we validated the accuracy of the model’s predictions against our observations and present the results. It is predicted that movement complexity will affect the model’s classification accuracy. We predicted resting behaviors would be classified with highest accuracy and feeding and locomotor behaviors such as climbing and walking would be classified with lower accuracy. We divided the results by broad behavioral categories: Locomotive, Feeding, and Resting. This is the first time accelerometry and machine learning have been applied to wild slow lorises to identify specific behaviors. For this reason, we tested the method with a single animal as proof of principle. The results imply exciting applications of accelerometry to behavioral ecology of cryptic arboreal mammals.
2. Materials and Methods
Using data extracted from an accelerometer worn by a wild male Javan slow loris, we developed an algorithm using a random forest model to identify behaviors. Direct behavioral observations were used to validate the algorithm. The study area lies outside the village of Cipaganti in West Java, Indonesia (7°16′44.30′′ S, 107°46′7.80′′ E, 1200 m asl) [
52] and is part of the Little Fireface Project (LFP), which has been consistently studying a wild population of Javan slow lorises since 2011. LFP is the longest continuous research project of any nocturnal primate species, which is ideal to validate the methods of this study. Cipaganti is located on the Gunung Puntang Mountain at 1345 m above sea level [
60] and exists nearby, but outside a strictly protected nature reserve, Gunung Papandayan. The landscape of Cipaganti is agroforest, which is characterized by patchworks of forest fragments, agricultural fields, and human settlements [
61] (
Figure 1). The climate of the region is tropical rainforest with annual precipitation exceeding 2500 mm [
52] and temperatures remain relatively constant throughout the year but vary more between day and night [
62]. Between January and August 2022, minimum lows reached 22 °C at night while maximum highs reached 35 °C during the day [
60].
2.1. Field Methods
A team of three to five people from LFP, using protocol recommended by Nekaris, Munds, and Pimley (2020) [
63], captured an adult male loris on 7 March 2022 and fitted him with a collar affixed with a radio transmitter and accelerometer and recaptured him on 11 April 2022 to retrieve the accelerometer. Any medical check-ups, sample collection, measurements, notes, and collar fittings are conducted in situ and without the use of anesthetic (
Figure 2).
We recorded the target slow loris’ behavior between the hours of 18:30 and 23:00 on 10, 15, and 18 March 2022. We recorded general behavior and positional and locomotor behaviors using a scan sampling method at five-minute intervals plus ad libitum observations. For the purposes of this study, and based on validation of an accelerometer in captivity, we used a reduced ethogram combining six behavioral categories alongside 11 postures (
Table 1,
Table 2 and
Table 3). Since the captive slow loris was on her own, we did not have validation data to include any social behaviors for the current study.
2.2. Materials
We used a Technosmart Axy 5s accelerometer, with dimensions of 22 mm × 13 mm × 10 mm, weighing 2.5 g, mounted to a Lotek VHF radio collar using two zip ties (
Figure 2). The combined weight of the collar and accelerometer is around 19 g, which is below the recommended 5% of the animal’s body mass [
64,
65]. We set the accelerometers to record at an interval of 25 Hz. The device in this study lasted the manufacturer suggested 60 days at this rate [
32].
2.3. Data Analysis
We conducted all data processing in Microsoft Excel and we used R and RStudio version 2022.02.2+485 to run the Random Forest model and validate the R script. Raw data from the Technosmart Axy 5s accelerometer provided information on acceleration and orientation through measurements of 15 variables (
Table 4).
We extracted accelerometer data between 10–18 March 2022, which corresponded to the direct observations taken in the field. Aligning the timestamps from both datasets, we added labels to the raw dataset with behaviors from the direct behavioral observations. Behaviors are recorded the moment the stopwatch signals and the accelerometer records data 25 times per second. We thus labelled all 25 datapoints with the same behavior. For instance, if a behavior is recorded for the time stamp 18:20 and there are 25 datapoints corresponding to the time 18:20:00, all 25 accelerometer datapoints are labelled with the same behavior. The labelled subset of accelerometer data consists of a total of 2900 datapoints. We divided the data into three parts and subsets (locomotive—explore and travel; feeding behaviors—feeding on gum, nectar, insects, etc.; and resting behaviors—alert, groom, rest) based on broad behavioral categories, then we ran the Random Forest model three times, once for each subset.
We used the labelled accelerometer dataset to train a Random Forest model to classify behaviors. We ran a Random Forest script in RStudio derived from the one used in Nekaris et al. (2022) [
13]. See
Appendix A for the Random Forest script we used.
Random Forest models can be defined as:
“a classifier consisting of a collection of tree-structured classifiers {h(x,k), k = 1,…} where the {k} are independent identically distributed random vectors and each tree casts a unit vote for the most popular class at input x”
The benefits to using random forests as opposed to a single decision tree are an increase in prediction accuracy and outputs of variable importance and prediction uncertainty [
58,
59]. A single decision tree is prone to overestimating the importance of certain variables and overfitting classifications. Random forests avoid this problem by introducing two random selection processes each time a tree is grown so that each tree is different from the next, thus increasing variability. Variability reduces the risk of overfitting and overemphasis of the importance of certain variables. Once all of the trees in the forest have made their predictions, the predictions are aggregated, with the most popular being the result of the model. The nodes of a decision tree terminate when the data included in each node cannot be classified any further, thus they are ‘pure’. The purity or impurity of each node is quantified with the Gini impurity index formula. The Gini index tends towards zero when the subset is pure or contains only one kind of class (in this case, behaviors). The model runs a subset through a decision tree which splits the data at nodes with the goal of minimizing the Gini impurity index.
where
n is the number of behavioral classes and
is the proportion of each class in a set of observations.
First, a training subset was randomly selected from the labelled dataset while the remaining 30% is used as a validation dataset, which was then used to test the accuracy of the Random Forest model predictions. Once we built the model, we used it to predict the behaviors of the validation dataset. We then compared the predicted behaviors to the observed behaviors and produced a confusion matrix to assess the accuracy of the model.
4. Discussion
Our aim in this study was to see if an accelerometer could accurately predict behaviors of a wild Javan slow loris, compared to those recorded by a human observer. By combining direct behavioral observation data and accelerometer data within a Random Forest model framework, we have successfully identified 21 combinations of six behaviors and 18 movement/position modifiers from a wild Javan slow loris with a mean accuracy of 91.6% in the training datasets and 94.6% in the validation datasets. The Random Forest model identified resting behaviors with the greatest accuracy (99.16%) and locomotive behaviors with the lowest accuracy (85.54%), which is consistent with the results of similar studies in other species [
13,
15]. The reason for this disparity may be due to fundamental differences between the two behavioral categories. Locomotive behaviors are more complex and varied than resting behaviors [
65], which likely increases the chances of confusion in the Random Forest model. Evidence of this complexity can be seen by looking at the number of combinations of behavior and position or locomotion; these included 11 combinations for locomotive behaviors (see
Table 5) and seven combinations for resting behaviors (see
Table 7). Interestingly, only three combinations were identified for feeding behaviors (see
Table 6), yet these were not identified with as high accuracy as resting behaviors. This may be due to the relatively small sample size of feeding behaviors (150 datapoints) as compared with resting (400 datapoints) or locomotive (2125 datapoints). Further study is needed to determine how sample size influences classification accuracy.
Within locomotive behaviors (explore and travel), we found seven positional or locomotive modifiers. The primary difference between the two behaviors, explore and travel, is the perceived intentionality of the loris by the human observer. The ethogram (see
Table 2) defines explore as “movement associated with looking for food or exploring the habitat”. This implies that the purpose of the loris’ movement is to search for food, or simply exploring their environment. Travel is defined as “continuous, directed movement from one location to another”, which implies the purpose of the movement is to simply get to another location. Both behaviors involve travelling from one place to another, and so, to an accelerometer, look very similar and the device may not be sensitive enough to discern visual and olfactory searching. Human observers are still important to interpret subtle differences in behaviors such as these. For the purposes of future accelerometer studies, it may be beneficial to classify explore and travel behaviors under one behavioral category to avoid confusion until improvements in accelerometer technology make them sensitive enough to discern the nuances of behavior.
Across all behavior categories the most important variables were static_DorsoVentral, static_Lateral, accZ, and pitch. This somewhat reflects the results of Nekaris et al. (2022) [
13], which examined accelerometer data from a captive individual of a different species of loris,
Nycticebus bengalensis. They found Static_Lateral and Static_DorsoVentral to be the first and second most important variables respectively to predict behaviors while accZ was the third most important variable for just one behavior. In the current study, the second and third most important variables in the feeding category were undetermined due to the fact that across the three distinct feeding behaviors, there were three distinct variables in second and third place. Between all three behavior categories, feeding behaviors had the smallest sample size. Replication of the model with a much larger sample size might reveal truer variable importance for feeding behaviors.
A greater sample size is needed to further validate and increase the reliability of the algorithm before we can run unlabeled data with confidence. The objective is to have ample sample size with the maximum number of subjects and behavior variability to build an algorithm that is robust enough to apply to any Javan slow loris accelerometer data without the need for correlating behavioral data. A reliable algorithm may potentially be further tested and then applied to any species with similar morphology and ecology [
66] such as other loris species although some scientists caution the use of one algorithm across different species [
67]. This study provides proof of method that can be applied to any lorisid species with an established ethogram.
Comparison of results with previous studies is difficult to do for a variety of reasons. Of the few studies that seek to investigate primate behavior using accelerometers, only two [
13,
15] classify behaviors using similar methods and present their results using the same metrics as those in the present study i.e., model accuracy and variable importance.
Table 8 shows a comparison of the results of the present study, two similar primate studies, and three non-primate studies that classify animal behavior using accelerometers and Random Forest models. Other studies using accelerometers may seek to distinguish between periods of activity and inactivity [
30], overall activity patterns [
68], or may be concerned with identifying just one type of behavior [
69], requiring different methods and results metrics.
Boyd et al. (2004) [
3] pose a definition of bio-logging as the “investigation of phenomena in or around free-ranging organisms that are beyond the boundary of our visibility or experience”. Animals such as the Javan slow loris are small, arboreal, and nocturnal—all conditions that make them difficult for humans to observe in the wild. Bio-logging devices such as accelerometers effectively extend the capacity of our senses to allow us a previously inaccessible view into the activities and behaviors of animals such as the Javan slow loris, deep diving sunfish [
21], flying and diving seabirds [
43], or arctic muskox [
70]. The information obtained from such studies is important to understand wildlife responses and resistance to global climate change, anthropogenic environmental modification and destruction [
3,
65,
71]. At the same time, we can use bio-logging data to reconstruct environmental state and fluctuations since animal behavior is affected by the surrounding environment and therefore contains environmental information [
9]. These insights can be integrated into ecosystem management programs to resist the effects of climate change and environmental degradation, including for species across a broad geographic range [
72,
73,
74].