Next Article in Journal
Measuring SDG 15 at the County Scale: Localization and Practice of SDGs Indicators Based on Geospatial Information
Next Article in Special Issue
Mapping Creative Industries: A Case Study on Supporting Geographical Information Systems in the Olomouc Region, Czech Republic
Previous Article in Journal
TerraBrasilis: A Spatial Data Analytics Infrastructure for Large-Scale Thematic Mapping
Previous Article in Special Issue
Interactions between Bus, Metro, and Taxi Use before and after the Chinese Spring Festival
Article

The Effects of GPS-Based Buffer Size on the Association between Travel Modes and Environmental Contexts

by 1 and 2,3,4,*
1
Illinois Informatics Institute, University of Illinois at Urbana-Champaign, 616 E Green Street Suite 210, Champaign, IL 61820, USA
2
Department of Geography and Resource Management, The Chinese University of Hong Kong, Shatin, Hong Kong, China
3
Institute of Space and Earth Information Science, The Chinese University of Hong Kong, Shatin, Hong Kong, China
4
Department of Human Geography and Spatial Planning, Utrecht University, 3584 CB Utrecht, The Netherlands
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2019, 8(11), 514; https://0-doi-org.brum.beds.ac.uk/10.3390/ijgi8110514
Received: 15 October 2019 / Revised: 8 November 2019 / Accepted: 11 November 2019 / Published: 13 November 2019
(This article belongs to the Special Issue Geospatial Methods in Social and Behavioral Sciences)

Abstract

To investigate the association between physical activity (including active travel modes) and environmental factors, much research has estimated contextual influences based on zones or areas delineated with buffer analysis. However, few studies to date have examined the effects of different buffer sizes on estimates of individuals’ dynamic exposures along their daily trips recorded as GPS trajectories. Thus, using a 7-day GPS dataset collected in the Chicago Regional Household Travel Inventory (CRHTI) Survey, this study addresses the methodological issue of how the associations between environmental contexts and active travel modes (ATMs) as a subset of physical activity vary with GPS-based buffer size. The results indicate that buffer size influences such associations and the significance levels of the seven environmental factors selected as predictors. Further, the findings on the effects of buffer size on such associations and the significance levels are clearly different between the ATMs of walking and biking. Such evidence of the existence of buffer-size effects for multiple environmental factors not only confirms the importance of the uncertain geographic context problem (UGCoP) but provides a resounding cautionary note to all future research on human mobility involving individuals’ GPS trajectories, including studies on physical activity and travel behaviors, especially on the reliable estimation of individual exposures to environmental factors and their health outcomes.
Keywords: buffer analysis; GIS; environmental context; GPS trajectories; active travel modes buffer analysis; GIS; environmental context; GPS trajectories; active travel modes

1. Introduction

With an increasing interest in the relationships between environmental contexts and human health-related behaviors and outcomes, many physical activity (PA) researchers have investigated the environmental influences on people’s moderate to vigorous PA. PA is “any bodily movement produced by skeletal muscles that results in energy expenditure” [1]. Moderate to vigorous PA especially, such as brisk walking and running, brings various health benefits [2]. Many studies have used geographic information system (GIS)-based buffer analyses due to the simplicity of using them to delineate contextual areas or zones and the effectiveness in deriving contextual variables with them. Buffer zones with a pre-specified distance are, thus, often used in research on PA to delineate areas within which individuals are potentially affected by specific environmental factors.
Home addresses have been a popular type of geographic location for delineating buffer zones. For example, previous studies investigated the effects of neighborhood green spaces around individuals’ home locations on their PA [3,4,5,6]. Some researchers have sought to identify a reasonable and reliable distance to capture true neighborhood effects. For instance, McGinn and colleagues [7] suggested that a 20-min walking distance—roughly 1.6 km or 1 mile—was appropriate for defining neighborhood areas from individual home locations in physical health research, whereas Berke and colleagues [8] claimed that a slightly smaller size, 1 km or 0.6 miles, might better capture the characteristics of people’s residential neighborhoods.
When compared to the static locations, such as home addresses, a few studies have been conducted to explore the effects of the sizes of GPS-based buffers when estimating the dynamic exposure along individuals’ daily GPS trajectories. For GPS-based buffers, dynamic GPS points are used as entities to construct buffers instead of static home locations. By identifying where and when people spend their time and are exposed to a specific environmental influence in their daily lives, using GPS data may help mitigate the uncertain geographic context problem (UGCoP), which arises due to the spatial and temporal uncertainties of contextual influences on people’s health-related behaviors or outcomes [9]. In the last decade or so, buffer analysis (e.g., 50 or 100 m buffer) has been used to delineate environmental contexts along individuals’ GPS trajectories that have a spatially immediate and momentary influences on individuals’ moderate to vigorous PA [10,11]. However, due to the lack of consensus on the best distance for creating GPS-based buffers in previous studies and potential variations in research results due to the use of buffers of different sizes, there is an urgent need to investigate the effects of different sizes of buffers on research findings to provide insights into public health and transportation research.
Thus, this study addresses the methodological issue of the effects of different buffer sizes on the estimated relationships between environmental contexts and active travel modes (ATMs), which is a manifestation of the UGCoP. As a subset of PA, ATMs include only walking and biking in this study, although there are other ATMs in our daily lives, like running, roller-skating, and so on. Sensitivity analyses were conducted to investigate the varying associations between seven environmental factors (crime, trees, parks and open spaces, neighborhood median household income, neighborhood population, transit availability, and traffic collision) and walking and biking, taking into consideration 11 different buffer sizes ranging from 20 to 200 m. In addition, this study used multinomial logistic regression to examine the relationships between buffer size and the impacts of the seven environmental factors and on people’s ATMs, and the statistical significance of each predictor based on the 11 different buffer sizes is explored.
This study contributes to the literature on the accurate estimation of individual exposures to various environmental contexts.
This article is structured as follows. Past studies on the estimation of environmental exposure using buffers are reviewed in Section 2. The GPS dataset, the GIS dataset, and the analytical methods used in this study are explained in Section 3. Section 4 presents the results obtained from the statistical analyses on the associations between ATMs and multiple environmental factors, and Section 5 discusses the research findings and concludes the article.

2. Estimation of Individual Environmental Exposure using Buffer Analysis

GIS methods have been essential for estimating the impact of environmental influences on PA. A large body of research has used buffer analysis to delineate neighborhoods around individuals’ home addresses and find empirical evidence regarding the associations between PA and neighborhood characteristics [5,7,8,12,13]. Buffering is one type of spatial analysis that can be used to assess the effects of various factors by defining zones around geometric primitives, such as points, lines, or polygons, which represent geographic entities or objects. Circular (radial) buffers with the same distance in all directions are widely used to define neighborhood or contextual areas in an isotropic manner, whereas network buffers are used to define anisotropic contextual areas taking into account individuals’ reachable distances along road networks (Figure 1). A circular shape has been the most popular form to delineate contextual areas based on a specific distance from a set of entities (e.g., people’s home locations). For instance, McGinn and colleagues [7] created 1.6 km (1 mile) circular buffers around the homes of 1270 adults in Forsyth County, NC, and Jackson City, MS, to examine the associations between the built-environment and adults’ PA for leisure and transportation purposes. The 1.6 km distance is equivalent to a 20-min walking distance, which seems a reasonable distance for delineating people’s residential neighborhoods based on their home locations. The study found that, in Jackson, MS, people who live in neighborhoods with low-traffic volumes were less likely to meet physical activity recommendations.
Because a neighborhood or contextual area can be delineated based on different characteristics of participants or the environment, various buffer sizes were used in some studies [5,8,13,14,15]. Berke et al. [8], for instance, used smaller buffer sizes (e.g., 100 m, 500 m, and 1 km), which may better capture the walkable areas for older adults around their homes. As one of the characteristics of the built-environment, the effects of green space on people’s PA may depend on the proximity between green space and people’s homes. For example, Maas and colleagues [13] used 1 km and 3 km circular buffers to measure the percentages of green space around participants’ homes, while Cerin and colleagues [16] applied 500 m and 1 km network buffers to delineate reachable green space for adults in 12 countries. Browning and Lee [17] conducted a systematic literature review and found that when buffers were created around individuals’ home addresses, a large number of studies found significant associations between greenness and better physical health or health behaviors, including PA, as the buffer size increases up until 1999 m. Further, Nagel and colleagues [5] concluded that the significance of the associations between some built-environment factors and PA could vary depending on the sizes of the buffers that represent different ranges of neighborhood or contextual areas.
The increasing adoption of GPS in PA research has led to the identification of the locations where PA occurs. In some studies, GPS points falling within 400 m to 1600 m circular or network buffers from people’s home locations were considered [18,19,20,21]. These studies combined GPS points with objectively measured PA to understand the effect of residential neighborhoods on people’s PA. Boruff and colleagues [18] further investigated different buffer types and their influence on research findings. Further, different sizes of buffers were considered in recent studies to delineate more accurate residential neighborhoods for specific PA types (e.g., walking or bicycling) [22,23]. In other words, with respect to the size of buffers, the types of PA in question became a critical factor that needed to be taken into account to find appropriate buffer sizes.
However, only a few studies to date have considered buffers along individual trips traced by GPS trajectories for assessing the dynamic influence of environmental factors. Among these studies, Rodríguez et al. [10] justified the use of 50 m buffers around each GPS point to estimate daily exposures of adolescent females to built-environment characteristics. The purpose of using a 50 m distance was to avoid the potential dependence in the estimated effects of the built-environment between two consecutive GPS points. Further, Burgoine and colleagues [11] applied a hybrid method by using 100 m circular buffers for estimating environmental exposures during children’s trips from and to school and 800 m network buffers for their residential and school neighborhoods. Regarding the trips from and to the school of children, Harrison and colleagues [24] used the actual routes derived from GPS points and the predicted routes calculated using the shortest path algorithm to compare the food and PA environments in the 100 m buffers along the two kinds of routes. Yin and colleagues [25] highlighted that, although the moderate to vigorous PA of youths usually occurred within a 0.25 or 0.3-mile radius around their residences considering their daily trips, their space-time paths were not uniformly distributed in the radial area. In addition, Houston [26] tested 50, 250, and 500 m buffers created around the GPS trajectories of 55 adults and found that the results varied between different buffer sizes. For example, the magnitude of the impact of green space on moderate to vigorous PA was diminished as the buffer size increased to 500 m.
Therefore, this study conducted an in-depth investigation into the effects of buffer size on research results concerning PA based on GPS trajectories. With regard to the UGCoP, Kwan [27] highlighted the need for performing sensitivity analysis in particular, in order to better understand the extent to which research findings and contextual influences are affected by different delineations of contextual units. Hence, this study examines whether GPS-based buffer size affects the associations between ATMs, including walking and biking, and multiple physical and social environmental factors that previous studies have not explored. Further, when compared to Houston’s research [26], this study examines individual environmental exposures using smaller buffers (e.g., from 50 m to 200 m).

3. Method

3.1. GPS Data

The GPS data and daily activity diaries collected in the Chicago Regional Household Travel Inventory (CRHTI) project were used in this study. Chicago is the third-largest metropolitan area in the U.S. where people are exposed to various urban opportunities and built-environments. This study was approved by the University of Illinois Institutional Review Board. In the CRHTI survey, GPS trajectory data were recorded from members of 147 households for 7 days between September 2007 and December 2007, using GlobalSat Data Logger. Daily activity diaries were reported during the first day of the entire 7 survey days regarding destinations and trips. Among these participants, only 178 persons from 73 households had both complete personal and household information in addition to GPS data and activity diaries, and 168 adults who were 18 years old or older were selected as subjects for this study. The GPS data were recorded at a 5-s interval when a participant was moving at a speed of at least one mile per hour (mph), which is the speed of slow walking. In this study, to reduce the computation time for the following analyses, GPS points were sampled at a 10-s interval. Because GPS points at 5-s intervals are too numerous to process for buffer analysis, we decided to reduce the number of points by half by sampling them at a 10-s interval. Before data analysis, Kalman filtering [28], based on linear quadratic estimation, was performed to increase the accuracy the GPS data. Further, short trips with less than a 3-min duration were excluded because GPS tracks that seem like short trips of under 3 min are most likely not real trips (e.g., due to drifting GPS points) and can be wrongly identified as trips. These “short trips” were, thus, removed.

3.2. Travel Mode Classification

To obtain the travel modes of the trips recorded in the survey, the travel mode classification algorithm developed by Lee and Kwan [28] was adopted. The algorithm identifies travel modes like walking and biking using machine learning. Among the three main processes that constitute the travel mode classification algorithm, the “classification using GPS data” process was modified, optimized, and implemented to classify walking, in-vehicle status, biking, and running for this study, since GPS data were the only sensor data collected by the CRHTI project. This optimization process includes the test of different sets of distance and time windows—10 to 500 m distance and 10 to 300 s time—to find the best predictive accuracy, which was not done in the previous version of the algorithm (10 to 200 m distance and 10 to 180 s time). Distance and time windows play an important role in better predicting such travel modes with GPS data by creating many variables related to the movements of humans and vehicles.
The overall predictive accuracy of the optimized classification algorithm was 97.00% (walking: 97.58%, in-vehicle: 97.24%, biking: 89.33%, and running: 99.01% in Figure 2) with 10 to 300 s time windows and 10 to 500 m distance windows when the Geolife GPS data described in Section 3.1 in [28] were used as training and test datasets. When 10 to 180 s time windows and 10 to 200 m distance windows—which were the minimum ranges of window sizes in the test—were used, the algorithm achieved the lowest predictive accuracy (95.04%). Next, the optimized classification algorithm was also tested using real-world GPS trajectories collected from three subjects described in Lee and Kwan [28]. The three subjects were young adults and all males with no health issues. It improved the accuracy of walking (99.38%) and in-vehicle status (95.81%) when compared to the previous version (walking: 98.25%; in-vehicle: 91.98%), whereas biking (90.47%) showed a slightly lower accuracy than the original version of the algorithm (90.95%). Running observations in the real-world GPS trajectories were replaced in this study with almost 2-h records from TrackProfiler (Višnjan, Croatia), which is an online service to upload and share GPS tracks, because the number of observations for running was too small to measure the performance of the algorithm in predicting running in the previous study. With the replaced observations, the optimized classification achieved an accuracy of 90.95% in identifying running. Predicted results of the CRHTI GPS data and their clustering patterns were represented using kernel density estimation, as shown in Figure 3.

3.3. Statistical Analyses

GPS-based buffers are more spatially specific and can better capture the environmental contexts along individuals’ GPS trajectories when compared to other delineation methods, including activity spaces [29,30,31,32,33], kernel densities [34,35], and daily potential path areas [36]. Further, buffers created using GPS points can also take into account the temporal dimension of exposure, enabling the estimation of people’s cumulative exposures to environmental influences (e.g., more exposures when individuals stay for a longer time at specific places), whereas other delineation methods often neglect them (c.f., Wang and Kwan [37]).
Circular buffers are created around each GPS point with the predicted travel modes (Figure 4). As for the impact of the environment on people’s PA, the in-vehicle status contrasts with ATMs, such as biking (Winters et al., 2010). Therefore, the in-vehicle status can serve as a reference category when the associations between each travel mode and the environmental context are examined.
For the statistical model in this study, multinomial logistic regression was used to examine the associations between ATMs and environmental contexts. As of the response variable, predicted inactivity and ATMs were compared (e.g., in-vehicle versus walking, and in-vehicle versus biking). Logistic regression analyses provided odds ratios (ORs) for each predictor as estimates of the probability that a specific outcome (e.g., active travel) would happen against another (e.g., inactive travel). In this paper, if the OR is greater than 1, it means that walking or biking is more likely to occur for a given predictor. On the other hand, if the OR is less than 1, it indicates that in-vehicle status is more likely to happen. In logistic regression, ORs are derived by exponentiating the beta coefficient of a given variable as follows.
O d d s   R a t i o = e β ,
where β is a coefficient of a given explanatory variable in logistic regression.
Each observation considered in the models was a GPS point, and 7 different environmental variables were included in the models based on the buffer areas delineated by different distances and assigned to each observation. As shown in Table 1, those 7 predictor variables represent different aspects of the physical, social, and safety-related environmental contexts. Among the 7 predictors, trees, parks and open spaces, transit accessibility, crime, and traffic collision were selected because they were widely used as important variables in previous studies [4,5,10,18,38,39,40,41,42,43,44,45,46]. Further, since many studies reported mixed or non-significant findings regarding the influence of safety-related factors on people’s active travel, this study specifically included crime and traffic collisions as public safety factors [5,7,18,46,47]. Individuals’ perceptions regarding neighborhood environments are also influential factors on PA [48], and thus, neighborhood-level income and population were included as predictors in this study.
Trees and parks and open spaces were included in the environment category. The transit availability index in the transport category is a measure that “takes into account transit service frequency, pedestrian friendliness, network distance to transit stops, and the number of subzone connections” [49]. It represents transit accessibility as an index from 1 to 5 (1 represents the lowest while 5 the highest level of accessibility to transit). The crime predictor is the incidence of 11 types of violent crimes [50,51]. Traffic collisions involving pedestrians and cyclists associated with public safety were taken into account in this study. Aside from those predictors, age, race, household income, and weekday/weekend were included as confounding factors.
Three models were iteratively generated using 11 buffer sizes—20, 30, 40, 50, 60, 70, 80, 90, 100, 150, and 200 m; therefore, a total of 33 models were created. For each buffer size, the 7 environmental variables were calculated iteratively. Specifically, for parks and open spaces, areas included in each buffer size were computed. Regarding transit availability index and neighborhood median household income and population, area-weighted averages were calculated for each buffer size. The three models included a slightly different number of environmental variables. The first model included the number of cases of violent crime, the percentage of tree areas, the density of park and open space, the transit availability index, and the number of traffic collisions adjusted for age, race, household income, and weekday/weekend. The second model added the density of African Americans in the neighborhood as a predictor, in addition to those predictors already included in the first model. The third model included neighborhood median household income in addition to those predictors already included in the second model. The second and third models were used to explore the effects of neighborhood socioeconomic status on the likelihood of individuals adopting walking or biking against traveling in vehicles as a travel mode. Regarding using the percentage of African Americans in a neighborhood as a predictor, a past study observed that “African Americans perceived their neighborhoods as less safe and less pleasant for physical activity than did whites, regardless of the racial composition of the neighborhood” [48]. In other words, African Americans tend to rate their neighborhoods lower than whites on both safety and pleasantness, and thus, neighborhoods with a high percentage of African Americans tend to be perceived negatively for physical activity (which in turn, would negatively influence people’s intention to undertake physical activity).
Three goodness-of-fit measures (pseudo-R-squared, AIC (Akaike information criterion), and BIC (Bayesian information criterion)) were used to compare the fit of the three models.
R statistical software was used for the computation of the variables based on a large number of GPS points with different buffer sizes and statistical analyses. Extreme Science and Engineering Discovery Environment (XSEDE) Jetstream with Intel Xeon E-2680v3 CPUs (24 cores) accelerated the variable calculations in this study [52,53].

4. Result

4.1. Descriptive Statistics

Table 2 shows the descriptive statistics of the 168 adults with their personal characteristics and predicted average daily travel time using the optimized travel mode classification algorithm described in Section 3.2. In the statistical summary, the percentage of females was slightly higher than that of males. Most of the participants were whites and middle-aged adults. According to the population statistics of Chicago (United States Census Bureau, 2010), the percentage of whites in the population of the study area was 45% while it was 81.5% in our sample, and African Americans accounted for 33% of the population, which was higher than the percentage in our samples (10.7%). High-income people comprised the dominant group, and vehicles, including private cars and public transport, were the most-used modes for daily travels, which accounted for one hour per day on average. Since running was rarely performed in the daily lives of the 168 participants, it was excluded from this study. The other three travel modes (i.e., walking, traveling in a car, and biking) were considered.
The total number of GPS points was 156,627 (156,627 observations). The environmental characteristics within the 50 m buffers around these GPS points are described in Table 3 to give a sense of how the participants were dynamically exposed to different environmental contexts in their daily lives. Regarding the predicted travel modes, 40,999 GPS points were identified as walking, whereas only 2545 points were classified as biking. The buffer areas of the GPS points associated with walking had, on average, higher tree density, higher transit availability and battery incidence, and more traffic collisions involving pedestrians and cyclists than the buffer areas of the GPS points associated with biking and in-vehicle. On the other hand, buffer areas of the GPS points associated with biking had, on average, more park and open space areas and higher neighborhood median household incomes. The correlations among the seven predictors were analyzed to evaluate multicollinearity using correlation coefficients, and it was found that there was very little correlation between any pairs of predictors (not presented).

4.2. Sensitivity Analyses of the 11 Sizes of Buffers

The varying associations between participants’ ATMs and the seven environmental factors in Models 1, 2, and 3 are shown in Figure 5. In all three models, different buffer sizes affected such associations in terms of both the significance levels of the variables and the ORs. When trees, parks and open spaces, transit availability, battery, and traffic collision were included as predictors in Model 1, the associations between participants’ ATMs and the seven environmental factors were mostly significant for walking across the 20 to 200 m buffers, whereas only trees and traffic collision involving pedestrians and cyclists had significant associations for biking consistently for different buffer sizes. When the neighborhood African American was added as a predictor in Model 2, transit availability became more significant for all the buffer sizes for biking versus in-vehicle; however, parks and open spaces and crime still remain non-significant for most of the buffer sizes for biking. The added density of African Americans in a neighborhood was also not significantly associated with biking. On the contrary, the results of Model 3 with the additional neighborhood median household income variable indicate that the implications of buffer sizes are eventually alleviated in trees, transit availability, and neighborhood median household income regarding their significance levels for walking and biking. The rest of the predictors, however, do not show consistent significance levels across the buffer sizes. Particularly, as to parks and open spaces, crime, and the density of neighborhood African Americans, only relatively large buffer sizes—150 and 200 m—make them significant for biking. Further, the predictor of traffic collision is not significant for biking versus in-vehicle until the buffer size reaches 40 m.
In all of the three models, the associations between walking versus in-vehicle and all predictors mostly had high significance levels (p < 0.001), and the ORs varied as the buffer size changed, while the graphs of biking versus in-vehicle mostly show stable trends across the buffer sizes, except for crime and traffic collisions. With regard to the two safety-related factors, the ORs of walking and biking compared to in-vehicle status especially, show common characteristics. They both begin with similar ORs for small buffer sizes around 20 and 30 m, diverge more and more as the buffer size becomes larger, and cross at a certain size (crime) or are widened further (traffic collisions).
Since Model 3 had more significant variables across different buffer sizes, and 200 m was the utmost distance showing the largest number of higher significance levels in all predictors in Model 3, as shown in Figure 6, it was selected as the most appropriate buffer distance to examine the associations between ATMs and the environmental factors in this study. Figure 6 indicates the significance levels of the seven predictors for walking and biking, and thus, the maximum number of significant variables is 14. In Figure 6, the histogram indicates that as buffer size gets closer to 200 m, more environmental variables become significant.
The associations of the seven predictors derived with 200 m buffers and travel modes are shown in Table 4. All the model fit measures, including the three kinds of pseudo-R-squared values, indicate that Model 3 with the added neighborhood median household income better explains variations in the outcome variable—predicted travel modes—than the other two models. The higher percentages of tree areas (OR: 1.05), transit availability (OR: 2.06), incidence of crime (OR: 1.00), and traffic collision (OR: 1.01) were significantly associated with higher odds of walking, whereas traffic collisions (OR: 0.99) were significantly associated with lower odds of biking compared to in-vehicle status in Model 1. For walking, the density of parks and open spaces (OR: 0.97) had a significant association with lower odds of walking against in-vehicle status in Model 1. The results of Model 2 were similar to Model 1, and the model fit measures of Model 2 were not improved much after the density of neighborhood African Americans was added.
Compared to Models 1 and 2, all the variables had significant associations with walking and biking in Model 3, showing much-enhanced R-squared, AIC, and BIC values. Tree density, transit availability, and crime had significant associations with higher odds of both walking (OR: 1.05, 1.92, and 1.00 respectively) and biking (OR: 1.00, 1.09, and 1.00 respectively) compared to in-vehicle status in Model 3. Neighborhood median household income, density of neighborhood African Americans, and traffic accidents crash were associated with higher odds for walking (OR: 1.00, 1.00, and 1.01 respectively), but with lower odds for biking (OR: 0.99, 0.99, and 0.99 respectively) compared to in-vehicle status. On the other hand, parks and open spaces were associated with lower odds of walking (OR: 0.98) and biking (OR: 0.99) against in-vehicle status.
To determine the effects of all the environmental factors in the probability scale on walking or biking, the average marginal effects were calculated. In Model 3, the average marginal effect of transit availability on walking was the highest (0.1061) among all the predictors. It indicated that the probability of walking is approximately 10 percentage points higher for areas with great transit accessibility than areas with low levels of transit accessibility. It was also found that when the surrounding environments had higher tree density, there was approximately a 1% higher probability of walking, on average, than in areas with lower tree density.

5. Discussion and Conclusions

This study explored how different buffer sizes affect the associations between ATMs and multiple environmental factors (including the physical, social, and safety environments) in the estimation of spatially-immediate and temporally-momentary exposures around individuals’ GPS trajectories for PA and transportation research. In addition, the sensitivity analysis with different buffer sizes addressed the UGCoP by showing that the study results are sensitive to the choice of different sizes of buffers and large buffer sizes have more significant findings. Among the three models, Model 3 had more significant variables across different buffer sizes, and 200 m was the most appropriate buffer distance for the model. Based on the ORs and the significance levels of the environmental variables, the study found that buffer size has an influence on the associations between ATMs and the environmental factors, and the findings about the ORs and significance levels are clearly different in walking and biking. Specifically, the associations between biking and/or walking and parks and open spaces, crime, and traffic collisions do not remain consistent, showing an increase or decrease in the ORs and moving from a positive to a negative association or vice versa as the buffer size increases. A possible explanation for this inconsistency is that the changes in the direction of the associations between 20 m and 30 m, when compared to other sizes, are caused by the insufficient size of the 20 m buffer areas, which do not include any park areas and incidents of crime and traffic collision around individual GPS trajectories. Among these three predictors, parks and open spaces particularly showed a decrease in magnitude in its influence on biking over in-vehicle status, which corroborates Houston’s findings [26].
The associations between ATMs and the environmental factors are more sensitive for biking than for walking, showing varying statistical significance levels across different buffer sizes over parks and open spaces, transit availability, crime, the density of neighborhood African Americans, and traffic collisions. One common characteristic in the outcomes is that non-significant associations become significant when the buffer size reaches a relatively large distance, like 150 or 200 m. Using 200 m buffers, in this study particularly, produced more significant variables in Model 3 and obtained better model fit assessments based on several pseudo-R-squared measures than the other two models. Furthermore, Model 3 when using 200 m buffers showed the best fit when compared to all the other, shorter buffer distances. Neighborhood-level demographics and socioeconomic characteristics derived with each buffer at a GPS point played an important role in producing the better model, which may be relevant to individuals’ perceptions of opportunities for health-promoting behaviors [48]. Thus, the evidence on the existence of buffer-size effects on multiple environmental factors obtained in this study provides more systematic insights into PA and transportation research than previous studies regarding GPS-based buffers [10,26].
In the physical environment, the percentage of tree areas as a proxy of greenness derived with 200 m buffers around each GPS point was one of the consistent predictors, showing stable and significant associations in the three models. Tree density has higher ORs for walking and biking when compared to in-vehicle travel. With the objectively measured tree density, this study shows that greenness is likely to be associated with more walking (relative to motorized travel modes), which is consistent with the findings in previous studies [38,39]. The role of parks and open spaces in promoting walking and biking is, however, inconsistent with other studies, suggesting that the more park areas that individuals are exposed to in their daily trips, the more significantly the association with lower ORs of active travels, compared to the motorized travel mode [4,18,40,41,42,43,44,45]. One possible explanation is that some adults intentionally take a detour when they drive home to enjoy the fleeting natural landscape, including green space, which may give in-vehicle status higher odds than the two ATMs (Bell et al., [54]). In addition, since parks and open spaces have more complex characteristics, such as quality and availability (Lee & Maheswaran, [55]), which may affect their associations with the use of ATMs, the findings about the effects of parks and open spaces are not as consistent as those of tree density. The higher ORs of non-motorized travel modes—walking and biking in this study—against in-vehicle status with regard to transit availability also correspond to past studies, indicating that transit facilities encourage people’s use of ATMs and PA [41,46]. Thus, to promote walking or biking, urban planners may need to consider such effects of trees, parks and open spaces, and transit availability on active travels.
One piece of salient evidence that this study yielded is that safety-related factors, including crime and traffic collisions, have significant associations with walking and biking. Compared to traveling by private vehicles or public transit, more traffic collisions involving pedestrians and pedal cyclists is significantly associated with a lower likelihood of biking, which provides empirical evidence that traffic collisions constrain PA [56]. Conversely, there are mixed findings concerning walking. Unlike biking, walking is more likely to occur in areas with more traffic collisions. Furthermore, a higher incidence of battery is significantly associated with higher ORs of walking and biking when compared to in-vehicle status. The higher ORs of walking relative to in-vehicle status in the associations between ATMs and crime and traffic collision were unexpected, suggesting that walking is more likely to happen than in-vehicle status in areas with more crime cases and traffic collision in the immediate surroundings. One possible reason behind these inconsistent associations is that larger buffers included more crimes and traffic collisions around places where walking and biking occurred, and this may have affected the results. Furthermore, with the different findings in the associations between walking and biking and traffic collisions, this study provides empirical evidence that the mechanisms underlying the associations between different travel modes and environmental factors may not work identically. The influence of neighborhood median household income and African American density also indicates that some environmental factors could have opposite effects on the two active modes. For example, walking is likely to be performed in neighborhoods with a high percentages of African Americans and high median household incomes compared to in-vehicle status, while biking has the opposite outcomes.
The optimized travel mode classification algorithm adopted to automatically identify walking, running, biking, and in-vehicle status is one of the innovative parts of this study. Such automatic classification of travel modes only uses GPS trajectories and achieved remarkable accuracy in identifying those four travel modes. With the newly-adopted travel mode classification algorithm, this study suggests a novel way of using estimated travel modes in health, transportation, and urban planning research to understand individuals’ dynamic exposures to environmental factors and their impacts on individuals’ PA, taking into account people’s daily trips recorded by GPS trajectories.
This study, however, has some limitations. First, this study did not consider trip chains as an analytical unit. Physical activity research tends to use each GPS point as the analytical unit for exposure estimation, while transportation research uses trips as the analytical unit to estimate exposures around people’s travel routes [57]. Specifically, the point-by-point approach can make the results more sensitive to the GPS points with low accuracy. Second, this study does not address the biases associated with selective daily mobility. Because this study focused only on associations rather than causal relationships, it could not ascertain the causal effects of environmental factors on people’s travel (c.f., [58]). For example, this study could not identify the reasons why people selected a particular type of environment to walk or bike or why exposure to specific environments made people perform walking or biking. Third, correlations between the observations were not addressed in this study. Due to the large number of GPS points at the high frequency (10-s interval), consecutive GPS points may have had very similar values of environmental characteristics for a subject. Observations from different subjects can also be related to each other, since the GPS data were collected from household members, and therefore, parents and their children and siblings may share the same trips. Observations should be independent in many statistical tests, and the ignorance of the correlations between observations can cause the overestimation of p-values [59]. This issue should be examined in future studies by, for instance, comparing the results to those obtained through random samples of the GPS points. Fourth, the optimal buffer size identified in this study may not be generalizable. The optimal buffer size may extend beyond 200 m for other study areas, and different environmental factors may have different optimal buffer sizes. Hence, such variabilities should be further investigated. In addition, the sampling rate of the GPS points may affect the consistency of the results. In this study, GPS points were sampled at a 10-s interval to generate a smaller GPS dataset and to increase computational efficiency for generating the buffers. However, coarser (e.g., 60 s) or finer (e.g., 5 s) sampling scales may have implications for the findings on the associations between ATMs and environmental factors and the effects of different buffer sizes on these associations, since an initial analysis with the GPS dataset at a 60-s interval obtained somewhat different results for some buffer sizes, although mostly similar in general. Further, the intensity of the ATMs was not considered in this study, which could enrich our understanding of the associations. Walking, for instance, can be further divided into light and brisk walking depending on its intensity, which may be affected differently by different environmental factors, as many studies have demonstrated by using accelerometers to identify the intensity levels of individuals’ PA. Lastly, the sample of subjects used in this study is not representative of the larger population of the study areas. The participants in the sample were mostly wealthy, middle-aged whites, which restricts the applicability of our findings about the effects of different buffer sizes on the associations between ATMs and environmental factors to this specific social group. Further, this study did not deal with the temporal aspects of the buffer size effects. Different buffer sizes may have different effects on the study results at different time points, and this should be addressed in future studies in order to better understand the time-sensitive effects of buffer size.
Considering these limitations, future work should further investigate different delineation methods and their impacts on research findings. More aggregated methods, such as activity space and kernel density that are based on trips instead of each GPS points should be explored to mitigate the problems due to the use of a point-by-point approach and to take into account the correlations between observations. The correlations among observations can also be addressed using specialized statistical tests that consider the hierarchical structure of trips and participants [59]. Nonlinearity that might exist in the associations between ATMs and environmental factors should also be handled in future work using nonlinear statistical models. In addition, the impacts of GPS data sampling rates on research findings will need to be examined. Instead of the 10-s interval, it would be useful to compare the results obtained with the original 5-s intervals and larger intervals to see how the study results vary depending on the sampling rate. Such an investigation will contribute to mobility research in various fields by suggesting a minimum sampling frequency for GPS data. In addition, the categories of ATMs need to be expanded by considering the intensity of ATMs, which can be based on data obtained with accelerometers or on estimations using people’s physiological information, such as age, height, weight, and velocity of walking or biking. Moreover, further research is needed to enhance our understanding of the inconsistencies in the results by focusing on different genders, racial or ethnic groups, and socioeconomic groups. Spatio-temporal analysis will also be needed for exploring some of the predictors, such as parks and open spaces and safety-related factors, which may be time-sensitive and vary between weekdays and weekends.

Author Contributions

K.L. conceived, designed, and implemented the experiments; K.L. analyzed the results and wrote the paper; M.-P.K. contributed to refining and revising the paper.

Funding

This research was supported by a grant from the XSEDE Startup Project at the University of Illinois at Urbana-Champaign, which was funded under the Extreme Science and Engineering Discovery Environment (XSEDE) Program, U.S. National Science Foundation. In addition, Mei-Po Kwan was supported by a grant from the U.S. National Science Foundation (#BCS-1832465).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Caspersen, C.J.; Powell, K.E.; Christenson, G.M. Physical activity, exercise, and physical fitness: Definitions and distinctions for health-related research. Public Health Rep. 1985, 100, 126–131. [Google Scholar]
  2. Physical Activities Guidelines Advisory Committee. Physical Activity Guidelines Advisory Committee Report; US Department of Health and Human Services: Washington, DC, USA, 2008.
  3. Cohen, D.A.; Ashwood, J.S.; Scott, M.M.; Overton, A.; Evenson, K.R.; Staten, L.K.; Porter, D.; McKenzie, T.L.; Catellier, D. Public Parks and Physical Activity Among Adolescent Girls. Pediatrics 2006, 118, e1381–e1389. [Google Scholar] [CrossRef] [PubMed]
  4. Coombes, E.; Jones, A.P.; Hillsdon, M. The relationship of physical activity and overweight to objectively measured green space accessibility and use. Soc. Sci. Med. 2010, 70, 816–822. [Google Scholar] [CrossRef] [PubMed]
  5. Nagel, C.L.; Carlson, N.E.; Bosworth, M.; Michael, Y.L. The relation between neighborhood built environment and walking activity among older adults. Am. J. Epidemiol. 2008, 168, 461–468. [Google Scholar] [CrossRef] [PubMed]
  6. Schipperijn, J.; Bentsen, P.; Troelsen, J.; Toftager, M.; Stigsdotter, U.K. Associations between physical activity and characteristics of urban green space. Urban For. Urban Green. 2013, 12, 109–116. [Google Scholar] [CrossRef]
  7. McGinn, A.P.; Evenson, K.R.; Herring, A.H.; Huston, S.L.; Rodriguez, D.A. Exploring associations between physical activity and perceived and objective measures of the built environment. J. Urban Health 2007, 84, 162–184. [Google Scholar] [CrossRef] [PubMed]
  8. Berke, E.M.; Koepsell, T.D.; Moudon, A.V.; Hoskins, R.E.; Larson, E.B. Association of the Built Environment With Physical Activity and Obesity in Older Persons. Am. J. Public Health 2007, 97, 486–492. [Google Scholar] [CrossRef] [PubMed]
  9. Kwan, M.-P. The Uncertain Geographic Context Problem. Ann. Assoc. Am. Geogr. 2012, 102, 958–968. [Google Scholar] [CrossRef]
  10. Rodríguez, D.A.; Cho, G.-H.; Evenson, K.R.; Conway, T.L.; Cohen, D.; Ghosh-Dastidar, B.; Pickrel, J.L.; Veblen-Mortenson, S.; Lytle, L.A. Out and about: Association of the built environment with physical activity behaviors of adolescent females. Health Place 2012, 18, 55–62. [Google Scholar] [CrossRef]
  11. Burgoine, T.; Jones, A.P.; Namenek Brouwer, R.J.; Benjamin Neelon, S.E. Associations between BMI and home, school and route environmental exposures estimated using GPS and GIS: Do we see evidence of selective daily mobility bias in children? Int. J. Health Geogr. 2015, 14, 8. [Google Scholar] [CrossRef]
  12. Hillsdon, M.; Panter, J.; Foster, C.; Jones, A. The relationship between access and quality of urban green space with population physical activity. Public Health 2006, 120, 1127–1132. [Google Scholar] [CrossRef] [PubMed]
  13. Maas, J.; Verheij, R.A.; Spreeuwenberg, P.; Groenewegen, P.P. Physical activity as a possible mechanism behind the relationship between green space and health: A multilevel analysis. BMC Public Health 2008, 8, 206. [Google Scholar] [CrossRef] [PubMed]
  14. Mitchell, C.; Clark, A.; Gilliland, J. Built Environment Influences of Children’s Physical Activity: Examining Differences by Neighbourhood Size and Sex. Int. J. Environ. Res. Public Health 2016, 13, 130. [Google Scholar] [CrossRef] [PubMed]
  15. Chambers, T.; Pearson, A.L.; Kawachi, I.; Rzotkiewicz, Z.; Stanley, J.; Smith, M.; Barr, M.; Ni Mhurchu, C.; Signal, L. Kids in space: Measuring children’s residential neighborhoods and other destinations using activity space GPS and wearable camera data. Soc. Sci. Med. 2017, 193, 41–50. [Google Scholar] [CrossRef]
  16. Cerin, E.; Mitáš, J.; Cain, K.L.; Conway, T.L.; Adams, M.A.; Schofield, G.; Sarmiento, O.L.; Reis, R.S.; Schipperijn, J.; Davey, R.; et al. Do associations between objectively-assessed physical activity and neighbourhood environment attributes vary by time of the day and day of the week? IPEN adult study. Int. J. Behav. Nutr. Phys. Act. 2017, 14, 34. [Google Scholar] [CrossRef]
  17. Browning, M.; Lee, K. Within what distance does “Greenness” best predict physical health? A systematic review of articles with GIS buffer analyses across the lifespan. Int. J. Environ. Res. Public Health 2017, 14, 675. [Google Scholar] [CrossRef]
  18. Boruff, B.J.; Nathan, A.; Nijënstein, S. Using GPS technology to (re)-examine operational definitions of ‘neighbourhood’ in place-based health research. Int. J. Health Geogr. 2012, 11, 22. [Google Scholar] [CrossRef]
  19. Almanza, E.; Jerrett, M.; Dunton, G.; Seto, E.; Ann Pentz, M. A study of community design, greenness, and physical activity in children using satellite, GPS and accelerometer data. Health Place 2012, 18, 46–54. [Google Scholar] [CrossRef]
  20. Dunton, G.F.; Almanza, E.; Jerrett, M.; Wolch, J.; Pentz, M.A. Neighborhood Park Use by Children. Am. J. Prev. Med. 2014, 46, 136–142. [Google Scholar] [CrossRef]
  21. James, P.; Berrigan, D.; Hart, J.E.; Aaron Hipp, J.; Hoehner, C.M.; Kerr, J.; Major, J.M.; Oka, M.; Laden, F. Effects of buffer size and shape on associations between the built environment and energy balance. Health Place 2014, 27, 162–170. [Google Scholar] [CrossRef]
  22. Hirsch, J.A.; Winters, M.; Ashe, M.C.; Clarke, P.J.; McKay, H.A. Destinations That Older Adults Experience Within Their GPS Activity Spaces: Relation to Objectively Measured Physical Activity. Environ. Behav. 2016, 48, 55–77. [Google Scholar] [CrossRef] [PubMed]
  23. Prins, R.G.; Pierik, F.; Etman, A.; Sterkenburg, R.P.; Kamphuis, C.B.M.; Van Lenthe, F.J. How many walking and cycling trips made by elderly are beyond commonly used buffer sizes: Results from a GPS study. Health Place 2014, 27, 127–133. [Google Scholar] [CrossRef] [PubMed]
  24. Harrison, F.; Burgoine, T.; Corder, K.; Van Sluijs, E.M.; Jones, A. How well do modelled routes to school record the environments children are exposed to? A cross-sectional comparison of GIS-modelled and GPS-measured routes to school. Int. J. Health Geogr. 2014, 13, 5. [Google Scholar] [CrossRef] [PubMed]
  25. Yin, L.; Raja, S.; Li, X.; Lai, Y.; Epstein, L.; Roemmich, J. Neighbourhood for Playing: Using GPS, GIS and Accelerometry to Delineate Areas within which Youth are Physically Active. Urban Stud. 2013, 50, 2922–2939. [Google Scholar] [CrossRef]
  26. Houston, D. Implications of the modifiable areal unit problem for assessing built environment correlates of moderate and vigorous physical activity. Appl. Geogr. 2014, 50, 40–47. [Google Scholar] [CrossRef]
  27. Kwan, M.-P. The Limits of the Neighborhood Effect: Contextual Uncertainties in Geographic, Environmental Health, and Social Science Research. Ann. Am. Assoc. Geogr. 2018, 108, 1482–1490. [Google Scholar] [CrossRef]
  28. Lee, K.; Kwan, M. Automatic physical activity and in-vehicle status classification based on GPS and accelerometer data: A hierarchical classification approach using machine learning techniques. Trans. GIS 2018, 22, 1522–1549. [Google Scholar] [CrossRef]
  29. Zenk, S.N.; Schulz, A.J.; Matthews, S.A.; Odoms-Young, A.; Wilbur, J.; Wegrzyn, L.; Gibbs, K.; Braunschweig, C.; Stokes, C. Activity space environment and dietary and physical activity behaviors: A pilot study. Health Place 2011, 17, 1150–1161. [Google Scholar] [CrossRef]
  30. Hirsch, J.A.; Winters, M.; Clarke, P.; McKay, H. Generating GPS activity spaces that shed light upon the mobility habits of older adults: A descriptive analysis. Int. J. Health Geogr. 2014, 13, 51. [Google Scholar] [CrossRef]
  31. Perchoux, C.; Kestens, Y.; Thomas, F.; Hulst, A.V.; Thierry, B.; Chaix, B. Assessing patterns of spatial behavior in health studies: Their socio-demographic determinants and associations with transportation modes (the RECORD Cohort Study). Soc. Sci. Med. 2014, 119, 64–73. [Google Scholar] [CrossRef]
  32. Lee, N.C.; Voss, C.; Frazer, A.D.; Hirsch, J.A.; McKay, H.A.; Winters, M. Does activity space size influence physical activity levels of adolescents?—A GPS study of an urban environment. Prev. Med. Rep. 2016, 3, 75–78. [Google Scholar] [CrossRef] [PubMed]
  33. Rundle, A.G.; Sheehan, D.M.; Quinn, J.W.; Bartley, K.; Eisenhower, D.; Bader, M.M.D.; Lovasi, G.S.; Neckerman, K.M. Using GPS Data to Study Neighborhood Walkability and Physical Activity. Am. J. Prev. Med. 2016, 50, e65–e72. [Google Scholar] [CrossRef] [PubMed]
  34. Thierry, B.; Chaix, B.; Kestens, Y. Detecting activity locations from raw GPS data: A novel kernel-based algorithm. Int. J. Health Geogr. 2013, 12, 14. [Google Scholar] [CrossRef] [PubMed]
  35. Jankowska, M.M.; Natarajan, L.; Godbole, S.; Meseck, K.; Sears, D.D.; Patterson, R.E.; Kerr, J. Kernel Density Estimation as a Measure of Environmental Exposure Related to Insulin Resistance in Breast Cancer Survivors. Cancer Epidemiol. Prev. Biomark. 2017, 26, 1078–1084. [Google Scholar] [CrossRef]
  36. Kwan, M.-P. Gender and Individual Access to Urban Opportunities: A Study Using Space–Time Measures. Prof. Geogr. 1999, 51, 211–227. [Google Scholar] [CrossRef]
  37. Wang, J.; Kwan, M.-P. An Analytical Framework for Integrating the Spatiotemporal Dynamics of Environmental Context and Individual Mobility in Exposure Assessment: A Study on the Relationship between Food Environment Exposures and Body Weight. Int. J. Environ. Res. Public Health 2018, 15, 2022. [Google Scholar] [CrossRef]
  38. Gong, Y.; Gallacher, J.; Palmer, S.; Fone, D. Neighbourhood green space, physical function and participation in physical activities among elderly men: The Caerphilly Prospective study. Int. J. Behav. Nutr. Phys. Act. 2014, 11, 40. [Google Scholar] [CrossRef]
  39. McMorris, O.; Villeneuve, P.J.; Su, J.; Jerrett, M. Urban greenness and physical activity in a national survey of Canadians. Environ. Res. 2015, 137, 94–100. [Google Scholar] [CrossRef]
  40. Troped, P.J.; Saunders, R.P.; Pate, R.R.; Reininger, B.; Addy, C.L. Correlates of recreational and transportation physical activity among adults in a New England community. Prev. Med. 2003, 37, 304–310. [Google Scholar] [CrossRef]
  41. Sallis, J.F.; Bowles, H.R.; Bauman, A.; Ainsworth, B.E.; Bull, F.C.; Craig, C.L.; Sjöström, M.; De Bourdeaudhuij, I.; Lefevre, J.; Matsudo, V.; et al. Neighborhood Environments and Physical Activity Among Adults in 11 Countries. Am. J. Prev. Med. 2009, 36, 484–490. [Google Scholar] [CrossRef]
  42. Gómez, L.F.; Parra, D.C.; Buchner, D.; Brownson, R.C.; Sarmiento, O.L.; Pinzón, J.D.; Ardila, M.; Moreno, J.; Serrato, M.; Lobelo, F. Built Environment Attributes and Walking Patterns Among the Elderly Population in Bogotá. Am. J. Prev. Med. 2010, 38, 592–599. [Google Scholar] [CrossRef] [PubMed]
  43. Astell-Burt, T.; Feng, X.; Kolt, G.S. Green space is associated with walking and moderate-to-vigorous physical activity (MVPA) in middle-to-older-aged adults: Findings from 203 883 Australians in the 45 and Up Study. Br. J. Sports Med. 2014, 48, 404–406. [Google Scholar] [CrossRef] [PubMed]
  44. Brown, G.; Schebella, M.F.; Weber, D. Using participatory GIS to measure physical activity and urban park benefits. Landsc. Urban Plan. 2014, 121, 34–44. [Google Scholar] [CrossRef]
  45. Fisher, K.J.; Li, F.; Michael, Y.; Cleveland, M. Neighborhood-Level Influences on Physical Activity among Older Adults: A Multilevel Analysis. J. Aging Phys. Act. 2004, 12, 45–63. [Google Scholar] [CrossRef]
  46. Hoehner, C.M.; Brennan Ramirez, L.K.; Elliott, M.B.; Handy, S.L.; Brownson, R.C. Perceived and objective environmental measures and physical activity among urban adults. Am. J. Prev. Med. 2005, 28, 105–116. [Google Scholar] [CrossRef]
  47. Troped, P.J.; Wilson, J.S.; Matthews, C.E.; Cromley, E.K.; Melly, S.J. The Built Environment and Location-Based Physical Activity. Am. J. Prev. Med. 2010, 38, 429–438. [Google Scholar] [CrossRef]
  48. Boslaugh, S.E.; Luke, D.A.; Brownson, R.C.; Naleid, K.S.; Kreuter, M.W. Perceptions of Neighborhood Environment for Physical Activity: Is It “Who You Are” or “Where You Live”? J. Urban Health Bull. N. Y. Acad. Med. 2004, 81, 671–681. [Google Scholar] [CrossRef]
  49. Transit Availability Index—CMAP Data Hub. Available online: https://datahub.cmap.illinois.gov/dataset/access-to-transit-index (accessed on 18 July 2019).
  50. Bureau of Justice Statistics (BJS)—Violent Crime. Available online: https://www.bjs.gov/index.cfm?ty=tp&tid=31 (accessed on 18 July 2019).
  51. Violent Crimes. Available online: https://www.nij.gov:443/topics/crime/violent/Pages/welcome.aspx (accessed on 18 July 2019).
  52. Towns, J.; Cockerill, T.; Dahan, M.; Foster, I.; Gaither, K.; Grimshaw, A.; Hazlewood, V.; Lathrop, S.; Lifka, D.; Peterson, G.D.; et al. XSEDE: Accelerating Scientific Discovery. Comput. Sci. Eng. 2014, 16, 62–74. [Google Scholar] [CrossRef]
  53. Stewart, C.A.; Turner, G.; Vaughn, M.; Gaffney, N.I.; Cockerill, T.M.; Foster, I.; Hancock, D.; Merchant, N.; Skidmore, E.; Stanzione, D.; et al. Jetstream: A self-provisioned, scalable science and engineering cloud environment. In Proceedings of the 2015 XSEDE Conference on Scientific Advancements Enabled by Enhanced Cyberinfrastructure—XSEDE ’15; ACM Press: St. Louis, MO, USA, 2015; pp. 1–8. [Google Scholar]
  54. Bell, S.L.; Phoenix, C.; Lovell, R.; Wheeler, B.W. Using GPS and geo-narratives: A methodological approach for understanding and situating everyday green space encounters: Using GPS and geo-narratives. Area 2015, 47, 88–96. [Google Scholar] [CrossRef]
  55. Lee, A.C.K.; Maheswaran, R. The health benefits of urban green spaces: A review of the evidence. J. Public Health 2011, 33, 212–222. [Google Scholar] [CrossRef]
  56. Foster, S.; Giles-Corti, B. The built environment, neighborhood crime and constrained physical activity: An exploration of inconsistent findings. Prev. Med. 2008, 47, 241–251. [Google Scholar] [CrossRef] [PubMed]
  57. Badland, H.M.; Schofield, G.M.; Garrett, N. Travel behavior and objectively measured urban design variables: Associations for adults traveling to work. Health Place 2008, 14, 85–95. [Google Scholar] [CrossRef] [PubMed]
  58. Chaix, B.; Méline, J.; Duncan, S.; Merrien, C.; Karusisi, N.; Perchoux, C.; Lewin, A.; Labadi, K.; Kestens, Y. GPS tracking in neighborhood and health studies: A step forward for environmental exposure assessment, a step backward for causal inference? Health Place 2013, 21, 46–51. [Google Scholar] [CrossRef] [PubMed]
  59. Sainani, K. The Importance of Accounting for Correlated Observations. PM&R 2010, 2, 858–861. [Google Scholar]
Figure 1. Circular and network buffers.
Figure 1. Circular and network buffers.
Ijgi 08 00514 g001
Figure 2. Performance test of the optimized travel mode classification algorithm using GPS data with different combinations of features depending on the sizes of the distance and time windows (x-axis).
Figure 2. Performance test of the optimized travel mode classification algorithm using GPS data with different combinations of features depending on the sizes of the distance and time windows (x-axis).
Ijgi 08 00514 g002
Figure 3. Predicted travel modes based on the Chicago Regional Household Travel Inventory GPS data and their clustering visualization using kernel density estimation. (a) Walking, (b) in-vehicle, and (c) biking.
Figure 3. Predicted travel modes based on the Chicago Regional Household Travel Inventory GPS data and their clustering visualization using kernel density estimation. (a) Walking, (b) in-vehicle, and (c) biking.
Ijgi 08 00514 g003
Figure 4. The 20 m circular buffers created around each GPS point with the predicted travel modes.
Figure 4. The 20 m circular buffers created around each GPS point with the predicted travel modes.
Ijgi 08 00514 g004
Figure 5. Varying odds ratios and standard errors resulted from Models 1, 2, and 3 across the 11 sizes of buffers. Model 1: the number of cases of violent crimes, percentages of tree areas; park and open space density; transit availability index; and traffic collision adjusted for age, race, household income, and weekday/weekend. Model 2: Model 1 + neighborhood African American as an additional predictor. Model 3: Model 2 + neighborhood median household income as an additional predictor. *** p < 0.001; ** p < 0.01; * p < 0.1. Gray regions: standard errors of coefficients.
Figure 5. Varying odds ratios and standard errors resulted from Models 1, 2, and 3 across the 11 sizes of buffers. Model 1: the number of cases of violent crimes, percentages of tree areas; park and open space density; transit availability index; and traffic collision adjusted for age, race, household income, and weekday/weekend. Model 2: Model 1 + neighborhood African American as an additional predictor. Model 3: Model 2 + neighborhood median household income as an additional predictor. *** p < 0.001; ** p < 0.01; * p < 0.1. Gray regions: standard errors of coefficients.
Ijgi 08 00514 g005aIjgi 08 00514 g005b
Figure 6. The total number of significant variables in Model 3 according to different buffer sizes.
Figure 6. The total number of significant variables in Model 3 according to different buffer sizes.
Ijgi 08 00514 g006
Table 1. Description of the seven environmental context variables.
Table 1. Description of the seven environmental context variables.
CategoryPredictorsData SourceMeasureMeasuring UnitResolution/UnitTimeComments
EnvironmentTreeLand Cover data from the Chicago Metropolitan Agency for Planning Data HubPercentageArea (m2)1 m pixel2010
Park and open spaceChicago Data PortalPercentageArea (m2)Polygon2010
TransportTransit availability indexChicago Metropolitan Agency for Planning Data HubAverageIndex (1–5)Polygon2010
SafetyCrimeChicago Data PortalCountNumber of crimes/km2Point2007Violent crimes
Traffic collisionIllinois Department of TransportationCountNumber of crashes/km2Point2007Pedestrian and pedal cyclists
Neighborhood socio-economy statusNeighborhood median household incomeAmerican Community Survey of the United States Census BureauAverageDollars ($)Census tract (polygon)2010
Density of neighborhood African AmericansAmerican Community Survey of the United States Census BureauAverageNumber of people/km2Census tract (polygon)2010
Table 2. Descriptive statistics of the 168 adult participants in the Chicago Regional Household Travel Inventory (CRHTI) project.
Table 2. Descriptive statistics of the 168 adult participants in the Chicago Regional Household Travel Inventory (CRHTI) project.
n = 168 PersonsPercentage (%)
Female53
Race
White81.5
African American10.7
American Indian or Alaska Native1.2
Asian1.8
Hispanic3.0
Other1.8
Household income (73 households)
< $20,0008.7
$20,000–$34,9995.4
$35,000–$49,9993.3
$50,000–$59,9999.8
$60,000 to $74,99910.9
$75,000 to $99,99921.7
$100,000+39.1
NA1.1
Mean ± standard deviation
Age43.2 ± 11.4
Predicted average daily travel time (hours)
Walking0.3 ± 0.2
Running0.0003 ± 0.0
Biking0.02 ± 0.5
In-vehicle1.1 ± 0.7
Number of recorded days5.3 ± 1.5
Table 3. Descriptive statistics of seven predictors measured around all GPS points using 50 m circular buffers.
Table 3. Descriptive statistics of seven predictors measured around all GPS points using 50 m circular buffers.
PredictorsWalking (40,999 Observations)Biking (2545 Observations)In-Vehicle (113,083 Observations)
MeanSDMinMaxMeanSDMinMaxMeanSDMinMax
Physical environment
Tree (%)18.0019.140100.009.0010.49052.6011.8013.150100.00
Park and open space (%)5.5019.120100.008.7025.760100.008.0023.730100.00
Transit availability index (1–5)4.670.463.415.004.310.454.005.004.470.511.005.00
Social environment and safety
Crime (count/km2)41.0074.810844.0019.9031.940214.9028.0057.1701146.00
Neighborhood median household income ($)59,48422,83213,177127,73669,25319,75327,866127,46055,16122,12210,217150,281
Density of neighborhood African Americans (population/km2)17422354410,3252693354263012071767010,325
Traffic collision (count/km2)3.808.65071.603.107.33047.703.207.70079.60
Table 4. Odds ratios and standard errors resulted from Models 1, 2, and 3 to examine the associations between travel modes and trees, and park and open space densities, transit availability, the incidence of violent crimes, neighborhood African American density, median household income, and traffic collisions involving pedestrians and pedal cyclists adjusted for age, race, household income, and weekday/weekend derived with 200 m buffers.
Table 4. Odds ratios and standard errors resulted from Models 1, 2, and 3 to examine the associations between travel modes and trees, and park and open space densities, transit availability, the incidence of violent crimes, neighborhood African American density, median household income, and traffic collisions involving pedestrians and pedal cyclists adjusted for age, race, household income, and weekday/weekend derived with 200 m buffers.
PredictorsModel 1Model 2Model 3
Walking Versus in-VehicleBiking Versus in-VehicleWalking Versus in-VehicleBiking Versus in-VehicleWalking Versus in-VehicleBiking Versus in-Vehicle
OR95% CI AMEOR95% CIAMEOR95% CIAMEOR95% CIAMEOR95% CIAMEOR95% CIAME
Physical environment
Tree1.05(1.04, 1.05)0.00791.01(1.00, 1.01)−0.00011.05(1.05, 1.05)0.00781.01(1.00, 1.01)−0.00011.05(1.04, 1.05)0.00771.00(1.00, 1.00)−0.0000
Park and open space0.97(0.97, 0.97)−0.00361.00(0.99, 1.00)0.00010.97(0.97, 0.97)−0.00361.00(0.99, 1.00)0.00010.98(0.97, 0.98)−0.00330.99(0.99,0.99)0.0001
Transit availability index2.06(2.03, 2.09)0.11941.06(1.00, 1.12)−0.00152.04(2.01, 2.07)0.11731.07(1.05, 1.08)−0.00141.92(1.92, 1.92)0.10611.09(1.09, 1.09)−0.0008
Social environment and safety
Crime1.00(1.00, 1.00)0.00011.00(1.00, 1.00)0.00001.00(1.00, 1.00)0.00001.00(1.00, 1.00)0.00001.00(1.00, 1.00)0.00001.00(1.00, 1.00)0.0000
Neighborhood median household income 1.00(1.00, 1.00)0.00000.99(0.99, 0.99)−0.0000
Density of neighborhood African Americans 1.00(1.00, 1.00)0.00001.00(1.00, 1.00)−0.00001.00(1.00, 1.00)0.00000.99(0.99, 0.99)−0.0000
Traffic collision1.01(1.01, 1.01)0.00250.99(0.99, 0.99)−0.00011.01(1.01, 1.01)0.00270.99(0.99, 0.99)−0.00011.01(1.01, 1.01)0.00210.99(0.99, 0.99)−0.0001
McFadden’s R20.1280.1290.140
Nagelkerke’s R20.2120.2120.229
CoxSnell’s R20.1540.1550.167
AIC178,352178,285175,644
BIC178,611178,564175,943
p < 0.001; p < 0.01; p < 0.1. OR: odds ratio; 95% CI: 95% confidence intervals of odds ratios; AME: average marginal effects; AIC: Akaike information criterion. BIC: Bayesian information criterion.
Back to TopTop