In the past few years, the range of location-based services has been progressively extended from outdoor to indoor environments, as well as to applications such as path finding, emergency planning, and object tracking. This implies a demand for more accurate and robust indoor and outdoor localization and tracking technology on mobile devices. The outdoor localization services can be provided by GPS with a reliable accuracy, but in indoor spaces which are GPS-denied, alternative technology needs to be explored. Many existing indoor localization methods rely on dedicated infrastructure such as Wi-Fi access points [1
], ultrasonic networks [2
], synthetic aperture radar (SAR) [3
], Bluetooth [4
], ultra-wideband (UWB) [5
], or magnetic fields [7
]. However, this is often expensive and labor-intensive for large-scale deployment and suffers from discontinuous tracking during pedestrian movement. Moreover, the computational complexity involved in the algorithms is another challenge when applied to resource-limited smartphones.
Recently, the pedestrian dead reckoning (PDR) approach has become a promising solution to indoor localization; it estimates the distance and heading measurements of every step from the accelerometer and gyroscope embedded in smartphones to obtain a continuous trajectory. However, the error accumulation caused by low-cost sensors makes the PDR method alone unable to achieve acceptable accuracy. To solve this problem, many combination approaches are explored. Some of the approaches rely on frequent external position fixes (e.g., Wi-Fi access points) [8
]. Some exploit the compatibility of human motion patterns and map information to determine the possible trajectory [9
]. Others use sensor fusion methods such as Kalman filters [10
] or particle filters [11
] to account for the measurement uncertainty.
Among these approaches, combing the particle filter algorithm and map constraints to estimate the appropriate PDR result has been widely used for its effectiveness and ease of use. In these approaches, non-wall-crossing constraint is often used to eliminate invalid particles in position updating iterations [12
]. However, in such cases, the map information is often insufficiently used, resulting in poor convergence. To exploit deeper map information, there are also methods that utilize the indoor skeleton map (e.g., Voronoi graph) to recover the indoor topological structure for indoor localization [15
], which can achieve a lower complexity while maintaining an accuracy of a few meters. However, the fixed edge length assumption (e.g., equal to the average pedestrian step length) and the constraint of the pedestrian position to be at the nodes of the graph of these methods may not be applicable for diverse pedestrian step lengths. Moreover, the conventional approaches estimated the localization result using only the geometric coordinates, while the semantic location—which can provide potential compatibility of user locations with the indoor structure—is ignored. Goodchild [18
] argued that human cognition is based on named places rather than the geometrical space of latitude and longitude. Therefore, the semantic accuracy should be considered as an important component of indoor localization, rather than purely providing sub-meter accuracy.
In this paper, an adaptive smartphone-based semantically-constrained indoor localization method is proposed, overcoming the inferiority of infrastructure-dependence and pure geometric-dependence problem. Differing from the previous methods which totally rely on the geometric constraints of maps, the semantic information (which comply with the human cognition of indoor space and imply the indoor space occupancy) is incorporated into the proposed semantic augmented route network graph. By constructing each node with the indoor landmark (e.g., corners, doors, stairs) and each edge with the routes between the landmarks, the semantic augmented route network graph was then input as a prior map for the trajectory calibration using a particle filter algorithm, providing geometric (location, length of edges), topological (connectivity and orientation), and semantic information (human cognition). In this way, the rich semantic information can be exploited to avoid the localization errors caused by purely depending on geometric coordinates in the conventional methods. Furthermore, in order to be applicable to diverse user step lengths, the adaptive edge lengths of the proposed route network graph are adopted according to the user motion and space continuity. Based on the semantic constraints and conformance information imposed by the constructed graph, an enhanced particle filter is adopted to simultaneously calibrate the PDR trajectory whenever it is estimated from the smartphone sensor, which can achieve a high-accuracy pedestrian trajectory estimation in terms of both semantic and geometric accuracy.
The remainder of this paper is organized as follows. Section 2
briefly reviews related works. Section 3
explicitly introduces the proposed method, including the definition and extraction of the semantically-augmented route network graph based on a prior floorplan, the PDR algorithm that tracks the user trajectory from the smartphone inertial sensors, and the graph match-based particle filter. The experimental results and comparisons are presented in Section 4
. Finally, the conclusions are drawn in Section 5
2. Related Works
Nowadays, the most prominent techniques for indoor localization have been built upon positioning hardware, including wireless modules (e.g., Wi-Fi, Bluetooth, ultra-wideband (UWB)) and motion sensors (e.g., accelerometers, gyroscopes, and compasses) [19
]. The methodology involved can be categorized into triangulation, fingerprinting, and PDR. The triangulation technique estimates the location of a target using the distances or angles between the target and beacons. The fingerprinting approach computes the location on demand by matching the online signal fingerprints with those collected and stored in the database in the offline phase. Although the used methodologies vary, both methods have in common that they rely on wireless signals that share time-varying and multipath characteristics, and require intensive human labor in calibration and maintenance. In contrast, the PDR algorithm—advanced by lightweight micro-electro-mechanical systems (MEMS) sensors—has become a practical indoor localization approach with handheld devices [20
]. However, the low-cost nature of the sensors can cause significant drift and bias in long-term tracking tasks. In order to solve this problem, many efforts have been devoted to body-fixed sensors (e.g., foot-mounting [21
] and waist-mounting [24
]), or to furnishing the intrinsic algorithms (step detection and heading estimation) that directly affect the localization result, regardless of the different smartphone placements and user motions [25
Although achievements have been made, the PDR method alone still cannot provide reliable long-term pedestrian tracking. Therefore, many prior attempts focus on combining multiple methods to compensate each other’s drawbacks. Lee et al. [30
] combined PDR with Wi-Fi by using the absolute localization provided by Wi-Fi to determine the initial position as well as fix the long-time accumulation drift of PDR. Similarly, Jin et al. [31
] proposed to reduce the uncertainty of the PDR result based on sparse and partial locations sampled from the available wireless signals. In addition to this, this map information and particle filter are often combined with PDR to enhance the accuracy by implying possible routes and barriers, thus eliminating invalid particles [8
]. Bojja et al. [33
] extended the particle filter to three dimensions and combined it with collision detection techniques to navigate and localize vehicles in a parking garage. Other probabilistic approaches, such as the Kalman filter [10
] and conditional random field (CRF) [34
], are often used instead of a particle filter to reckon the appropriate localization.
To improve the localization accuracy, incorporating contextual information derived from a map has become a significant research aspect. Specifically, the map information includes coarse information such as the corridors and the boundaries of rooms, and high-level information such as the geometric, topological, and semantic relationships, as well as the landmarks that depict the interior structures of a room [12
]. In terms of the level to which a map is exploited, the methods can be classified as landmark matching, trajectory matching, and graph matching methods. Landmarks are distinguished features or unique signatures that can be easily observed in the environment, including seed landmarks (e.g., elevators and staircases) and organic ones (e.g., magnetic anomaly spots) [12
]. Wang et al. [13
] detected landmarks with a smartphone by recognizing the measurement fluctuation in the accelerometer when the user enters an elevator, the sudden rise in the gyroscope when turning a corridor corner, or the unusual reading in the magnetometer when passing a magnetic anomaly spot. Combined with map information, the user trajectory determined by PDR can be recalibrated when sensing a landmark. Chen et al. [10
] incorporated the user motion state (e.g., going up elevators, walking, stationary) into classifying the landmarks, and combined this with PDR to achieve a more robust localization performance. However, this method does not consider the topological information.
In contrast, trajectory matching aims to obtain a globally optimal estimation by taking into account the geometric structures, such as the straight corridors. Bao et al. [14
] computed the geometric similarity between a trace of the user trajectory and the map (the length of the straight line and the angle of turning) to reference the latest corner, and hence eliminated the error caused by gyroscope noise. Khan et al. [32
] defined pathways (e.g., hallways) and barriers (e.g., walls) that the user can traverse and cannot pass, respectively. Combined with the particle filter, PDR, and Wi-Fi scans, these methods can achieve zero-effort calibration for Wi-Fi fingerprinting. However, this method may fail when the constraints are sparse.
Studies have suggested that graph models can be used in indoor localization by providing the necessary geometric (e.g., space layout) as well as non-geometric cues (e.g., landmarks) when people navigate in space [15
]. Jensen et al. [15
] and Bercer and Dürr [35
] extended this from outdoor to indoor localization by constructing indoor location models. The location model representing the indoor topology at different levels offers a uniform data management infrastructure for different positioning technologies. Based on this, Park and Teller [9
] decoded the user trajectory with a sequence of low-level motions (e.g., standing still, walking straight, going upstairs, turning left, etc.) estimated by the smartphone’s inertial sensors. Then, the indoor locations are determined with respect to a discrete route map by constructing a hidden Markov model (HMM state model). Zhou et al. [16
] adopted the same idea of examining the conformance between an activity sequence and a prior route-based graph. Wasiq et al. [36
] used motion information from accelerometer and directions from magnetometer and gyroscope to achieve localization in sparse Wi-Fi environments. Similar to our method, Hilsenbeck et al. [17
] undertook data fusion of a pedometer and Wi-Fi in a graph-based representation of the indoor environment. Compared with the previous graph-based localization method, our proposed method has the advantage of constructing the graph with adaptive sample rate of varying edge length to allow multiple step length modes and promoting semantic accuracy in addition to pure geometric accuracy.
4. Experimental Evaluation
In order to evaluate the performance of the proposed model, two sets of experiments were conducted in the indoor space to verify its applicability to a complex environment and diverse users, respectively.
4.1. Experimental Setup
The two experiments were performed in the building of the State Key Laboratory of Information Engineering in Survey, Mapping and Remote Sensing (LIESMARS) at Wuhan University. This is a four-level office environment consisting of typical indoor structure, including individual offices, corridors, stairs, halls, and walls. Each level has an area of about 38 m × 51 m. The experiments were conducted on the first and second floors of the building, which are connected by stairs to the south. The participants of the experiments were asked to hold their smartphones in hand and behave as usual when walking in the space. At the same time, the sensor readings were recorded and processed to determine the user location. The smartphone used in the experiments was the Xiaomi 2 smartphone (Xiaomi Inc., Beijing, China) running the Android 4.4 operating system. To obtain the ground truth at the sampling time of the tracking system, we marked the ground with a 1 m grid on the pre-specified route and used a camera to record the walking process. We then manually measured the locations of each step of the pedestrians. In order to verify the proposed semantic augmented route network localization approach, we first evaluated the performance of the approach in the complex indoor environment by conducting two experiments containing all kinds of scenarios of people traversing the indoor space. We then invited six volunteers to walk along a predefined route and recorded their trajectory while, in the meantime, exploring whether the proposed method is robust to diverse users and brings additional improvement to the tracking performance.
4.2. Applicability in a Complicated Indoor Environment
In a real-world setting, users usually carry their smartphones as they walk through various sections of an indoor space. Moreover, they are likely to walk, stop, or go upstairs or downstairs; for example, walking between locations of interest or dwelling at certain locations for a significant length of time. Our experiment was aimed at emulating these practical scenarios in an office environment, considering all the contexts defined before in our model. Therefore, the following two routes were designed as the ground truth for verifying the performance of our approach, which are the combination of all possible user movement modes, such as walking along the corridor, walking in the open area, going in and out of the rooms, and going up and down stairs. Scenario (1): A enters the building from the front gate and walks through the open area and corridor to go to their seat and then comes out of the office to the lobby; Scenario (2): B walks upstairs from the first floor to the second floor and walks around the lobby to the office door.
The PDR result is first calculated and the semantic augmented route is performed simultaneously to calibrate its result. The result of PDR and the semantically-calibrated approach are as follows:
shows the result of the peak detection algorithm in PDR. In scenario one, the result reveals 107 steps, whereas 110 true steps were taken. In scenario two, the result shows 126 steps, while 127 true steps were taken. It can be seen that the different motions, such as taking a pause, turning left, or going upstairs, have no negative impact on the detection result. In summary, the peak detection algorithm achieves an average accuracy of 97.86%. It can be seen that all of the algorithms have a median error rate of less than 3%, and are more inclined to undercount than overcount. We attribute this to the fact that the algorithms are unlikely to find multiple cycles where only one exists, but may only find one cycle when multiple cycles exist. This is particularly likely to happen at the start and end of a walk, where the steps typically have different properties (lower energy, longer duration, etc.).
demonstrates the heading estimation result of PDR. Figure 9
a,b are the orientation change during one step and the overall heading estimation of scenario one, respectively. Figure 9
c,d are the corresponding results of scenario two. North is defined as the reference direction (0°), and the angle increases gradually clockwise. As a result, it can be inferred from the figure that A starts their route facing west, takes a slight turn to the north, walks for a while towards the north, then turns slightly back to the west, walks for a second before a sharp U-turn to the east, and then keeps this direction until the end. Similarly, B starts their route facing south, and after a second of straight walking, they then make a U-turn to the north for a while, which is followed by three consecutive turns to the east, to the south, and finally to the west. The turns can be easily detected during the process. It can be seen that although random noise exists in the estimated orientation change during one step, the accumulated heading can achieve an acceptable estimation result.
shows the trajectories of the PDR and the semantically-augmented route network corrected path. In the first experiment, the mean localization errors of the PDR and route network graph-based MM were 1.76 m and 1.12 m, respectively. The adoption of the semantically-augmented route network therefore increased the accuracy by 36.4%. The PDR can provide a relatively accurate initial result (Figure 10
a), but encounters orientation bias when coming out of the office after several minutes (Figure 10
b). The semantically-augmented route network graph-based MM constrains the route to the predefined edge on the graph, and thus corrects the orientation bias. The output route is not continuous due to the map being predefined as discontinuous. This could be improved by either fixing the semantic augmented route network nodes or interpolating in the route to obtain a continuous route. In the second experiment, the mean localization errors of the PDR and route network-based MM are 2.68 m and 1.34 m, respectively. The accuracy is therefore increased by 50% when considering the map information. Figure 10
c,d are the side-look and overlook graphs of route two. It can be seen that the single PDR algorithm has a step length problem when going upstairs, and the accumulated heading error leads the user trajectory to wall-crossing paths and wrong rooms, while the semantic augmented route network corrected method conforms well to the indoor graph and can achieve an average enhanced tracking accuracy of 1.23 m. The results show that our approach can accurately track the pedestrian trajectory in terms of a complicated combination of indoor user movement.
4.3. Applicability for Diverse Users
In this experiment, we evaluate that the proposed approach can achieve satisfactory localization result even to diverse users with different heights and step lengths. Six volunteers, including two female and four male, were involved in our experiment. Their detailed information can be seen in Table 3
. The volunteers were asked to hold the smartphone and walk along a predefined rectangular route in the lobby of the third floor of the LIESMARS building, whose total length is 40.85 m. The true steps and smartphone sensor data of their walk were recorded at the same time. We first estimated their PDR trajectory, and then applied the proposed calibration algorithm to demonstrate its validation to diverse users and improve the localization accuracy with pedestrian invariant property.
The PDR results of six volunteers are demonstrated in Figure 11
and Figure 12
. Specifically, Figure 11
shows the number of steps detected for each volunteer. Their true steps were 77, 80, 74, 70, 54, and 57, respectively. Consequently, the average accuracy for step detection was 99.54%. Figure 12
demonstrates the heading estimation result of six volunteers. Three kinds of headings were recorded in the experiments. The first is the estimated heading changes during one step, the second is the estimated heading relative to the start orientation, and the last is the local heading change accumulation in three steps. The direct way for turn detection is to either employ the first or the second heading. However, the threshold for both headings are difficult to quantify, as the first is susceptible to different participants and the second can rarely observe a step when a turn is taken within several steps. For that reason, we chose the local heading accumulation in three steps for turn detection, since a turn is usually taken within three steps. A turn is detected if it exceeds the threshold (40° in our case) and the last three headings are not a turn. It can be observed in Figure 12
that three turns can be clearly detected for each volunteer.
After obtaining the PDR estimation of each trajectory, the proposed semantic augmented route network graph-based localization algorithm was applied simultaneously to calibrate the trajectory. The corresponding results are demonstrated in Figure 13
. The green line is the original PDR estimation, and the red line is the calibrated result. It can be obviously observed that the route estimated by the proposed method approaches the truth of the rectangular route around the lobby and adaptively updates location when encountering corners, without orientation bias and accumulation error. As a consequence, it can be concluded that the proposed work has the property of being scalable to different people with different walking information, and can improve the localization accuracy by introducing semantic constraints.
4.4. Computation Performance
This experiment was undertaken to verify the efficiency of the proposed method by evaluating the average time required to complete a single update step for a single particle in the semantic augmented route network graph based filter. In the proposed method, the semantic augmented route network graph replaces the continuous indoor space with 237 feature points, reducing the state space of the particle filter from two dimensions to one dimension and thus accelerating the time required for the conversion of the particles. Compared with the traditional methods that demand almost 100 particles to obtain an acceptable result [17
], the proposed method can achieve a comparable result with only 20 particles, which indicates that the continuous filter case needs to apply the resampling five times more than our solution, resulting in additional time consumption. The time required for a single particle to complement a single update at a different indoor place on the ThinkPad X240 personal computer (Lenovo Group Ltd., Beijing, China) is demonstrated in Table 4
. The places to be evaluated are the corridors, the stairs, the rooms, and the related user motions as defined in Section 3.3
. This is a normalized average obtained by running the filtering for the same route 10 times offline. The result reveals that the time consumption for all the places is less than 1 ms, and the method fulfills the requirement of low computational complexity.
5. Conclusions and Future Work
In this work, we have proposed a semantically-augmented route network-based pedestrian indoor localization approach using smartphones. Differing from the previous localization algorithms, the proposed method exploits the topological and semantic information of the indoor environment, as well as the compatibility of human motion and indoor structure to assist with and calibrate the pedestrian trajectory. The context-enhanced particle filter which integrates the route network graph and the PDR output is performed at a step frequency to sample the most appropriate indoor locations and orientations. The effectiveness and efficiency of the method were confirmed in two sets of experiments involving typical user movements in an indoor space of different users. The experimental results showed that an impressive improvement in localization accuracy is achieved when incorporating the semantic augmented route network graph. In conclusion, the proposed approach can achieve high-accuracy localization of about 1 m while maintaining low computational complexity, with only mobile phone inertial sensors and map information.
In our future work, we will exploit the specific patterns of smartphone sensors (e.g., accelerometer) in dedicated locations and user activity to identify landmarks for the initial location determination of PDR; for example, use of the accelerometer to distinguish user activities such as sitting, walking, or going upstairs, and to extend the limitation of holding the smartphone in hand to flexible placement. Since the proposed method is easily extensible with other methods, this information could be incorporated to achieve an even higher accuracy. Additionally, Wi-Fi signal could also be combined with the proposed method to provide an absolute location correction source for the tracking.