ACK-MSCKF: Tightly-Coupled Ackermann Multi-State Constraint Kalman Filter for Autonomous Vehicle Localization

Ma, Fangwu; Shi, Jinzhu; Yang, Yu; Li, Jinhang; Dai, Kai

doi:10.3390/s19214816

Open AccessArticle

ACK-MSCKF: Tightly-Coupled Ackermann Multi-State Constraint Kalman Filter for Autonomous Vehicle Localization

State Key Laboratory of Automotive Simulation and Control, Jilin University, Changchun 130022, China

^*

Author to whom correspondence should be addressed.

Sensors 2019, 19(21), 4816; https://0-doi-org.brum.beds.ac.uk/10.3390/s19214816

Submission received: 8 October 2019 / Revised: 29 October 2019 / Accepted: 3 November 2019 / Published: 5 November 2019

(This article belongs to the Collection Positioning and Navigation)

Download

Browse Figures

Versions Notes

Abstract

:

Visual-Inertial Odometry (VIO) is subjected to additional unobservable directions under the special motions of ground vehicles, resulting in larger pose estimation errors. To address this problem, a tightly-coupled Ackermann visual-inertial odometry (ACK-MSCKF) is proposed to fuse Ackermann error state measurements and the Stereo Multi-State Constraint Kalman Filter (S-MSCKF) with a tightly-coupled filter-based mechanism. In contrast with S-MSCKF, in which the inertial measurement unit (IMU) propagates the vehicle motion and then the propagation is corrected by stereo visual measurements, we successively update the propagation with Ackermann error state measurements and visual measurements after the process model and state augmentation. This way, additional constraints from the Ackermann measurements are exploited to improve the pose estimation accuracy. Both qualitative and quantitative experimental results evaluated under real-world datasets from an Ackermann steering vehicle lead to the following demonstration: ACK-MSCKF can significantly improve the pose estimation accuracy of S-MSCKF under the special motions of autonomous vehicles, and keep accurate and robust pose estimation available under different vehicle driving cycles and environmental conditions. This paper accompanies the source code for the robotics community.

Keywords:

autonomous vehicles; visual–inertial odometry; Ackermann error state measurements; sensor fusion

1. Introduction

Providing a real-time accurate vehicle position is one of the key technologies to enhance the active safety of advanced driver assistance systems (ADASs) and achieve autonomous driving [1]. Wheel odometry is a fundamental method in most autonomous vehicles, owing to low cost and high reliability. It uses the information from the proprioceptive sensors, including the drive motor encoder sensor and steering wheel angle sensor, to estimate the position relative to a starting location with the method of dead-reckoning, based on the kinematics model. It exhibits the characteristic of high positioning accuracy in short-term operation. However, it is sensitive to systematic and non-systematic errors [2]. The accumulated errors will result in a drift in position estimation gradually over time. Therefore, it is necessary to fuse data from other sensors to improve positioning accuracy [3]. The fusion of the Global Navigation Satellite System (GNSS) and the Inertial Navigation System (INS) can provide high-accuracy global positioning [4]. However, the performance degrades quickly in GPS-denied environments. The positioning technology based on the Light Detection and Ranging (LIDAR) odometry [5,6] has the advantages of algorithm maturity, high precision and real-time performance, but its application in vehicles is limited due to the high cost of LIDAR.

On the other hand, visual odometry (VO), which adopts visual sensors, is gaining significant interest by the robotics community owing to the abundant perceptive information, small size and low cost [7,8,9]. A detailed review of the VO methods can be found in the literature [10,11,12]. While VO algorithms undergo a lack of robustness when confronted with motion blur, low texture scenes, illumination changes, sharp turns, partial or full occlusions and the influence of dynamic objects [13], visual-inertial odometry (VIO) [14,15,16,17] algorithms can significantly improve the performance by fusing low-cost IMUs, which attract much attention in the field of micro air vehicles (MAVs). For vehicle positioning, Ramezani and Khoshelham [18] propose a stereo Multi-State Constraint Kalman Filter (Stereo-MSCKF) using MSCKF [19], which shows a promising potential to provide positioning with sub-meter accuracy in short periods of GNSS signal loss. By far, VIO algorithms mainly consist of EKF-based methods, including MSF [20], ROVIO [21], MSCKF [19], Stereo-MSCKF [18], S-MSCKF [22], R-VIO [23] and optimization-based methods including the Open Keyframe-based Visual-Inertial SLAM (OKVIS) [24], VINS-Mono [25], PL-VIO [26] and Basalt [27]. Delmerico and Scaramuzza [28] performed a thorough evaluation of open source VIO, suggesting that EKF-based algorithms display better efficient performance, whereas optimization-based algorithms provide higher accuracy. Therefore, the trade-off between efficiency and accuracy in practical applications should be considered in order to choose an appropriate off-the-shelf VIO algorithm.

However, VIO algorithms provide enormous challenges for autonomous vehicle localization. Firstly, the outdoor traffic scene is completely uncontrolled. The scene can be cluttered, and thus contains a number of moving objects, such as pedestrians and vehicles. Therefore, VIO must be able to work precisely and robustly in such challenging situations to ensure the safety of pedestrian and passenger. Moreover, the on-board computer is an embedded system with limited computational resources. Hence, VIO should possess low computational complexity for the convenience of in-vehicle deployment [29]. Finally and most importantly, autonomous vehicles usually move on an approximately planar environment, and the motion is constrained to be of either constant local acceleration or no rotation. Under these special motions, VIO is subjected to additional unobservable directions [30], which indeed leads to larger positioning errors.

To address the above problems, there are some studies to integrate VIO algorithms with additional sources of information. VINS-RGBD [31] is proposed to fuse RGB-D and IMU data for small ground rescue robots, which overcomes the problem of scale unobservability mentioned in [30]. However, the use of the RGB-D sensor does not apply to the outdoor traffic scene in direct sunlight [32] for autonomous vehicles. He et al. [26] propose a tightly-coupled monocular VIO system, PL-VIO, by integrating more structured line features. The PL-VIO improves positioning accuracy, especially in uncontrolled outdoor scenarios. Nevertheless, the line detection and matching were the bottlenecks in the efficiency of the system. Zheng and Liu [33] propose an SE(2)-constrained pose parameterization optimization-based VIO for ground vehicles, which could obtain better accuracy under sharp-turn motion. Wu et al. [34] prove that the VIO scale, or two additional degrees of freedom (DOF) of global orientation, could become unobservable when ground mobile robots move with constant acceleration, or no rotation, respectively. They propose to fuse wheel odometry measurements into VIO, which significantly improves positioning accuracy under special motions. Dang et al. [35] propose a tightly-coupled VIO to fuse wheel encoder measurements by considering wheel slippage, which achieves great improvements in positioning accuracy. Huai and Huang [23] propose a robocentric VIO (R-VIO) by reformulating the system with respect to the IMU frame, and show that this system can resolve the scale unobservability issue under special motions without using additional sensors.

Although the above tightly-coupled VIO systems achieve high accuracy and robustness, they are sensitive to initialization [36] and require a higher computational burden due to the iterative optimal backend. Moreover, these methods are based upon differential drive robots or hand-held equipment. In contrast to the wide utilization of inertial measurements, KO-Fusion [37] is proposed to integrate Mecanum wheel motion constraint and the RGB-D sensor for small ground rescue robots, which improves the robustness under the fast and loopy motion.

However, the use of the RGB-D sensor and Mecanum wheels does not directly apply to autonomous vehicles which usually run on the asphalt road in the outdoor high-light traffic scene. Moreover, its efficient performance should be further optimized for the convenience of in-vehicle deployment. As the on-road and off-road autonomous vehicles in the outdoor environment are mostly Ackermann steering robots [38], it is necessary to demonstrate in further research works whether the above methods are applicable to Ackermann steering robots.

Therefore, the purpose of this paper is to effectively utilize the wheel odometry and steering model of Ackermann steering robots to improve the pose estimation accuracy under special motions of autonomous vehicles, and to keep pose estimation available with accuracy and robustness under different vehicle driving cycles and environmental conditions. In view of the efficient performance, an open source EKF-based VIO is selected as the base of our work, i.e., S-MSCKF [22], which follows the same fundamental principle as Stereo-MSCKF [18]. We propose a tightly-coupled Ackermann Multi-State Constraint Kalman Filter, termed ACK-MSCKF, to fuse Ackermann error state measurements and S-MSCKF with tightly-coupled EKF-based mechanism. The main contributions of this paper are highlighted as follows:

(1) Conducting the formulation and implementation of a tightly-coupled visual-inertial odometry system, ACK-MSCKF, which significantly improves the pose estimation accuracy under the special motions of autonomous vehicles, and can allow the position and orientation estimation to remain accurate and robust under different vehicle driving cycles and environmental conditions.

(2) The rotation extrinsic parameters calibration method between vehicle coordinates and VIO coordinates is presented with a performance comparison.

(3) Facilitating the reproducibility of related research with the source code [39] of ACK-MSCKF publicly available.

(4) The performance of ACK-MSCKF is compared with the state-of-the-art open source stereo VIO, S-MSCKF and OKVIS [24] on the real-world datasets acquired by our experimental vehicle with both qualitative and quantitative analysis results.

The remainder of this paper is structured as follows: The formulation and implementation of the proposed approach are both introduced in Section 2, and then Section 3 exhibits corresponding experiments and results. Besides, the experimental results are demonstrated in Section 4. Finally, our conclusions are drawn in Section 5.

2. The Proposed Approach

The basic principle of ACK-MSCKF is shown in Figure 1. The process model in which the state, including parameters to be estimated, is propagated using IMU measurements. Then, the state is augmented with the new camera state of the left camera. Finally, the state is successively updated by Ackermann measurements and visual measurements.

2.1. Coordinate Systems and Notations

(1) Vehicle Coordinate System {B}. The {B} is fixed with the vehicle body. The origin of {B} lies on the automotive rear axle center. The x-axis and y-axis points to forward and left, respectively, following the right-hand rule.

(2) World Coordinate System {W}. The {W} has the same origin and axes as the {B} at the initial time.

(3) VIO Coordinate System {I}. The origin of {I} lies on the center of the stereo camera, which is installed in the front of the vehicle. The x-axis and y-axis points to upward and right, respectively, following the right-hand rule.

(4) Camera Coordinate System {C}. The origin of {C} lies on the center of the left camera. The x-axis and y-axis points to right and downward, respectively, following the right-hand rule.

(5) Inertial Coordinate System {G}. The origin of {G} is the same as that of {I} at the time of VIO initialized. The axes of {G} are obtained by calculation at the time of VIO initialized, and its z-axis is aligned with earth gravity.

In this paper, we adopt the JPL [40,41] quaternion conventions which are used in S-MSCKF to derive ACK-MSCKF and the Hamilton [42] quaternion conventions to implement rotation extrinsic parameters calibration.

2.2. Process Model and State Augmentation

The IMU state at the sampling time of IMU

t_{k}

is defined as,

X_{I_{k}} = {[\begin{matrix} {}_{G}^{I_{k}}q^{T} & b_{g, k}^{T} & {}^{G}v_{I_{k}}^{T} & b_{a, k}^{T} & {}^{G}p_{I_{k}}^{T} & {}_{I_{}}^{C}q^{T} & {}^{I}p_{C}^{T} \end{matrix}]}^{T}

(1)

where

{}_{G}^{I_{k}}q

denotes the rotation quaternion from {G} to {I},

{}^{G}v_{I_{k}}

and

{}^{G}p_{I_{k}}

are the velocity and position of {I} in {G},

{}_{I}^{C}q

and

{}^{I}p_{C}^{}

both denote the exterior parameters between {I} and {C}. The symbols

b_{g, k}^{}

and

b_{a, k}^{}

are the IMU’s gyroscope and accelerometer biases, respectively.

The camera state at the sampling time of camera

t_{j}

is defined as,

X_{C_{j}} = {[\begin{matrix} {}_{G}^{C_{j}}q^{T} & {}^{G}p_{C_{j}}^{T} \end{matrix}]}^{T}

(2)

where

{}_{G}^{C_{j}}q

denotes the rotation quaternion from {G} to {C} and

{}^{G}p_{C_{j}}

is the position of {C} in {G}.

According to Equation (1) and Equation (2), the full state vector with

N

camera states at time

t_{k} \in [\begin{matrix} t_{j} & t_{j + 1} \end{matrix})

is given by,

X_{k} = {[\begin{array}{l} X_{I_{k}}^{T} & {}_{G}^{C_{j - N + 1}}q^{T} & {}^{G}p_{C_{j - N + 1}}^{T} & {}_{G}^{C_{j - N + 2}}q^{T} & {}^{G}p_{C_{j - N + 1}}^{T} & \dots & {}_{G}^{C_{j}}q^{T} & {}^{G}p_{C_{j}}^{T} \end{array}]}^{T}

(3)

Following Equation (3), the full error state

{\tilde{X}}_{k} \in ℝ^{21 + 6 N}

at time

t_{k}

is defined as,

{\tilde{X}}_{k} = {[\begin{array}{l} {\tilde{X}}_{I_{k}}^{T} & {}_{G}^{C_{j - N + 1}}{\tilde{θ}}^{T} & {}^{G}{\tilde{p}}_{C_{j - N + 1}}^{T} & {}_{G}^{C_{j - N + 2}}{\tilde{θ}}^{T} & {}^{G}{\tilde{p}}_{C_{j - N + 2}}^{T} & \dots & {}_{G}^{C_{j}}{\tilde{θ}}^{T} & {}^{G}{\tilde{p}}_{C_{j}}^{T} \end{array}]}^{T}

(4)

where,

{\tilde{X}}_{I_{k}} = {[\begin{matrix} {}_{G}^{I_{k}}{\tilde{θ}}^{T} & {\tilde{b}}_{g, k}^{T} & {}^{G}{\tilde{v}}_{I_{k}}^{T} & {\tilde{b}}_{a, k}^{T} & {}^{G}{\tilde{p}}_{I_{k}}^{T} & {}_{I}^{C}{\tilde{θ}}^{T} & {}^{I}{\tilde{p}}_{C}^{T} \end{matrix}]}^{T}

(5)

For modeling the process model and state augmentation, we follow [22].

2.3. Measurement Model

2.3.1. Vehicle Relative Rotation and Translation Estimation

From the geometric constraints depicted in Figure 2, the estimation of vehicle relative rotation and translation can be given by the following equation:

{}_{B_{j}}^{B_{j - 1}}T = {}_{B}^{C}T^{- 1} {}_{C_{j - 1}}^{G}T^{- 1} {}_{C_{j}}^{G}T {}_{B}^{C}T

(6)

Equation (6) can be represented as,

\begin{array}{l} C ({}_{B_{j}}^{B_{j - 1}}q) = C {({}_{B}^{C}q)}^{T} C ({}_{G}^{C_{j - 1}}q) C {({}_{G}^{C_{j}}q)}^{T} C ({}_{B}^{C}q) \\ {}_{B_{j}}^{B_{j - 1}}p = C {({}_{B}^{C}q)}^{T} [C ({}_{G}^{C_{j - 1}}q) [C {({}_{G}^{C_{j}}q)}^{T} {}_{B}^{C}p + {}_{C_{j}}^{G}p - {}_{C_{j - 1}}^{G}p] - {}_{B}^{C}p] \end{array}

(7)

where,

\begin{array}{l} C {({}_{B}^{C}q)}^{T} = C ({}_{I}^{B}q) C {({}_{I}^{C}q)}^{T} \\ {}_{B}^{C}p = {}_{I}^{C}p + {}_{B}^{I}p \end{array}

(8)

where

C (\cdot)

denotes the function converting the quaternion to the rotation matrix [41].

C ({}_{I}^{B}q)

and

{}_{B}^{I}p

denote rotation extrinsic parameters and translation extrinsic parameters between {B} and {I}, respectively.

C ({}_{I}^{C}q)

and

{}_{I}^{C}p

denote rotation extrinsic parameters and translation extrinsic parameters between {C} and {I}, respectively.

C ({}_{B_{j}}^{B_{j - 1}}q)

and

{}_{B_{j}}^{B_{j - 1}}p

are vehicle relative rotation and translation estimation from time

t_{j}

to

t_{j - 1}

.

2.3.2. Ackermann Error State Measurements

After the state was augmented with the new camera state, the state is successively updated by Ackermann measurements and visual measurements. Here, the tightly-coupled Ackermann error state measurement model is present as the primary contribution of this paper.

Based on Equation (7), the measurement residual

r_{}^{B_{j}}

can be defined as,

r_{}^{B_{j}} = (\begin{matrix} r_{θ_{j - 1, j}}^{B} \\ r_{v_{j}}^{B_{}} \end{matrix}) = (\begin{matrix} z_{θ_{j - 1, j}}^{B_{}} \cdot C {({}_{B_{j}}^{B_{j - 1}}q)}^{- 1} \\ z_{v_{j}}^{B} - {}_{B_{j}}^{B_{j - 1}}p / Δ t_{j - 1, j} \end{matrix})

(9)

where

r_{θ_{j - 1, j}}^{B}

and

r_{v_{j}}^{B}

are the Ackermann steering measurement residual from time

t_{j - 1}

to time

t_{j}

and the velocity measurement residual at time

t_{j}

respectively, and

z_{θ_{j - 1, j}}^{B}

and

z_{v_{j}}^{B}

are the Ackermann steering measurement and velocity measurement, respectively. The camera sampling period

Δ t_{j - 1, j} = t_{j} - t_{j - 1}

. We assume that the velocity measurement

z_{v_{j}}^{B}

is constant for low-speed motion from time

t_{j - 1}

to time

t_{j}

.

Considering the existence of nonholonomic constraint for a ground wheeled vehicle [43], the vehicle velocity in the plane perpendicular to the moving direction is considered as zero mean with Gaussian white noise. So, the velocity measurement

z_{v_{j}}^{B}

can be represented as,

z_{v_{j}}^{B} = {[\begin{matrix} v_{x, j}^{B} & 0 & 0 \end{matrix}]}^{T}

(10)

where

v_{x, j}^{B}

is the vehicle longitudinal velocity at time

t_{j}

.

The Ackermann steering measurement

z_{θ_{j - 1, j}}^{B}

can be derived by the Ackermann steering geometry which is used in car-like wheeled mobile robots, and a closer look at the Ackermann steering geometry is illustrated in Figure 3. It is used to avoid wheel slippage by rotating the inner wheel with a larger angle than the outer wheel. As shown in Figure 3, the Instantaneous Center of Rotation (ICR) is located at the intersection of the extension cord of the automotive rear axle and the perpendicular bisectors of the front wheels [44].

It is assumed that vehicles follow the approximate planar motion, and there are no wheel slippage and sideslip angle for low-speed motion [45]. Therefore, the equation of Ackerman steering geometry then can be derived as follows,

\begin{matrix} \frac{L}{\tan | α_{o} |} - \frac{L}{\tan | α_{c} |} = \frac{B}{2} \\ R = \frac{L}{\tan α_{c}} \end{matrix}

(11)

According to Equation (11), the kinematic relation between

R

and

α_{o}

is shown in:

R = {\begin{array}{l} L / \tan α_{o} - B / 2 for α_{o} > 0 \\ L / \tan α_{o} + B / 2 for α_{o} < 0 \end{array}

(12)

α_{o} = δ / η

(13)

where the steering wheel angle

δ

can be acquired from the vehicle CAN-bus.

η

is the angular transmission ratio obtained from the offline calibration in this paper,

η = 17

.

By knowing the vehicle longitudinal velocity

v_{x}^{B}

,

R

and the vehicle CAN-bus sampling time

t_{τ}

, the change of yaw angle

ϕ_{j - 1, j}^{B}

from time

t_{j - 1}

to time

t_{j}

is obtained [46]:

ϕ_{j - 1, j}^{B} = \int_{τ \in [j - 1, j]}^{} v_{x, τ}^{B} / R_{τ} d τ

(14)

Considering the nonholonomic constraint and the approximate planar motion assumption, the roll and pitch angle are considered as a zero mean with Gaussian white noise,

θ_{j - 1, j}^{B} = {[\begin{matrix} 0 & 0 & ϕ_{j - 1, j}^{B} \end{matrix}]}^{T}

(15)

By using the Rodrigues rotation formula, the Ackermann steering measurement

z_{θ_{j - 1, j}}^{B}

can be represented by the exponential map of

θ_{j - 1, j}^{B}

[42]:

z_{θ_{j - 1, j}}^{B} = \exp ({(θ_{j - 1, j}^{B})}^{\land})

(16)

where the operator

{(\cdot)}^{\land}

maps a vector to a skew symmetric matrix.

By linearizing Equation (9) at current error state

{\tilde{X}}_{k}

,

r_{}^{B_{j}}

can be approximated as,

r_{}^{B_{j}} = H_{}^{B_{j}} {\tilde{X}}_{k} + n_{}^{B_{j}}

(17)

where the Ackermann error state measurement Jacobian matrix

H_{}^{B_{j}}

can be derived as,

H_{}^{B_{j}} = [\begin{matrix} 0_{3 \times 15} & \frac{\partial r_{θ_{j - 1, j}}^{B}}{\partial {}_{I}^{C}{\tilde{θ}}} & 0_{3 \times 3} & 0_{3 \times (6 N - 6)} & \frac{\partial r_{θ_{j - 1, j}}^{B}}{\partial {}_{G}^{C_{j}}{\tilde{θ}}} & 0_{3 \times 3} \\ 0_{3 \times 15} & \frac{\partial r_{v_{j}}^{B_{}}}{\partial {}_{I}^{C}{\tilde{θ}}} & \frac{\partial r_{v_{j}}^{B_{}}}{\partial {}^{I}{\tilde{p}}_{C}^{}} & 0_{3 \times (6 N - 6)} & \frac{\partial r_{v_{j}}^{B_{}}}{\partial {}_{G}^{C_{j}}{\tilde{θ}}} & \frac{\partial r_{v_{j}}^{B_{}}}{\partial {}^{G}{\tilde{p}}_{C_{j}}^{}} \end{matrix}]

(18)

where

\begin{matrix} \frac{\partial r_{θ_{j - 1, j}}^{B}}{\partial {}_{G}^{C_{j}}{\tilde{θ}}} = - C {({}_{B}^{C}q)}^{T} C ({}_{G}^{C_{j - 1}}q) C {({}_{G}^{C_{j}}q)}^{T} \\ \frac{\partial r_{θ_{j - 1, j}}^{B}}{\partial {}_{I}^{C}{\tilde{θ}}} = C {({}_{B}^{C}q)}^{T} C ({}_{G}^{C_{j - 1}}q) C {({}_{G}^{C_{j}}q)}^{T} - C {({}_{B}^{C}q)}^{T} \\ \frac{\partial r_{v_{j}}^{B_{}}}{\partial {}_{G}^{C_{j}}{\tilde{θ}}} = - C {({}_{B}^{C}q)}^{T} C ({}_{G}^{C_{j - 1}}q) C {({}_{G}^{C_{j}}q)}^{T} ⌊ {}_{B}^{C}p \times ⌋ / Δ t_{j - 1, j} \\ \frac{\partial r_{v_{j}}^{B_{}}}{\partial {}^{G}{\tilde{p}}_{C_{j}}^{}} = C {({}_{B}^{C}q)}^{T} C ({}_{G}^{C_{j - 1}}q) / Δ t_{j - 1, j} \\ \frac{\partial r_{v_{j}}^{B_{}}}{\partial {}_{I}^{C}{\tilde{θ}}} = C {({}_{B}^{C}q)}^{T} ⌊ C ({}_{G}^{C_{j - 1}}q) ({}_{C_{j}}^{G}p - {}_{C_{j - 1}}^{G}p) \times ⌋ / Δ t_{j - 1, j} \\ \frac{\partial r_{v_{j}}^{B_{}}}{\partial {}^{I}{\tilde{p}}_{C}^{}} = (C {({}_{B}^{C}q)}^{T} C ({}_{G}^{C_{j - 1}}q) C {({}_{G}^{C_{j}}q)}^{T} - C {({}_{B}^{C}q)}^{T}) / Δ t_{j - 1, j} \end{matrix}

(19)

Meanwhile,

n_{}^{B_{j}}

in Equation (17) denotes the noise of the measurement, and the measurement covariance matrix of

n_{}^{B_{j}}

is given by,

S_{}^{B_{j}} = [\begin{matrix} S_{θ_{j - 1, j}}^{B} & 0_{3 \times 3} \\ 0_{3 \times 3} & S_{v_{j}}^{B_{}} \end{matrix}]

(20)

where

S_{v_{j}}^{B} = diag {\begin{matrix} σ_{v_{x}^{B}}^{2} & σ_{v_{y}^{B}}^{2} & σ_{v_{z}^{B}}^{2} \end{matrix}}

,

S_{θ_{j - 1, j}}^{B} = diag {\begin{matrix} σ_{θ_{r o l l}^{B}}^{2} & σ_{θ_{p i t c h}^{B}}^{2} & S_{ϕ_{j - 1, j}}^{B} \end{matrix}}

, and

S_{ϕ_{j - 1, j}}^{B}

can be given by the following recursive formula,

S_{ϕ_{j - 1, τ}}^{B} = S_{ϕ_{j - 1, τ - 1}}^{B} + V_{τ - 1} M V_{τ - 1}^{T}

(21)

where the input Jacobian matrix

V_{τ - 1}

can be obtained according to Equation (14),

\begin{array}{l} ϕ_{j - 1, τ}^{B} = ϕ_{j - 1, τ - 1}^{B} + Δ t_{τ - 1, τ} v_{x, τ - 1}^{B} / R_{τ - 1} \\ V_{τ - 1} = \frac{\partial ϕ_{j - 1, τ}^{B}}{\partial {[\begin{matrix} v_{x, τ - 1}^{B} & δ_{τ - 1} \end{matrix}]}^{T}} = [\begin{matrix} V_{v_{x, τ - 1}^{B}}^{ϕ} & V_{δ_{τ - 1}}^{ϕ} \end{matrix}] \end{array}

(22)

where

Δ t_{τ - 1, τ} = t_{τ} - t_{τ - 1}

is the vehicle CAN-bus sampling period,

V_{v_{x, τ - 1}^{B}}^{ϕ}

and

V_{δ_{τ - 1}}^{ϕ}

can be derived as,

\begin{array}{l} V_{v_{x, τ - 1}^{B}}^{ϕ} = \frac{Δ t_{τ - 1, τ}}{R_{τ - 1}} \\ V_{δ_{τ - 1}}^{ϕ} = \frac{L v_{x, τ - 1}^{B} Δ t_{τ - 1, τ} (\tan^{2} (δ_{τ - 1} / η) + 1)}{η {(R_{τ - 1} \tan (δ_{τ - 1} / η))}^{2}} \end{array}

(23)

and the input noise

M

is given by,

M = [\begin{matrix} σ_{v_{x}^{B}}^{2} & 0 \\ 0 & σ_{δ}^{2} \end{matrix}]

(24)

where

σ_{v_{x}^{B}}

and

σ_{δ}

are the standard deviations of vehicle longitudinal velocity and steering wheel angle, respectively. Meanwhile,

σ_{v_{y}^{B}}^{}

and

σ_{v_{z}^{B}}^{}

are the white-noise standard deviations of vehicle lateral velocity and vertical velocity, respectively. Moreover,

σ_{θ_{r o l l}^{B}}

and

σ_{θ_{p i t c h}^{B}}

are the white-noise standard deviations of vehicle roll angle and pitch angle, respectively. In this paper,

σ_{v_{x}^{B}} = 0.3

,

σ_{v_{y}^{B}}^{} = 0.3

,

σ_{v_{z}^{B}}^{} = 5

,

σ_{δ} = 17

,

σ_{θ_{r o l l}^{B}} = 10 \cdot \sqrt{S_{ϕ_{j - 1, j}}^{B}}

,

σ_{θ_{p i t c h}^{B}} = 10 \cdot \sqrt{S_{ϕ_{j - 1, j}}^{B}}

.

Finally, the Kalman gain, updated state vector and covariance matrix at time

t_{j}

can be obtained as follows,

\begin{array}{l} K = P_{k | k} {(H^{B_{j}})}^{T} {(H_{}^{B_{j}} P_{k | k} {(H^{B_{j}})}^{T} + S_{}^{B_{j}})}^{- 1} \\ X_{k}^{+} = X_{k} + K r_{}^{B_{j}} \\ P_{k | k}^{+} = (I_{ξ} - K H^{B_{j}}) P_{k | k} {(I_{ξ} - K H^{B_{j}})}^{T} + K S_{}^{B_{j}} K_{}^{T} \end{array}

(25)

where

I_{ξ} \in ℝ^{(21 + 6 N) \times (21 + 6 N)}

represents the identity matrix.

2.3.3. Visual Measurements

In the visual front-end, the FAST feature detector and KLT optical flow tracker are used. The configuration of the visual front-end and the visual feature measurement model in this paper are the same as S-MSCKF. Please refer to [22] for further details.

2.4. Extrinsic Parameters Calibration

Accurate extrinsic parameters can greatly affect the accuracy and reliability of overall multi-sensors’ fusion system performances. In this paper, the rotation extrinsic parameters

C ({}_{I}^{C}q)

and translation extrinsic parameters

{}_{I}^{C}p

between {C} and {I} are calibrated by using the Kalibr visual-inertial calibration toolbox [47]. The translation extrinsic parameters

{}_{B}^{I}p

between {B} and {I} are obtained by the vehicle design model, and we adopt the singular value decomposition (SVD) [48] rotation calibration and weighted average quaternion method [49] to calibrate rotation extrinsic parameters

C ({}_{I}^{B}q)

between {B} and {I}. A dual-antenna Spatial NAV982-RG inertial navigation system with real-time kinematic (GPS-RTK) is used to assist calibration. The GPS-RTK is fixed just above the automotive rear axle center and has the same axes as the {B}. Assuming that the installation error can be ignored, rotation extrinsic parameters

C ({}_{I}^{B}q)

can be obtained by aligning two rotation sequences from GPS-RTK and frame {I}. The following equation holds for any

γ

,

{}_{B_{γ}}^{B_{γ - 1}}q \otimes {}_{I}^{B}q = {}_{I}^{B}q \otimes {}_{I_{γ}}^{I_{γ - 1}}q

(26)

where

{}_{B_{γ}}^{B_{γ - 1}}q

and

{}_{I_{γ}}^{I_{γ - 1}}q

are unit quaternions of GPS-RTK rotation measurements and gyroscope integral measurements of frame {I} from time

t_{γ}

to time

t_{γ - 1}

respectively, and

\otimes

is the quaternion multiplication operator.

Equation (26) can be represented as,

[Ω_{L} ({}_{B_{γ}}^{B_{γ - 1}}q) - Ω_{R} ({}_{I_{γ}}^{I_{γ - 1}}q)] \cdot {}_{I}^{B}q = Ω_{γ}^{γ - 1} \cdot {}_{I}^{B}q = 0

(27)

where

\begin{array}{l} Ω_{L} ({}_{B_{γ}}^{B_{γ - 1}}q) = [\begin{matrix} {}_{B_{γ}}^{B_{γ - 1}}q_{w} & - {}_{B_{γ}}^{B_{γ - 1}}q_{x y z}^{T} \\ {}_{B_{γ}}^{B_{γ - 1}}q_{x y z} & {}_{B_{γ}}^{B_{γ - 1}}q_{w} I_{3} + ⌊ {}_{B_{γ}}^{B_{γ - 1}}q_{x y z} \times ⌋ \end{matrix}] \\ Ω_{R} ({}_{I_{γ}}^{I_{γ - 1}}q) = [\begin{matrix} {}_{I_{γ}}^{I_{γ - 1}}q_{w} & - {}_{I_{γ}}^{I_{γ - 1}}q_{x y z}^{T} \\ {}_{I_{γ}}^{I_{γ - 1}}q_{x y z} & {}_{I_{γ}}^{I_{γ - 1}}q_{w} I_{3} - ⌊ {}_{I_{γ}}^{I_{γ - 1}}q_{x y z} \times ⌋ \end{matrix}] \end{array}

(28)

where

Ω_{L} ({}_{B_{γ}}^{B_{γ - 1}}q)

and

Ω_{R} ({}_{I_{γ}}^{I_{γ - 1}}q)

are matrix representations for left and right Hamilton [42] quaternion multiplication, and

q_{w}

and

q_{x y z}

are the real part and imaginary part of a quaternion. respectively.

By collecting a period of rotation sequences, the overconstrained linear system can be represented as,

[\begin{matrix} Ω_{1}^{0} \\ Ω_{2}^{1} \\ ⋮ \\ Ω_{γ}^{γ - 1} \end{matrix}] {}_{I}^{B}q = Ω_{γ} \cdot {}_{I}^{B}q = 0

(29)

where time

t_{γ}

keeps going until the rotation calibration is completed.

Equation (29) can be solved by using SVD. Then, the right unit singular vector corresponding to the smallest singular value of

Ω_{γ}

is the solution. We set a threshold of iterations

N_{s}

, and the rotation calibration process terminates if

γ > N_{s}

.

After

n

rounds of rotation calibration with random sampling of rotation sequences, the weighted average quaternion [50] is utilized to obtain rotation extrinsic parameters

{}_{I}^{B}q

,

{}_{I}^{B}q = \arg \max_{q \in S^{3}} q^{T} W q

(30)

where

S^{3}

denotes the unit 3-sphere, and

W

is given by,

W = \sum_{i = 1}^{n}_{}^{\sec} σ_{i} \cdot {}_{I}^{B}q_{i} \cdot {}_{I}^{B}q_{i}^{T}

(31)

where

{}_{I}^{B}q_{i}

denotes rotation calibration solution in the

i

^th round,

_{}^{\sec} σ_{i}

is the second smallest singular value of

Ω_{γ}

in the

i

^th round.

Equation (30) can be solved by using SVD, the rotation extrinsic parameters

{}_{I}^{B}q

are the eigenvector of

W

corresponding to the maximum eigenvalue.

3. Experiments and Results

The experimental vehicle setup and experimental scenarios are first introduced in this section. Secondly, the experimental results are exhibited for different vehicle driving cycles.

3.1. Experimental Vehicle Setup

To the best of our knowledge, it would be impractical to get public datasets collected by Ackermann steering vehicles in an outdoor environment, which contains both vehicle CAN-bus data, visual data, IMU data and ground truth. The public datasets used for VO and VIO are known as the EuRoC MAV Dataset [51], RGBD SLAM Dataset [52], KITTI Dataset [53] and PennCOSYVIO Dataset [54]. The EuRoC MAV Dataset is mainly suitable for MAVs applications; the RGBD SLAM dataset only contains RGBD camera data; the PennCOSYVIO Dataset is collected by handheld devices; and the KITTI Dataset is acquired by an Ackermann steering vehicle. However, the data does not contain time synchronous vehicle CAN-bus data. Therefore, the performance of the proposed method is evaluated by collecting real-world data.

The experimental vehicle is an Ackermann steering vehicle with dual in-wheel motors drive designed by our research group. As shown in Figure 4, the experimental vehicle can be remote controlled by using the smart phone application (APP) developed by our research group. Telematics BOX (T-BOX) is mainly used for communicating with the smart phone APP and Vehicle Control Unit (VCU) to realize the display and remote control of the experimental vehicle. The datasets were collected in the outdoor environment of the test field. We used the MYNTEYE-S1030 stereo camera (MYNTAI Inc., Beijing, China) with a focal length of 8 mm as the visual-inertial sensor. The camera with 720 pixels × 480 pixels resolution and 30 Hz frame rate is equipped with a time-synchronized IMU. The acquisition frequency of the IMU is 200 Hz. The vehicle longitudinal speed and steering wheel angle were acquired from the vehicle CAN-bus with sampling frequency 150 Hz by using the Controller Area Network (CAN) interface card.

A dual-antenna Spatial NAV982-RG inertial navigation system was used, which combines the FindCM positioning service from Qianxun Spatial Intelligence Inc. for centimeter-level positioning of the vehicle as the experimental ground truth.

The data acquisition was performed on a domestic mini-computer (Shenzhen Yingchi Technology Co., Ltd, Shenzhen, China) running on an Intel Core i3-5005U CPU with 2.00 GHz, 4 GB RAM (Intel, Santa Clara, CA, USA). The algorithm evaluation was performed on a Lenovo laptop (Lenovo, Beijing, China) running on an Intel Core i5-8300H CPU with 2.30 GHz, 8 GB RAM (Intel, Santa Clara, CA, USA). The data acquisition software and entire algorithm were implemented in C++ programming language (Free Software Foundation, Inc., Boston, Mass., USA) using ROS Kinetic (Open Source Robotics Foundation, Inc., Mountain View, CA, USA). We evaluated the trajectories using the methods provided by [55].

3.2. Experimental Results

Taking into account the different vehicle driving cycles and visually distinctive environmental features, in order to evaluate the algorithm more comprehensively, five different experimental datasets were acquired; i.e., the AM_01 dataset, AM_02 dataset, AM_03 dataset, AM_04 dataset and AM_05 dataset. AM_01 and AM_02 were respectively acquired under the straight line driving cycle and slalom line driving cycle by traveling around the building in a test field counterclockwise for one circle in the forenoon. In both of these datasets, there exist strong changes of illumination against the sunlight and motion blur at the sharp turns. AM_03 was acquired under the low-light environment at dusk which poses a great challenge for the VIO visual front-end. AM_04 with the dataset including road dynamic objects, such as other vehicles and pedestrians, was acquired under a long-distance driving cycle on the trafficway. AM_05 was acquired under a free driving cycle to verify the extrinsic parameters calibration method proposed in Section 2.4. The details of the experimental datasets are shown in Table 1, and Figure 5 shows several acquired images in the AM_01, AM_02, AM_03 and AM_04 datasets.

3.2.1. Extrinsic Parameters Calibration Performance

The original S-MSCKF was utilized to evaluate the extrinsic parameters calibration performance on the AM_05 dataset. S-MSCKF with the calibration method proposed in Section 2.4 is named S-MSCKF(s), and S-MSCKF with its rotation extrinsic parameter manually measured is named S-MSCKF(m).

Figure 6 shows that the performance from S-MSCKF(s) and S-MSCKF(m) on AM_05 dataset. For quantitative evaluation, we chose the absolute translation error (ATE) and the relative error (RE) as the evaluation metric. The open-source package [55] offers an off-the-shelf interface to perform a comprehensive evaluation of VIO. A yaw-only transformation with only the first state was used for trajectory alignment according to this package. Figure 6a,b show the top-view and side-view of the estimated trajectory, respectively. As presented in Figure 6c, the boxplots of the relative translation error are expressed in percentages by dividing the error with the sub-trajectory length. The boxplots of the relative yaw error are displayed in Figure 6d.

From Figure 6a, it can be seen that the estimated trajectory of S-MSCKF(s) aligns to the ground truth better than S-MSCKF(m). Obviously from Figure 6c, S-MSCKF(s) performs better than S-MSCKF(m) on relative translation error. Meanwhile, Figure 6b illustrates both methods exhibit large scale drift along the gravitational direction.

The absolute translation error on the AM_05 dataset is shown in Table 2, and the relative translation error and relative yaw error on the AM_05 dataset are shown in Table 3.

It can be seen from the quantitative results in Table 1 and Table 2 that the absolute translation error of S-MSCKF(s) is reduced by 27.55% of S-MSCKF(m). The relative translation error and relative yaw error of S-MSCKF(s) are less than S-MSCKF(m) over all sub-trajectory lengths. By comprehensive analysis of Table 1 and Table 2, S-MSCKF(s) performs better than S-MSCKF(m), indicating that the proposed extrinsic parameters’ calibration method can significantly improve the position accuracy.

3.2.2. ACK-MSCKF Performance

The performance of ACK-MSCKF is compared with state-of-the-art open source stereo VIO, S-MSCKF and OKVIS on the AM_01, AM_02, AM_03 and AM_04 datasets acquired by our experimental vehicle. The extrinsic parameters obtained by our proposed calibration method are used for all of these algorithms.

Figure 7 and Figure 8 show the trajectory estimation results in a top view and aligned to Google Earth, respectively.

From Figure 7 we can see that the estimated trajectories of our proposed ACK-MSCKF algorithm align to the ground truth better than others. Meanwhile, it can be seen that the position estimation results of S-MSCKF and OKVIS have larger scale drift, especially in the straight line driving cycle and slalom line driving cycle (Figure 7a,b). Figure 7c suggests that OKVIS has larger y-axis error in the low-light environment.

By fusing Ackermann error state measurements and S-MSCKF, it can be seen intuitively that the accuracy of the trajectory estimation is significantly improved.

Figure 8 shows that the trajectory of ground truth (magenta line) is highly identical with the outline of our test field in Google Earth. Therefore, the ground truth has high credibility to evaluate the performance of our algorithm.

The boxplots of the relative translation error are shown in Figure 9. Figure 10 presents the boxplots of the relative yaw error.

Figure 9 demonstrates that the relative translation error of OKVIS on AM_01, AM_02 and AM_04 datasets is at the largest within the sub-trajectory length of 5 m, while S-MSCKF has the largest relative translation error within the sub-trajectory length of 50 m on the AM_03 dataset. Meanwhile, there exists larger scale drift, especially in the straight line driving cycle and slalom line driving cycle (Figure 9a,b).

It can be seen from Figure 10 that OKVIS has the largest relative yaw error within the sub-trajectory length of 20 m on all of datasets. The relative yaw error of OKVIS increases significantly along with the sub-trajectory length increases, especially in the straight line driving cycle and low-light environment (Figure 10a,c). S-MSCKF performs best on the relative yaw error within the sub-trajectory length of 100 m on the AM_03 dataset.

It can be concluded from Figure 9 and Figure 10 that the relative translation error of the ACK-MSCKF algorithm is significantly reduced, and the relative yaw error is close to S-MSCKF within the sub-trajectory length of 5 m.

The absolute translation error along the whole trajectory of each experimental dataset is shown in Table 4, and the overall relative translation error and relative yaw error on all of the real-world datasets are reflected in Table 5.

It can be seen from the quantitative results in Table 4 that the absolute translation error of ACK-MSCKF is the least on all of the datasets. There exists a larger absolute translation error, especially under the straight line driving cycle and outdoor long-distance driving cycle on the road. By analyzing the quantitative results in Table 5, it can be obtained that the relative translation error of S-MSCKF and OKVIS increases significantly along with the sub-trajectory length increases. At the same time, it illustrates that the relative yaw error of S-MSCKF is less than OKVIS over different sub-trajectory lengths.

By comprehensive analysis of Table 4 and Table 5, we can see that the relative translation errors of ACK-MSCKF in different sub-trajectory lengths are significantly reduced, and it is 4.18 m in the sub-trajectory length of 160 m, which is reduced respectively by 59.85% and 67.70% under the same condition of S-MSCKF and OKVIS. Meanwhile, the relative yaw error of ACK-MSCKF is close to S-MSCKF. The least relative yaw error from ACK-MSCKF is 0.80 deg in the sub-trajectory length of 160 m, which is reduced respectively by 23.08% and 39.85% under the same condition of S-MSCKF and OKVIS.

4. Discussion

From the experimental results, this can all be summarized as the following:

Accurate extrinsic parameters calibration greatly affects the accuracy and reliability of overall system performances. The extrinsic parameters calibration method proposed in this paper can significantly improve the position accuracy with the absolute translation error reduced by 27.55%.

S-MSCKF and OKVIS confront the problem of scale unobservability under special motions of autonomous vehicles. The reasons can be explained as follows. Visual features in outdoor scenes are mostly far away relative to the stereo baseline, making the stereo camera degenerated into a monocular one. Besides, the scale unobservability under special motions of ground vehicles attached to monocular VIO cannot be neglected.

ACK-MSCKF can achieve accurate and robust pose estimation under different vehicle driving cycles and environmental conditions, which mainly benefits from the improvement of observability under special motions of autonomous vehicles by fusing Ackermann error state measurements. The relative translation error of ACK-MSCKF is reduced respectively by 59.85% and 67.70% of S-MSCKF and OKVIS in the sub-trajectory length of 160 m. Moreover, the relative yaw error is reduced respectively by 23.08% and 39.85% under the same condition.

5. Conclusions

In conclusion, this paper proposes a tightly-coupled Ackermann visual-inertial odometry as ACK-MSCKF. The Ackermann steering geometry and Ackermann error state measurement model are illustrated in detail. Moreover, the rotation extrinsic parameters calibration method is presented. The performance is evaluated on real-world datasets acquired by an Ackermann steering vehicle. The experimental results demonstrate that the proposed ACK-MSCKF improves the pose estimation accuracy of S-MSCKF under special motions of autonomous vehicles, and can maintain the accuracy and robustness of pose estimation available under different vehicle driving cycles and environmental conditions. This work shows the guiding significance in the study of the high-precision pose estimation of Ackermann steering vehicles through an in-vehicle proprioceptive sensor and visual inertial sensor with a tightly-coupled filter-based mechanism.

However, the limitation of this work derives from the assumption that vehicles follow approximate planar motion without wheel slippage and sideslip angle for low-speed operation. In the future, we will further study the Ackermann error state measurement model-aided VIO under non-planar motion, and use incremental optimization-based algorithms [56,57,58,59] to further improve the pose estimation accuracy of autonomous vehicles.

Author Contributions

Conceptualization, F.M. and J.S.; Data curation, J.S.; Funding acquisition, F.M.; Investigation, J.S.; Methodology, J.S.; Project administration, F.M.; Resources, F.M. and K.D.; Software, J.S.; Supervision, F.M.; Validation, J.S.; Visualization, J.S.; Writing-original draft, J.S.; Writing-review & editing, F.M., J.S., Y.Y. and J.L.

Funding

This research was funded by the Jilin Province Key Technology and Development Program (Grant No. 20190302077GX), the National Natural Science Foundation of China (Grant No. 51705185) and the National Key Technologies R&D Program of China (Grant No. 2017YFC0601604).

Acknowledgments

We would like to thank the support received from Linhe Ge and Shouren Zhong. We acknowledge http://www.shenlanxueyuan.com/ for providing professional training and technical exchange about the latest algorithms in VIO.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lee, K.; Li, S.E.; Kum, D. Synthesis of Robust Lane Keeping Systems: Impact of Controller and Design Parameters on System Performance. IEEE Trans. Intell. Transport. Syst. 2018, 20, 3129–3141. [Google Scholar] [CrossRef]
Lee, K.; Chung, W.; Yoo, K. Kinematic parameter calibration of a car-like mobile robot to improve odometry accuracy. Mechatronics 2010, 20, 582–595. [Google Scholar] [CrossRef]
Toledo, J.; Piñeiro, J.; Arnay, R.; Acosta, D.; Acosta, L. Improving odometric accuracy for an autonomous electric cart. Sensors 2018, 18, 200. [Google Scholar] [CrossRef]
Groves, P.D. Principles of GNSS, Inertial, and Multisensor Integrated Navigation Systems; Artech House: Norwood, MA, USA, 2013. [Google Scholar]
Xue, H.; Fu, H.; Dai, B. IMU-Aided High-Frequency Lidar Odometry for Autonomous Driving. Appl. Sci. 2019, 9, 1506. [Google Scholar] [CrossRef]
Zhang, J.; Singh, S. Visual-lidar odometry and mapping: Low-drift, robust, and fast. In Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 2174–2181. [Google Scholar]
Mur-Artal, R.; Tardós, J.D. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Robot. 2017, 33, 1255–1262. [Google Scholar] [CrossRef]
Engel, J.; Koltun, V.; Cremers, D. Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 611–625. [Google Scholar] [CrossRef]
Forster, C.; Zhang, Z.; Gassner, M.; Werlberger, M.; Scaramuzza, D. SVO: Semidirect visual odometry for monocular and multicamera systems. IEEE Trans. Robot. 2017, 33, 249–265. [Google Scholar] [CrossRef]
Younes, G.; Asmar, D.; Shammas, E.; Zelek, J. Keyframe-based monocular SLAM: Design, survey, and future directions. Robot. Auton. Syst. 2017, 98, 67–88. [Google Scholar] [CrossRef]
Aqel, M.O.A.; Marhaban, M.H.; Saripan, M.I.; Ismail, N.B. Review of visual odometry: Types, approaches, challenges, and applications. Springerplus 2016, 5, 1897. [Google Scholar] [CrossRef]
Saputra, M.R.U.; Markham, A.; Trigoni, N. Visual SLAM and Structure from Motion in Dynamic Environments: A Survey. ACM Comput. Surv. 2018, 51, 37. [Google Scholar] [CrossRef]
Aladem, M.; Rawashdeh, S. Lightweight visual odometry for autonomous mobile robots. Sensors 2018, 18, 2837. [Google Scholar] [CrossRef] [PubMed]
Huang, G. Visual-inertial navigation: A concise review. arXiv 2019, arXiv:1906.02650. [Google Scholar]
Gui, J.; Gu, D.; Wang, S.; Hu, H. A review of visual inertial odometry from filtering and optimisation perspectives. Adv. Robot. 2015, 29, 1289–1301. [Google Scholar] [CrossRef]
Mur-Artal, R.; TardÓs, J.D. Visual-Inertial Monocular SLAM with Map Reuse. IEEE Robot. Autom. Lett. 2017, 2, 796–803. [Google Scholar] [CrossRef]
Von Stumberg, L.; Usenko, V.; Cremers, D. Direct sparse visual-inertial odometry using dynamic marginalization. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 2510–2517. [Google Scholar]
Ramezani, M.; Khoshelham, K. Vehicle Positioning in GNSS-Deprived Urban Areas by Stereo Visual-Inertial Odometry. IEEE Trans. Intell. Veh. 2018, 3, 208–217. [Google Scholar] [CrossRef]
Mourikis, A.I.; Roumeliotis, S.I. A multi-state constraint Kalman filter for vision-aided inertial navigation. In Proceedings of the 2007 IEEE International Conference on Robotics and Automation, Roma, Italy, 10–14 April 2007; pp. 3565–3572. [Google Scholar]
Lynen, S.; Achtelik, M.W.; Weiss, S.; Chli, M.; Siegwart, R. A robust and modular multi-sensor fusion approach applied to MAV navigation. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 3923–3929. [Google Scholar]
Bloesch, M.; Omari, S.; Hutter, M.; Siegwart, R. Robust visual inertial odometry using a direct EKF-based approach. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 298–304. [Google Scholar]
Sun, K.; Mohta, K.; Pfrommer, B.; Watterson, M.; Liu, S.; Mulgaonkar, Y.; Taylor, C.J.; Kumar, V. Robust stereo visual inertial odometry for fast autonomous flight. IEEE Robot. Autom. Lett. 2018, 3, 965–972. [Google Scholar] [CrossRef]
Huai, Z.; Huang, G. Robocentric visual-inertial odometry. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 6319–6326. [Google Scholar]
Leutenegger, S.; Lynen, S.; Bosse, M.; Siegwart, R.; Furgale, P. Keyframe-based visual–inertial odometry using nonlinear optimization. Int. J. Robot. Res. 2014, 34, 314–334. [Google Scholar] [CrossRef]
Qin, T.; Li, P.; Shen, S. VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator. IEEE Trans. Robot. 2018, 34, 1004–1020. [Google Scholar] [CrossRef] [Green Version]
He, Y.; Zhao, J.; Guo, Y.; He, W.; Yuan, K. Pl-vio: Tightly-coupled monocular visual-inertial odometry using point and line features. Sensors 2018, 18, 1159. [Google Scholar] [CrossRef]
Usenko, V.; Demmel, N.; Schubert, D.; Stückler, J.; Cremers, D. Visual-Inertial Mapping with Non-Linear Factor Recovery. arXiv 2019, arXiv:1904.06504. [Google Scholar]
Delmerico, J.; Scaramuzza, D. A benchmark comparison of monocular visual-inertial odometry algorithms for flying robots. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 2502–2509. [Google Scholar]
Wu, M.; Lam, S.; Srikanthan, T. A Framework for Fast and Robust Visual Odometry. IEEE Trans. Intell. Transport. Syst. 2017, 18, 3433–3448. [Google Scholar] [CrossRef]
Wu, K.J.; Roumeliotis, S.I. Unobservable Directions of VINS under Special Motions; Department of Computer Science & Engineering, University of Minnesota: Minneapolis, MN, USA, 2016. [Google Scholar]
Shan, Z.; Li, R.; Schwertfeger, S. RGBD-Inertial Trajectory Estimation and Mapping for Ground Robots. Sensors 2019, 19, 2251. [Google Scholar] [CrossRef] [PubMed]
Schmid, K.; Tomic, T.; Ruess, F.; Hirschmüller, H.; Suppa, M. Stereo vision based indoor/outdoor navigation for flying robots. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 3955–3962. [Google Scholar]
Zheng, F.; Liu, Y.H. SE(2)-Constrained Visual Inertial Fusion for Ground Vehicles. IEEE Sens. J. 2018, 18, 9699–9707. [Google Scholar] [CrossRef]
Wu, K.J.; Guo, C.X.; Georgiou, G.; Roumeliotis, S.I. Vins on wheels. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 5155–5162. [Google Scholar]
Dang, Z.; Wang, T.; Pang, F. Tightly-coupled Data Fusion of VINS and Odometer Based on Wheel Slip Estimation. In Proceedings of the 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO), Kuala Lumpur, Malaysia, 12–15 December 2018; pp. 1613–1619. [Google Scholar]
Hong, E.; Lim, J. Visual-Inertial Odometry with Robust Initialization and Online Scale Estimation. Sensors 2018, 18, 4287. [Google Scholar] [CrossRef] [PubMed]
Houseago, C.; Bloesch, M.; Leutenegger, S. KO-Fusion: Dense Visual SLAM with Tightly-Coupled Kinematic and Odometric Tracking. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019. [Google Scholar]
Huskić, G.; Buck, S.; Zell, A. GeRoNa: Generic Robot Navigation. J. Intell. Robot. Syst. 2018, 95, 419–442. [Google Scholar] [CrossRef]
ACK-MSCKF ROS Package. Available online: https://github.com/qdensh/ACK-MSCKF (accessed on 8 October 2019).
Breckenridge, W.G. Quaternions proposed standard conventions. Jet Propuls. Lab. Pasadena CA Interoff. Memo. IOM 1999, 343–379. [Google Scholar]
Trawny, N.; Roumeliotis, S.I. Indirect Kalman filter for 3D attitude estimation. Univ. Minn. Dept. Comp. Sci. Eng. Tech. Rep 2005, 2, 2005. [Google Scholar] [CrossRef]
Sola, J. Quaternion kinematics for the error-state Kalman filter. arXiv 2017, arXiv:1711.02508. [Google Scholar]
Scaramuzza, D.; Fraundorfer, F.; Pollefeys, M.; Siegwart, R. Absolute scale in structure from motion from a single vehicle mounted camera by exploiting nonholonomic constraints. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 1413–1419. [Google Scholar] [Green Version]
Kevin, M.L.; Park, F.C. Modern Robotics: Mechanics, Planning, and Control; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
Rajamani, R. Vehicle Dynamics and Control, 2nd ed.; Springer: New York, NY, USA, 2012. [Google Scholar]
Marín, L.; Vallés, M.; Soriano, Á.; Valera, Á.; Albertos, P. Multi sensor fusion framework for indoor-outdoor localization of limited resource mobile robots. Sensors 2013, 13, 14133–14160. [Google Scholar] [CrossRef]
Furgale, P.; Rehder, J.; Siegwart, R. Unified temporal and spatial calibration for multi-sensor systems. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013; pp. 1280–1286. [Google Scholar]
Choi, S.; Park, J.; Yu, W. Simplified epipolar geometry for real-time monocular visual odometry on roads. Int. J. Control Autom. Syst. 2015, 13, 1454–1464. [Google Scholar] [CrossRef]
Yang, Z.; Shen, S. Monocular visual-inertial state estimation with online initialization and camera-IMU extrinsic calibration. IEEE Trans. Autom. Sci. Eng. 2016, 14, 39–51. [Google Scholar] [CrossRef]
Markley, F.L.; Cheng, Y.; Crassidis, J.L.; Oshman, Y. Averaging Quaternions. J. Guid. Control Dyn. 2007, 30, 1193–1197. [Google Scholar] [CrossRef]
Burri, M.; Nikolic, J.; Gohl, P.; Schneider, T.; Rehder, J.; Omari, S.; Achtelik, M.W.; Siegwart, R. The EuRoC micro aerial vehicle datasets. Int. J. Robot. Res. 2016, 35, 1157–1163. [Google Scholar] [CrossRef]
Sturm, J.; Engelhard, N.; Endres, F.; Burgard, W.; Cremers, D. A benchmark for the evaluation of RGB-D SLAM systems. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, Portugal, 7–12 October 2012; pp. 573–580. [Google Scholar]
Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? the kitti vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012; pp. 3354–3361. [Google Scholar]
Pfrommer, B.; Sanket, N.; Daniilidis, K.; Cleveland, J. Penncosyvio: A challenging visual inertial odometry benchmark. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 3847–3854. [Google Scholar]
Zhang, Z.; Scaramuzza, D. A tutorial on quantitative trajectory evaluation for visual (-inertial) odometry. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 7244–7251. [Google Scholar]
Kaess, M.; Ranganathan, A.; Dellaert, F. ISAM: Incremental Smoothing and Mapping. IEEE Trans. Robot. 2008, 24, 1365–1378. [Google Scholar] [CrossRef]
Kaess, M.; Johannsson, H.; Roberts, R.; Ila, V.; Leonard, J.J.; Dellaert, F. ISAM2: Incremental smoothing and mapping using the Bayes tree. Int. J. Robot. Res. 2012, 31, 216–235. [Google Scholar] [CrossRef]
Liu, H.; Chen, M.; Zhang, G.; Bao, H.; Bao, Y. Ice-ba: Incremental, consistent and efficient bundle adjustment for visual-inertial slam. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1974–1982. [Google Scholar]
Li, X.; Wang, Y.; Khoshelham, K. Comparative analysis of robust extended Kalman filter and incremental smoothing for UWB/PDR fusion positioning in NLOS environments. Acta Geod. Geophys. 2019, 54, 157–179. [Google Scholar] [CrossRef]

Figure 1. The basic workflow of Ackermann visual-inertial odometry (ACK-MSCKF).

Figure 2. Geometric relation between {B} and {C} frames from time

t_{j - 1}

to time

t_{j}

. The symbol

{}_{B_{j - 1}}^{W}T

denotes the vehicle pose at time

t_{j - 1}

in {W},

{}_{B_{j}}^{W}T

denotes the vehicle pose at time

t_{j}

in {W},

{}_{B_{j}}^{B_{j - 1}}T

represents the homogeneous coordinate transformation from time

t_{j}

to time

t_{j - 1}

,

{}_{B}^{C}T

represents the homogeneous coordinate transformation from {B} to {C},

{}_{C_{j - 1}}^{G}T

denotes the camera pose at time

t_{j - 1}

in {G} and finally

{}_{C_{j}}^{G}T

denotes the camera pose at time

t_{j}

in {G}.

Figure 2. Geometric relation between {B} and {C} frames from time

t_{j - 1}

to time

t_{j}

. The symbol

{}_{B_{j - 1}}^{W}T

denotes the vehicle pose at time

t_{j - 1}

in {W},

{}_{B_{j}}^{W}T

denotes the vehicle pose at time

t_{j}

in {W},

{}_{B_{j}}^{B_{j - 1}}T

represents the homogeneous coordinate transformation from time

t_{j}

to time

t_{j - 1}

,

{}_{B}^{C}T

represents the homogeneous coordinate transformation from {B} to {C},

{}_{C_{j - 1}}^{G}T

denotes the camera pose at time

t_{j - 1}

in {G} and finally

{}_{C_{j}}^{G}T

denotes the camera pose at time

t_{j}

in {G}.

Figure 3. Ackermann steering geometry used in car-like wheeled mobile robots.

B

is the distance between the king pins,

L

is the wheel base (distance between the two axles),

α_{i}

is the steering angle of the inner wheel from straight ahead,

α_{c}

is the steering angle of the virtual wheel at the midpoint between the front wheels,

α_{o}

is the steering angle of the outer wheel from straight ahead and

R

is the steering radius (distance between the Instantaneous Center of Rotation (ICR) and the centerline of the vehicle). The sign of

α_{i}

,

α_{c}

,

α_{o}

and

R

follows the rule that a left turn assumes a positive value and a right turn a negative value in the coordinate system {B}.

Figure 3. Ackermann steering geometry used in car-like wheeled mobile robots.

B

is the distance between the king pins,

L

is the wheel base (distance between the two axles),

α_{i}

is the steering angle of the inner wheel from straight ahead,

α_{c}

is the steering angle of the virtual wheel at the midpoint between the front wheels,

α_{o}

is the steering angle of the outer wheel from straight ahead and

R

is the steering radius (distance between the Instantaneous Center of Rotation (ICR) and the centerline of the vehicle). The sign of

α_{i}

,

α_{c}

,

α_{o}

and

R

follows the rule that a left turn assumes a positive value and a right turn a negative value in the coordinate system {B}.

Figure 4. Experimental vehicle setup: (a) The Ackermann steering vehicle built by our research group; (b) MYNTEYE-S1030; (c) dual-antenna Spatial NAV982-RG; (d) telematic box ( (communicating with the smart phone application (APP) and Vehicle Control Unit (VCU)); (e) smart phone APP; (f) Controller Area Network (CAN) interface card; (g) mini-computer for the data acquisition.

Figure 5. Acquired images in the experimental datasets. (a) AM_01; (b) AM_02; (c) AM_03; (d) AM_04.

Figure 6. The performance from the Stereo Multi-State Constraint Kalman Filter (S-MSCKF)(s) (blue line), S-MSCKF(m) (green line) and ground truth (magenta line) on the AM_05 dataset. (a) Top-view of estimated trajectory; (b) Side-view of estimated trajectory; (c) The relative translation error; (d) The relative yaw error. Relative errors are measured over different sub-trajectories of length {1, 5, 20, 50, 100, 160} m.

Figure 7. The top-views of estimated trajectory from ACK-MSCKF (blue line), S-MSCKF (green line), Open Keyframe-based Visual-Inertial SLAM (OKVIS) (red line) and ground truth (magenta line) of the dataset: (a) AM_01; (b) AM_02; (c) AM_03; (d) AM_04.

Figure 8. The estimated trajectories aligned to Google Earth from ACK-MSCKF (blue line), S-MSCKF (green line), OKVIS (red line) and ground truth (magenta line) of the dataset: (a) AM_01; (b) AM_02; (c) AM_03; (d) AM_04.

Figure 9. The relative translation error from ACK-MSCKF (blue line), S-MSCKF (green line) and OKVIS (red line) of the dataset: (a) AM_01; (b) AM_02; (c) AM_03; (d) AM_04.

Figure 10. The relative yaw error from ACK-MSCKF (blue line), S-MSCKF (green line) and OKVIS (red line) of the dataset: (a) AM_01; (b) AM_02; (c) AM_03; (d) AM_04.

Table 1. List of our experimental datasets.

Dataset	Vehicle Driving Cycles	Environmental Condition	Duration (s)	Data Bulk
Dataset	Vehicle Driving Cycles	Environmental Condition	Duration (s)	Vehicle CAN-Bus	Stereo Images	IMU	Ground Truth
AM_01	Straight	High-light	151	24,171	4532	30,164	30,244
AM_02	Slalom	Motion blur	170	27,135	5099	33,933	34,025
AM_03	Free	Low-light	131	17,387	3930	26,258	13,111
AM_04	Free	Dynamic objects	493	65,147	14,801	98,651	49,291
AM_05	Free	High-light	238	38,020	7136	47,439	47,583

Table 2. Absolute translation error on AM_05 dataset.

Methods	RMSE (m)
S-MSCKF(s)	3.05
S-MSCKF(m)	4.21

Table 3. The relative error on AM_05 dataset.

Sub-trajectory Length (m)	Relative Translation Error (m)		Relative Yaw Error (deg)
Sub-trajectory Length (m)	S-MSCKF(s)	S-MSCKF(m)	S-MSCKF(s)	S-MSCKF(m)
1	0.17	0.19	0.25	0.27
5	0.61	0.67	0.46	0.48
20	1.56	1.76	0.76	0.79
50	1.94	2.39	0.91	0.93
100	2.20	2.96	1.00	1.02
160	2.97	4.14	1.18	1.20

Table 4. Absolute translation error along the whole trajectory of each experimental dataset.

Methods	AM_01 (m)	AM_02 (m)	AM_03 (m)	AM_04 (m)
ACK-MSCKF	2.25	3.43	1.57	5.13
S-MSCKF	15.24	12.26	5.23	21.66
OKVIS	24.62	13.01	9.68	13.63

Table 5. The overall relative error on all of the real-world datasets.

Sub-trajectory Length (m)	Relative Translation Error (m)			Relative Yaw Error (deg)
Sub-trajectory Length (m)	ACK-MSCKF	S-MSCKF	OKVIS	ACK-MSCKF	S-MSCKF	OKVIS
1	0.25	0.41	0.46	0.14	0.15	0.36
5	0.34	0.75	0.80	0.21	0.20	0.41
20	0.97	2.48	2.65	0.43	0.39	0.68
50	1.85	4.99	5.88	0.59	0.55	0.87
100	2.99	7.70	9.99	0.74	0.78	1.11
160	4.18	10.41	12.94	0.80	1.04	1.33

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ma, F.; Shi, J.; Yang, Y.; Li, J.; Dai, K. ACK-MSCKF: Tightly-Coupled Ackermann Multi-State Constraint Kalman Filter for Autonomous Vehicle Localization. Sensors 2019, 19, 4816. https://0-doi-org.brum.beds.ac.uk/10.3390/s19214816

AMA Style

Ma F, Shi J, Yang Y, Li J, Dai K. ACK-MSCKF: Tightly-Coupled Ackermann Multi-State Constraint Kalman Filter for Autonomous Vehicle Localization. Sensors. 2019; 19(21):4816. https://0-doi-org.brum.beds.ac.uk/10.3390/s19214816

Chicago/Turabian Style

Ma, Fangwu, Jinzhu Shi, Yu Yang, Jinhang Li, and Kai Dai. 2019. "ACK-MSCKF: Tightly-Coupled Ackermann Multi-State Constraint Kalman Filter for Autonomous Vehicle Localization" Sensors 19, no. 21: 4816. https://0-doi-org.brum.beds.ac.uk/10.3390/s19214816

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ACK-MSCKF: Tightly-Coupled Ackermann Multi-State Constraint Kalman Filter for Autonomous Vehicle Localization

Abstract

1. Introduction

2. The Proposed Approach

2.1. Coordinate Systems and Notations

2.2. Process Model and State Augmentation

2.3. Measurement Model

2.3.1. Vehicle Relative Rotation and Translation Estimation

2.3.2. Ackermann Error State Measurements

2.3.3. Visual Measurements

2.4. Extrinsic Parameters Calibration

3. Experiments and Results

3.1. Experimental Vehicle Setup

3.2. Experimental Results

3.2.1. Extrinsic Parameters Calibration Performance

3.2.2. ACK-MSCKF Performance

4. Discussion

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI