The mean centre (MC) tool in ArcGIS (Environmental Systems Research Institute, Redlands, CA, USA) was used to identify the spatio-temporal change in TB in Beijing from 2009 to 2014. The MC identifies the geographic centre of a set of points to measure the central tendency, which is calculated as follows:

where

${MC}_{t}$ denotes the coordinates of the MC in the

tth (

$t=1,2,\dots ,m$) year,

n is the number of points over the study area in the

tth year, and

${x}_{j}$ and

${y}_{j}$ are the coordinates of the

jth (

$j=1,2,\dots ,n$) point in the

tth year. The IR MCs of the same geographic area in a time series could reveal the movement of the IR central tendency. The MCs of NSP TB patients in the resident population (NSPRP), NSP TB patients in the TB patient population (NSPTBP), RSP TB patients in the resident population (RSPRP) and RSP TB patients in the TB patient population (RSPTBP) from 2010 to 2014 were calculated to identify and compare the yearly movement of the central tendency.

A trajectory is a serial record of spatial locations of moving objects with time attributes. The central tendencies of IR, NSPRP, NSPTBP, RSPRP and RSPTBP over time can be regarded as a type of trajectory. The Euclidean distance between different tracks can be used to measure the trajectory similarities of IR and the other four categories’ central tendencies. The Euclidean distance between tracks is based on the Euclidean distance between points. First, the distance between points is calculated using the same time, and the sum of these distances is then calculated. The Euclidean distance of two tracks can be calculated as follows:

where

$dist(MC1,MC2)$ is the Euclidean distance of two different tracks

$MC1$ and

$MC2$. When the distance is smaller, the trajectory similarity between them is higher. The total points of each track are the same and equal

$m$.

${p}_{t}$ is the weight for different point pairs according to certain rules. Because the situation is very similar for adjacent years, we assume that the IR of current year is more strongly correlated with the IR of the previous year than with that of the year before last, three years ago and so on. Therefore, because we focused on the IR close to the current year, we assigned a greater weight to the most recent year. For exponential function,

$a$ should meet the condition

$a>0$ and

$a\ne 1$. When

$0<a<1$, the curve of trend component is a monotonically decreasing function; when

$a>1$, the curve of trend component is a monotonically increasing function. Therefore, we used the exponential function model (

$0<a<1$) to give the point pairs different weights and unitized the results, which can be expressed as follows:

where

$k=0,1,2\cdot \cdot \cdot m-1$ starting from the current year. Because the curve is characterized by a sharper slope when

$a$ is close to 0. If

$a$ were small, the weights for years far from current year would be too small. Therefore, we assigned

$0.5\le a<1$. We generated 1000 random numbers between 0.5 and 1, than performed point estimation in the sample, and the result showed

$a=0.75$. Therefore, we established five gradient values for

$a$ (

$a=0.55$,

$a=0.65$,

$a=0.75$,

$a=0.85$ and

$a=0.95$) which were symmetrical by 0.75. However, the number of gradient values is not fixed as long as the selected numbers are symmetrical by 0.75.

$MC{1}_{t}$ and

$MC{2}_{t}$ are the point pairs of the track

$MC1$ and the track

$MC2$ in the

tth year, respectively.

$dist(MC{1}_{t},MC{2}_{t})$ is the Euclidean distance between the point pairs

$MC{1}_{t}$ and

$MC{2}_{t}$, which can be calculated as follows:

where

$X{1}_{t}$ and

$X{2}_{t}$ can be calculated using Equation (9) and

$Y{1}_{t}$ and

$Y{2}_{t}$ can be calculated using Equation (10). The unit of distance is the kilometre (km).