#
An Interpolation and Prediction Algorithm for XCO_{2} Based on Multi-Source Time Series Data

^{1}

^{2}

^{*}

## Abstract

**:**

_{2}dataset at a resolution of 0.25°, and achieve a validated determination coefficient of 0.92. Secondly, introducing technologies such as Time Convolutional Networks (TCN), Channel Attention Mechanism (CAM), and Long Short-Term Memory networks (LSTM), we conduct atmospheric CO

_{2}concentration interpolation and predictions. When conducting predictive analysis for the Yangtze River Delta region, we train the model by using quarterly data from 2016 to 2020; the correlation coefficient in summer is 0.94, and in winter it is 0.91. These experimental data indicate that compared to other algorithms, this algorithm has a significantly better performance.

## 1. Introduction

_{2}) is one of the most significant greenhouse gases in the atmosphere, constituting 0.04% of the total atmospheric composition [1]. Due to human activities, its concentration has risen from 280 ppm before the Industrial Revolution to the current level of 414 ppm. This increase, coupled with other greenhouse gas emissions, has resulted in a global average temperature rise of approximately 1.09 °C over the past century, causing irreversible damage to ecosystems [2]. The United Nations Framework Convention on Climate Change and the Paris Agreement aim to control and reduce atmospheric CO

_{2}concentration [3], making climate change an integral part of the United Nations’ Sustainable Development Goals with profound implications for global health and sustainable development [4]. As one important step in technology, the accurate prediction of atmospheric CO

_{2}concentration is crucial for formulating emission reduction plans to achieve the “net-zero” target by 2050, aligning with both international and national emission reduction goals [5]. The study aims to establish an impartial carbon emission monitoring system by utilizing environmental variables, with the goal of providing crucial references and support for future anthropogenic economic activity carbon emissions.

_{2}concentration observations provide long-term, high-precision data but are sparsely distributed with limited spatial coverage. In contrast, satellite observations overcome the limitations of ground stations by covering extensive spatial ranges [7].

_{2}concentrations [8]. These satellite monitors utilize near-infrared solar radiation reflected from the Earth’s surface in the CO

_{2}spectral and O

_{2}A bands to generate XCO

_{2}, aiming to enhance estimates of the spatial distribution of carbon sources and sinks [9]. Despite the numerous advantages of using carbon satellites for monitoring CO

_{2}concentrations, there are inevitably two challenges.

- Due to insufficient satellite data coverage, the acquisition of long-term time series data is limited, thus making the accurate prediction of future CO
_{2}concentrations more challenging.

_{2}has a relatively low coverage. This low coverage of XCO

_{2}concentration data adversely impacts the accurate estimation of carbon sources and sinks [13]. Therefore, filling the gaps in XCO

_{2}data is crucial for subsequent predictions.

_{2}(carbon dioxide column-averaged dry air mole fraction) data [14], including spatial interpolation [15], multisensor fusion [16], and modeling based on machine learning [8].

_{2}data, followed by the reconstruction of CO

_{2}concentrations in regional or global atmospheres. For instance, Siabi et al. [8] employed a multilayer perceptron model to construct a nonlinear correspondence between OCO-2 satellite XCO

_{2}data and multiple data sources, effectively filling gaps in satellite observations. He et al. [17] and colleagues utilized elevation, meteorological conditions, and CarbonTracker XCO

_{2}data, employed LightGBM to achieve comprehensive XCO

_{2}data coverage for China. Using extreme random forest and random forest models, Li et al. [18] and Wang et al. [19] generated continuous spatiotemporal atmospheric CO

_{2}concentration data at both global and regional scales. However, most studies are limited to constructing datasets, without delving into the subsequent prediction of CO

_{2}concentration changes.

_{2}column concentrations. For example, Zheng et al. [20] utilized the GOSAT dataset and applied differential moving autoregressive models and long short-term memory (LSTM) neural network models to predict the trend of CO

_{2}concentration changes in the near-surface region of China. However, this experiment did not consider meteorological and vegetation factors related to CO

_{2}, and the data resolution was relatively low, posing challenges for regional predictions, with less-than-satisfactory prediction accuracy. Meng et al. [21] employed a heterogeneous spatiotemporal dataset obtained from OCO-2, GOSAT, and self-built wireless carbon sensors, attempting to use the LSTM model for prediction. However, they tested only one location, lacking a more comprehensive validation. On the other hand, Li et al. [22] selected OCO-2 satellite spectral data from 2019 and used five machine learning models, considering various meteorological, surface, and vegetation factors for estimation. But, they did not adequately account for regional seasonal variations in CO

_{2}and long-term trends. Moreover, there is currently no publicly available dataset.

_{2}column concentrations. The primary advantage of deep learning methods lies in their powerful ability to automatically learn advanced features from extensive datasets, a crucial step in bridging the gap between data patterns at different feature levels. Given the outstanding feature extraction performance of deep learning neural networks, they hold significant potential in fusing multisource data to extract crucial spatial information [23,24].

_{2}data gaps, enhance the high spatiotemporal resolution of the data, and use a deep learning neural network for prediction, with a specific focus on estimating medium- to long-term fully covered daily scale CO

_{2}data. Due to the advantages of Temporal Convolutional Networks (TCN), Channel Attention mechanism (CA), and Long Short-Term Memory networks (LSTM), this paper intends to combine them to alleviate the problem of interpolation and prediction of XCO

_{2}data. The contributions of this study are as follows:

- Augmenting the existing multisource data with ground semantic information has been incorporated, enhancing the predictive capabilities of the model.
- A daily dataset of seamless XCO
_{2}in the Yangtze River Delta region with a spatial resolution of 0.25°, derived from the fusion of multisource data spanning from 2016 to 2020, has been established. - The adoption of the TCN-Attention module has improved the quality and efficiency of feature aggregation, enabling the better capture of both local and global spatial features.
- Leveraging the LSTM structure, long-term trends in multisource spatiotemporal data are effectively modeled, facilitating the integration of features across multiple time steps.

_{2}data from satellite observations, and auxiliary data, and details the data processing and analysis procedures. This section delves into the prediction methodology, encompassing the deep learning approach and the model’s schematic diagram. Section 3 encompasses the model evaluation, along with a detailed discussion of the spatiotemporal distribution. Section 4 gives a conclusion and future prospects. Figure 1 provides an overview of the whole workflow.

## 2. Materials and Methods

#### 2.1. Study Area

_{2}(https://earthdata.nasa.gov/, accessed on 10 May 2024) growth from 2016 to 2020. The red line depicts the CO

_{2}concentration dynamics, showing lower levels in summer, higher levels in winter, and an overall upward trend. The data are sourced from the MOD13C2 product, available for download from (https://modis.gsfc.nasa.gov/, accessed on 10 May 2024).

_{2}in the YRD region [26]. This prediction serves as a scientific basis for regional ecological environment quality monitoring, environmental health assessment, and decision-making management to achieve pollution reduction, carbon reduction, and coordinated efficiency enhancement. In the bottom right corner, the trend chart illustrates the CO

_{2}concentration in the YRD region from 2016 to 2020. It reveals a seasonal cyclic variation in CO

_{2}, with concentrations continuously increasing.

#### 2.2. Multisource Data

_{2}from multiple sources, covering seven main categories, including OCO-2 XCO

_{2}, CAMS XCO

_{2}, vegetation data, the Fifth Generation European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis (ERA5) meteorological variables, land cover data, elevation data, and Total Carbon Column Observing Network (TCCON) station XCO

_{2}measurement data.

#### 2.2.1. OCO-2 XCO_{2} Data

_{2}column concentration data used in this study are sourced from the OCO-2 satellite product (OCO

_{2}_L2_Lite_FP). OCO-2, launched by NASA in July 2014, is the first dedicated carbon observation satellite designed for measuring XCO

_{2}and monitoring near-surface carbon sources and sinks. The satellite observes the Earth around 13:30 local time, with a spatial resolution of 2.25 km × 1.29 km (∼0.02°) and a revisit cycle of 16 days [27]. In comparison to other CO

_{2}observation satellites, OCO-2 satellite data offers superior spatial resolution and monitoring accuracy [12]. The XCO

_{2}data utilized in this study cover the period from 1 January 2018, to 31 December 2020. Figure 3 depicts the mean OCO-2 XCO

_{2}values for the year 2016 in the Chinese region.

#### 2.2.2. CAMS XCO_{2} Data

_{2}and CH

_{4}, currently spans from 2003 to 2020, with temporal and spatial resolutions of 3 h and 0.75°, respectively. In the generation process of CAMS XCO

_{2}, OCO-2 data are not assimilated. Therefore, the fusion of CAMS XCO

_{2}with OCO-2 XCO

_{2}data holds the potential to integrate advantages from multiple data sources [29]. Verification indicates that CAMS XCO

_{2}data demonstrates potential and feasibility for atmospheric CO

_{2}analysis. Generated by the Integrated Forecast System (IFS) model and the 4DVar data assimilation system at ECMWF, CAMS XCO

_{2}data are derived from atmospheric data storage, utilizing the “Column-Averaged Mole Fraction of CO

_{2}” variable in this study.

#### 2.2.3. Vegetation Data

_{2}concentration [30]. Therefore, in the reconstruction process, NDVI is employed as one of the auxiliary predictive factors. The MODIS instrument (https://modis.gsfc.nasa.gov/, accessed on 10 May 2024), a crucial tool on the Terra and Aqua satellites, is widely utilized for vegetation growth monitoring due to its large observation coverage (approximately 2330 km) and high data quality. Thus, monthly MOD13C2 products were obtained at a resolution of 0.05° for this study [31,32].

#### 2.2.4. Meteorological Data

_{2}concentration. Given the significant impact of meteorological factors on the temporal and spatial variations of CO

_{2}concentration, key meteorological factors affecting concentration include wind speed, temperature, and humidity [33,34]. ERA5, the fifth-generation ECMWF global climate and weather reanalysis dataset, features a spatial resolution of 0.25° × 0.25° and a temporal resolution of 1 h, distributed on a grid. ERA5 incorporates more historical observational data, particularly satellite data, into advanced data assimilation and modeling systems to estimate more accurate atmospheric conditions. In this context, wind speed (wspd) and wind direction (wdir) are calculated based on the U-component (UW, m/s) and V-component (VW, m/s) of wind velocity, employing the following formula. Additionally, temperature (TEM, K) and relative humidity (RH, %) are introduced for modeling CO

_{2}concentration estimation. All meteorological data used here are from the time interval between 13:00 and 14:00 during satellite overpasses [35].

#### 2.2.5. Elevation Data

#### 2.2.6. Land Cover Data

_{2}absorption and emission. Therefore, this study incorporates the Chinese Land Cover Dataset (CLCD). Created by the team at Wuhan University, this dataset, based on Landsat imagery, characterizes land use and land cover across various regions in China. It typically includes multiple categories such as forests, grasslands, water bodies, wetlands, and farmland. The spatial resolution of the CLCD used in this study is 30 m.

#### 2.2.7. TCCON XCO_{2} Data

_{2}detection, TCCON station data is widely utilized for validating satellite-derived CO

_{2}products [36,37]. Hence, in this study, TCCON data are utilized as ground-based in situ CO

_{2}data to assess the reconstruction performance. The research region includes a ground monitoring station, the Hefei station (located at 117.17°E, 31.9°N), with data collection spanning from January 2016 to December 2020 [38].

#### 2.2.8. Data Preprocessing

_{2}data based on quality flags, eliminating pixels with poor quality (where the xco

_{2}_quality_flag parameters of 0 and 1 represent good and poor quality, respectively) [40]. Subsequently, daily data passing through 13:00 are selected as CAMS daily data, and the ERA5 meteorological data’s average values were used to represent different pressure level data [19].

_{2}results. Although CO

_{2}observation stations in the YRD region are limited, the TCCON Hefei station’s data covers the period from 2015 to 2020, along with some climate background station observations of near-surface CO

_{2}concentrations [41,42]. Comparing the XCO

_{2}results from the Hefei station with the reconstructed XCO

_{2}model data, the average deviation was approximately 0.4 ppm, the Standard Deviation (SD) was about 0.75 ppm, and the Root Mean Square Error (RMSE) is around 1.01 ppm. As shown in Table 2, Li et al. estimated an RMSE of 1.71 ppm for XCO

_{2}from 2015 to 2020 compared to ground-based TCCON data [43]. Zhang et al. validated XCO

_{2}from the Hefei TCCON site against ML results, showing an average deviation of −0.60 ppm, an SD of 0.99 ppm, and an RMSE of 1.18 ppm [44]. He et al. validated XCO

_{2}results generated by random forest against ground-based data, with an RMSE of 1.123 ppm. These results are consistent with our analysis, further supporting the reliability and validity of our findings [45]. The error of the validation results is depicted in Figure 4. The x-axis is based on time, the left y-axis (XCO

_{2}/ppm) represents the concentration of XCO

_{2}, and the right y-axis (bias) shows the difference between the actual station data and the reconstructed data. Clearly, the results from the Hefei station closely align with TCCON observations, indicating the good performance of the model data in simulating XCO

_{2}. Therefore, this dataset is named Yangtze River Delta _XCO

_{2}(YRD_XCO

_{2}) and serves as the research dataset in this paper.

_{2}concentration requires a parameterized model, with each parameter or variable having different scales in the dataset. To prevent parameters with large value ranges from exerting excessive influence, feature normalization is performed to scale all features equally. This normalization eliminates the influence of absolute values across different units, enabling fair comparisons among indicators. The $Max-Min$ normalization method is employed to ensure that all features are normalized to the same range, transforming the original data of each feature into the range $[0,1]$.

#### 2.3. Data Analysis

#### 2.3.1. Seasonal Analysis

_{2}concentration is influenced by seasonal variations, and Figure 5a illustrates a schematic representation of original satellite data using the Local Polynomial Regression and Scatterplot Smoothing (LOESS) method for Seasonal-Trend decomposition using LOESS (STL) [46,47]. This method decomposes the original time series into secular trend (Figure 5b), seasonal variation (Figure 5c), and residual terms (Figure 5d).

_{2}concentrations exhibit clear periodic patterns and strong autocorrelation in Figure 5, confirming that the collected CO

_{2}concentration data from past time points can be used for subsequent predictions. This supports the rationale for using time series to construct the Temporal Convolutional Network (TCN) model in the study. However, on 11 December 2018, the carbon dioxide concentration was 419 ppmv, which is 8.69 ppmv higher than the expected value of 409 ppmv, resulting in a residual. This indicated an abnormally increased concentration compared to the expected value, possibly influenced by non-periodic meteorological factors, which may be related to extreme weather conditions, posing a challenge for accurate predictions of YRD_XCO

_{2}concentrations [48,49]. Considering that CO

_{2}concentration is influenced by various factors, including time, weather, vegetation, elevation, and semantic information, we selected time information, meteorological input parameters, vegetation parameters, elevation information, and semantic data as important variables for model prediction. Additionally, CAMS XCO

_{2}reanalysis data were considered to enhance spatiotemporal resolution and improve the spatiotemporal resolution of satellite XCO

_{2}data. This parameter was used as an auxiliary variable input.

_{2}concentrations in the YRD region from 2016 to 2020, showing a yearly increase in the mean CO

_{2}concentration. The annual average growth of YRD_XCO

_{2}falls within the range of 2.8 ± 0.8 ppm/yr. Differences in CO

_{2}concentrations are observed among the four seasons, with noticeably higher average concentrations in spring and winter compared to summer and autumn. Specifically, variations in CO

_{2}concentrations are observed in April (spring) and September (summer), with fluctuations occurring mainly between spring, summer, and the arrival of winter to the next spring. Changes between summer and autumn are relatively small. According to Falahatkar et al.’s study [50], the rise in spring temperatures and vegetation recovery accelerate soil microbial activity, leading to increased CO

_{2}release. Simultaneously, the combustion of fossil fuels during winter releases a substantial amount of CO

_{2}, contributing to the rise in atmospheric CO

_{2}concentrations during spring. Subsequently, enhanced vegetation growth and photosynthesis during spring and summer gradually reduce CO

_{2}concentrations. In autumn and winter, when vegetation growth ceases and photosynthesis weakens, coupled with fossil fuel combustion for heating during winter, CO

_{2}concentrations gradually increase. In the following sections, this paper will explore the statistical relationships between various variables, as depicted in Figure 6.

#### 2.3.2. Statistical Relationship between Variables

_{2}denotes column-averaged carbon dioxide data, CAMS refers to reanalysis data, r represents relative humidity, ndvi indicates vegetation coverage, t denotes temperature, u signifies horizontal wind speed, v stands for vertical wind speed, classid represents surface semantic data, and dem refers to elevation data. In Figure 7, the statistical relationships and importance between variables are illustrated, with correlation coefficients (r) used to indicate their correlations. The formula is shown below.

_{2}and CAMS_XCO

_{2}is 0.71, while it shows a negative correlation with vegetation data r = −0.20. Regarding emissions, there is a significant correlation with meteorological factors, such as a negative correlation with temperature (r = −0.52) and a r value of −0.28 with sea-level pressure. The correlation with elevation and ground semantic information is weaker, with negative correlations of −0.04 and −0.02, respectively. The interrelation among these variables is intricate. Despite some correlations not being highly pronounced, the model constructed in this study is capable of extracting valuable information from these complex relationships. Therefore, meteorological input parameters, vegetation parameters, elevation information, and semantic data were chosen as auxiliary data for training in this study.

#### 2.4. Prediction Models

_{2}concentration prediction model based on feature fusion. By incorporating Time Convolutional Network (TCN), the model is able to effectively extract mid-term and periodic variations in the CO

_{2}concentration sequence. The use of a Channel Attention Mechanism aids in learning the relationships between different features, and the Long Short-Term Memory (LSTM) network is employed to capture the long-term dependencies in the time series. The research objective is to comprehensively predict the variation trends of CO

_{2}concentration in the Yangtze River Delta across different seasons in 2020. To thoroughly assess the model’s performance, this study employs evaluation metrics for time series regression models, providing an in-depth analysis of the model’s performance on the test data.

_{2}concentration involves a time series forecasting problem with non-linear features. Factors influencing YRD_XCO

_{2}concentration include meteorological conditions, vegetation, and semantic information from the ground. In this study, TCN and CAM modules are fused, combined with the LSTM model to construct a CATCN-LSTM model for non-linear feature atmospheric CO

_{2}concentration prediction using multi-source data. The structure of CATCN-LSTM is illustrated in Figure 8, and the primary process for predicting YRD_XCO

_{2}concentration is described as follows.

#### 2.4.1. TCN Module

_{2}concentration. TCN is a simple and versatile convolutional neural network architecture designed for addressing time-series problems, primarily composed of multiple stacked residual units [51]. The residual module comprises two convolutional units and a non-linear mapping unit. TCN exhibits several advantages in time-series prediction tasks: (a) it can address the issues of gradient vanishing and exploding; (b) it can compute convolutions in parallel, thereby accelerating training speed; (c) TCN possesses a highly effective historical length, making it capable of capturing temporal correlations for discontinuous and widely spaced historical time series data.

**Causal Convolution**The causal convolution imparts a strict temporal constraint on the TCN module with respect to the input XCO_{2}sequence ${x}_{0}$, ${x}_{1}$, …, ${x}_{t-1}$, ${x}_{t}$,.... The output ${y}_{t}$ at time t is expressed such that it is only related to the inputs up to and including time t. As illustrated in Figure 8b, its mathematical representation is as follows:$${y}_{t}=f({x}_{1},{x}_{2},\cdots ,{x}_{t}).$$Here, ${x}_{t}$ is a one-dimensional vector containing n features, and ${y}_{t}$ is the variable to be predicted. There exists some relationship between ${x}_{t}$ and ${y}_{t}$, denoted by the function f. To ensure that the output tensor and input tensor have the same length, a strategy of zero-padding on the left side of the input tensor is employed. Causal convolution is a unidirectional structure that processes the value at time t and uses only data before time t to ensure the temporal nature of data processing. However, to obtain longer and complete historical information, as the network depth increases, issues such as gradient vanishing, computational complexity, and poor fitting effects may arise. Therefore, dilated convolution is introduced.**Dilated Convolution**Dilated convolution allows exponentially increasing the receptive field without increasing parameters and model complexity. As shown in Figure 8c, the network structure of dilated convolution is presented. Unlike traditional convolutional neural networks (CNN), dilated convolution permits the input of convolution to have interval sampling controlled by the dilation factor, denoted as d. In the bottom layer, d represents that the input is sampled at each time point, and in the hidden layers, d = 2 means that the input is sampled every 2 time points as one input. For a one-dimensional XCO_{2}concentration sequence X = (${x}_{0}$, ${x}_{1}$, …, ${x}_{t-1}$, x_{t}), the definition of dilated convolution ${F}_{\left(S\right)}$ with a filter f on 0, …, k − 1 is given as follows:$${F}_{\left(s\right)}=\sum _{i=0}^{k-1}f\left(i\right){x}_{(s-d\xb7i)},$$$$w=1+(k-1)\xb7\frac{{b}^{n}-1}{b-1}.$$Here, n is the number of layers, and b is the base of the dilation convolution (dilation factor d = ${b}^{i}$, i = 1, 2, …, n). It can be observed that when the filter size is 3 and the dilation factors are [1, 2, 4], the output y_{t}at time t is determined by the inputs (x_{1}, x_{2}, …, x_{t}), indicating that the receptive field can cover all values in the input sequence.**Residual block**The residual structure of TCN is illustrated in Figure 8a. The output of different layers is added to the input data, forming a residual block. After passing through an activation function, the output is obtained. The residual block connection mechanism enhances the network’s feedback and convergence, and helps avoid issues like gradient vanishing and exploding commonly found in traditional neural networks. Each residual unit consists of two one-dimensional dilated causal convolutional layers and a non-linear mapping. Initially, the input data ${h}_{t-1}$ undergoes a one-dimensional dilated causal convolution, followed by weight normalization to address gradient explosion and accelerate network training. Subsequently, a ReLU activation function is applied for non-linear operations. Dropout is added after each dilated convolution to prevent overfitting. Additionally, a 1 × 1 convolution is introduced to return to the original number of channels. Finally, the obtained result is summed with the input to generate the output vector$${f}_{i}=conv({w}_{i}\times {F}_{j}+{b}_{i}),$$$$\left(\right)open="\{"\; close="\}">{f}_{0},{f}_{1},\cdots ,{f}_{t-1},{f}_{t},$$$${h}_{t}=\left(\right)open="\{"\; close="\}">{f}_{0},{f}_{1},\cdots ,{f}_{t-1},{f}_{t},$$_{i}represents the feature vector obtained through convolution at time i, w_{i}denotes the weights of the convolution calculation at time i, F_{j}represents the convolutional kernel of the j-th layer, b_{i}is the bias vector, weightnorm(x) = $\frac{\left(\right)}{{w}_{x}}\u2225v\u2225$, $\left(\right)$ represents the magnitude of the weight w in the Relu(x) = max(0,x) operation, and $\frac{v}{\u2225v\u2225}$ indicates the unit vector in the same direction as w. h_{t}represents the feature map obtained after the complete convolution of the j-th layer.

#### 2.4.2. Tcn-Cam Module

_{2}time series, calculate their attention scores, and further capture temporal relationships, this study designs a channel attention module suitable for TCN. The attention mechanism weights and sums the feature vectors input to the TCN network, as shown in Figure 8c. Two pooling layers, global average pooling and global maximum pooling, are used to obtain the importance of these features. The input is the hidden layer output vector h

_{t}(with shape N × C × T) from the TCN layer, where C is the number of features or channels, T is the time sequence length, and N is the number of samples. After passing through the two global pooling layers, a channel feature of size C × 1 × 1 is obtained. Then, channel dimension reduction is performed through a 1 × 1 convolutional layer. This process is expressed as

_{t}is subjected to channel attention, resulting in the new feature y

_{i}. Subsequently, the weighted new feature y

_{i}is input into an LSTM module for further predictions.

#### 2.4.3. LSTM Module

_{t}. The current cell state is determined by

_{t}, i

_{t}, and o

_{t}represent the forget gate, input gate, and output gate, respectively; $\sigma $ and tanh denote the sigmoid function and hyperbolic tangent function; w

_{f}, w

_{i}, w

_{o}and w

_{c}are the weight matrices of the LSTM model; ${h}_{t-1}$ is the state information passed from the previous time step; b

_{f}, b

_{i}, b

_{o}and b

_{c}are the bias matrices of the LSTM; $\tilde{{c}_{t}}$ represents the candidate memory cell; c

_{t}denotes the current cell state. The symbol ⊙ represents element-wise multiplication of two matrices.

_{2}in the next time step. The output vector learned by the LSTM layer is then fed into a fully connected network. Through iterative training, the final estimate of YRD_XCO

_{2}is obtained.

#### 2.5. Model Evaluation Metrics

## 3. Results and Discussion

#### 3.1. Experimental Environment

_{2}concentration values for the YRD region in 2020.

_{2}concentration prediction model based on the multi-input CATCN-LSTM architecture is developed. The prediction results are illustrated in Figure 10, demonstrating a strong positive correlation between the predicted values and the observed values with a fitting degree of 93%. Based on the left side of Figure 10, it is evident that the model performs well in predicting regions with high amplitude and frequency. Some outliers can be attributed to extreme weather conditions or industrial incidents, such as during the COVID-19 pandemic when certain regions experienced abnormal fluctuations in CO

_{2}concentration due to lockdowns and reduced economic activities. Hence, the occurrence of these outliers may be closely linked to environmental factors and human activities.

#### 3.2. Sensitivity Analysis

_{2}data for January 2020 to December 2020. In order to validate the effectiveness of the proposed model, sensitivity analysis, involving five different combinations, is conducted:

- LSTM, denoted as Model 1;
- TCN, denoted as Model 2;
- TCN-CAM, denoted as Model 3;
- TCN-LSTM, denoted as Model 4;
- CATCN-LSTM, representing the integrated model proposed in this paper.

#### 3.3. Comparison of CATCN-LSTM with Other Models

_{2}concentration, the training data are divided into seasons: spring (March–May), summer (June–August), autumn (September–November), and winter (December–February). Each data subset is used for model training. Since the test set data for winter only extended until December 2020, predictions are made solely for this month.

_{2}concentration and the predicted values of each model. CATCN-LSTM consistently provides more accurate predictions across the entire forecast range compared to the other models. XGBOOST and SVR exhibited relatively weaker performances, while RNN and CNN-LSTM showed noticeable lags. The model achieves the best prediction performance during the summer season, considering that the Yangtze River Delta region experiences a subtropical monsoon climate during this period, typically characterized by higher temperatures. This season sees significant impacts on ecosystem activities and processes like plant photosynthesis, resulting in notable fluctuations in atmospheric CO

_{2}concentration. The model’s ability to accurately capture these seasonal variations contributes to its precision in predictions. In contrast, winter temperatures are generally lower, and the region experiences significant temperature fluctuations due to the convergence of cold and warm air masses. This can lead to phenomena such as snowfall, human activities related to heating facilities, and complex factors like emissions and energy consumption, introducing more noise and resulting in comparatively poorer model performance during this season.

_{2}and the impact of extreme weather. Firstly, the robustness, memory capacity, nonlinear mapping ability, and self-learning capability of TCN make it more effective in predicting CO

_{2}concentration and capturing global information than other models. Secondly, despite the influence of periodic patterns and weather conditions on CO

_{2}, the residual blocks of the TCN model add the input to the output of the convolutional layer, aiding in gradient propagation and model training. This mechanism enables better capture of local and short-term dependencies in the sequence. With the addition of the attention mechanism, the model can enhance its focus on different features. Lastly, LSTM is employed to handle the long-term dependencies of the entire sequence, further enhancing the accuracy of the final predictions. This method demonstrates practicality in capturing the atmospheric chemistry and physical nonlinearity. It can estimate CO

_{2}concentration trends for each season, providing essential data support for understanding and addressing climate change and environmental issues, contributing to the realization of carbon neutrality goals. Table 5 presents a comparison of the prediction errors between the proposed method and other typical machine learning methods.

_{2}concentrations for each season in 2020 were 415.11 ppm, 413.05 ppm, 413.18 ppm, and 414.71 ppm, with errors relative to the true values being 0.20 ppm, 0.13 ppm, 0.14 ppm, and 0.21 ppm, respectively. In the case of minimal concentration variation during spring, the model exhibits satisfactory performance in predicting values compared to the actual ones. However, during extreme increases or decreases in concentration in summer and winter, the proposed model demonstrates good fitting effects. In contrast, other models show noticeable lag and delay. This suggests that the proposed model has potential practical applications in addressing changes in CO

_{2}concentration in the field of carbon emissions. Figure 12 illustrates the annual average CO

_{2}values for YRD in 2020. It is evident that the estimated CO

_{2}values align well with the annual average XCO

_{2}values, showcase the high consistency of these results. These findings provide robust support for future climate and carbon emission management, highlighting the model’s applicability across different seasons and conditions.

## 4. Conclusions and Prospect

#### 4.1. Conclusions

- To address spatiotemporal sparse characteristics of data observed from carbon satellite raw data, this paper employs bilinear interpolation to resample multiple auxiliary datasets with XCO
_{2}data, achieving a daily data granularity of 0.25°. Subsequently, an Extreme Random Forest algorithm is utilized to reconstruct the data from 2016 to 2020. Through ten-fold cross-validation, the model’s robustness is verified, ensuring a high concordance of 92% with ground measurement station data. - CATCN-LSTM algorithm is proposed for predicting four seasons’ CO
_{2}concentrations in the Yangtze River Delta; it achieved higher predictive accuracy in summer and relatively weaker accuracy in winter. Compared to the LSTM model previously used by Meng and Li [21,22], this model effectively addresses the challenges posed by interdependent features in long sequences and provides a new approach for predicting CO_{2}concentrations.

#### 4.2. Prospective

- In terms of data, since satellite XCO
_{2}observational data are typically more accurate than reconstructed XCO_{2}data, future studies can integrate more satellite data to enhance accuracy. For example, satellites like OCO-3 and GOSAT can be integrated, and deep learning techniques can be employed for interpolation when integrating high spatiotemporal resolution XCO_{2}data. In addition, this study estimates XCO_{2}data using environmental variables, but did not incorporate anthropogenic factors into the modeling process. Existing research has not adequately addressed this point [43,44,45], and in the future, incorporating social science factors into the model may improve our estimation accuracy. - In the model aspect, more advanced deep learning architectures or ensemble methods can be explored to further improve the predictive accuracy of CO
_{2}concentrations. Consideration can be given to incorporating technologies like Transformer and spatiotemporal attention mechanisms to better capture the complex spatiotemporal relationships of CO_{2}concentrations in the atmosphere. Tuning model parameters and conducting sensitivity analyses are recommended to ensure model robustness and stability. - In terms of ground stations, it is advisable to increase the construction of CO
_{2}ground stations to enhance data reliability and coverage. Real-time monitoring data from ground stations can serve as crucial references for model validation and calibration, thereby increasing the credibility of the model in practical applications.

## 5. Declaration of Generative AI and AI-Assisted Technologies in the Writing Process

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

_{2}product is available from https://earthdata.nasa.gov/, accessed on 10 May 2024. The CAMS product is available from https://ads.atmosphere.copernicus.eu/, accessed on 10 May 2024. The NDVI dataset is available from https://modis.gsfc.nasa.gov/, accessed on 10 May 2024. The ERA5 dataset is available from https://cds.climate.copernicus.eu/, accessed on 10 May 2024. The CLCD dataset is available from https://engine-aiearth.aliyun.com/, accessed on 10 May 2024. The TCCON dataset is available from https://tccondata.org/, accessed on 10 May 2024.

## Conflicts of Interest

## References

- Brethomé, F.M.; Williams, N.J.; Seipp, C.A.; Kidder, M.K.; Custelcean, R. Direct air capture of CO
_{2}via aqueous-phase absorption and crystalline-phase release using concentrated solar power. Nat. Energy**2018**, 3, 553–559. [Google Scholar] [CrossRef] - Ofipcc, W.G.I. Climate Change 2013: The Physical Science Basis. Contrib. Work.
**2013**, 43, 866–871. [Google Scholar] - Zickfeld, K.; Azevedo, D.; Mathesius, S.; Matthews, H.D. Asymmetry in the climate—Carbon cycle response to positive and negative CO
_{2}emissions. Nat. Clim. Change**2021**, 11, 613–617. [Google Scholar] [CrossRef] - Zhenmin, L.; Espinosa, P. Tackling climate change to accelerate sustainable development. Nat. Clim. Change
**2019**, 9, 494–496. [Google Scholar] [CrossRef] - Zhao, X.; Ma, X.; Chen, B.; Shang, Y.; Song, M. Challenges toward carbon neutrality in China: Strategies and countermeasures. Resour. Conserv. Recycl.
**2022**, 176, 105959. [Google Scholar] [CrossRef] - Jeong, S.; Zhao, C.; Andrews, A.E.; Dlugokencky, E.J.; Sweeney, C.; Bianco, L.; Wilczak, J.M.; Fischer, M.L. Seasonal variations in N
_{2}O emissions from central California. Geophys. Res. Lett.**2012**, 39, L16805. [Google Scholar] [CrossRef] - Chiba, T.; Haga, Y.; Inoue, M.; Kiguchi, O.; Nagayoshi, T.; Madokoro, H.; Morino, I. Measuring regional atmospheric CO
_{2}concentrations in the lower troposphere with a non-dispersive infrared analyzer mounted on a UAV, Ogata Village, Akita, Japan. Atmosphere**2019**, 10, 487. [Google Scholar] [CrossRef] - Siabi, Z.; Falahatkar, S.; Alavi, S.J. Spatial distribution of XCO
_{2}using OCO-2 data in growing seasons. J. Environ. Manag.**2019**, 244, 110–118. [Google Scholar] [CrossRef] [PubMed] - Wang, H.; Jiang, F.; Wang, J.; Ju, W.; Chen, J.M. Terrestrial ecosystem carbon flux estimated using GOSAT and OCO-2 XCO 2 retrievals. Atmos. Chem. Phys.
**2019**, 19, 12067–12082. [Google Scholar] [CrossRef] - Hammerling, D.M.; Michalak, A.M.; Kawa, S.R. Mapping of CO
_{2}at high spatiotemporal resolution using satellite observations: Global distributions from OCO-2. J. Geophys. Res. Atmos.**2012**, 117, D6. [Google Scholar] [CrossRef] - Mao, J.; Kawa, S.R. Sensitivity studies for space-based measurement of atmospheric total column carbon dioxide by reflected sunlight. Appl. Opt.
**2004**, 43, 914–927. [Google Scholar] [CrossRef] [PubMed] - Liang, A.; Gong, W.; Han, G.; Xiang, C. Comparison of satellite-observed XCO
_{2}from GOSAT, OCO-2, and ground-based TCCON. Remote Sens.**2017**, 9, 1033. [Google Scholar] [CrossRef] - Chen, L.; Zhang, Y.; Zou, M.; Xu, Q.; Tao, J. Overview of atmospheric CO
_{2}remote sensing from space. J. Remote Sens.**2015**, 19, 1–11. [Google Scholar] - Pei, Z.; Han, G.; Ma, X.; Shi, T.; Gong, W. A method for estimating the background column concentration of CO
_{2}using the lagrangian approach. IEEE Trans. Geosci. Remote Sens.**2022**, 60, 4108112. [Google Scholar] [CrossRef] - He, Z.; Lei, L.; Zhang, Y.; Sheng, M.; Wu, C.; Li, L.; Zeng, Z.C.; Welp, L.R. Spatio-temporal mapping of multi-satellite observed column atmospheric CO
_{2}using precision-weighted kriging method. Remote Sens.**2020**, 12, 576. [Google Scholar] [CrossRef] - Jin, C.; Xue, Y.; Jiang, X.; Zhao, L.; Yuan, T.; Sun, Y.; Wu, S.; Wang, X. A long-term global XCO
_{2}dataset: Ensemble of satellite products. Atmos. Res.**2022**, 279, 106385. [Google Scholar] [CrossRef] - He, C.; Ji, M.; Li, T.; Liu, X.; Tang, D.; Zhang, S.; Luo, Y.; Grieneisen, M.L.; Zhou, Z.; Zhan, Y. Deriving full-coverage and fine-scale XCO
_{2}across China based on OCO-2 satellite retrievals and CarbonTracker output. Geophys. Res. Lett.**2022**, 49, e2022GL098435. [Google Scholar] [CrossRef] - Li, J.; Jia, K.; Wei, X.; Xia, M.; Chen, Z.; Yao, Y.; Zhang, X.; Jiang, H.; Yuan, B.; Tao, G.; et al. High-spatiotemporal resolution mapping of spatiotemporally continuous atmospheric CO
_{2}concentrations over the global continent. Int. J. Appl. Earth Obs. Geoinf.**2022**, 108, 102743. [Google Scholar] [CrossRef] - Wang, W.; He, J.; Feng, H.; Jin, Z. High-Coverage Reconstruction of XCO
_{2}Using Multisource Satellite Remote Sensing Data in Beijing–Tianjin–Hebei Region. Int. J. Environ. Res. Public Health**2022**, 19, 10853. [Google Scholar] [CrossRef] - Jingzhi, Z. Research on the Temporal Data Processing and Prediction Model of Atmospheric CO
_{2}. Ph.D. Thesis, Anhui University of Science and Technology, Huainan, China, 2020. [Google Scholar] - Meng, J.; Ding, G.; Liu, L. Research on a prediction method for carbon dioxide concentration based on an optimized LSTM network of spatio-temporal data fusion. IEICE Trans. Inf. Syst.
**2021**, 104, 1753–1757. [Google Scholar] [CrossRef] - Li, J.; Zhang, Y.; Gai, R. Estimation of CO
_{2}Column Concentration in Spaceborne Short Wave Infrared Based on Machine Learning. China Environ. Sci.**2023**, 43, 1499–1509. [Google Scholar] [CrossRef] - Voulodimos, A.; Doulamis, N.; Doulamis, A.; Protopapadakis, E. Deep learning for computer vision: A brief review. Comput. Intell. Neurosci.
**2018**, 2018, 7068349. [Google Scholar] [CrossRef] [PubMed] - Hu, K.; Jin, J.; Zheng, F.; Weng, L.; Ding, Y. Overview of behavior recognition based on deep learning. Artif. Intell. Rev.
**2023**, 56, 1833–1865. [Google Scholar] [CrossRef] - Li, H.; Mu, H.; Zhang, M.; Li, N. Analysis on influence factors of China’s CO
_{2}emissions based on Path–STIRPAT model. Energy Policy**2011**, 39, 6906–6911. [Google Scholar] [CrossRef] - Wu, Y.; Peng, Z.; Ma, Q. A Study on the Factors Influencing Carbon Emission Intensity in the Yangtze River Delta Region. J. Liaoning Tech. Univ. Soc. Sci. Ed.
**2023**, 25, 28–34. [Google Scholar] - Nassar, R.; Hill, T.G.; McLinden, C.A.; Wunch, D.; Jones, D.B.; Crisp, D. Quantifying CO
_{2}emissions from individual power plants from space. Geophys. Res. Lett.**2017**, 44, 10–045. [Google Scholar] [CrossRef] - Inness, A.; Ades, M.; Agustí-Panareda, A.; Barré, J.; Benedictow, A.; Blechschmidt, A.M.; Dominguez, J.J.; Engelen, R.; Eskes, H.; Flemming, J.; et al. The CAMS reanalysis of atmospheric composition. Atmos. Chem. Phys.
**2019**, 19, 3515–3556. [Google Scholar] [CrossRef] - Agustí-Panareda, A.; Barré, J.; Massart, S.; Inness, A.; Aben, I.; Ades, M.; Baier, B.C.; Balsamo, G.; Borsdorff, T.; Bousserez, N.; et al. The CAMS greenhouse gas reanalysis from 2003 to 2020. Atmos. Chem. Phys.
**2023**, 23, 3829–3859. [Google Scholar] [CrossRef] - Yang, W.; Zhao, Y.; Wang, Q.; Guan, B. Climate, CO
_{2}, and anthropogenic drivers of accelerated vegetation greening in the Haihe River Basin. Remote Sens.**2022**, 14, 268. [Google Scholar] [CrossRef] - Zhang, Y.; Hu, Z.; Wang, J.; Gao, X.; Yang, C.; Yang, F.; Wu, G. Temporal upscaling of MODIS instantaneous FAPAR improves forest gross primary productivity (GPP) simulation. Int. J. Appl. Earth Obs. Geoinf.
**2023**, 121, 103360. [Google Scholar] [CrossRef] - Lian, Y.; Li, H.; Renyang, Q.; Liu, L.; Dong, J.; Liu, X.; Qu, Z.; Lee, L.C.; Chen, L.; Wang, D.; et al. Mapping the net ecosystem exchange of CO
_{2}of global terrestrial systems. Int. J. Appl. Earth Obs. Geoinf.**2023**, 116, 103176. [Google Scholar] [CrossRef] - Liu, B.; Ma, X.; Ma, Y.; Li, H.; Jin, S.; Fan, R.; Gong, W. The relationship between atmospheric boundary layer and temperature inversion layer and their aerosol capture capabilities. Atmos. Res.
**2022**, 271, 106121. [Google Scholar] [CrossRef] - Zhang, Z.; Lou, Y.; Zhang, W.; Wang, H.; Zhou, Y.; Bai, J. Assessment of ERA-Interim and ERA5 reanalysis data on atmospheric corrections for InSAR. Int. J. Appl. Earth Obs. Geoinf.
**2022**, 111, 102822. [Google Scholar] [CrossRef] - Berrisford, P.; Soci, C.; Bell, B.; Dahlgren, P.; Horányi, A.; Nicolas, J.; Radu, R.; Villaume, S.; Bidlot, J.; Haimberger, L. The ERA5 global reanalysis: Preliminary extension to 1950. Q. J. R. Meteorol. Soc.
**2021**, 147, 4186–4227. [Google Scholar] - Toon, G.; Blavier, J.F.; Washenfelder, R.; Wunch, D.; Keppel-Aleks, G.; Wennberg, P.; Connor, B.; Sherlock, V.; Griffith, D.; Deutscher, N.; et al. Total column carbon observing network (TCCON). In Proceedings of the Hyperspectral Imaging and Sensing of the Environment, Vancouver, BC, Canada, 26–30 April 2009; Optica Publishing Group: Washington, DC, USA, 2009; p. JMA3. [Google Scholar]
- Hu, K.; Zhang, Q.; Gong, S.; Zhang, F.; Weng, L.; Jiang, S.; Xia, M. A review of anthropogenic ground-level carbon emissions based on satellite data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2024**, 17, 8339–8357. [Google Scholar] [CrossRef] - Zhang, L.L.; Yue, T.X.; Wilson, J.P.; Zhao, N.; Zhao, Y.P.; Du, Z.P.; Liu, Y. A comparison of satellite observations with the XCO
_{2}surface obtained by fusing TCCON measurements and GEOS-Chem model outputs. Sci. Total Environ.**2017**, 601, 1575–1590. [Google Scholar] [CrossRef] - Ren, W.; Wang, Z.; Xia, M.; Lin, H. MFINet: Multi-Scale Feature Interaction Network for Change Detection of High-Resolution Remote Sensing Images. Remote Sens.
**2024**, 16, 1269. [Google Scholar] [CrossRef] - Wunch, D.; Wennberg, P.O.; Osterman, G.; Fisher, B.; Naylor, B.; Roehl, C.M.; O’Dell, C.; Mandrake, L.; Viatte, C.; Kiel, M.; et al. Comparisons of the orbiting carbon observatory-2 (OCO-2) XCO
_{2}measurements with TCCON. Atmos. Meas. Tech.**2017**, 10, 2209–2238. [Google Scholar] [CrossRef] - Wunch, D.; Toon, G.C.; Blavier, J.F.L.; Washenfelder, R.A.; Notholt, J.; Connor, B.J.; Griffith, D.W.; Sherlock, V.; Wennberg, P.O. The total carbon column observing network. Philos. Trans. R. Soc. Math. Phys. Eng. Sci.
**2011**, 369, 2087–2112. [Google Scholar] [CrossRef] - Laughner, J.L.; Toon, G.C.; Mendonca, J.; Petri, C.; Roche, S.; Wunch, D.; Blavier, J.F.; Griffith, D.W.; Heikkinen, P.; Keeling, R.F.; et al. The Total Carbon Column Observing Network’s GGG2020 data version. Earth Syst. Sci. Data
**2024**, 16, 2197–2260. [Google Scholar] [CrossRef] - Li, T.; Wu, J.; Wang, T. Generating daily high-resolution and full-coverage XCO
_{2}across China from 2015 to 2020 based on OCO-2 and CAMS data. Sci. Total Environ.**2023**, 893, 164921. [Google Scholar] [CrossRef] - Zhang, M.; Liu, G. Mapping contiguous XCO
_{2}by machine learning and analyzing the spatio-temporal variation in China from 2003 to 2019. Sci. Total Environ.**2023**, 858, 159588. [Google Scholar] [CrossRef] [PubMed] - He, S.; Yuan, Y.; Wang, Z.; Luo, L.; Zhang, Z.; Dong, H.; Zhang, C. Machine Learning Model-Based Estimation of XCO
_{2}with High Spatiotemporal Resolution in China. Atmosphere**2023**, 14, 436. [Google Scholar] [CrossRef] - Fichtner, F.; Mandery, N.; Wieland, M.; Groth, S.; Martinis, S.; Riedlinger, T. Time-series analysis of Sentinel-1/2 data for flood detection using a discrete global grid system and seasonal decomposition. Int. J. Appl. Earth Obs. Geoinf.
**2023**, 119, 103329. [Google Scholar] [CrossRef] - Qiu, Y.; Zhou, J.; Chen, J.; Chen, X. Spatiotemporal fusion method to simultaneously generate full-length normalized difference vegetation index time series (SSFIT). Int. J. Appl. Earth Obs. Geoinf.
**2021**, 100, 102333. [Google Scholar] [CrossRef] - Wang, X.; Li, L.; Gong, K.; Mao, J.; Hu, J.; Li, J.; Liu, Z.; Liao, H.; Qiu, W.; Yu, Y.; et al. Modelling air quality during the EXPLORE-YRD campaign—Part I. Model performance evaluation and impacts of meteorological inputs and grid resolutions. Atmos. Environ.
**2021**, 246, 118131. [Google Scholar] [CrossRef] - Li, Q.; Zhang, K.; Li, R.; Yang, L.; Yi, Y.; Liu, Z.; Zhang, X.; Feng, J.; Wang, Q.; Wang, W.; et al. Underestimation of biomass burning contribution to PM2. 5 due to its chemical degradation based on hourly measurements of organic tracers: A case study in the Yangtze River Delta (YRD) region, China. Sci. Total Environ.
**2023**, 872, 162071. [Google Scholar] [CrossRef] - Falahatkar, S.; Mousavi, S.M.; Farajzadeh, M. Spatial and temporal distribution of carbon dioxide gas using GOSAT data over IRAN. Environ. Monit. Assess.
**2017**, 189, 627. [Google Scholar] [CrossRef] - Shi, Q.; Zhuo, L.; Tao, H.; Yang, J. A fusion model of temporal graph attention network and machine learning for inferring commuting flow from human activity intensity dynamics. Int. J. Appl. Earth Obs. Geoinf.
**2024**, 126, 103610. [Google Scholar] [CrossRef] - Guo, X.; Hou, B.; Yang, C.; Ma, S.; Ren, B.; Wang, S.; Jiao, L. Visual explanations with detailed spatial information for remote sensing image classification via channel saliency. Int. J. Appl. Earth Obs. Geoinf.
**2023**, 118, 103244. [Google Scholar] [CrossRef] - Yin, H.; Weng, L.; Li, Y.; Xia, M.; Hu, K.; Lin, H.; Qian, M. Attention-guided siamese networks for change detection in high resolution remote sensing images. Int. J. Appl. Earth Obs. Geoinf.
**2023**, 117, 103206. [Google Scholar] [CrossRef] - Ren, H.; Xia, M.; Weng, L.; Hu, K.; Lin, H. Dual-Attention-Guided Multiscale Feature Aggregation Network for Remote Sensing Image Change Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2024**, 17, 4899–4916. [Google Scholar] [CrossRef] - Hu, K.; Shen, C.; Wang, T.; Shen, S.; Cai, C.; Huang, H.; Xia, M. Action Recognition Based on Multi-Level Topological Channel Attention of Human Skeleton. Sensors
**2023**, 23, 9738. [Google Scholar] [CrossRef] [PubMed] - Hu, K.; Zhang, E.; Xia, M.; Wang, H.; Ye, X.; Lin, H. Cross-dimensional feature attention aggregation network for cloud and snow recognition of high satellite images. Neural Comput. Appl.
**2024**, 36, 7779–7798. [Google Scholar] [CrossRef] - Hu, K.; Shen, C.; Wang, T.; Xu, K.; Xia, Q.; Xia, M.; Cai, C. Overview of Temporal Action Detection Based on Deep Learning. Artif. Intell. Rev.
**2024**, 57, 26. [Google Scholar] [CrossRef] - Jiang, S.; Lin, H.; Ren, H.; Hu, Z.; Weng, L.; Xia, M. MDANet: A High-Resolution City Change Detection Network Based on Difference and Attention Mechanisms under Multi-Scale Feature Fusion. Remote Sens.
**2024**, 16, 1387. [Google Scholar] [CrossRef]

**Figure 2.**Study area: (

**a**) the NDVI coverage map of China, (

**b**) the study area, (

**c**) the growth trend of XCO

_{2}in the YRD from 2016 to 2020.

**Figure 4.**Validation chart of TCCON and reconstructed XCO

_{2}at Hefei site from 2016 to 2020 (publicly available sites in the Yangtze River Delta region).

**Figure 5.**Time series after STL decomposition (

**a**) original data, (

**b**) trend component, (

**c**) seasonal component, (

**d**) residual component.

**Figure 6.**Seasonal and annual changes in CO

_{2}concentrations in the Yangtze River Delta from 2016 to 2020.

**Figure 8.**Model architecture: (

**a**) CATCN-LSTM, (

**b**) Dilated Causal Convolution, (

**c**) CAM, (

**d**) LSTM, (

**e**) TCN-LSTM.

**Figure 12.**Annual average CO

_{2}concentration map in the Yangtze River Delta region for the year 2020.

Data | Variables | Spatial Resolution | Temporal Resolution | Source |
---|---|---|---|---|

Satellite Data | XCO_{2} | 1.29 × 2.25 km | 16 day | https://earthdata.nasa.gov/ |

Reanalysis Data | XCO_{2} | 0.75° | 3 h | https://ads.atmosphere.copernicus.eu/ |

Meteorological Data | Relative Humidity (Rh) | |||

10-m U Component of Wind (U) | 0.25° | 3 h | https://cds.climate.copernicus.eu/ | |

10-m V Component of Wind (V) | ||||

2 m Temperature (T2M) | ||||

Elevation Data | DEM | 90 m × 90 m | - | https://engine-aiearth.aliyun.com/ |

CLCD | Land, Forest, Grassland, Water Body, Shrubland and so on | 30 m | - | https://engineaiearth.aliyun.com/ |

Station Data | XCO_{2} | Point | ∼2 m | https://tccondata.org/ |

Lag Order | AC Value | PAC Value | Q Statistic | p Value |
---|---|---|---|---|

1st | 0.938 | 0.938 | 290,163.434 | 0.030 |

2nd | 0.888 | 0.069 | 550,369.730 | 0.020 |

3rd | 0.843 | 0.022 | 784,843.896 | 0.001 |

4th | 0.801 | 0.011 | 996,667.104 | 0.000 |

5th | 0.763 | 0.014 | 1,188,779.593 | 0.000 |

6th | 0.728 | 0.018 | 1,363,936.930 | 0.000 |

7th | 0.697 | 0.014 | 1,524,238.478 | 0.000 |

8th | 0.669 | 0.027 | 1,672,195.034 | 0.000 |

9th | 0.645 | 0.026 | 1,809,783.745 | 0.000 |

10th | 0.624 | 0.022 | 1,938,508.098 | 0.000 |

Model | ${\mathit{R}}^{2}$ | MAE | RMSE | MAPE |
---|---|---|---|---|

LSTM | 0.75 | 0.77 | 1.25 | 0.010 |

TCN | 0.85 | 0.58 | 0.92 | 0.014 |

TCN-CAM | 0.86 | 0.54 | 0.90 | 0.014 |

TCN-LSTM | 0.90 | 0.40 | 0.69 | 0.009 |

CATCN-LSTM | 0.92 | 0.34 | 0.62 | 0.007 |

Season | Model | ${\mathit{R}}^{2}$ | MAE | RMSE | MAPE |
---|---|---|---|---|---|

Spring | CATCN-LSTM | 0.917 | 0.403 | 0.681 | 0.0009 |

CNN-LSTM | 0.878 | 0.595 | 0.901 | 0.0014 | |

RNN | 0.754 | 0.774 | 1.250 | 0.0018 | |

SVR | 0.699 | 0.916 | 1.385 | 0.0022 | |

XGBOOST | 0.602 | 1.027 | 1.594 | 0.0024 | |

Summer | CATCN-LSTM | 0.941 | 0.344 | 0.559 | 0.0008 |

CNN-LSTM | 0.863 | 0.588 | 0.926 | 0.0014 | |

RNN | 0.748 | 0.821 | 1.279 | 0.0023 | |

SVR | 0.685 | 1.074 | 1.390 | 0.0026 | |

XGBOOST | 0.620 | 1.255 | 1.624 | 0.0033 | |

Autumn | CATCN-LSTM | 0.937 | 0.333 | 0.515 | 0.0008 |

CNN-LSTM | 0.855 | 0.604 | 1.006 | 0.0019 | |

RNN | 0.721 | 0.871 | 1.483 | 0.0021 | |

SVR | 0.682 | 0.916 | 1.425 | 0.0022 | |

XGBOOST | 0.640 | 1.062 | 1.590 | 0.0024 | |

Winter | CATCN-LSTM | 0.915 | 0.410 | 0.697 | 0.0010 |

CNN-LSTM | 0.880 | 0.567 | 0.992 | 0.0012 | |

RNN | 0.734 | 0.821 | 1.304 | 0.0019 | |

SVR | 0.659 | 0.937 | 1.476 | 0.0022 | |

XGBOOST | 0.582 | 0.860 | 1.534 | 0.0026 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Hu, K.; Zhang, Q.; Feng, X.; Liu, Z.; Shao, P.; Xia, M.; Ye, X.
An Interpolation and Prediction Algorithm for XCO_{2} Based on Multi-Source Time Series Data. *Remote Sens.* **2024**, *16*, 1907.
https://0-doi-org.brum.beds.ac.uk/10.3390/rs16111907

**AMA Style**

Hu K, Zhang Q, Feng X, Liu Z, Shao P, Xia M, Ye X.
An Interpolation and Prediction Algorithm for XCO_{2} Based on Multi-Source Time Series Data. *Remote Sensing*. 2024; 16(11):1907.
https://0-doi-org.brum.beds.ac.uk/10.3390/rs16111907

**Chicago/Turabian Style**

Hu, Kai, Qi Zhang, Xinyan Feng, Ziran Liu, Pengfei Shao, Min Xia, and Xiaoling Ye.
2024. "An Interpolation and Prediction Algorithm for XCO_{2} Based on Multi-Source Time Series Data" *Remote Sensing* 16, no. 11: 1907.
https://0-doi-org.brum.beds.ac.uk/10.3390/rs16111907