The overall proposed hybrid approach requires in situ meteorological variables, stream flow input information and spatially explicit data. Below we briefly present the main data sources.
2.2.1. In Situ Information Procurement
The climatic data were derived from a hydro meteorological station located within our region of interest (21.678881, 40.690079). The hydro meteorological data were obtained from the region of Western Macedonia, where the relevant data records of local stations in the catchment are kept updated. The climatic data refer to daily precipitation, temperature, relative humidity, wind speed and solar radiation for the period from 2006 to 2019. During summer maximum temperatures are recorded in July-August, while during winter minimum values in January. The driest month rains less than 30mm, while during the wettest winter, it is about three times more. The wind speed is of a light breeze, and relative humidity and solar radiation ranged at relatively moderate-to-high levels. A brief overview of the climatic observations is presented in Table 1
Furthermore, for calibration and validation purposes of the model, monthly observed stream flow data from the years 2011 to 2014 were collected at the inlet of Zazari Lake. The stream flow data were obtained by Aristotle University of Thessaloniki, where historical archives of the relevant data records are kept. In addition, within the framework of the AquaNEX project one telemetric station was installed at the inlet of the Zazari Lake, giving monthly nitrate values and data of observed sediment for the second semester (July–December) of 2019.
2.2.2. Spatially Explicit Data
In this study, it is noteworthy that, the DEM used could not cover the entire Zazari-Chimaditida river sub-basin area (Figure 2
, red circle). Hence, a small area (approximately 7 km2
) in the western part was excluded from the rest of the analysis, without significantly affecting the final results and the stability of the model. For stream delineation purposes, we obtained a two-meter resolution DEM from a previous airborne campaign in 2018, which has been used for delineating the sub-river basin watershed and stream network (Figure 3
a). Based on these values, the slope map was reclassified into four slopes (0–2.6%, 2.6–10.4%, 10.4–20.1% and more than 20.1%). Considering the soil data, we utilized a grid soil map from the Food and Agriculture Organization of the United Nations digital soil data hub. The Cambisols, Lithosols, Luvisols and Regosols are the dominant soils in the area (Figure 3
In addition to the topographic data, other data sources were needed as well, such as the land cover. In this study a dataset of geo-referenced polygons of land parcels and information on the type of land cover according with their area was utilized. This product is a combination of (i) the Greek Integrated Administration and Control System of 2017; (ii) in situ recordings for ground truth validation; and (iii) photo interpretation of very high-resolution spaceborne images. Note that the in situ data collection is not through the use of volunteer efforts but instead by a group of highly trained experts involved in the data collection, during the field campaigns of the AquaNEX research project. Water and built-up surfaces are derived from the national data archive. Prior to merging these data, they were checked for consistency and if needed, re-warped to the universal transverse mercator (UTM) coordinate system, resampled to 10 m and retiled into the Sentinel-2 tiling grid. Then, this information was converted into raster format. This land cover product hereinafter called as the combined reference dataset (Figure 3
c). The final reference data includes 2,158,324 pixels distributed over 33 classes (see Appendix A
). Based on this, the LC was categorized into agricultural land (24.26%), forest (68.50%), orchard (0.23%), grassland (3.15%), water (1.97%), wetland (0.42%), urban land (1.46%), and barren land (0.01%).
Then, the decision was made to follow an object-partitioning approach to guarantee that calibration and independent validation pixels were located in different land objects (e.g., neighborhood of pixels). A set of 70,000 random points generated within the boundary shapefile of the entire river basin and utilizing the Voronoi polygons a layer of smaller polygons was generated. Then, the labelled raster (33 classes) was polygonized and intersected with the created sub-polygons layer to ensure that all areas of the polygonized labelled data were contained within one of the created sub-polygons. The generated objects were separated into a randomly selected train (90%) and a test set (10%) with the total number of pixels varying between classes.
One main improvement in the proposed approach is the utilization of spaceborne EO imagery data to provide annual updates in the land cover map at 10m spatial resolution (see Section 2.3
). In the current work, only visible (B2, B3, B4), near-infrared (B5-B8 and B8a) and short wave infrared (B11, B12) bands were used for both Sentinel-2A and Sentinel-2B Level-2A. To mitigate the limitation that arises due to cloud cover, we applied a selection criterion to cloud percentage (<10%) when generating our nearly cloud-free time series. Sitokonstantinou et al. [21
] indicated that the utilization of dense satellite time series data regardless of the cloud coverage offered only a marginal increase in accuracy for a disproportionally larger cost in processing time. Consequently, it was decided to select satellite acquisitions that covers critical phenological stages of the targeted agricultural classes. In this context, the available Sentinel-2 images for the study area were downloaded from the Copernicus Open Access Hub. Figure 4
illustrates the selected Sentinel-2 acquisitions for 2017–2019 period, along with the phenology stages of agricultural classes (aquatic and urban classes are not presented). In the last step of pre-processing the bands at 20 m resolution (B5-B7, B8a, B11 and B12) were resampled at 10 m using a bicubic interpolation. These multispectral bands from the selected satellite observations constituted the classification features and were extracted for further processing.