The marked increase in freshwater algal blooms is of major interest to governments and public health agencies responsible for maintaining the ecological services provided by these systems. Cyanobacteria threaten the ecological integrity of some of the world’s most important lake environments, including Lake Erie [1
], Lake Ontario [2
], Lake Taihu [3
], Lake Okeechobee [4
], and Lake Victoria [5
]. Their increasing frequency and sustained presence affect the structure and functioning of aquatic food webs [6
], limit recreational activities [7
], and threaten drinking water sources [8
]. Monitoring phytoplankton blooms remains difficult and onerous, particularly because their spatial and temporal distribution is highly variable when dominated by buoyant cyanobacteria [9
Several models have been developed to describe the mechanisms involved in bloom onset, development, maintenance and decline, including ecological models based on phytoplankton growth response mechanisms [11
], and hydrological models involving nutrient production, transport, and accumulation [15
]. Other studies have focused on the relationships between the development of phytoplankton blooms and the environmental conditions prevailing in water bodies or watersheds [16
]. Among others, Hu et al. [19
] and Liu et al. [20
] have demonstrated the impact of climatic variables (air temperature, relative humidity, wind speed and direction) on the development of cyanobacterial blooms. Several other studies have shown relations between chlorophyll a concentration (Chl-a) and key physicochemical factors associated with phytoplankton development [21
]. The main disruptive elements identified are phosphorus [23
] and nitrogen when phosphorus is no longer limiting [23
]. The N:P ratio is also used as an indicator of the occurrence of cyanobacterial [27
] and other phytoplankton blooms [30
]. Water temperature and light availability has also been shown to play a major role in bloom development [32
] in a way that is specific to each species [33
These models, most being empirical, allowed to target disruptive elements and characterize their effect on spatiotemporal patterns of phytoplankton blooms. It remains fairly difficult, however, to identify all disruptive elements acting upon a given water body or on an annual basis. In order to develop solutions for the protection or restoration of water bodies, it is necessary to identify major sources of disruption among the multiple anthropogenic disturbances concurrently taking place [34
] and to define their specific effect on the frequency, intensity, and duration of blooms. The supply of local or diffuse sources of nutrients to a water body is related to the physiography of its watershed. For example, the nutrient storage capacity is influenced by the size of the watershed; the type of soil affects runoff and infiltration rates; and the shape and topography of the watershed act upon peak flows and soil erosion [36
]. Moreover, climatic variables play an undeniable role in the development of blooms, including air temperature, precipitation, wind, and hours of sunlight [37
]. Bloom intensity and frequency were shown to increase in response to climate change [38
] and to occur earlier in the spring [44
This study is part of a project aiming to estimate lake predisposition to phytoplankton blooms based on the environmental conditions prevailing on its watershed. The ultimate goal is to provide a tool to project future scenarios of bloom phenology in response to climate change and anthropogenic developments, or to test the efficiency of mitigating approaches. To do so, the first step was to set up a database of near-surface phytoplankton bloom phenological characteristics (frequency, intensity, surface area, onset date, end date, and duration) between 2000 and 2016 from 580 lakes in Quebec, Canada, using satellite images from the MODIS (Moderate Resolution Imaging Spectroradiometer) sensor. This database was then used to target key environmental variables involved in defining this phenology, including the morphological, physiographic, anthropogenic, and climatic characteristics of the lakes and their watershed, through a canonical correlation analysis (CCA). We further illustrate the results for two lakes particularly affected by blooms, the Missisquoi Bay of Lake Champlain and Lake Brome. A frequency analysis model linking the phenological characteristics of phytoplankton blooms with the environmental conditions at the regional scale will later be presented in a sister paper.
2. Materials and Methods
The lakes studied are located in the province of Quebec, Canada, between 44° N and 50° N and 67° W and 80° W, covering an area of approximately 600,000 km2
). The spatial distribution of the lakes was homogeneous throughout the study area, as verified by the Ripley’s index suggesting a regular pattern [46
]. This territory mainly consists of podzolic soils and is characterized by a humid continental climate in the south and a subarctic climate in the north. Industrial and agricultural developments are widespread in the southwestern area of the territory, while the northern part is much less developed.
A database of historical Chl-a concentrations was generated from satellite data acquired by the MODIS sensor (bands 1–7) located on NASA’s Earth Observation System Terra platform, at a temporal frequency of 1 day. The spatial resolution of the bands 3–7 was refined from 500 m to 250 m using a spatial resolution downscaling approach developed by Trishchenko et al. [47
]. This approach was validated using high resolution spatial data (Landsat ETM+ at 30 m) demonstrating that the radiometric properties of the downscaled bands were not altered. Image pre-treatment, including downscaling, projection, and atmospheric correction, was achieved using an automated procedure developed by the Canadian Center for Remote Sensing [47
]. An estimation algorithm for Chl-a concentrations based on ensemble methods [48
] was then applied to all MODIS images extracted. This algorithm was specifically developed for inland waters and performed well on databases with Chl-a ranging from low to high concentrations (see Appendix A
). In order to reduce the uncertainty related to the heterogeneous elements on lakeshores (e.g., build environments, lake bottom), mixed pixels (water-land boundary) were removed over a 250 m band by applying a ground mask. This procedure generated a composite image formed by the minima of reflectance in the near infrared on images captured between May and October of 2000 to 2016. This avoids interference by macrophytes, since the reflectance of water pixels with phytoplankton will be inferior to that of the pixels occupied by macrophytes, which are more permanent elements in the littoral zone during the open-water season. In order to have enough pixels to adequately represent each water body, only lakes with a minimum area of 3.5 km2
were considered for this study. Areas affected by haze or cloud cover were then removed using a cloud mask [49
] specifically developed for inland waters (lakes, rivers, estuaries). Only lakes having less than 25% cloud cover for a given image were selected. Overall, Chl-a concentrations were extracted from 580 lakes in southern Quebec between May and October of the years 2000 to 2016, for a total of 1572 images.
An operational biomass threshold of 10 µg Chl-a L−1
was chosen to define the onset of a bloom and characterize its phenology on the studied lakes. It corresponds to the lower end of the eutrophic lake class (8–25 µg L−1
) and to the decision threshold for recreational water level 1 as established by the World Health Organization [50
] and the government of Quebec [51
]. When level 1 threshold is exceeded and biomass becomes dominated by cyanobacteria, it is considered a risk of minor health effects (irritation and allergies). This threshold is already used by watershed-based organizations, water management officers and municipalities.
For each studied year, phenological variables were established as follows: (1) the frequency, which is the number of days when Chl-a concentrations remained above the threshold, (2) the intensity, which is the maximum concentration of Chl-a detected during a bloom, (3) the relative area, which is the maximum area occupied by a bloom normalized by the lake area, (4) the onset date, and (5) the end date, which are the first and last days of the year when a bloom is detected, and (6) the duration, which is the number of days between the onset date and the end date. Determination of the end date and duration of blooms is challenging because the studied region is frequently covered by clouds during the fall, significantly reducing the number of MODIS images available for this period. This is especially true during the month of October, for which there was on average half as many MODIS images without full cloud cover than for months between May and September. Therefore, end date and duration were discarded from the study. The retained variables describe what can be considered as an annual-based phenology, compiling days with less than 25% cloud cover and for which remotely-sensed Chl-a was above the established threshold, for any given pixel.
A geo-referenced database of the morphological, physiographic, and climatic characteristics of the watershed of each studied lake was established. The boundaries and morphological descriptors of the watersheds (area, slope) were calculated from the Canadian Digital Elevation Model [52
] with a spatial resolution of approximately 30 m. Climate data from North American Regional Reanalyses [53
], with a spatial resolution of approximately 32 km, were used. The cumulative degree-days (°C day) were calculated by summing the recorded degrees (°C) each day above 20 °C, a value that has been considered a threshold for cyanobacterial growth [54
]. Even though the remote sensing approach used here is not specific to cyanobacterial biomass, this climate proxy is considered valid. Land use data (at 40 m spatial resolution) and agricultural and ecumene data (at 25 m spatial resolution) were provided by Natural Resources Canada [56
]. The environmental indicators were considered stationary over the period 2000–2016. A total of 27 environmental variables were extracted for each lake and their watershed. The variables with the highest correlation to phytoplankton bloom phenology were selected (see Section 3.2
) for statistical analysis (Table 1
The spatial variability in phenological data is presented according to the latitude or longitude of the concerned lakes, and was statistically tested using the Kruskal and Wallis [58
] test by randomly generating a subsample of lakes (by longitude or latitude) to ensure the independence of variables (runs test [59
]). Temporal trend tests were conducted on median and annual extreme values (5th or 95th percentile; normality test [60
]). Given the size and diversity of the generated data sets and the complexity of the interactions governing bloom phenology, analyses were carried out using canonical methods. The aim was to explore all possible correlations between phenological and environmental variables without using one group of variables to justify the other. The canonical correlation analysis (CCA) allows to simultaneously analyze two groups of variables by quantifying their association. The objects (lakes) under study are described by two sets of quantitative descriptors: the first set X1
of p phenological descriptors, and the second set X2
of q environmental descriptors. Linear combinations Ui = AiX1
and Vi = BiX2
are parameters; i
= 1, …, K
) of each set of descriptors are calculated in such a way that the canonical correlation between Ui
is the highest possible. The first pair of canonical variables
is the pair of linear combinations
, which maximizes the equation’s correlation. The second pair of canonical variables i = 2 is the pair of linear combinations U2
which maximizes the correlation of this equation and which is not correlated with the first pair of canonical variables. This process is repeated until
pairs of canonical variables are obtained, such as K
= min (p
). The significance of canonical correlations was tested using Bartlett’s approximate chi-squared statistic [61
]. The interpretation of canonical variables was based on the identification of: (1) standardized canonical coefficients
, (2) structure coefficients
, and (3) canonical communality coefficients
. The contribution of original variables to a given canonical variables was estimated from standardized canonical coefficients
. These weights are generated to maximize the canonical correlation
, and are thus similar to the weights of a regression. Standardized coefficients assess the importance of one variable in relation to the others, and thus reflect their contribution to the canonical correlation. Structure coefficients were also calculated to evaluate the importance of a given unrelated variable. These coefficients correspond to the correlations between the original variables and the canonical variables
. The correlations (when squared) indicate the proportion of variance linearly shared by a given original variable with the canonical variable. Note that
correlations are not affected by the standardization of the original variables. Finally, the canonical communality coefficients (
) correspond to the sum of the squared
of all the canonical variables interpreted in the analysis. They provide information on the proportion of variance of a variable that is explained by the set of canonical variables used in the analysis. Variables with low values (<45%) are generally omitted from the analysis [63
]. Although multicolinearity between variables does not present any analytical difficulties when using a CCA, it can complicate the interpretation of the results by blurring the origin of the observed effects [64
]. The combined use of standardized and structure coefficients is therefore recommended since the latter are not affected by multicollinearity, and informs us on the potential contribution of the observed variables to the development of canonical variables [65
]. Satellite data treatment and statistical analyses were computed using Matlab software (R2018b).
This study presents the spatiotemporal dynamics of phytoplankton blooms over 580 lakes in southern Quebec between 2000 and 2016, and their potential relationships with physiographic, morphological, and climatic descriptors prevailing on the lakes and their watersheds. The results demonstrate realistic and expected trends, which validate the usefulness of this approach to study the response of lake trophic state to historical and future changes in land use and climate. For instance, the data show the expected increase in the magnitude of phytoplankton blooms (expressed by frequency, surface area, or intensity) from north to south and east to west, as well as an earlier onset date in the southern and western regions of the studied region. These spatial trends had been observed over the last decades by water quality monitoring services (MELCC) [66
], with blooms typically located in sectors of highly developed areas.
A particularly interesting result is the temporal increase of blooms observed between 2000 and 2016 in Quebec lakes, invalidating the hypothesis that the increasing trend would simply be related to the greater attention given to the phenomenon. This trend has also been observed by Winter et al. [69
] from 1994 to 2009 on lakes of Ontario, and Ho et al. [42
] over a 30-year period on 71 large lakes across the planet. The recent review by Huisman et al. [70
] particularly describes the overall increase in frequency, intensity, and duration of cyanobacterial blooms observed on lakes globally. For example, this was demonstrated from the analysis of cyanobacterial pigments in sediment cores from over one hundred lakes of Northern America and Europe [71
]. Our study allowed to quantify phenological trends over time and with respect to landscape characteristics, promoting predictive models such as the one put forward by Cremona et al. [17
], who developed a cyanobacterial biomass prediction model with respect to regional climatic variables and hydrological indicators. They found that cyanobacteria biomass will increase from 2% to 10% in future decades.
4.1. Phenological Trends
During the period under study, phytoplankton biomass exceeded the threshold of 10 µg Chl-a L−1
during 15 days per year (May to October) on average for all lakes studied. This result is rather conservative, as only days with a cloud cover under 25% of observable surface area were included in the database, while we can assume that many additional cloudy days would add to this [27
]. The areal extent of the blooms reached 19% of the overall lake surface on average (47% for the 95th percentile). Hence, most bloom events were rather restricted in terms of surface coverage. A bloom is qualified as very limited when the surface area remains under 25% (Sylvie Blais from the government of Quebec, pers. comm.). Our results show that all studied lakes had at least one pixel reaching a biomass of
or above at least once over the studied period, while this proportion drops to 12% of the lakes for blooms (biomass > 10 µg L−1
) covering more than 50% of the lake area. Therefore, high biomass values are often reached, but the problem is seldom extended to the entire lake surface area.
High annual bloom frequencies were mostly observed on lakes located in highly developed sectors with heavy urbanization and agricultural use. The impact of land use (and its associated inputs of nutrients) appears to play an undeniable role on algal bloom phenology; this relationship was validated by the CCA showing that the relative proportion of urban areas and population ecumene were significant in controlling the frequency, intensity, and onset date. These results were also obtained by Weber et al. [72
] showing a significant relationship between percent forest and cyanobacteria cell densities for 771 waterbodies in Georgia, USA. The bloom areal extent also varied spatially, with a significant increase between longitudes 74° W and 75° W, a corridor with intense seasonal resort operations stretching southwest of Montreal. However, this regional increase was partly caused by the phenology of two fluvial lakes (see Appendix D
), Lake Des-Deux-Montagnes and Lake Saint-François, showing extended blooms (on average 35% of the lake area) of low biomass (median values below 15 µg L−1
for days exceeding the threshold). Nevertheless, this spatial trend in bloom areal extent still existed when removing fluvial lakes from the database.
An increase by 23% in bloom frequency was observed between 2000 and 2016 (all lakes). It was mostly observed on lakes with moderate to high frequencies (i.e., between 15 to 28 days of blooms per year on average). While the median areal extent did not increase significantly over the studied period, events covering a large fraction of the lake (95th percentile) showed a significant increase over the years. Hence, lakes with large and frequent blooms, for instance Missisquoi Bay north of Lake Champlain and Lake Brome, are the ones most clearly showing a rising trend over the studied period. These two lakes have largely been studied due to the intensity of blooms occurring there and the resulting socio-economic issues, for example on drinking water quality [10
]. The rising importance of blooms on these lakes, often composed of buoyant cyanobacteria, has been associated to land-use changes (nutrient inputs) and increasing water temperature [76
The first blooms occurred 3 days earlier in 2016 than in 2000 on average over the studied territory, although the accuracy of onset date estimates is influenced by the amount of missing data. Other studies have shown bloom onset date becoming earlier over time on Lake Taihu in China (subtropical climate), but at a much faster rate (~10 days earlier per year between 1998 and 2009, [38
]). Interestingly, our results show that the onset date was particularly getting earlier (by almost 2 weeks) on lakes located in the northern part of the studied territory, but this specific trend is not significant. Direct impacts of increased air temperatures include an extended summer season, higher surface water temperatures, and intensified thermal stratification of lacustrine environments [78
]. These conditions can stimulate the growth of phytoplankton communities in eutrophic environments [79
] and particularly of cyanobacteria [70
]. Therefore, these effects may be more apparent in northern regions where other anthropogenic factors are not acting, although the observed trends need to be substantiated.
4.2. Links to Climate and Environmental Physiography
It is not simple to identify the causes of the rising trend in bloom frequency in a context of population expansion occurring in parallel to global warming. Partitioning the causes will need the development of a regional model allowing to test various scenarios for a given water body with respect to the specific conditions prevailing on its watershed. The spatiotemporal trends in bloom onset date, development, maintenance, and decline observed on the studied territory present linkages with the lake’s morphological characteristics, watershed’s physiographic characteristics, and prevailing climatic conditions, as shown from the canonical analysis. It allowed to highlight the characteristics underpinning the regional dynamics of phytoplankton blooms.
The frequency and onset date of blooms are the phenological variables most strongly linked to the environmental characteristics prevailing on the lakes and their watershed. The most significant environmental variables are the lake area, watershed area, settlement, and degree-days. Ultimately, the input of nutrients to lakes will be larger in urbanized and agricultural sectors southwest of the territory where lakes are more exposed to non-point source pollution (typically related to agriculture) and point source pollution (e.g., sewage systems, domestic, or industrial wastes). On the other hand, since the studied territory covers about 600,000 km2
, climate is very likely to play a role on phytoplankton bloom phenology. For example, on the studied territory, there is a difference of 37 days in the onset date of phytoplankton bloom. Zhang et al. [38
] showed that the onset date and duration of blooms are strongly related to climate (temperature, sunshine hours, and global radiation). The southwestern part of the territory is therefore offering more favorable conditions to phytoplankton growth, particularly when nutrients are abundant (Huisman et al [70
] and references therein).
Results indicate that the urbanized area is the physiographic variable best explaining bloom frequency and intensity. This relationship has been evoked in other studies mentioning that key forcing factors for the development of blooms include modifications resulting from anthropogenic activities such as contaminants from effluent and stormwater discharges, natural resource extraction and agricultural runoff [80
]. Interestingly, the CCA results indicate that urbanization (settlement and population ecumene) better explains the spatiotemporal variability of phytoplankton blooms than agricultural variables (cropland and agriculture ecumene), but both variables are linked on this relative occupancy scale. In the studied region, urban area varied extensively (0–64% of the lakes drainage basin) as well as farming area (0–55%). Farming has often been raised as a major controlling factor on bloom phenology through its influence on P and N loads.
The impact of lake surface area on frequency, intensity, and onset date revealed in the present study (larger lakes presenting higher frequency, intensity, and earlier onset date) has yet to be explained, although lake morphology is of definite importance in the development of cyanobacteria. For instance, dominance of filamentous species is observed in shallow lakes while colony-forming species dominate deeper lakes [82
]. Others have also shown the influence of hydrologic retention time on the establishment of blooms and their composition [83
]. Since larger lakes generally tend to have longer retention times, this factor could explain the relationships revealed by the CCA. However, we cannot exclude that the frequency of events could increase with the observable lake surface area (by the remote sensor), increasing the probability of detecting a bloom.
Degree-days is the climatic variable best explaining bloom phenology, followed by total annual precipitation. However, annual or seasonal water temperatures (Table 3
and Appendix C
) do not show any clear relationships with the phenological variables. Although several temperature descriptors have often been related to phytoplankton growth, including atmospheric temperature [19
], water temperature [85
], hours of sunlight [38
], and degree-days [86
], it is the latter descriptor that most directly impact the growth of ectotherm organisms [88
]. For instance, Ralston et al. [87
] used degree-days to assess inter-annual variability in the onset date of algal blooms, their development and date of decline in the Nauset estuary, and proposed this metric as an efficient warning indicator. Trombetta et al. [89
] also pointed out that water temperature is a key factor controlling the phenology and community structure of phytoplankton blooms in a Mediterranean shallow coastal system. The recent study by Ho and Michalak [90
], exploiting over twelve hundred summertime lake observations from across the continental U.S., showed that summer temperature drives total phytoplankton abundance, while the length of summer is linked to cyanobacterial abundance. The impact of precipitations on bloom frequency, intensity, and onset date suggested by the present study CCA is also discussed by Ho and Michalak [90
], who are evoking the effect of increased nutrient runoff on bloom development, while precipitations could rise flushing rates and slow down growth, confusing the relationships.