3.1. Model Performances for 1–10 Days
We first ask the question of how each model performs against the observation at each grid point. This information is beneficial to know for each model in identifying its regional skills. For this purpose, a sample of precipitation forecast skills in the form of absolute minimum error is illustrated in Figure 1
a–j for days 1 through 10. At each grid point of the domain, model performance is evaluated. A member model, which carries the smallest absolute error among others, is represented for day 1 to 10 at that grid point of the domain. A few noteworthy observations could be made in the spatial distribution in panels (a) through (i) that the best model carrying the orange color is the UKMO model, which outperformed all operational models. Over the ocean, the UK Met Office model, NCEP, ECMWF, and CPTECH perform well for all days, but the performance of the CMA and CMC model is inferior. Another critical point drawn from the figure is that not a single model color covers the entire domain. Still, all models show their significance in terms of best absolute error locally.
This also shows that it is not wise to rely on forecasts from a single model. Extracting useful information from the worst and good operational models becomes essential and the first step towards achieving the best forecast from currently available resources. For this purpose, it is vital to extract useful information from the history of the forecast data and correcting the estimates based on that information.
3.2. Stability of Coefficients
SE is one such technique based on the least square minimization of errors from the anomaly correlation matrix. It assigns the unequal coefficients to each model at each grid point based on their performance history during the training phase, which performs better than the equal coefficients base criteria [24
]. In the multimodel SE technique, coefficients can be positive or negative, which is one of the benefits over ensemble mean, where coefficients are positive only. Therefore, it becomes essential to have an SE forecast, which is a single model consensus forecast based on the best available operational models. This is conducted by comparing the model forecasts with TRMM analysis fields at each grid point, having the primary objective of developing statistics based on the least square fit. It adopts a particular way of assigning coefficients to the models derived from their past performance. It awards positive, negative, and even fractional coefficients, thereby eliminating the model’s overfitting. It, therefore, evaluates model biases geographically. We carried out the regression coefficients‘ stability analysis against the training phase periods to obtain the optimal value of regression coefficients. Out of 732 days of JJAS ((30 + 31 + 31 + 30) * 6), the first 600 days were used in the training phase to calculate stable statistical coefficients. A safe selection was rendered by using the best projection that satisfies the stability criteria on the spatially averaged curve of the data in Figure 2
a. While robust results can be obtained from a large number of cases, a minimum number of 90 days is also used in the training phase in the study [24
], where there is a restricted number of cases available for training phases.
a shows six member models’ (day one forecasts) domain-averaged regression coefficients. In the figure, the abscissa indicates the number of training days, which are being increased cumulatively with a step size of 10 days. The ordinate shows corresponding domain-averaged coefficients. The spatial average coefficients vary positively only in the range [0.07–2.2] for the Indian domain only, which can also be validated in the spatial distribution of coefficients shown in Figure 2
b–g. There are a number of studies to find criteria from for temporal stability in regression models for a minimum number of cases. The authors in [24
] claim that 90–120 days are sufficient to obtain reliable coefficients, which can also be seen in Figure 2
a, in which coefficients start to become stable (parallel to abscissa). We also can see that almost all models (except CMA and NCEP) have achieved stable values after 130 training cases. All models acquire stable values after 545 days. Therefore, the use of 600 training days for this study is justified and safe. These coefficients are then passed to the forecast phase and used to construct the SE real-time forecasts for onset dates during the southwest monsoon of 2014.
3.3. Rainfall Variability from 6 Models
The onset of the southwest monsoon over Kerala signals the monsoon’s arrival over the Indian subcontinent and represents the beginning of India’s rainy season. We applied these stable coefficients to the real-time case studies with a forecast lead time of 10 days.
The spatial distribution of precipitation on the onset date 05 June 2014 (day one forecast only) for the observed, SE, and member models are shown in Figure 3
The above figure points out the incapability of numerical weather prediction models to accurately represent the observed rainfall. They either overestimate or underestimate the rainfall values. The location of precipitation maxima does not agree with observation closely. Only SE and ECMWF were able to capture the event adequately in terms of area. Skill forecasts of precipitation demonstrate that it is possible to obtain higher skills for precipitation forecasts for days 1 through 10 of projections from the use of the SE compared to the best model in the suite. The higher skills of the SE make it a handy tool for such real-time events. The following two examples, each shown as a spatial distribution of the SE on the onset date (5 June 2014) and on the demise date (13 September 2014). These are seen in the form of spatiotemporal position on Figure 4
a–j and Figure 5
a–j with the corresponding TRMM observed shown on Figure 4
k–t and Figure 5
k–t of the Indian summer monsoon during the forecast span of 10 days from day 1 to day 10, respectively.
The above work is materialized in real-time, and one example of ten SE precipitation forecast valid at 12 UTC (05–14 June) onset date is shown in Figure 4
. After the onset date (5 June 2014), the monsoon’s progress can be seen over the Indian landmass. Another example in Figure 5
for monsoon Isochrones’ withdrawal (blue lines) for 13 September 2014 is shown from SE forecasts. The SE forecast shows a systematic withdrawal of the southwest monsoon.
Prediction of the onset day of the Indian summer monsoon rainfall remains an important topic for monsoon meteorologists. Figure 6
shows the RMSE and correlation on 5 June 2014 up to day 10. SE carries the lowest RMSE and highest correlation from day 1 to day 10. Among six member models, SE carries the lowest RMSE of 8.5 mm/day and highest correlation of 0.49.
We have shown the qualitative effect of the spatial distribution of rainfall, which we further quantitatively demonstrated in the form of their systematic errors, root mean square errors, and their standard deviations from the observed. All these quantitative numbers can be combined in the form of a Taylor diagram. Figure 7
a–d show skills based on the Taylor diagram [25
] for the few case studies of dates 01 June 2014, 05 June 2014, 01 July 2014, and 10 July 2014. This is a polar diagram. The radial direction carries a normalized standard deviation of the forecasts of all models and SE (normalized concerning the observed TRMM rainfall estimates for each day). The azimuth in these four diagrams carries the spatial correlation of the forecasts. These are forecasts from day 1 to day 10 assigned to each day’s forecast with a single dot with the forecast day number. The color scheme is explained in the legend of the illustration. Values closer to 0 for the normalized deviation denote a good forecast, and values closer to the REF point along the azimuth are good forecasts [25
shows the normalized values of the RMSE, correlation, and standard deviation from day 1 to day 8 of the forecast on four selected days of 2014. The clustering of importance relative to these optimal skills is best seen for the SE denoted by red dots. The member models show a more extensive spread of errors in comparison. A careful glance at the case study starting on 01 June 2014 in Figure 7
a show the clustering of red points for ten days. This refers to the degradation of skill scores, being the least in the case of SE, whereas all other models show a large spread of metric points, which indicates a relatively fast decreasing skills with time for member models. Similar information can be drawn from the other cases in Figure 7
b–d. It is also observed from the results that ECMWF forecasts outperform all other ensemble members, and the CPTEC model forecast shows the lowest skill in terms of Root Mean Square Error (RMSE) and Correlation Coefficient (CC). The overall conclusion drawn from the Taylor diagram is that SE techniques possess the best skill score for domain-averaged rainfall. SE shows a standard deviation of ~0.42mm and correlation of ~0.49 for 05 June 2014, on the onset day of the Indian monsoon. We have developed a real-time package that can be used for the forecast of rainfall two weeks in advance, consisting of six ensemble members with the option to add more members. We have also submitted SE forecasts to the USA-based company Weather Predict Consulting Inc. for the complete season (JJAS) of 2014.