Next Article in Journal
Heat-Stress-Mitigating Effects of a Protein-Hydrolysate-Based Biostimulant Are Linked to Changes in Protease, DHN, and HSP Gene Expression in Maize
Previous Article in Journal
Seed Priming and Foliar Application of Nutrients Influence the Productivity of Relay Grass Pea (Lathyrus sativus L.) through Accelerating the Photosynthetically Active Radiation (PAR) Use Efficiency
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evaluation of Genomic Selection Methods for Wheat Quality Traits in Biparental Populations Indicates Inclination towards Parsimonious Solutions

1
Department for Cereal Breeding and Genetics, Agricultural Institute Osijek, Južno predgrađe 17, 31000 Osijek, Croatia
2
Centre of Excellence for Biodiversity and Molecular Plant Breeding (CoE CroP-BioDiv), Svetošimunska cesta 25, 10000 Zagreb, Croatia
3
Department of Plant Breeding, Genetics and Biometrics, Faculty of Agriculture, University of Zagreb, Svetošimunska cesta 25, 10000 Zagreb, Croatia
4
Department for Maize Breeding and Genetics, Agricultural Institute Osijek, Južno predgrađe 17, 31000 Osijek, Croatia
*
Author to whom correspondence should be addressed.
Submission received: 25 February 2022 / Revised: 2 May 2022 / Accepted: 5 May 2022 / Published: 7 May 2022
(This article belongs to the Section Crop Breeding and Genetics)

Abstract

:
Breeding for end-use quality traits is often challenging since their assessment requires larger quantities of grain and flour samples, which are usually not available early in the breeding process. Using the mixograph as a fast and effective method of evaluating dough quality together with genomic selection (GS) can help in pre-selecting high-performing progenies earlier in the breeding process and achieve a higher gain per unit of time and cost. In the present study, the potential of GS to predict seven end-use quality traits, including mixograph traits, in two biparental wheat populations was investigated. Field trials with both populations were conducted at two locations in Croatia (Osijek and Slavonski Brod) over three years. Results showed that the size of the training population (TP) plays an important role in achieving higher prediction accuracies, while marker density is not a major limitation. Additionally, results of the present study did not support the optimization of TP based on phenotypic variance as a tool to increase prediction accuracy. The performance of eight prediction models was compared and among them elastic net showed the lowest prediction accuracy for all traits. Bayesian models provided slightly higher prediction accuracy than the ridge regression best linear unbiased prediction (RR-BLUP) model, which is negligible considering the time required to perform an analysis. Although RR-BLUP was not the best performing model in all cases, no advantage of using any other model studied here was observed. Furthermore, strong differences between environments in terms of the prediction accuracy achieved were observed, suggesting that environments that are less predictive should be removed from the dataset used to train the prediction model. The prediction accuracies obtained in this study support implementation of GS in wheat breeding for end-use quality, including some mixograph traits.

1. Introduction

The importance of wheat (Triticum aestivum L.) is underlined by the fact that wheat products are the most important source of dietary proteins and energy supply for humankind [1,2]. Therefore, suitable wheat quality is of great importance. Many traits have been identified to determine wheat quality, namely grain protein content (GPC), wet gluten content (WGC), gluten quality, grain hardness, test weight (TW), etc. [3]. High wheat quality is determined by high GPC content, while TW is often used as an indicator of flour yield. On the other hand, the baking quality of wheat is mainly influenced by gluten content and, more importantly, its composition and quality [4]. However, breeding for end-use quality traits, and especially baking quality traits, is often challenging because larger quantities of grain and flour samples are needed, which are usually not available early in the breeding process. The mixogram allows for the precise evaluation of flour quality using a relatively small sample (2–35 g), making it ideal for plant breeding, especially in the early breeding generations. It is a dough mixer that rapidly develops a dough sample and establishes its rheological profile, which provides the information on gluten quality and dough strength, optimum development time, etc. [5].
Reduced costs and the introduction of novel genotyping technologies have enabled high-density genotyping and increased the use of molecular markers in plant breeding. Because phenotyping is often very time consuming, breeders are increasingly turning to alternative breeding approaches to reduce the need for phenotyping and speed up the selection process. Phenotyping for end-use and baking quality of wheat is not only time consuming but also costly. For an equal number of lines, phenotyping for end-use quality and processing traits can be up to fifty times more expensive than high-density genotyping-by-sequencing (GBS), which is widely available today [6]. Therefore, approaches based on molecular markers are increasingly used in plant breeding, including breeding for wheat quality. One of the widely used methods of marker-based selection is genomic selection (GS), which was first proposed as a promising breeding strategy in 2001 when Meuwissen et al. [7] revealed that high-density markers can be used to estimate the breeding values of non-phenotyped genotypes. In GS, the training population (TP) is used to evaluate marker effects, followed by model validation in a related (preselected) validation population (VP), while selection is performed in a test or a target breeding population that contains candidate genotypes for which phenotypic data are not available. Marker effects estimated by predictive statistical models using genotypic and phenotypic data of the TP are used to calculate the genomic estimated breeding value (GEBV) of the selection candidates. GS helps to reduce the duration of a breeding cycle, improves selection accuracy, and allows more effective use of genetic diversity to increase genetic gain in breeding programs [8,9]. It allows for the selection of lines earlier in the breeding cycle, thereby reducing the potential cost of phenotyping [10]. Since GS takes into account all available markers without pre-selecting them, it has been reported to be particularly suitable for predicting polygenic traits, the expression of which is influenced by a large number of low-effect loci, such as the end-use quality traits of wheat [11].
The first step towards the successful implementation of GS in practical breeding programs is the correct adjustment of the parameters that can affect prediction accuracy. An overview of these parameters has been given elsewhere [12,13,14,15], and, as reported, the interrelatedness of the population structure, the TP size, and the marker density plays the most important role [16,17]. When designing the TP, the VP must be taken into account, i.e., the TP should be designed in accordance with the desired outcomes in the VP [18,19]. To achieve acceptable prediction accuracies, the TP should be highly related to the VP or contain genotypes that are related to the genotypes present in the VP [20,21]. Prediction accuracy increases with the size of the TP as well as with the marker density until it reaches a plateau [22,23,24]. The more closely related the TP and VP are, the smaller the TP and marker density required to reach the plateau of prediction accuracy [25,26]. In addition, the extent of linkage disequilibrium (LD) affects the number of markers required to reach a given level of GS prediction accuracy [9]. The extensive LD between quantitative trait loci (QTL) and markers in highly related populations, such as biparental populations, ensures that more than one marker accompanies each QTL. Consequently, a lower marker density is required to reach a plateau of prediction accuracy when GS is applied in biparental populations [27]. Optimization of the TP for GS has been shown to be important in achieving higher prediction accuracy. Many sampling algorithms have been proposed for TP optimization, namely random sampling, stratified sampling, sampling based on coefficient of determination (CD) mean or predictor error variance (PEV) mean, etc. [28,29]. Isidro et al. [28] showed that the strategy to optimize the TP depends on the population structure. According to their results, for structured populations, it is preferable to build a TP with the largest phenotypic variance to achieve high prediction accuracy. Using a biparental population, Marulanda et al. [30] demonstrated that optimization strategies based on genetic properties of the population do not lead to an increase in prediction accuracy. It was shown that only phenotypic variance in the TP is related to prediction accuracy. Since phenotyping is currently the most expensive phase of GS, determining the optimal size of the TP and the potential need for its optimization is critical to reduce the need for phenotyping. One of the key elements for the successful and cost-effective implementation of GS in plant breeding programs is achieving the desired prediction accuracy in conjunction with effective resource allocation.
Different prediction models are developed to address the problem of high-dimensional data sets in GS, which arises from the large amount of data collected in high-throughput genotyping. Most of the differences between prediction models relate to different assumptions about the distribution and variance of marker effects, i.e., how marker effects contribute to the overall variance of the observed trait [7]. The specific features of the different prediction models have been presented in detail in previous publications [31,32,33,34]. Due to its robustness and reliability of results, ridge regression best linear unbiased prediction (RR-BLUP) is the most widely used prediction model in GS [35]. In addition to the traditionally used models, such as the genomic best linear unbiased prediction (G-BLUP), RR-BLUP, and Bayesian alphabet models and machine learning and deep learning approaches, have also been applied to GS in recent years. As in other models, the performance of machine and deep learning models has been shown to be trait-dependent. While some studies suggest that approaches based on deep learning methods outperform conventional models in predicting wheat quality traits [36,37,38], other studies have not shown significant improvement in performance [39].
This study investigated the potential of GS to predict seven different end-use quality traits, including mixograph traits, in two recombinant inbred line (RIL) winter wheat populations. Specific objectives included: (1) assessment of the need for TP optimization based on phenotypic variance, (2) identification of the effect of TP size and marker density on prediction accuracy using the RR-BLUP model, and (3) evaluation of the performance of the RR-BLUP model and seven other prediction models, including one machine learning model. The results obtained should provide insights and recommendations for the implementation of GS in wheat breeding for end-use and baking quality.

2. Materials and Methods

2.1. Plant Material, Field Trials, and Phenotyping

Two biparental (RIL) winter wheat populations were used in the present study: Bezostaya-1 × Klara (BK) and Monika × Golubica (MG). Pedigree of used genotypes is given in Table S4 in the Supplementary Material. After crossing of parental cultivars and selfing, plants were randomly selected up to the F7 generation, which was used for field trials in the growing season of 2008/2009. Originally, the BK population consisted of 145 genotypes and the MG population 175 genotypes, including parental cultivars. Due to the insufficient quality of samples, some genotypes could not be successfully genotyped, so a total of 139 and 153 RILs were used for this study for the BK and MG populations, respectively. Field trials with both populations were conducted at two locations in Croatia (Osijek and Slavonski Brod) over three years (2009–2011, denoting the year of harvest). Each individual year–location combination represented an environment designated by the following abbreviations: OS09 (Osijek–2009), OS10 (Osijek–2010), OS11 (Osijek–2011), SB09 (Slavonski Brod–2009), SB10 (Slavonski Brod–2010), and SB11 (Slavonski Brod–2011). In each of the six environments, a field trial was set up with two replicates according to a row–column design. Data collected for the BK population in the SB11 environment were discarded due to the low quality of the flour samples. The analysis included seven quality traits, which are listed and described in Table 1. Further details on the selection of parental cultivars, the experimental design, the soil type and weather conditions at the experimental sites, the fertilization rate applied, and the phenotyping procedure are described in a previously published article [40].

2.2. Statistical Analysis and Heritability Estimation

The combined data from individual trials were subjected to the prediction of genotypic best linear unbiased estimates (BLUE) using the mixed model:
Y = G + E + G⋅E + REP⋅E + ROW·REP⋅E + COL⋅REP⋅E,
which included the fixed effects of genotype (G), environment (E), genotype-by-environment interaction (G·E), and replicates within environments (REP·E), as well as the random effects of rows and columns within replicates within environments (ROW·REP·E and COL·REP·E, respectively). The resulting predicted values for all genotype–environment combinations were used as input for all subsequent GS analyses. A more detailed description of the calculation procedure can be found in previously published paper [40].
Broad-sense heritability (H2) for all traits was calculated as the ratio of total genetic variance to total phenotypic variance. Across-environmental heritability was assessed using the following equation:
H 2 = σ g 2 σ g 2 + σ ge 2 e + σ e 2 er ,
where σ g 2 is genotypic variance, σ ge 2 is genotype-by-environment interaction (GEI) variance, and σ e 2 is the error variance component, while e and r represent the number of environments and the number of replicates per environment, respectively. For the assessment of heritability across environments, the variance components were calculated using model (1) treating all effects as random.
For the assessment of within environment repeatability, each environment was analyzed separately using the model:
Y = G + REP + ROW⋅REP + COL⋅REP
which includes the random effects of genotype (G), replicate (REP), and the effects of rows and columns within replicates (ROW·REP and COL·REP, respectively). The variance components thus obtained were used to calculate within-environment repeatability according to the following equation:
H 2 = σ g 2 σ g 2 + σ e 2 r ,
which is a simplified version of Equation (2) that only takes σ g 2 as genotypic and σ e 2 as the error variance component and r as the number of replicates per environment. Statistical analysis and heritability/repeatability estimations were performed within the R environment [42] using the commercial package “asreml” [43] and the free add-on package “asremlPlus” [44].

2.3. Genotypic Characterization

For genotyping, all genotypes (F7 filial generation) were grown in a climate chamber. Leaf tissue samples were collected at the 4–5 leaf stage and used for the isolation of genomic DNA. The collected samples were frozen and subjected to a lyophilization procedure at a temperature of −50 °C and a pressure of 0.1 millibar. The samples prepared in this way were ground in a Retsch Schwing mill using Eppendorf tubes and metal beads before DNA isolation. The further procedure for DNA isolation followed the protocol described in Karp et al. [45] for the isolation of DNA from plant material. The concentration and purity of the isolated DNA was determined spectrophotometrically using an Eppendorf BioPhotometer instrument. After DNA quantification, the isolated DNA was diluted to the optimal concentration for subsequent analysis (50–100 ng/µL). DNA samples were sent to Diversity Arrays Technology (DArT) at the University of Canberra, Bruce, Australia for sequencing. For the purposes of this study, DArT SNP markers were used. The initial SNP call provided 4874 and 7192 markers for the BK and MG populations, respectively. After excluding markers with incomplete chromosomal position data, heterozygosity greater than 30% and minor allele frequency (MAF) <0.05, missing data were imputed using Beagle software, version 5.1 [46]. The final dataset used for GS contained 1087 (BK population) and 2231 (MG population) filtered and imputed SNPs. Figure S1 in the Supplementary Materials shows the distribution of SNPs on each chromosome and genetic map relative to the population.

2.4. Cross-Validation Strategies for Assessment of Different Genomic Selection Problems

The first phase of the GS analysis aimed to determine whether the TP needed to be optimized based on phenotypic variance, what size of TP was required, and to investigate the influence of marker density, all with the goal of improving prediction accuracy. Because it is robust and less computationally intensive compared to other models, only the RR-BLUP model was used to estimate the marker effects at this stage of the analysis.
The RR-BLUP is a parametric prediction model which can be represented by the following equation in matrix notation:
y = WGu + e
where y is a vector of phenotypic values, W and G are the design and the genotype matrix, respectively, u is a vector of marker effects which follow normal distribution and have a common variance u ~ N ( 0 ,   I σ u 2 ) , and the residual error is represented by e ~ N ( 0 ,   I σ e 2 ) . The BLUP solution for the calculation of marker effects can be written as:
u = ( Z T Z + λ I ) 1 Z T y
where λ is a ridge regression parameter calculated as the ratio of residual and marker variances ( σ e   2 / σ u   2 ) , I is an identity matrix, and Z = WG . The same penalty parameter is applied to all marker effects causing an equal shrinkage towards zero regardless of the size of the marker effect. This model applies Restricted Estimated Maximum Likelihood (REML) function for marker effect estimation [35].
GS analysis using RR-BLUP was evaluated for both populations and all quality traits separately in each available environment. Prediction accuracy was estimated using 100 independent (nTimes) 10-fold cross-validations. Pearson correlation between GEBVs and actual phenotypic values was calculated for each replicate (nTimes) and final prediction accuracy was expressed as the mean value over nTimes. The mean-squared error of prediction (MSEP) value was reported as the mean over nTimes as a criterion for the quality of the model. The MSEP value was reported as the mean over nTimes. The RR-BLUP model was implemented within the R environment [42], using the “BWGS” pipeline [47] and the “glmnet” package [48].

2.4.1. Effect of Training Population Phenotypic Variance on Prediction Accuracy

For each randomly selected TP, phenotypic variance was calculated for 100 independent 10-fold cross-validations to determine if there was a correlation between the phenotypic variance of the TP and the resulting prediction accuracy. The TP sizes were set to 25, 50, and 75 lines for both BK and MG populations (representing percentages of approximately 15, 35, and 50 of the total number of lines in the population). The aim of this step of the analysis was to evaluate whether phenotypic variance should be taken into account when optimizing the TP to achieve higher prediction accuracy. Because no strong correlation was found between phenotypic variance and prediction accuracy, all subsequent analyses were conducted using only randomly selected TP.

2.4.2. Effect of Marker Density on Prediction Accuracy

The final dataset used for GS for the MG population contained twice as many SNPs (2231) compared to the BK population (1087). To determine whether the differences in prediction accuracy between populations reflected differences in marker density, an additional subset of marker data was generated for the MG population. To avoid potential bias from a completely random subset of the data, marker pruning was performed based on LD. For each pair of markers with coefficient of LD greater than 0.9, only one SNP was left in the final data set. This resulted in a subset of markers that contained 1123 SNPs, which was used to approximate the number of SNPs in the full marker dataset for the BK population.

2.4.3. Effect of Training Population Size on Prediction Accuracy

Three different TP sizes were used to investigate the effect of TP size on prediction accuracy. The population size corresponding to 50, 65, and 80% of the total number of lines was used as the TP, while the remaining lines (50, 35, and 20%) served as the VP. The percentages given correspond to the 70, 90, and 111 lines for the BK population and 77, 99 and 122 lines for the MG population. In each scenario, the TP was randomly selected for all traits. This procedure was performed for the BK population and the MG population using both the full and reduced marker datasets.

2.5. Comparison of Genomic Selection Models

The second phase of the analysis compares the performance of different GS models, for predicting all seven quality traits examined in this study with the performance of the RR-BLUP model. This part of the analysis was performed only with the MG population, as it had a larger number of markers and lines compared to the BK population. The entire marker data set available for the MG population without reduction was used to estimate marker effects. The TP included 80% of the lines (122 RILs), while the VP included the remaining 20% (31 RILs). In addition to the RR-BLUP model, which is explained in more detail in the Section 2.4, five other parametric models and two semi-parametric models were used. The parametric models included elastic net (EN) and four Bayesian models—BayesA (BA), BayesB (BB), BayesC (BC), and BayesLASSO (BL). The semi-parametric models used in this study were random forest (RF) and reproducing kernel Hilbert spaces (RKHS).
All Bayesian regression models can be described using the following equation:
y = µ + k = 1 m x k β k + e
where y is a vector of phenotypic values, µ is the overall mean, x k is the vector of genotypes for the kth marker, β k is the effect of the kth marker, m is the number of markers, and e is a vector of residuals with the assumptions of e ~ N ( 0 ,   I σ e 2 ) . Bayesian models differ in the prior assumptions of the effects of markers ( β k ) , i.e., they assign distinct prior distribution for the estimation of marker effects. In the BA and BB models, β k follows the inverted chi-square distribution and the π value determines the probability that the marker has zero effect. For the BA model π = 0, which assumes that all markers have non-zero effect [7]. The BB model applies a distribution with point mass at zero, thus, allowing for many markers to have a zero effect [7,49]. From a breeders’ point of view, it is a more realistic assumption given that certain regions of the genome are not associated with QTL; hence, the effects of some markers would be absent. The BC model assumes that some of the markers (1—π) have zero effect, while the rest of them (π) follow a Gaussian distribution [36]. The BL model represents the L1 regularization norm in a Bayesian framework, to obtain a form of least absolute shrinkage and selection operator (LASSO) regression described by Park and Casella [50]. This model applies double exponential distribution for the estimation of marker effects and assigns unique variance to all markers, thus, causing stronger shrinkage of regression coefficients closer to zero (markers with small effect) and weaker shrinkage of coefficient with high absolute value (markers with greater effect) [51].
The EN model applies the weighted combination of penalization represented in the RR (L2 regularization of marker effects) and LASSO (L1 regularization of marker effects) methods. It introduces the elastic-net penalty P α , which determines how much weight is given to each of the two methods. The lower the α value, the more similar EN performs to RR (α = 0), while EN with α value closer to 1 is more equivalent to LASSO (α = 1). Therefore, EN can make the selection of groups of correlated markers while performing automatic variable selection and continuous shrinking simultaneously [48,52].
The RF is a machine learning model which can be represented by the following equation:
y ^ i = 1 B b = 1 B T b ( x i )
where y ^ i is the phenotypic prediction of the genotype x i , T is the number of trees, and B represents the number of bootstrap samples. Briefly, RF method is based on the construction of numerous identically distributed trees. For each tree, the individual prediction using the regression model is made and the final prediction value represents an average of outputs from all trees. The bootstrap method is used to find the optimal subset of training data for the construction of each tree. The splitting at the tree node is carried out in such a way that the loss function is reduced with each bootstrapped sample [53].
The RKHS model carries out the semi-parametric regression on marker genotypes. To control the distribution of marker effects, this model uses genetic distance and a kernel function which is based on the Euclidean measure of marker similarity [54]. Briefly, the covariance structure is constructed using the markers Cov ( g i , g i ) K ( x i , x i ) where x i and x i are vectors of marker genotypes and K ( . , . ) is a positive definite function, e.g., reproducing kernel [55].
For each of the models used, 100 independent 10-fold cross-validations were performed. As with the RR-BLUP model, prediction accuracy and MSEP were reported as the average of the cross-validation replicates for each model. The parameters for iterative models (Bayesian models and RKHS) were set to 5000 iterations with a burn-in (number of discarded samples) of 1000 and a thinning of three. This part of the analysis was performed within the R environment [42] using the “BWGS” pipeline [47] and packages “glmnet” [48] (EN model), “BGLR” [56] (Bayesian models and RKHS), and “randomForest” [53] (RF).

3. Results

3.1. Heritability and Repeatability of the Traits

Estimates of heritability for all traits and both populations across environments and repeatability within environments are shown in Table 2. In both observed populations, heritabilities for the GPC, WGC, and TW traits, were high ranging, from 0.78 to 0.92. In general, mixograph traits had high heritabilities (≥0.71), but these were somewhat lower compared with the other three quality traits. Only the MPT trait in the BK population had moderate heritability, with a value of 0.45. Comparing the within-environmental repeatabilities between the two populations, it is noticeable that they were equal or slightly higher in the BK population for most trait–environment combinations. In the BK population, repeatabilities were moderate to high, ranging from 0.55 to 0.95 for all trait-environment combinations, except in the SB09 environment where values were mostly moderate, ranging from 0.13 (MPT) to 0.78 (TW). Overall, in the BK population, the highest repeatabilities for all traits were observed in the OS09 environment. In the MG population, repeatabilities for the majority of trait–environment combinations were moderate to high, ranging from 0.52 to 0.96, with the exception of the OS11 environment, where repeatabilities were mostly low, ranging from 0.18 (GPC) to 0.43 (MPT).

3.2. The Influence of Training Population Phenotypic Variance on Prediction Accuracy

Scatter plots showing the relationship between the phenotypic variance of randomly selected TP in three different sizes and the prediction accuracy obtained with the RR-BLUP model for two traits of each population are included in Figure 1 (GPC and TW for the BK population, MTI and MPH for the MG population), and all other plots are included in Figure S2 (BK population) or Figure S3 (MG population) in the Supplementary Materials. The number in the angle of each scatter plot represents the observed correlation coefficient. For some trait–environment combinations, the correlation was nearly zero, regardless of the TP size used (Figure 1d, Figures S2c,d and S3c–e in the Supplementary Materials). Looking at each population–trait–environment combination separately, in some cases a slight decrease in the correlation coefficient was observed along with a shift from positive to negative values as the TP size increased (Figure 1a, Figures S2a and S3a in the Supplementary Materials). In some cases, the correlation coefficient even increased with increasing TP size (Figure 1b: OS11 environment, Figure 1d: OS09 and SB10 environment). In addition, the strength and direction of the correlation varied considerably among the different environments of the same combination of population and trait. In general, the observed correlation coefficients were low (r ≤ 0.35) and no consistent pattern in the strength or direction of correlation was observed for any of the population–trait combinations examined.

3.3. The Influence of Training Population Size and Marker Density on Prediction Accuracy

The boxplots in Figure 2 and Figures S4 and S5 (in the Supplementary Materials) show the influence of the size of TP and marker density (NM) on the prediction accuracy of the RR-BLUP model in each environment tested. MSEP values for all combinations are shown in Table S1 in the Supplementary Materials. Figure 2 includes three traits per population for which prediction accuracy was highest in all or most environments, i.e., GPC, WGC, and TW for the BK population (Figure 2a–c) and MPT, MTW, and MPH for the MG population (Figure 2d–f in the case of NM = 1123, and Figure 2g–i in the case of NM = 2231). Boxplots for all other traits are included in Figure S4 (BK population) and Figure S5 (MG population) in the Supplementary Materials. When comparing two populations and cases in which approximately the same number of markers were used (NM = 1087 and NM = 1123 for the BK and MG populations, respectively), it is noticeable that the prediction accuracy for the traits GPC, WGC, and TW was higher in the BK population, whereas the mixograph traits showed better predictability in the MG population. An exception is the MTI trait, the predictability of which was higher in the BK population, although it was still low in most environments (<0.3). Reducing the TP size from 85% to 50% of the total number of lines in a population had a negative effect on the achieved prediction accuracy in all observed population–trait–environment combinations. Although the effect was negative, it was not as severe, implying that even using 50% of the population as TP, the prediction accuracy can still be moderate to high (Figure 2a–c). It is also noticeable that the prediction accuracy strongly depends on the environment, regardless of the TP size, and it can vary substantially, e.g., the largest difference in prediction accuracy was observed at TP size 80% for the trait TW in the MG population, ranging from 0.06 in the OS11 environment to 0.49 in the OS10 environment.
When comparing the influence of different marker densities on the predictability of quality traits within the MG population (Figure 2d–i and Figure S5 in the Supplementary Materials), it is noticeable that higher values of prediction accuracy were obtained when a higher marker density was used (NM = 2231 compared to NM = 1123) for all trait–environment combinations and all TP sizes used. Nevertheless, the differences in found in prediction accuracy were not large. Considering the TP size of 80%, the largest difference in prediction accuracy was found for trait WGC in environment SB11, with values of 0.20 and 0.32 when NM = 1123 and NM = 2231 were used, respectively (Figure S5b,f in the Supplementary Materials). These results suggest that a lower number of markers is already sufficient to achieve good predictability, and that with a higher number of markers a plateau of prediction accuracy may have been reached for this population and the observed traits.

3.4. Performance of Different Prediction Models

Figure 3 shows the mean prediction accuracy for the MG population and the traits (a) GPC, (b) TW, (c) MTW, and (d) MPH obtained with eight different prediction models. Results for the remaining three traits (WGC, MPT, and MTI) are shown in Figure S6 in the Supplementary Materials. In both Figure 3 and Figure S6, the error bars indicate the standard deviation. The mean prediction accuracies over 100 independent 10-fold cross-validations are shown along with the standard deviations in Table S2 in the Supplementary Materials. The highest prediction accuracy achieved for each trait–environment combination is shown in bold. The mean MSEP values for all trait–environment–model combinations are listed in Table S3 in the Supplementary Materials. In general, trait predictability was found to be good in some environments while being low in other environments. For example, the prediction accuracy of TW (Figure 3b) was low (−0.08–0.31) in all environments except environment OS10, where the prediction accuracy was moderate (0.36–0.49) for all models examined, whereas the prediction accuracy of WGC (Figure S6a in the Supplementary Materials) was moderate for most environment–model combinations except environment SB09, where the prediction accuracy was low and even negative for the model EN. However, overall, GPC, WGC, and two mixograph traits (MPT and MTW) showed moderate predictability with prediction accuracies up to 0.57 (Figure 3a,c and Figure S6a,b and Table S2 in the Supplementary Materials). The predictability of TW and the other two mixograph traits (MTI and MPH) was rather low and varied substantially between environments, resulting in negative values of prediction accuracy in some cases (Figure 3b,d and Figure S6c and Table S2 in the Supplementary Materials). When comparing the performance of the different models, it is noticeable that the model with best performance depends strongly on the observed environment and not so much on the trait. According to Table S2, the model EN had the lowest values of prediction accuracy for the most combinations of traits and environments and was also the model with the highest number of cases with negative prediction accuracy values. In only one case did the EN model achieve the highest prediction accuracy (MPT trait in environment SB10). In 35 of 42 possible trait–environment combinations, the Bayesian alphabet models (BA, BB, and BB) proved to be superior, whereas the BL model performed best in only two cases, followed by the RF model with five and the RKHS model with seven cases. Although the RR-BLUP model performance was superior in only one case (TW trait in OS10 environment), the performance of all other models was not substantially better compared to it. Indeed, the prediction accuracy of the best performing model was on average only 0.05 points higher and ranged from 0 (in the case of the TW trait in the OS10 environment where the BA and BC models had the same prediction accuracy as RR-BLUP) to 0.14 (in the case of the MPT trait in the OS09 environment, where the superior model was BB with a prediction accuracy of 0.18 compared to 0.04 of the RR-BLUP model). Observed MSEP values (Table S3 in the Supplementary Materials) were relatively low (0.44 or lower) for the majority of the trait–environment combinations with little or no difference among prediction models used. The highest MSEP values were recorded for MTI, ranging from 0.49 to 3.42. One of the biggest differences among the implemented models is their computational efficiency, i.e., the time required for one analysis. In the present study, conducted on a 64-bit Windows 10 workstation with a 2.90 GHz Intel (R) Xeon (R) processor and 32 GB RAM, the least demanding model was EN, which took approximately 19 min to compute a prediction accuracy. It was followed by RR-BLUP, RF, and RKHS with computation times of 38, 71, and 92 min, respectively. The most demanding were the Bayesian models, which required almost 3 h to compute an analysis (166, 162, 161, and 171 for BA, BB, BC, and BL, respectively).

4. Discussion

The present study investigated the potential of the GS approach for predicting seven end-use quality traits, among which are some rheological properties of the dough obtained by the mixograph. While assessment of dough rheological traits is typically labor-intensive and time-consuming, the mixograph can provide good insight into baking quality by using only a small flour sample. When combined with GS, it has the potential to support end-use quality improvement well, especially in filial or early wheat breeding generations.

4.1. Heritability

In the present study, the broad-sense heritability estimated across all environments was high (>0.7) for all traits, with the exception of MPT in the BK population, the heritability of which was 0.45 (Table 2). Although repeatability varied considerably within environments, it was high for most of the trait–environment combinations, with a value above 0.7, suggesting that heritability itself should not be a barrier to achieving good prediction accuracy. High heritability generally indicates that genetic factors account for the majority of trait variance, making GS the ideal approach for predicting these traits, as it takes into account all available markers while attempting to capture the total genetic variance [57]. In general, the lowest repeatabilities for the BK and MG population were obtained in the SB09 and OS11 environments, respectively, indicating the presence of a stronger non-genetic variance effect within each environment. Two recent studies by Sandhu et al. [37,38] reported similar heritabilities for TW, but the reported values for GPC were lower (0.35–0.63) than those obtained in the present study. Lado et al. [58] reported low to moderate heritabilities for WGC, TW, and some mixograph traits, while Hayes et al. [59] showed that the vast majority of dough rheology and baking traits had higher heritabilities compared to grain traits such as GPC and TW. Nevertheless, the heritabilities reported to date for wheat end-use quality traits appear to be sufficient to achieve acceptable prediction accuracy and have not been reported to be a significant limiting factor for GS [60,61,62]. Regardless, several previous studies have shown that predictability can be low in some environments despite high heritability [63], which can be explained by various environmental factors [64]. Additionally, Sandhu et al. [38] have shown that although heritability for a single trait can vary substantially between tested environments, this variation does not significantly affect prediction accuracy.

4.2. Optimization of Training Population and Marker Density

When deciding to include GS in a breeding program, it is essential to take the right steps to optimize the factors that might lead to unnecessarily high costs. Although the cost of genotyping has decreased significantly in recent years, the cost of phenotyping wheat traits for end-use quality have remained relatively high [11]. Therefore, it is important to optimize TP to reduce the potential cost of phenotyping while maintaining the same level of prediction accuracy. Prediction accuracy was significantly affected by the size of the TP in almost all studies that investigated this issue [39,62,65,66,67]. In addition, the relatedness of TP and VP was found to play an important role in choosing the optimal size of TP [68]. The results of this study are consistent with those of previous studies [24,27]. In the present study, the highest prediction accuracy for all traits was achieved when 80% of the dataset was used as TP, i.e., when the TP included 111 and 122 RILs for the BK and MG populations, respectively (Figure 2, Figures S4 and S5 in the Supplementary Materials). On the other hand, the results obtained with half of the dataset as TP (70 and 77 RILs for BK and MG, respectively) show that acceptable levels of prediction accuracy values may be achieved even with a smaller TP. Previous research has also shown that a reasonably small TP size is sufficient to achieve high prediction accuracies [58,62,69,70], especially in highly related populations such as biparental populations [24]. This is particularly important for resource allocation, i.e., deciding on the number of genotypes to include in the experiment, especially when phenotyping is time-consuming or expensive. Additionally, some other criteria for selecting TP individuals, such as PEV mean or CD mean, were recommended in previous research in order to maximize prediction accuracy. Using two diverse groups of maize inbreeds, Rincent et al. [29] showed that TP optimization based on CD mean values maximizes the reliability of GS. The authors justified the reported results by stating that the CD mean reduces the variance due to the higher relatedness of individuals in the selected TP. Marulanda et al. [30] studied the influence of different parameters on the variability of prediction accuracy using a simulated biparental maize population. Of the parameters studied, only TP phenotypic variance was found to be positively correlated with the prediction accuracy and was suggested as a tool for the optimization of TP. This correlation was stronger when smaller TP was used, while it was weaker and even negative in the case of larger TP. In the present study, no consistent correlation was found between phenotypic variance and prediction accuracy, regardless of the trait or size of TP used (Figure 1, Figures S2 and S3 in the Supplementary Materials). Therefore, results of the present study do not support the optimization of TP based on phenotypic variance as a tool to increase prediction accuracy for wheat end-use quality traits.
Although the cost of genotyping has decreased substantially over the years, it remains a significant source of expense for breeders. Therefore, it is important to optimize the marker density used for GS. According to previous studies, increasing marker density has a positive effect on prediction accuracy but reaches a plateau after which further increase have no significant effect on prediction accuracy [24,71]. Whether the plateau is reached with a lower or higher marker density depends on the relatedness of the population. Liu et al. [66] have shown that the plateau is reached at approximately 3000 markers and can be as low as 500 markers when TP and VP are more closely related. Due to the low rate of recombination, closely related plant populations usually have large linkage blocks, resulting in a high LD between markers and QTL. Consequently, a lower marker density is required to reach the plateau of prediction accuracy in biparental populations, which have a high LD compared to populations with low relatedness [27]. Using a double-haploid (DH) population and a breeding panel, Haile et al. [71] showed that the plateau for GPC is reached at 2000 markers. Juliana et al. [72] reported that once genomic resolution is achieved, increasing marker density has little effect on the predictability of quality traits in biparental wheat populations. It has been reported that, for wheat quality traits, this genomic resolution can be achieved even at low marker density, i.e., 256 markers in biparental populations and 768 markers in multi-family populations [24,73]. In the present study using the MG population, we compared the prediction accuracies when the entire available marker dataset (NM = 2231) and half of it (NM = 1123) were used (Figure 2 and Figure S5 in the Supplementary Materials). According to the existing literature, 1123 markers should be sufficient to achieve acceptable prediction accuracy in the biparental population. In general, no large increase in prediction accuracy was achieved when 2231 markers were used compared with 1123 markers (Figure 2, Figures S4 and S5 in the Supplementary Materials). However, the results presented in this study show that for some traits, such as GPC (Figure S5a,e in the Supplementary Materials), WGC (Figure S5b,f in the Supplementary Materials), and MPT (Figure 2d,g), the increase was slightly larger compared with the other four studied traits. This suggests that for some traits, such as TW, MTW, MTI, and MPH, the plateau was already reached at 1123 markers, whereas for other traits, a further increase in marker density may still improve prediction accuracy. Gorjanc et al. [74] reported that low coverage GBS combined with increased TP size doubles the value of prediction accuracy and can be successfully used for GS in biparental populations. These results suggest that the size of TP plays a more important role in achieving high prediction accuracy than marker density [27].

4.3. Prediction Accuracies of Different Models

Sufficiently high prediction accuracies to allow for the inclusion of GS in the breeding program and selection early in the breeding cycle have already been reported for end-use quality traits [59,66]. Comparing the prediction accuracy of RR-BLUP model for two populations examined in this study, it can be seen that some traits are more predictable in one population than another (Figure 2, Figures S4 and S5 in the Supplementary Materials). With the exception of some environments, higher prediction accuracy was obtained for GPC, WGC, and TW in the BK population, while the mixograph traits MPT, MTW, and MPH showed better predictability in the MG population. MTI showed low predictability in both populations together with high MSEP values, from which it can be concluded that this trait is not a good target trait for GS. Trait predictabilities observed in this study varied by environment but were generally comparable to results from the existing literature [24,75]. When lines were randomly assigned to TP or VP, Kristensen et al. [11] achieved a prediction accuracy of 0.5 or higher for wheat quality traits, including GPC. Using different prediction models for biparental wheat populations, Charmet et al. [76] reported accuracies up to 0.7 for TW, which is higher than that presented by the results of the present study, where the highest prediction accuracy of TW was approximately 0.6. Lado et al. [58] showed prediction accuracies ranging from 0.24 to 0.43 for eight bread baking quality traits, including WGC and mixograph traits. On the other hand, Battenfield et al. [39] obtained moderate prediction accuracies (up to 0.62) for several mixograph traits while showing low predictability of TW. Nevertheless, some authors reported that lower prediction accuracies can be successfully used to exploit GS in early generations when the selection of lines is performed simultaneously based on GEBV and BLUP values [10].
Although numerous prediction models have been developed to date for GS, none has shown a clear advantage over other models by achieving higher prediction accuracy regardless of the trait being evaluated [51,60]. Previous research has shown that there is no significant difference in performance between BLUP and Bayesian models for most wheat end-use quality traits [11,57]. When comparing the performance of RR-BLUP and BC models for wheat quality traits in two biparental populations, Heffner et al. [24] found little or no difference in average performance of the two models. However, when looking at each population separately, they concluded that RR-BLUP performed better than BC in one population, while it was less accurate in the other population, which the authors explained by the different marker effects in each population. Some studies have shown that Bayesian models are better at capturing LD between markers and QTL and are, therefore, better for predicting genotype performance when TP and VP have low relatedness [11,77,78]. Of the seven prediction models used, Battenfield et al. [39] reported the lowest prediction accuracies for RF in general. Sandhu et al. [38] found that deep learning models outperformed the RR-BLUP model in a biparental population. The RR-BLUP model provided a prediction accuracy of 0.48 and 0.45 for the GPC and TW, respectively, while one of the deep learning models used provided approximately 10% higher prediction accuracy. In another study by Sandhu et al. [37], the authors confirmed that deep learning models are generally superior to other models used for predicting end-use quality and processing traits in wheat breeding populations. Average prediction accuracy across all traits was highest for deep learning models (0.63–0.64), followed by machine learning models (0.63 for both RF and support vector machine, SVM) and RR-BLUP (0.61). The lowest average prediction accuracies were obtained for Bayesian models. In addition to the deep learning models, SVM and RF were the best performing models for the traits GPC and TW, followed by RR-BLUP and Bayesian models, although the differences in prediction accuracy were minor. In the present study, we compared the performance of eight models, including one machine learning model (RF) (Figure 3, Figure S6 and Table S2 in the Supplementary Materials). In general, the model with the lowest prediction accuracy was EN. However, in some cases EN was as successful as RR-BLUP and it was also the least computationally intensive model, so it is recommended for cases where a breeding program includes a large number of lines and selection needs to be carried out quickly and not with high precision. The RF and RKHS models outperformed RR-BLUP only for some trait–environment combinations and, therefore, cannot be recommended as models of choice for end-use traits in general, as in some previous studies [37]. For the majority of trait–environment combinations, Bayesian models (BA, BB, and BC) had the highest prediction accuracy. Nevertheless, the obtained values were not substantially higher than those of RR-BLUP, which could be explained by the high relatedness of TP and VP in the present study [11,77]. Bayesian models were also the most computationally intensive and time-consuming models in the present study, requiring more than 2 h for one analysis. Therefore, Bayesian models could be recommended for breeding programs with fewer lines where selection must be performed with a higher degree of precision. Since no clear superiority of one model over another in terms of achieved prediction accuracy could be shown in the present study, less computationally intensive models that also achieve a reasonable level of prediction accuracy, such as RR-BLUP, represent the best choice.

4.4. Genotype-by-Environment Interaction

Another major challenge in implementing GS in breeding programs represents GEI [79]. According to Bernardo [80], there are three possibilities of how to deal with the GEI when breeding for quantitative traits in plants. The first approach is to ignore it, the second is to reduce it, and the third is to exploit it. Due to the high heritability of the traits investigated in the present study, the first approach was applied, and the analysis was performed for each of the environments separately. Indeed, it has already been reported that the prediction accuracy varies considerably between the environments tested [81,82], and the results presented in this study are no exception. Looking at the prediction accuracy within the MG population (Figure 3 and Figure S6 in the Supplementary Materials), it is clear that the prediction accuracy for some traits, such as TW (Figure 3b), is moderate in one environment (OS10), while it is low in all other environments. A similar pattern can be observed for MTW (Figure 3c), which has low predictability in the OS10 environment, while the prediction accuracy is moderate to high in the other environments, and for WGC (Figure S6a in the Supplementary Materials) and MPT (Figure S6b in the Supplementary Materials), for which substantially lower prediction accuracy was observed in the SB09 and OS09 environments, respectively. Comparing these results with those from our previous publication in which we examined GEI in the same dataset [40], certain assumptions can be made. Environments that are characterized by unusually high or low values for prediction accuracy compared to the rest of the environments tend to be those that produce the greatest GEI and are more pronounced. The clearest example of this is the TW, the predictability of which was highest in the OS10 environment. This was the only outstanding environment, while all others were grouped together on the AMMI2 biplot (see Plavšin et al. [40] Figure 3b). Nevertheless, the stability of the prediction models and the accuracies achieved in different environments are still largely unknown. Some research suggests that modeling GEI in GS [83,84,85] or incorporating information from correlated environments [86,87] leads to higher prediction accuracy. Ornella et al. [87] showed that high correlation between environments allows for the prediction of one environment based on a model trained with data from another environment. Furthermore, identifying and removing environments from the dataset used to train the prediction model proved to be a successful strategy to improve prediction accuracy [64,65]. This would be a good strategy for the WGC, MPT, and MTW traits from this study, as only one environment was found to be less predictive, while moderate predictive abilities were seen in all the others.

5. Conclusions

In the present study, the potential of GS to predict seven end-use quality traits in two biparental wheat populations was investigated. As in previous studies, it was found that the size of TP plays an important role in achieving high prediction accuracies, while marker density is not a major limitation nowadays due to the use of high-throughput genotyping. Moreover, no advantage of TP optimization based on phenotypic variance was found in this study. Although RR-BLUP was not the best performing model in all cases presented, no significant advantage of using any other model studied here was observed. Some Bayesian models provided slightly higher prediction accuracy than RR-BLUP, which can be considered negligible considering the time required to perform an analysis. Furthermore, we observed strong differences between environments in terms of the prediction accuracy achieved, suggesting that environments that are less predictive should be removed from the dataset used to train the prediction model. Nonetheless, we provided evidence that GS is a good potential selection tool for end-use quality traits, including some mixograph traits. End-use quality traits, and especially dough rheology traits, are typically difficult to breed for because their evaluation is time-consuming and requires a larger quantity of seed, which is usually not available in early generations. Therefore, using the mixograph as a fast and effective method of evaluating dough quality together with GS can help in pre-selecting high-performing lines earlier in the breeding process and achieve a higher gain per unit of time and cost.

Supplementary Materials

The following supporting information can be downloaded at: https://0-www-mdpi-com.brum.beds.ac.uk/article/10.3390/agronomy12051126/s1, Figure S1: The distribution of SNPs on each chromosome for (a) BK population and (c) MG population together with genetic map of all available SNPs for (c) BK population and (d) MG population, Figure S2: Phenotypic variance of randomly selected TP plotted against prediction accuracy values obtained using RR-BLUP model for following traits of BK population: (a) WGC, (b) MPT, (c) MTW, (d) MTI, and (e) MPH, Figure S3: Phenotypic variance of randomly selected TP plotted against prediction accuracy values obtained using RR-BLUP model for following traits of MG population: (a) GPC, (b) WGC, (c) TW, (d) MPT, and (e) MTW, Figure S4: Prediction accuracy values obtained using RR-BLUP model and three different sizes of TP (50%, 65% and 80% of the total number of lines in a population) for following traits of BK population: (a) MPT, (b) MTW, (c) MTI, and (d) MPH, Figure S5: Prediction accuracy values obtained using RR-BLUP model and three different sizes of TP (50%, 65% and 80% of the total number of lines in a population) for following traits of MG population: (a) GPC, (b) WGC, (c) TW, and (d) MTI, Figure S6: Prediction accuracies for MG population and traits (a) WGC, (b) MPT, and (c) MTI evaluated with eight different prediction models. Error bars denote standard deviation; Table S1: Mean MSEP values estimated for both populations using RR-BLUP model. Standard deviation values are indicated in parenthesis, Table S2: Mean prediction accuracy values estimated for MG population using eight different prediction models. Standard deviation values are indicated in parenthesis, Table S3: Mean MSEP values estimated for MG population using eight different prediction models. Standard deviation values are indicated in parenthesis, Table S4: Pedigree of winter wheat genotypes used in the study.

Author Contributions

Conceptualization, I.P., J.G. and D.N.; methodology, I.P., V.G. and J.G.; formal analysis, I.P. and J.G.; writing—original draft preparation, I.P.; writing—review and editing, J.G., V.G. and D.N.; visualization, I.P.; supervision, J.G. and D.N. All authors have read and agreed to the published version of the manuscript.

Funding

This study has been fully supported by the project KK.01.1.1.01.0005 Biodiversity and Molecular Plant Breeding, Centre of Excellence for Biodiversity and Molecular Plant Breeding (CoE CroP-BioDiv), Zagreb, Croatia.

Data Availability Statement

The data were obtained from the Agricultural Institute Osijek and are available upon request from the corresponding author with the permission of the Agricultural Institute Osijek.

Acknowledgments

The authors gratefully acknowledge Ruđer Šimek for his valuable technical assistance.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shewry, P.R. Improving the protein content and composition of cereal grain. J. Cereal Sci. 2007, 46, 239–250. [Google Scholar] [CrossRef]
  2. Shewry, P.R.; Hey, S.J. The contribution of wheat to human diet and health. Food Energy Secur. 2015, 4, 178–202. [Google Scholar] [CrossRef] [PubMed]
  3. Bordes, J.; Ravel, C.; Le Gouis, J.; Lapierre, A.; Charmet, G.; Balfourier, F. Use of a global wheat core collection for association analysis of flour and dough quality traits. J. Cereal Sci. 2011, 54, 137–147. [Google Scholar] [CrossRef]
  4. Shewry, P.R.; Tatham, A.S.; Barro, F.; Barcelo, P.; Lazzeri, P. Biotechnology of breadmaking: Unraveling and manipulating the multi-protein gluten complex. Bio/Technology 1995, 13, 1185–1190. [Google Scholar] [CrossRef] [PubMed]
  5. Swanson, C.O.; Working, E.B. Testing of the quality of flour by the recording dough mixer. Cereal Chem. 1933, 10, 1–29. [Google Scholar]
  6. Guzman, C.; Peña, R.J.; Singh, R.; Autrique, E.; Dreisigacker, S.; Crossa, J.; Rutkoski, J.; Poland, J.; Battenfield, S. Wheat quality improvement at CIMMYT and the use of genomic selection on it. Appl. Transl. Genom. 2016, 11, 3–8. [Google Scholar] [CrossRef] [Green Version]
  7. Meuwissen, T.H.E.; Hayes, B.J.; Goddard, M.E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 2001, 157, 1819–1829. [Google Scholar] [CrossRef]
  8. Lorenz, A.J.; Chao, S.; Asoro, F.G.; Heffner, E.L.; Hayashi, T.; Iwata, H.; Smith, K.P.; Sorrells, M.E.; Jannink, J.L. Genomic selection in plant breeding: Knowledge and prospects. Adv. Agron. 2011, 110, 77–123. [Google Scholar] [CrossRef]
  9. Sorrells, M.E. Genomic selection in plants: Empirical results and implications for wheat breeding. In Advances in Wheat Genetics: From Genome to Field; Ogihara, Y., Takumi, S., Handa, H., Eds.; Springer Japan KK: Yokohama, Japan, 2015; pp. 401–409. [Google Scholar]
  10. Belamkar, V.; Guttieri, M.J.; Hussain, W.; Jarquín, D.; El-basyoni, I.; Poland, J.; Lorenz, A.J.; Baenziger, P.S. Genomic selection in preliminary yield trials in a winter wheat breeding program. G3 Genes Genomes Genet. 2018, 8, 2735–2747. [Google Scholar] [CrossRef] [Green Version]
  11. Kristensen, P.S.; Jahoor, A.; Andersen, J.R.; Cericola, F.; Orabi, J.; Janss, L.L.; Jensen, J. Genome-wide association studies and comparison of models and cross-validation strategies for genomic prediction of quality traits in advanced winter wheat breeding lines. Front. Plant Sci. 2018, 9, 69. [Google Scholar] [CrossRef] [Green Version]
  12. Plavšin, I.; Gunjača, J.; Šatović, Z.; Šarčević, H.; Ivić, M.; Dvojković, K.; Novoselović, D. An overview of key factors affecting genomic selection for wheat quality traits. Plants 2021, 10, 745. [Google Scholar] [CrossRef]
  13. Wang, X.; Xu, Y.; Hu, Z.; Xu, C. Genomic selection methods for crop improvement: Current status and prospects. Crop J. 2018, 6, 330–340. [Google Scholar] [CrossRef]
  14. Combs, E.; Bernardo, R. Accuracy of genomewide selection for different traits with constant population size, heritability, and number of markers. Plant Genome 2013, 6, plantgenome2012-11. [Google Scholar] [CrossRef] [Green Version]
  15. Krishnappa, G.; Savadi, S.; Tyagi, B.S.; Singh, S.K.; Mamrutha, H.M.; Kumar, S.; Mishra, C.N.; Khan, H.; Gangadhara, K.; Uday, G.; et al. Integrated genomic selection for rapid improvement of crops. Genomics 2021, 113, 1070–1086. [Google Scholar] [CrossRef] [PubMed]
  16. Robertsen, C.; Hjortshøj, R.; Janss, L. Genomic Selection in Cereal Breeding. Agronomy 2019, 9, 95. [Google Scholar] [CrossRef] [Green Version]
  17. Riedelsheimer, C.; Endelman, J.B.; Stange, M.; Sorrells, M.E.; Jannink, J.L.; Melchinger, A.E. Genomic predictability of interconnected biparental maize populations. Genetics 2013, 194, 493–503. [Google Scholar] [CrossRef] [Green Version]
  18. Jannink, J.L.; Lorenz, A.J.; Iwata, H. Genomic selection in plant breeding: From theory to practice. Brief. Funct. Genom. Proteom. 2010, 9, 166–177. [Google Scholar] [CrossRef] [Green Version]
  19. Crossa, J.; Jarquín, D.; Franco, J.; Pérez-Rodríguez, P.; Burgueño, J.; Saint-Pierre, C.; Vikram, P.; Sansaloni, C.; Petroli, C.; Akdemir, D.; et al. Genomic prediction of gene bank wheat landraces. G3 Genes Genomes Genet. 2016, 6, 1819–1834. [Google Scholar] [CrossRef] [Green Version]
  20. Asoro, F.G.; Newell, M.A.; Beavis, W.D.; Scott, M.P.; Jannink, J.-L. Accuracy and training population design for genomic selection on quantitative traits in elite North American oats. Plant Genome J. 2011, 4, 132–144. [Google Scholar] [CrossRef] [Green Version]
  21. Hickey, J.M.; Dreisigacker, S.; Crossa, J.; Hearne, S.; Babu, R.; Prasanna, B.M.; Grondona, M.; Zambelli, A.; Windhausen, V.S.; Mathews, K.; et al. Evaluation of genomic selection training population designs and genotyping strategies in plant breeding programs using simulation. Crop Sci. 2014, 54, 1476–1488. [Google Scholar] [CrossRef] [Green Version]
  22. Maulana, F.; Kim, K.S.; Anderson, J.D.; Sorrells, M.E.; Butler, T.J.; Liu, S.; Baenziger, P.S.; Byrne, P.F.; Ma, X.-F. Genomic selection of forage quality traits in winter wheat. Crop Sci. 2019, 59, 2473–2483. [Google Scholar] [CrossRef] [Green Version]
  23. Arruda, M.P.; Brown, P.J.; Lipka, A.E.; Krill, A.M.; Thurber, C.; Kolb, F.L. Genomic selection for predicting Fusarium head blight resistance in a wheat breeding program. Plant Genome 2015, 8, plantgenome2015-01. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Heffner, E.L.; Jannink, J.-L.; Iwata, H.; Souza, E.; Sorrells, M.E. Genomic selection accuracy for grain quality traits in biparental wheat populations. Crop Sci. 2011, 51, 2597–2606. [Google Scholar] [CrossRef] [Green Version]
  25. Rutkoski, J.; Singh, R.P.; Huerta-Espino, J.; Bhavani, S.; Poland, J.; Jannink, J.-L.; Sorrells, M.E. Efficient use of historical data for genomic selection: A case study of stem rust resistance in wheat. Plant Genome 2015, 8, eplantgenome2014-09. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Herter, C.P.; Ebmeyer, E.; Kollers, S.; Korzun, V.; Würschum, T.; Miedaner, T. Accuracy of within- and among-family genomic prediction for Fusarium head blight and Septoria tritici blotch in winter wheat. Theor. Appl. Genet. 2019, 132, 1121–1135. [Google Scholar] [CrossRef]
  27. Lorenzana, R.E.; Bernardo, R. Accuracy of genotypic value predictions for marker-based selection in biparental plant populations. Theor. Appl. Genet. 2009, 120, 151–161. [Google Scholar] [CrossRef]
  28. Isidro, J.; Jannink, J.L.; Akdemir, D.; Poland, J.; Heslot, N.; Sorrells, M.E. Training set optimization under population structure in genomic selection. Theor. Appl. Genet. 2015, 128, 145–158. [Google Scholar] [CrossRef] [Green Version]
  29. Rincent, R.; Laloë, D.; Nicolas, S.; Altmann, T.; Brunel, D.; Revilla, P.; Rodríguez, V.M.; Moreno-Gonzalez, J.; Melchinger, A.; Bauer, E.; et al. Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: Comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics 2012, 192, 715–728. [Google Scholar] [CrossRef] [Green Version]
  30. Marulanda, J.J.; Melchinger, A.E.; Würschum, T. Genomic selection in biparental populations: Assessment of parameters for optimum estimation set design. Plant Breed. 2015, 134, 623–630. [Google Scholar] [CrossRef]
  31. de los Campos, G.; Hickey, J.M.; Pong-Wong, R.; Daetwyler, H.D.; Calus, M.P.L. Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 2013, 193, 327–345. [Google Scholar] [CrossRef] [Green Version]
  32. Hayes, B.J.; Visscher, P.M.; Goddard, M.E. Increased accuracy of artificial selection by using the realized relationship matrix. Genet. Res. 2009, 91, 47–60. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Heffner, E.L.; Sorrells, M.E.; Jannink, J.L. Genomic selection for crop improvement. Crop Sci. 2009, 49, 1–12. [Google Scholar] [CrossRef]
  34. Merrick, L.F.; Carter, A.H. Comparison of genomic selection models for exploring predictive ability of complex traits in breeding programs. Plant Genome 2021, 14, e20158. [Google Scholar] [CrossRef] [PubMed]
  35. Endelman, J.B. Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome J. 2011, 4, 250–255. [Google Scholar] [CrossRef] [Green Version]
  36. Hu, X.; Carver, B.F.; Powers, C.; Yan, L.; Zhu, L.; Chen, C. Effectiveness of Genomic Selection by Response to Selection for Winter Wheat Variety Improvement. Plant Genome 2019, 12, 180090. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Sandhu, K.S.; Aoun, M.; Morris, C.F.; Carter, A.H. Genomic selection for end-use quality and processing traits in soft white winter wheat breeding program with machine and deep learning models. Biology 2021, 10, 689. [Google Scholar] [CrossRef] [PubMed]
  38. Sandhu, K.S.; Lozada, D.N.; Zhang, Z.; Pumphrey, M.O.; Carter, A.H. Deep Learning for Predicting Complex Traits in Spring Wheat Breeding Program. Front. Plant Sci. 2021, 11, 613325. [Google Scholar] [CrossRef] [PubMed]
  39. Battenfield, S.D.; Guzmán, C.; Gaynor, R.C.; Singh, R.P.; Peña, R.J.; Dreisigacker, S.; Fritz, A.K.; Poland, J.A. Genomic Selection for Processing and End-Use Quality Traits in the CIMMYT Spring Bread Wheat Breeding Program. Plant Genome 2016, 9, plantgenome2016-01. [Google Scholar] [CrossRef] [Green Version]
  40. Plavšin, I.; Gunjača, J.; Šimek, R.; Novoselović, D. Capturing GEI patterns for quality traits in biparental wheat populations. Agronomy 2021, 11, 1022. [Google Scholar] [CrossRef]
  41. Prashant, R.; Mani, E.; Rai, R.; Gupta, R.K.; Tiwari, R.; Dholakia, B.; Oak, M.; Röder, M.; Kadoo, N.; Gupta, V. Genotype × environment interactions and QTL clusters underlying dough rheology traits in Triticum aestivum L. J. Cereal Sci. 2015, 64, 82–91. [Google Scholar] [CrossRef]
  42. R Core Team R. A Language and Environment for Statistical Computing; Foundation for Statistical Computing: Vienna, Austria, 2020. [Google Scholar]
  43. Butler, D.G.; Cullis, B.R.; Gilmour, A.R.; Gogel, B.J.; Thompson, R. ASReml-R Reference Manual Version 4; VSN International Ltd.: Hemel Hempstead, UK, 2017. [Google Scholar]
  44. Brien, C. Asremlplus: Augments “ASReml-R” in Fitting Mixed Models and Packages Generally in Exploring Prediction Differences 2021. Package Version 4.2-32. Available online: https://cran.r-project.org/web/packages/asremlPlus/index.html (accessed on 20 October 2021).
  45. Karp, A.; Isaac, P.G.; Ingram, D.S. Molecular Tools for Screening Biodiversity; Karp, A., Isaac, P.G., Ingram, D.S., Eds.; Chapman & Hall: London, UK, 1998; ISBN 9789401064965. [Google Scholar]
  46. Browning, B.L.; Zhou, Y.; Browning, S.R. A One-Penny Imputed Genome from Next-Generation Reference Panels. Am. J. Hum. Genet. 2018, 103, 338–348. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  47. Charmet, G.; Tran, L.G.; Auzanneau, J.; Rincent, R.; Bouchet, S. BWGS: A R package for genomic selection and its application to a wheat breeding programme. PLoS ONE 2020, 15, e0222733. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  48. Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  49. Habier, D.; Fernando, R.L.; Kizilkaya, K.; Garrick, D.J. Extension of the bayesian alphabet for genomic selection. BMC Bioinform. 2011, 12, 186. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  50. Park, T.; Casella, G. The Bayesian Lasso. J. Am. Stat. Assoc. 2008, 103, 681–686. [Google Scholar] [CrossRef]
  51. Heslot, N.; Yang, H.P.; Sorrells, M.E.; Jannink, J.L. Genomic selection in plant breeding: A comparison of models. Crop Sci. 2012, 52, 146–160. [Google Scholar] [CrossRef]
  52. Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef] [Green Version]
  53. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  54. Gianola, D.; Van Kaam, J.B.C.H.M. Reproducing kernel Hilbert spaces regression methods for genomic assisted prediction of quantitative traits. Genetics 2008, 178, 2289–2303. [Google Scholar] [CrossRef] [Green Version]
  55. De Los Campos, G.; Gianola, D.; Rosa, G.J.M.; Weigel, K.A.; Crossa, J. Semi-parametric genomic-enabled prediction of genetic values using reproducing kernel Hilbert spaces methods. Genet. Res. 2010, 92, 295–308. [Google Scholar] [CrossRef] [Green Version]
  56. Pérez, P.; de los Campos, G. BGLR: A Statistical Package for Whole Genome Regression and Prediction. Genetics 2014, 198, 483–495. [Google Scholar] [CrossRef] [PubMed]
  57. Tsai, H.Y.; Janss, L.L.; Andersen, J.R.; Orabi, J.; Jensen, J.D.; Jahoor, A.; Jensen, J. Genomic prediction and GWAS of yield, quality and disease-related traits in spring barley and winter wheat. Sci. Rep. 2020, 10, 3347. [Google Scholar] [CrossRef]
  58. Lado, B.; Vázquez, D.; Quincke, M.; Silva, P.; Aguilar, I.; Gutiérrez, L. Resource allocation optimization with multi-trait genomic prediction for bread wheat (Triticum aestivum L.) baking quality. Theor. Appl. Genet. 2018, 131, 2719–2731. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  59. Hayes, B.J.; Panozzo, J.; Walker, C.K.; Choy, A.L.; Kant, S.; Wong, D.; Tibbits, J.; Daetwyler, H.D.; Rochfort, S.; Hayden, M.J.; et al. Accelerating wheat breeding for end-use quality with multi-trait genomic predictions incorporating near infrared and nuclear magnetic resonance-derived phenotypes. Theor. Appl. Genet. 2017, 130, 2505–2519. [Google Scholar] [CrossRef] [PubMed]
  60. Yao, J.; Zhao, D.; Chen, X.; Zhang, Y.; Wang, J. Use of genomic selection and breeding simulation in cross prediction for improvement of yield and quality in wheat (Triticum aestivum L.). Crop J. 2018, 6, 353–365. [Google Scholar] [CrossRef]
  61. Michel, S.; Gallee, M.; Löschenberger, F.; Buerstmayr, H.; Kummer, C. Improving the baking quality of bread wheat using rapid tests and genomics: The prediction of dough rheological parameters by gluten peak indices and genomic selection models. J. Cereal. Sci. 2017, 77, 24–34. [Google Scholar] [CrossRef]
  62. Kristensen, P.S.; Jensen, J.; Andersen, J.R.; Guzmán, C.; Orabi, J.; Jahoor, A. Genomic Prediction and Genome-Wide Association Studies of Flour Yield and Alveograph Quality Traits Using Advanced Winter Wheat Breeding Material. Genes 2019, 10, 669. [Google Scholar] [CrossRef] [Green Version]
  63. Dawson, J.C.; Endelman, J.B.; Heslot, N.; Crossa, J.; Poland, J.; Dreisigacker, S.; Manès, Y.; Sorrells, M.E.; Jannink, J.L. The use of unbalanced historical data for genomic selection in an international wheat breeding program. Filed Crops Res. 2013, 154, 12–22. [Google Scholar] [CrossRef] [Green Version]
  64. Heslot, N.; Jannink, J.L.; Sorrells, M.E. Using genomic prediction to characterize environments and optimize prediction accuracy in applied breeding data. Crop Sci. 2013, 53, 921–933. [Google Scholar] [CrossRef]
  65. Michel, S.; Ametz, C.; Gungor, H.; Epure, D.; Grausgruber, H.; Löschenberger, F.; Buerstmayr, H. Genomic selection across multiple breeding cycles in applied bread wheat breeding. Theor. Appl. Genet. 2016, 129, 1179–1189. [Google Scholar] [CrossRef] [Green Version]
  66. Liu, G.; Zhao, Y.; Gowda, M.; Longin, C.F.H.; Reif, J.C.; Mette, M.F. Predicting hybrid performances for quality traits through genomic-assisted approaches in Central European wheat. PLoS ONE 2016, 11, e0158635. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  67. Lorenz, A.J. Resource allocation for maximizing prediction accuracy and genetic gain of genomic selection in plant breeding: A simulation experiment. G3 Genes Genomes Genet. 2013, 3, 481–491. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  68. Edwards, S.M.K.; Buntjer, J.B.; Jackson, R.; Bentley, A.R.; Lage, J.; Byrne, E.; Burt, C.; Jack, P.; Berry, S.; Flatman, E.; et al. The effects of training population design on genomic prediction accuracy in wheat. Theor. Appl. Genet. 2019, 132, 1943–1952. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  69. Verges, V.L.; van Sanford, D.A. Genomic selection at preliminary yield trial stage: Training population design to predict untested lines. Agronomy 2020, 10, 60. [Google Scholar] [CrossRef] [Green Version]
  70. Lozada, D.N.; Mason, R.E.; Sarinelli, J.M.; Brown-Guedira, G. Accuracy of genomic selection for grain yield and agronomic traits in soft red winter wheat. BMC Genet. 2019, 20, 82. [Google Scholar] [CrossRef]
  71. Haile, J.K.; N’Diaye, A.; Clarke, F.; Clarke, J.; Knox, R.; Rutkoski, J.; Bassi, F.M.; Pozniak, C.J. Genomic selection for grain yield and quality traits in durum wheat. Mol. Breed. 2018, 38, 75. [Google Scholar] [CrossRef]
  72. Juliana, P.; Poland, J.; Huerta-Espino, J.; Shrestha, S.; Crossa, J.; Crespo-Herrera, L.; Toledo, F.H.; Govindan, V.; Mondal, S.; Kumar, U.; et al. Improving grain yield, stress resilience and quality of bread wheat using large-scale genomics. Nat. Genet. 2019, 51, 1530–1539. [Google Scholar] [CrossRef]
  73. Heffner, E.L.; Jannink, J.-L.; Sorrells, M.E. Genomic selection accuracy using multifamily prediction models in a wheat breeding program. Plant Genome 2011, 4, 65–75. [Google Scholar] [CrossRef] [Green Version]
  74. Gorjanc, G.; Dumasy, J.F.; Gonen, S.; Gaynor, R.C.; Antolin, R.; Hickey, J.M. Potential of low-coverage genotyping-by-sequencing and imputation for cost-effective genomic selection in biparental segregating populations. Crop Sci. 2017, 57, 1404–1420. [Google Scholar] [CrossRef]
  75. Michel, S.; Kummer, C.; Gallee, M.; Hellinger, J.; Ametz, C.; Akgöl, B.; Epure, D.; Löschenberger, F.; Buerstmayr, H. Improving the baking quality of bread wheat by genomic selection in early generations. Theor. Appl. Genet. 2018, 131, 477–493. [Google Scholar] [CrossRef]
  76. Charmet, G.; Storlie, E.; Oury, F.X.; Laurent, V.; Beghin, D.; Chevarin, L.; Lapierre, A.; Perretant, M.R.; Rolland, B.; Heumez, E.; et al. Genome-wide prediction of three important traits in bread wheat. Mol. Breed. 2014, 34, 1843–1852. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  77. Gao, H.; Su, G.; Janss, L.; Zhang, Y.; Lund, M.S. Model comparison on genomic predictions using high-density markers for different groups of bulls in the Nordic Holstein population. J. Dairy Sci. 2013, 96, 4678–4687. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  78. Zhao, Y.; Mette, M.F.; Gowda, M.; Longin, C.F.H.; Reif, J.C. Bridging the gap between marker-assisted and genomic selection of heading time and plant height in hybrid wheat. Heredity 2014, 112, 638–645. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  79. Heslot, N.; Akdemir, D.; Sorrells, M.E.; Jannink, J.L. Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions. Theor. Appl. Genet. 2014, 127, 463–480. [Google Scholar] [CrossRef] [PubMed]
  80. Bernardo, R. Genotype x Environment Interaction. In Breeding for Quantitative Traits in Plants; Stemma Press: Woodbury, MN, USA, 2010; p. 422. [Google Scholar]
  81. Crossa, J.; De Los Campos, G.; Maccaferri, M.; Tuberosa, R.; Burgueño, J.; Pérez-Rodríguez, P. Extending the marker × environment interaction model for genomic-enabled prediction and genome-wide association analysis in durum wheat. Crop Sci. 2016, 56, 2193–2209. [Google Scholar] [CrossRef] [Green Version]
  82. Crossa, J.; de los Campos, G.; Pérez, P.; Gianola, D.; Burgueño, J.; Araus, J.L.; Makumbi, D.; Singh, R.P.; Dreisigacker, S.; Yan, J.; et al. Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 2010, 186, 713–724. [Google Scholar] [CrossRef] [Green Version]
  83. Lado, B.; Barrios, P.G.; Quincke, M.; Silva, P.; Gutiérrez, L. Modeling genotype × environment interaction for genomic selection with unbalanced data from a wheat breeding program. Crop Sci. 2016, 56, 2165–2179. [Google Scholar] [CrossRef] [Green Version]
  84. Jarquín, D.; Lemes da Silva, C.; Gaynor, R.C.; Poland, J.; Fritz, A.; Howard, R.; Battenfield, S.; Crossa, J. Increasing genomic-enabled prediction accuracy by modeling genotype × environment interactions in kansas wheat. Plant Genome 2017, 10, plantgenome2016-12. [Google Scholar] [CrossRef] [Green Version]
  85. Lopez-Cruz, M.; Crossa, J.; Bonnett, D.; Dreisigacker, S.; Poland, J.; Jannink, J.L.; Singh, R.P.; Autrique, E.; de los Campos, G. Increased prediction accuracy in wheat breeding trials using a marker × environment interaction genomic selection model. G3 Genes Genomes Genet. 2015, 5, 569–582. [Google Scholar] [CrossRef] [Green Version]
  86. Burgueño, J.; de los Campos, G.; Weigel, K.; Crossa, J. Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Sci. 2012, 52, 707–719. [Google Scholar] [CrossRef] [Green Version]
  87. Ornella, L.; Sukhwinder-Singh; Perez, P.; Burgueño, J.; Singh, R.; Tapia, E.; Bhavani, S.; Dreisigacker, S.; Braun, H.J.; Mathews, K.; et al. Genomic prediction of genetic values for resistance to wheat rusts. Plant Genome 2012, 5, 136–148. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Phenotypic variance of randomly selected TP plotted against prediction accuracy values obtained using RR-BLUP model. The top two scatter plots refer to observations for (a) GPC and (b) TW traits of BK population, and the bottom two to (c) MTI and (d) MPH traits of MG population. For each population–trait–environment combination, three different sizes of TP were used (25, 50, and 75 lines) with remaining lines serving as VP.
Figure 1. Phenotypic variance of randomly selected TP plotted against prediction accuracy values obtained using RR-BLUP model. The top two scatter plots refer to observations for (a) GPC and (b) TW traits of BK population, and the bottom two to (c) MTI and (d) MPH traits of MG population. For each population–trait–environment combination, three different sizes of TP were used (25, 50, and 75 lines) with remaining lines serving as VP.
Agronomy 12 01126 g001
Figure 2. Prediction accuracy values obtained using RR-BLUP model and three different sizes of TP (50%, 65%, and 80% of the total number of lines in a population). The top three boxplots represent results obtained for BK population and traits (a) GPC, (b) WGC, and (c) TW. The middle three boxplots refer to observations for MG population and traits (d) MPT, (e) MTW, and (f) MPH in the case when half of the marker dataset was used (NM = 1123 SNPs), while the bottom three (gi) refer to the same traits, respectively, but using the whole available marker dataset (NM = 2231 SNPs).
Figure 2. Prediction accuracy values obtained using RR-BLUP model and three different sizes of TP (50%, 65%, and 80% of the total number of lines in a population). The top three boxplots represent results obtained for BK population and traits (a) GPC, (b) WGC, and (c) TW. The middle three boxplots refer to observations for MG population and traits (d) MPT, (e) MTW, and (f) MPH in the case when half of the marker dataset was used (NM = 1123 SNPs), while the bottom three (gi) refer to the same traits, respectively, but using the whole available marker dataset (NM = 2231 SNPs).
Agronomy 12 01126 g002
Figure 3. Prediction accuracies for MG population and traits (a) GPC, (b) TW, (c) MTW, and (d) MPH evaluated with eight different prediction models. Error bars denote standard deviation.
Figure 3. Prediction accuracies for MG population and traits (a) GPC, (b) TW, (c) MTW, and (d) MPH evaluated with eight different prediction models. Error bars denote standard deviation.
Agronomy 12 01126 g003
Table 1. Quality trait abbreviations, descriptions, and measurement method used in present study.
Table 1. Quality trait abbreviations, descriptions, and measurement method used in present study.
Trait AbbreviationDescriptionUnitMeasuring Instrument
GPCGrain protein content measured on whole grain samplespercentInfratec 121 Grain Analyzer
WGCWet gluten content measured using flour samplespercentGlutomatic 2200 Gluten System/Glutomatic Centrifuge 2015 (Perten)
TWTest weight measured on whole grain sampleskg hL−1Infratec 121 Grain Analyzer
MPTMidline peak time measured using flour samples (denotes time required for optimal dough development) [41]minMixograph (National MFG Co., National Manufacturing Company, Lincoln, NE, USA); MixSmart software (v 3.40)
MTWMidline curve tail width measured using flour samples (designates the consistency and stability of the dough) [41]percent
MTIMidline curve tail integral measured using flour samples (describes energy used during the mixing process) [41]unitless
MPHMidline peak height measured using flour samples (denotes dough strength) [41]percent
Table 2. Broad-sense heritability (across environments, H2)/repeatability (within environment) estimates for both biparental populations used.
Table 2. Broad-sense heritability (across environments, H2)/repeatability (within environment) estimates for both biparental populations used.
GPC 2WGCTWMPTMTWMTIMPH
Within environment 1Bezostaya-1/Klara (BK) population
OS090.950.940.910.750.790.870.88
OS100.860.860.920.610.770.840.85
OS110.930.920.920.570.730.870.84
SB090.490.540.780.130.520.490.51
SB100.650.730.770.550.740.830.77
Across environments0.910.920.880.450.710.810.77
Within environmentMonika/Golubica (MG) population
OS090.860.830.860.680.90.80.84
OS100.850.860.870.940.960.860.88
OS110.180.250.190.430.310.240.35
SB090.760.720.650.520.890.730.72
SB100.820.820.830.810.930.830.86
SB110.660.710.790.70.890.730.77
Across environments0.900.890.780.840.910.720.76
1 Environment abbreviations represent a combination of year and location of experiment and are as follows: OS09 (Osijek–2009), OS10 (Osijek–2010), OS11 (Osijek–2011), SB09 (Slavonski Brod–2009), SB10 (Slavonski Brod–2010), and SB11 (Slavonski Brod–2011). 2 Trait abbreviations: grain protein content (GPC), wet gluten content (WGC), test weight (TW), midline peak time (MPT), midline curve tail width (MTW), midline curve tail integral (MTI), midline peak height (MPH).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Plavšin, I.; Gunjača, J.; Galić, V.; Novoselović, D. Evaluation of Genomic Selection Methods for Wheat Quality Traits in Biparental Populations Indicates Inclination towards Parsimonious Solutions. Agronomy 2022, 12, 1126. https://0-doi-org.brum.beds.ac.uk/10.3390/agronomy12051126

AMA Style

Plavšin I, Gunjača J, Galić V, Novoselović D. Evaluation of Genomic Selection Methods for Wheat Quality Traits in Biparental Populations Indicates Inclination towards Parsimonious Solutions. Agronomy. 2022; 12(5):1126. https://0-doi-org.brum.beds.ac.uk/10.3390/agronomy12051126

Chicago/Turabian Style

Plavšin, Ivana, Jerko Gunjača, Vlatko Galić, and Dario Novoselović. 2022. "Evaluation of Genomic Selection Methods for Wheat Quality Traits in Biparental Populations Indicates Inclination towards Parsimonious Solutions" Agronomy 12, no. 5: 1126. https://0-doi-org.brum.beds.ac.uk/10.3390/agronomy12051126

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop