3.1. Identification and Quantification of Phytochemicals in CP Samples
It was previously reported that coffee by-products contain a significant amount of caffeine [
32], so HPLC analysis was carried out using both the aqueous CP extracts without (
Figure 1, in black) and with (
Figure 1, in red) chloroform (CHCl
3) treatment (1:5,
v/
v), which could facilitate the identification of caffeine in the samples.
HPLC analysis results showed that a total of 10 compounds (peaks 1–10, numbered following the sample order) were observed in all the 8 CP samples, and their identification was then performed by HPLC, MS (and/or MS/MS), and UV-vis spectrum analysis.
As shown in
Figure 2a, five ions with
m/
z 137.98, 275.06, 412.07, 549.07, and 686.06 were observed on the MS spectrum of compound
1. The predominant ion with
m/
z 137.98 corresponds to the [M + H]
+ ion of compound
1, whereas the other four minor ions are the [M + H]
+ ion of dimer, trimer, tetramer, and pentamer of compound
1. Further MS/MS analysis (
Figure 2b) shows that the ion with
m/
z 137.98 is mainly fragmented into four ions with
m/
z 121.12, 120.32, 110.06, and 94.06. The MS and MS/MS information of compound
1 are quite similar with those of trigonelline [
33]. Therefore, compound
1 was assigned to trigonelline, and its identification was further confirmed by HPLC (
Figure 2c) and UV-vis spectrum (
Figure 2d) using authentic trigonelline.
HPLC elution fraction of compound
2 was collected and directly analyzed by MS for identification. However, the [M + H]
+ ion corresponding to compound
2 could not be distinguished from the background ions whose signal was strong due to low sample concentration. Therefore, an elution fraction of 100 mL was prepared, lyophilized, and redissolved in 1 mL pure methanol, giving a highly concentrated sample of compound
2. As shown in
Figure 3a, this concentrated sample only contains compound
2 at a much higher level than the original ones (
Figure 1, CP sample 1, 2 and 4), and exhibits a yellow color (
Figure 2b). Upon the second MS analysis of compound
2 (
Figure 3b), three major ions with
m/
z 142.97, 338.12 and 675.36 were observed on the spectrum. It was most likely that the ion with
m/
z 338.12 corresponds to the [M + H]
+ ion of compound
2, because another major ion with
m/
z 675.36 should correspond to the [2M + H]
+ ion derived from
m/
z 338, and the minor ion with
m/
z 360.10 observed in
Figure 3b should correspond to the [M + Na]
+ ion derived from
m/
z 338. Further MS/MS fragmentation of the [M + H]
+ ion with
m/
z 338 gave two major ions with
m/
z 320.92 and 303.38 (
Figure 3c). In addition, as shown in
Figure 3c, several minor fragment ions with
m/
z less than 100 were also observed on the MS/MS spectrum of the ion with
m/
z 338, which were found on the MS/MS spectrum of the ion with
m/
z 142.97 as well (
Figure 3d), indicating that the ion with
m/
z 142.97 obtained on
Figure 3b was actually derived from
m/
z 338. Therefore, it could be concluded that the yellow-colored compound
2 would have a molecular weight of 337 Da, which could be fragmented into at least three ions with
m/
z 321, 303, and 143.
However, compound 2 still could not be successfully identified based on the MS and MS/MS data alone.
As shown in
Figure 4a, two major ions with
m/
z 125.10 and 169.04 were observed on the MS spectrum of compound
3. Further MS/MS of the latter (
Figure 4b) shows that it is mainly broken down into one fragment ion with
m/
z 125.11, suggesting that the ion with
m/
z 125.10 observed in
Figure 4a was already due to the fragmentation of
m/
z 169. Therefore, the ion with
m/
z 169 is the [M-H]
− ion of compound
3, indicating that compound
3 would have a molecular weight of 170 Da, whose specific MS/MS fragment ion is that with
m/
z 125. These results were consistent with the reported MS and MS/MS data of gallic acid [
34]. Finally, compound
3 was unambiguously identified as gallic acid by HPLC (
Figure 4c) and UV-vis spectrum (
Figure 4d) analysis using commercial gallic acid as the external standard.
According to the HPLC analysis (
Figure 1), compound
4 was present in the samples at a very low concentration, so an elution fraction of 50 mL was collected by HPLC, which was then lyophilized and redissolved in 1 mL pure methanol, giving a more concentrated sample of compound
4 (
Figure 5a) for use in further identification by MS. In addition, this concentrated sample also exhibits a light-yellow color, but it could not be concluded that compound
4 is itself yellow, because a certain level of contamination by compound
2 was also found in the sample (
Figure 5a), which could contribute to the yellow color.
As shown in
Figure 5b, it was surprising that the obtained MS and MS/MS spectra of compound
4 are quite similar to those of compound
2 (
Figure 3b,c), on the basis of which we could conclude that compound
4 would also have a molecular weight of 337 Da, the same as that of compound
2.
It should be noted that compound
3 (gallic acid, peak 3) is eluted between compound
2 (peak 2) and compound
4 (peak 4) based on HPLC analysis (
Figure 1, CP sample 1, in black). If the peak of contaminant compound
2 obtained in
Figure 5a was introduced due to a poor manual collection of peak 4 during HPLC purification, then a peak corresponding to compound
3 should also be observed in
Figure 5a, which is not the case. In addition, UV-vis spectrum analysis of the concentrated sample of compounds 2 and 4 (
Figure 5d) showed that these two compounds have almost the same absorption behaviors in the 190–600 nm spectral range. Combined with the MS analysis results described above, this information indicated a strong relationship between compounds
4 and
2, which could be concluded as follows: compound
4 is most likely a tautomer of compound
2, which is less stable than the latter in solution and exists in an equilibrium of two tautomeric forms. Similarly, compound
4 also could not be identified based on the current data.
On the MS spectrum of compound
5 (
Figure 6a), only one major ion with
m/
z 153.04 was observed, which corresponds to the [M–H]
− ion of compound
5, and its MS/MS fragmentation only gives one ion with
m/
z 109.24 as a specific fragment ion (
Figure 6b). It was previously reported that protocatechuic acid (PA) was found to be one of the main phenolics in coffee pulp, whose molecular weight is 154 Da [
35]. Therefore, compound
5 was assigned to PA, and further confirmation was performed by HPLC (
Figure 6c) and UV-vis spectrum analysis (
Figure 6d) using authentic PA.
As shown in
Figure 1, a significant decrease in the peak of compound
6 (in black) was observed after CHCl
3 treatment (in red), indicating that compound
6 is most likely caffeine. As expected, compound
6 was assigned to caffeine after HPLC verification using the commercial caffeine standard (
Figure 7a). Moreover, a small peak 7 at the HPLC retention time of caffeine was found after CHCl
3 treatment, which would be caffeine residue or a new compound
7 that co-eluted with caffeine during HPLC analysis. As shown in
Figure 7b, the UV-vis spectrum of peak 7 was distinct from that of caffeine, indicating that caffeine was completely removed by CHCl
3 treatment, and peak 7 corresponds to a new compound
7 which cannot be detected in the presence of higher levels of caffeine. Finally, compound
7 was unambiguously identified as 3-caffeoylquinic acid (theoretical molecular weight of 354 Da) by HPLC-MS (
Figure 7c) and HPLC (
Figure 7d) analysis using commercial 3-caffeoylquinic acid as external standard.
For compound
8, only one major ion with
m/
z 481.47 was observed on its MS spectrum (
Figure 8a), which certainly corresponds to the [M–H]
− ion of compound
8. Further MS/MS fragmentation showed that this ion was broken down into several fragment ions with
m/
z 463.26, 437.32, 372.20, 319.41, and 301.32 (
Figure 8b). Among these fragment ions, the one with
m/
z 463.26 was derived from the precursor ion with
m/
z 481.47 by water elimination (481 − 18 = 463 Da), and the one with
m/
z 319.41 could be obtained either from the precursor ion or the fragment ion with
m/
z 463 by loss of the glycosyl moiety (481 − 162 = 319 Da). This result suggested that compound
8 is most likely a glycosidic compound, and its specific MS/MS fragment ion is that with
m/
z 319, which also corresponds to the [M–H]
− ion of its aglycone (320 Da). Therefore, compound
8 was identified as dihydromyricetin glycoside.
Compound
9 was only found in the CP sample 4 (
Figure 1), whose molecular weight is 628 Da according to the MS analysis (
Figure 8c). Although several fragment ions were observed on the MS/MS spectrum of compound
9 (
Figure 8d), the presence of the ones with
m/
z 481 and 319 were more important for identification, which would indicate that compound
9 is most likely a dihydromyricetin glycoside (compound
8) derivative. In addition, the fragment ions with
m/
z 609.84 and 481.44 would result from the precursor ion with
m/
z 627.50 by the loss of one water molecule (627 − 18 = 609 Da) and a rhamnosyl (627 − 146 = 481 Da), respectively, and further elimination of a second glycosyl from the latter ion would yield the ion with
m/
z 319.23 (481 − 162 = 319 Da). Therefore, compound
9 was identified as dihydromyricetin rhamnosylglycoside.
Compound
10 was only observed in CP sample 7, which was unambiguously identified as theobromine based on HPLC and UV-vis spectrum analysis by using commercial theobromine as the external standard (
Figure 9).
In summary, compounds 1, 3, 5, 6, 7, and 10 were unambiguously identified as trigonelline, gallic acid, protocatechuic acid, caffeine, 3-caffeoylquinic acid, and theobromine, respectively, including 3 alkaloids and 3 phenolic acids.
Compounds 2 and 4 were still unknown, but current data indicated that they were most likely two tautomers with molecular weight of 337 Da. As for compounds 8 and 9, they were successfully identified as two dihydroflavonol glycosides, dihydromyricetin glycoside (DMG), and dihydromyricetin rhamnosylglycoside (DMRG); however, their exact structure determination requires further NMR analysis.
In addition, the quantification and distribution of these 10 compounds in different CP samples is summarized in
Table 2, and their identification information is also summarized in
Table 3.