1. Introduction
Cardiac auscultation, i.e., listening to heart sounds via a stethoscope, has long been practiced by physicians as an essential part of physical examination [
1,
2]. Several pathological conditions, such as valvular stenosis or regurgitation, abnormalities in heart rhythm, or heart failure, may be detected via auscultation well before the appearance of any symptoms. For this reason, the assessment of the heart sounds plays a key role in the early diagnosis of many cardiovascular diseases (CVDs) [
1,
2,
3,
4].
The origin of the heart sounds has been identified in the motion of the heart valves, the contraction and relaxation of the cardiac muscle, pressure variations in the heart cavities, and the flow of blood through the heart and great vessels during the cardiac cycle. These events cause mechanical vibrations that are transmitted to the chest surface, where they can be heard. Generally, two physiological heart sounds can be auscultated from a healthy adult. They mainly capture the high-frequency and high-amplitude vibrations produced by the heart valves activity. Specifically, the first physiological heart sound, commonly referred to as “S1”, is generated by the closure of the mitral and tricuspid valves and the subsequent opening of the semilunar valves at the onset of ventricular systole, while the second physiological heart sound, commonly referred to as “S2”, is caused by the closure of the aortic and pulmonary valves at the onset of ventricular diastole [
1,
2,
3,
4,
5,
6]. Most of the information content of S1 and S2 is found at frequencies lower than 150 Hz [
3]. Occasionally, additional heart sounds, due to low-pitch and low-intensity vibrations of cardiac structures, can be heard. The third heart sound (S3) is attributed to vibrations of the ventricular walls, which are produced by blood deceleration when the ventricles reach their limit of distensibility at the end of rapid filling. S3 is commonly auscultated in young people, but it could be a sign of different CVDs, such as heart failure, mitral stenosis, aortic regurgitation and more, above the age of 40 years. The fourth heart sound (S4) is caused by atrial contraction and blood flowing through the valvular orifice in late ventricular diastole. The presence of S4 is usually associated with a severe pathological condition, such as coronary artery disease, left ventricular hypertension, ischemic heart disease, etc. Apart from S1, S2, S3 and S4, other acoustic phenomena originating from blood flow turbulences, known as murmurs, can be auscultated. Pathological murmurs are very useful in detecting valvular dysfunctions, like stenoses and regurgitations [
1,
2,
3,
4,
5,
6].
Despite its great diagnostic capability, cardiac auscultation is a qualitative examination, which strongly depends on the hearing acuity and the expertise of the physician [
1,
3,
4]. To overcome these limitations, some techniques have been proposed that apply frequency shifts to heart sounds, in order to move their energy to a frequency band corresponding to superior human ear sensitivity [
7,
8,
9]. Phonocardiography (PCG) is the method of recording the heart sounds by means of electronic stethoscopes. This technique retrieves the diagnostic significance of cardiac auscultation while obviating the problem of the subjectivity related to the human hearing sense. PCG enables not only the amplification, digitization, storage, and visualization of the heart sounds, but also the recording of vibrations that cannot be perceived by the human ear. A more effective discrimination between physiological and pathological heart sounds, as well as between heart sounds and murmurs, can be accomplished through the visual inspection of PCG signals, and different types of murmurs can also be distinguished. In addition, information on the timing of the heart sounds with respect to the cardiac cycle, as well as measurements of their intensity, frequency, and duration, can be obtained from the analysis of PCG signals. The correlation between variations of these parameters and different pathological conditions could be used to provide a valuable aid in the diagnosis of many CVDs [
2,
3,
6,
10,
11,
12,
13,
14,
15,
16]. A challenging task in PCG signals analysis is the localization of the heart sounds, especially S1 and S2. Because stethoscopes are sensitive to environmental noises and other sounds from the human body (e.g., respiratory sounds, lung sounds, rumbling of the stomach and intestine), denoising is strongly required to improve the accuracy of heart sounds localization [
10,
17,
18,
19,
20,
21]. Several techniques, such as short-time Fourier transform, fast Wavelet transform, tunable-Q Wavelet transform, and S transform, have been used to accomplish this task [
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27]. Different approaches have been proposed in the literature for heart sounds localization. Most of these methods take advantage of simultaneous Electrocardiography (ECG) tracing as a reference signal. However, in the last two decades, research has focused on the development of automated algorithms that perform heart sounds localization without using any reference signal. These tools can be essentially categorized into envelogram-based methods and artificial intelligence-based methods [
10,
17,
18,
19,
20,
21]. Envelogram-based methods generally extract an envelope from the PCG signal by using a Shannon energy operator [
24,
28,
29], a Hilbert transform [
30,
31,
32], or a Teager–Kaiser energy operator [
26], among others [
33,
34]. A fixed or adaptive threshold is then applied to the envelope to locate the peaks, to therefore identify the boundaries of the signal chunks corresponding to heart sounds. However, the sole thresholding operation often fails to selectively detect peaks related to heart sounds. Nevertheless, the great part of the envelogram-based methods is unable to discriminate S1 and S2 [
23,
26,
29,
32], or they need to consider additional criteria such as the typical duration of systole and diastole, to distinguish them [
22,
24,
28,
31,
33,
34].
On the other hand, the artificial intelligence-based methods extract many time-domain, frequency-domain, or time-frequency-domain features from each PCG segment, which are then used to train a classification model. Several machine learning or deep learning algorithms, such as k-nearest neighbor, support vector machine, hidden Markov model, decision tree, k-means clustering, logistic regression, and neural networks, have been implemented to discriminate between S1, S2, S3, and S4 heart sounds, and also to distinguish heart sounds from murmurs [
25,
27,
35,
36,
37,
38,
39,
40,
41,
42,
43] and to recognize abnormal heart sounds [
44,
45]. Although they achieve high classification performance, artificial intelligence-based methods are far more complex than envelogram-based algorithms in terms of computational burden and require the a priori knowledge of the heart sounds for labelling PCG segments and training a classifier.
Recently, Forcecardiography (FCG) has been introduced as a novel, non-invasive technique for cardio-respiratory monitoring [
46,
47,
48,
49]. FCG records the local forces induced on the chest wall by the mechanical activity of the heart and lungs by means of piezoresistive and piezoelectric force sensors, which have also been used to record sphygmic waves [
50]. These sensors are equipped with dome-shaped mechanical couplers to ensure an efficient transmission of the force from human tissues to the sensors active area [
46,
47]. The wide bandwidth of FCG sensors allows them to monitor respiration [
48], infrasonic cardiac vibrations [
46,
49], and heart sounds, all simultaneously from a single contact point on the chest [
47]. This capability supports the accurate estimation of inter-breath and inter-beat intervals [
46,
47,
48], as well as cardiac time intervals, such as the pre-ejection period and the left ventricular ejection time [
51,
52]. The infrasonic cardiac vibrations captured using FCG can be divided in two components: a low-frequency component related to emptying and filling of heart chambers, and a high-frequency component, related to the opening and closure of heart valves, which also exhibits a very high similarity with accelerometric Seismocardiography (SCG) signals [
46,
47]. The infrasonic FCG components have also been shown to be affected by respiration, which causes both amplitude modulations and morphology variations [
48,
52,
53]. In [
47], a morphological comparison was carried out that confirmed the high similarity between PCG signals and the audible component of FCG signals, both in terms of morphology and acoustic impression.
All measurements and analyses related to events of the cardiac cycle require a fundamental task, i.e., the localization of heartbeats. In cardio-mechanical signals, such as PCG, SCG, and FCG, this task is usually performed with the support of a concurrent ECG tracing, which, however, poses a limitation to the standalone application of these cardio-mechanical monitoring techniques. To address this issue, a template matching algorithm has been proposed for ECG-free heartbeat localization in cardio-mechanical signals [
54,
55]. A template is selected from the analyzed signal to capture the typical heartbeat morphology and the algorithm evaluates the similarity between the template and the whole signal by calculating the normalized cross-correlation (NCC) function. A high similarity between the template and any signal chunk results in a local maximum of the NCC function, which is assumed as heartbeat marker. The template matching algorithm has been tested on SCG and Gyrocardiography signals from 29 healthy subjects and 100 patients with valvular pathologies. The high accuracy achieved in heartbeat localization, as well as the high correlation and negligible errors obtained in inter-beat intervals estimation, demonstrated that the template matching approach is a very simple, effective, and robust solution for continuous heart rate monitoring via a standalone cardio-mechanical approach. Moreover, the feasibility of heart rate variability (HRV) analysis on the inter-beat intervals obtained by using the ECG-free heartbeat localization method based on template matching was investigated [
56]. Many time-domain, frequency-domain, and non-linear HRV indices were computed from the time series of these inter-beat intervals and resulted in very close agreement with those provided by the reference ECG signals.
In this study, the template matching algorithm described above was applied to the HS-FCG signals of healthy subjects to investigate its suitability in localizing and discriminating physiological S1 and S2 heart sounds without using any reference tracing. Then, the temporal locations of S1 and S2 were considered for the estimation of inter-beat intervals. Finally, an HRV analysis was carried out on these intervals. Both the inter-beat intervals estimates and the HRV indices obtained using the proposed approach were compared with those provided using simultaneous ECG recordings.
The article is organized as follows:
Section 2 summarizes the measurement setup and data collection procedure adopted for FCG signals acquisition in [
47] and describes the proposed template matching approach and the methodology for performance evaluation;
Section 3 describes the results of performance evaluation;
Section 4 discusses the results and highlights the limitations of the study;
Section 5 presents the conclusions of the study and suggests possible future developments.
4. Discussion
This study proposes the use of the template-matching technique for accurate recognition of heart sounds. In particular, the effectiveness of this technique in separately recognizing S1 and S2 heart sounds was investigated. The template-matching technique has several advantages with respect to other approaches. The approaches based on energy or envelope operators can easily recognize the presence of heart sounds but are usually unable to distinguish between S1 and S2 sounds and are very sensitive to various sources of noise. Instead, the template matching based on normalized cross-correlation is able to recognize the waveform of a specific heart sound because it evaluates the morphological similarity, which also prevents other noises to be incorrectly recognized as heart sounds, regardless of their amplitude. On the other hand, machine learning techniques have a much higher computational load, as opposed to the template-matching technique, which could easily be implemented in real time on microcontrollers, paving the way for long-term patient monitoring applications. It is worth underlining that the proposed technique does not require a synchronously acquired ECG signal: it was only used here for performance evaluation.
The results of this preliminary study are exceptionally encouraging. In fact, the proposed approach proved capable of separately classifying S1 and S2 sounds in more than 96% of all heartbeats. Linear regression, correlation, and Bland–Altman analyses showed that the template matching method allowed the estimation of inter-beat intervals with high accuracy. Indeed, 95% of the estimation errors were confined within 10 ms, which corresponds to relative errors lower than 2% by considering heart rates between 50 and 120 bpm. Further statistical analyses showed that HRV indices were estimated with reasonable accuracy, by achieving mean absolute percentage errors within 8% for all time-domain and non-linear indices, apart from NN50 and pNN50. Higher errors, within 32%, were found for frequency-domain indices.
It is interesting to note that the best results were obtained by considering only the first sound S1. This can be explained by recalling that the second sound undergoes substantial morphological changes in relation to respiration (the well-known physiologic splitting of S2).
Limitations of the Study
The heart sounds recording considered in this study had been acquired in a quiet environment. The proposed algorithm needs to be further verified in the presence of relevant external noises in order to be adopted as a sound recognizer in applications of long monitoring of patients during the performance of their daily activities.
All the tests were carried out on a limited number of subjects sitting at rest. Studies on a larger cohort of subjects are undoubtedly needed for a proper assessment of this technique, also considering the circumstance that the subject may change posture and may freely move. In such cases, morphological variations of heart sounds might occur, which might make the template-matching technique less effective. However, a dynamic update of the template might be considered to overcome such possible drawbacks.
The cardiac sound recordings considered in this study had been acquired only from healthy subjects: it was not possible to verify the recognition of S1 and S2 sounds in conjunction with other heart sounds (i.e., S3 and S4) or murmurs, which are generally associated with pathologies. Therefore, further tests on pathological subjects are foreseen to thoroughly assess the accuracy of heart sounds recognition via template matching.