An Intelligent Athlete Signal Processing Methodology for Balance Control Ability Assessment with Multi-Headed Self-Attention Mechanism

Xu, Nannan; Cui, Xinze; Wang, Xin; Zhang, Wei; Zhao, Tianyu

doi:10.3390/math10152794

Open AccessArticle

An Intelligent Athlete Signal Processing Methodology for Balance Control Ability Assessment with Multi-Headed Self-Attention Mechanism

¹

Sports Training Institute, Shenyang Sport University, Shenyang 110115, China

²

Department of Kinesiology, Shenyang Sport University, Shenyang 110115, China

³

School of Aerospace Engineering, Shenyang Aerospace University, Shenyang 110136, China

⁴

Key Laboratory of Structural Dynamics of Liaoning Province, College of Sciences, Northeastern University, Shenyang 110819, China

^*

Authors to whom correspondence should be addressed.

Mathematics 2022, 10(15), 2794; https://0-doi-org.brum.beds.ac.uk/10.3390/math10152794

Submission received: 9 July 2022 / Revised: 29 July 2022 / Accepted: 3 August 2022 / Published: 6 August 2022

(This article belongs to the Special Issue Applied Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

In different kinds of sports, the balance control ability plays an important role for every athlete. Therefore, coaches and athletes need accurate and efficient assessments of the balance control ability to improve the athletes’ training performance scientifically. With the fast growth of sport technology and training devices, intelligent and automatic assessment methods have been in high demand in the past years. This paper proposes a deep-learning-based method for a balance control ability assessment involving an analysis of the time-series signals from the athletes. The proposed method directly processes the raw data and provides the assessment results, with an end-to-end structure. This straight-forward structure facilitates its practical application. A deep learning model is employed to explore the target features with a multi-headed self-attention mechanism, which is a new approach to sports assessments. In the experiments, the real athletes’ balance control ability assessment data are utilized for the validation of the proposed method. Through comparisons with different existing methods, the accuracy rate of the proposed method is shown to be more than 95% for all four tasks, which is higher than the other compared methods for tasks containing more than one athlete of each level. The results show that the proposed method works effectively and efficiently in real scenarios for athlete balance control ability evaluations. However, reducing the proposed method’s calculation costs is an important task for future studies.

Keywords:

athlete signal processing; deep learning; balance control ability; multi-headed self-attention mechanism

MSC:

68T07

1. Introduction

Because of its significance, almost every sport requires accurate and efficient assessments of the balance control ability of athletes [1]. Meanwhile, the scientific management of athletes depends on good assessments of their balance control ability, including for selection, training, and competition. It is very difficult to accurately assess the balance control ability, since massive and complex data are produced during training and events, and large amounts of expert knowledge and human labor are also required in order to explore the underlying ability of the athletes from the data, making it hard to carry out such assessments in practical scenarios [2,3].

In recent years, with the rapid development of measurement devices and artificial intelligence technology, data-driven methods of balance control ability assessment have demonstrated some excellent effects [4]. In this paper, all of the utilized data were collected from the athletes using a movement pressure measurement machine. When an athlete stands on the machine, the pressure signals are collected, which reflect the balance control ability of the athlete. In general, a smaller movement pressure indicates better balance control ability, while a larger movement pressure shows a lower level of balance control [5]. Therefore, we can analyze the data to assess the balance control ability of the athletes and even explore their underlying abilities.

In the traditional methods, statistical features are usually used to assess the balance control ability, such as the mean, root mean square, and so on. However, these methods are too simple to reflect the complex features in the collected data. In recent years, many signal processing methods have been used to extract better features, such as wavelet analysis [6] and stochastic resonance [7] techniques. In addition, it is very popular to use machine learning and statistical inference techniques for related problems, such as artificial neural network (ANN) [8], support vector machine (SVM), random forest, fuzzy inference, and other techniques [9,10,11]. Although the existing methods have achieved success, they are generally less capable of dealing with the collected movement pressure data, which contain a lot of noise. Furthermore, the distinction of the balance control ability among athletes at different levels is quite hard, especially for professional high-level freestyle skiing athletes, which makes it difficult to use the existing methods for assessments of balance control ability. This is also a great challenge for the traditional data-driven methods with related problems.

With the rapid development of computing technologies, deep neural networks have been the advanced methods of choice for artificial intelligence in recent years [12,13,14,15,16], which have achieved a lot of effective and fruitful results in many fields, such as image recognition [17,18,19,20,21] and natural language processing [22]. Deep neural networks can achieve high prediction accuracy via training with big data to automatically learn the mapping function between the input data and the output target. They can automatically analyze the input data without prior knowledge related to signal processing or domain expertise. Therefore, DNNs are quite suitable for the assessment of balance control ability with freestyle skiing athlete data.

For the analysis of time-series data, the recent studies [23,24,25,26,27,28] show many good applications of deep neural network models, and higher feature learning efficiency can be achieved using deep neural network models. Therefore, deep learning is being used on various types of time-series data, such as in financial analyses, traffic monitoring, industrial optimization, machinery fault diagnosis problems, and so on [29,30,31,32,33]. In a related study [34], a deep-learning-based LSTM method was used for COVID-19 pandemic spread forecasting, which achieved great success.

However, the simple structure of the basic deep neural network models cannot be applied well in real tasks with quite complicated data. Normally, adding a number of inside neurons and layers can enhance the learning ability. However, the consumption of computing power also increases in the meantime. On the other hand, the deep architecture generally causes losses of feature information.

In this paper, a novel multi-headed self-attention mechanism is proposed to address the assessment problem of balance control ability for freestyle skiing athletes. The main novelties and contributions of this paper are as follows:

A multi-headed self-attention mechanism is used to automatically learn features with many standalone heads and to process the information using a residual connection structure. The heads have the same structure but with different initial parameters, which can explore different information features at the same time. This structure is very advanced in the field of deep learning. Through this structure, we can efficiently explore the deep features of the data;
As one of the first attempts, this paper proposes a deep-learning-based method for automatic feature exploration and freestyle skiing athlete balance control ability assessment, which have been seldomly studied in the current literature;
A real freestyle skiing athlete under-feet movement pressure measurement dataset is adopted to validate the adopted method, which shows high assessment accuracy and promise for applications in real scenarios.

However, we have to say that the proposed method becomes inefficient when processing high-dimension data because of the multi-headed self-attention structure, which causes excessive calculation costs.

In this paper, the preliminary aspects are described in Section 1. The proposed method is presented in Section 2. The experiments used to validate the proposed method and the results are presented in Section 3. We close the paper with our conclusions in Section 4.

2. Dataset and Methodology

2.1. Dataset

2.1.1. Introduction of the Dataset

In this paper, a dataset collected from the real freestyle skiing aerial athletes involved in the balance control ability assessment task is used to validate the proposed method. The dataset includes a number of people in different balance control levels. They are required to stand with a balance meter under their feet and try their best to keep still with their eyes closed. This is done to achieve a better balance control effect by reducing the vision disturbance and focusing on the body control. The area of the balance meter measures 65 cm × 40 cm, and the balance meter can collect the movement pressure data in the anteroposterior and mediolateral directions, which are denoted as Y and X. The scenarios for data collection are shown in Figure 1.

The levels of balance control ability are divided into four classes. Specifically, they people from different groups, including top freestyle skiing athletes, professional skill athletes, normally trained students from non-skill sports, and common people. The four classes are denoted as A, B, C, and D, respectively. The balance control ability levels decrease from A to D. For instance, the A group has the best ability, and the B group has the second best ability. We select three athletes from each level, who are represented by #1, #2, and #3 respectively. The athletes are required to keep their upper bodies stationary to and use two feet while standing with their eyes closed, then to reduce their body swing. The movement pressure data sampling frequency is 100 Hz. We show the information for the dataset used in this study in Table 1.

2.1.2. Pre-Processing of the Dataset

In this study, the task involves predicting different athlete balance control ability levels through learning features from the collected data with the proposed method. In order to fully examine the performance of the proposed method, we implement 5 tasks with different training and testing datasets, which include different athletes in each level. The tasks are demonstrated in Table 2. Every sample of the tasks contains 200 continuous points. The proposed method and compared methods can be fairly evaluated through the tasks by using a wide range of experimental settings.

2.2. Methods

The flow chart of our proposed method is displayed in Figure 2.

2.2.1. Proposed Method

In this paper, we propose a novel method based on the Transformer. The Transformer is one type of auto-encoder (AE). Auto-encoder are some of the most popular neural network structures in the current research, and are wildly used in many application scenarios, such as in image classification tasks, speech recognition problems, video processing problems, and so on [35,36,37].

In general, an auto-encoder includes an encoder and a decoder, which are symmetric. The function of the encoder is to compress the input data, while the function of the decoder is to decompress the data that the encoder outputs close to the original data. In brief, the auto-encoder is used to reproduce the input data. In this way, the auto-encoder can explore the features of the input data automatically. The process can be expressed as:

h (f (x)) \approx x

(1)

where x is the input data. The function

f

represents the encoder and the function

h

represents the decoder, which are inverse processes. The encoder and decoder require different building approaches. For example, the Vanilla Auto-Encoder is made of fully connected neural networks and is the most primitive auto-encoder. Convolutional neural networks (CNNs) are also used to build the auto-encoder [38].

As one of the latest and most powerful auto-encoders, the Transformer was originally used in natural language processing [39], and the encoder and decoder of the Transformer mainly rely on the self-attention mechanism. In addition to being used to solve natural language processing problems, the Transformer is also reformed to deal with the image classification tasks and video processing problems [40,41]. Its effectiveness has been well validated for analyzing time-series signals.

The basic Transformer consists of an input layer, multi-headed self-attention block, normalization layer, feedforward layer, and residual connected layer. Because the basic Transformer is used in natural language processing, the input layer includes word embedding and position embedding. The word embedding is used to transform the words of input sentences to a series of vectors, while the position embedding is used to describe the information about the corresponding positions of the words in the sentence. The multi-headed self-attention block is the most important part to explore for the features of the input data. The details of the structure of the basic Transformer are illustrated in Figure 3.

The basic Transformer consists of an encoder and decoder. It is mostly used in natural language processing tasks, the input data and target data for which are sentences that contain complex information. In such tasks, researchers use the encoder to analyze the input data and the decoder to analyze the target data. The underlying connection between the two results is also explored. In this paper, only the encoder part of the basic Transformer is adopted. This is because the target data for our task are class numbers without complex information such as sentences. This means we only need to explore the input data and predict their class. The detailed structure is illustrated in Figure 4.

Before the Transformer encoder block, the dimensions of the input data should be extended with a trainable pre-training linear layer in order to explore the deep information. In the results of the experiments, we will show the significant effectiveness of such layers. In addition, similar to most existing methods for natural language processing [42], we propose a learnable embedding approach to the input data, whose state at the output of the Transformer encoder is as the representation of the input data.

After the aforementioned operations, the input data for the Transformer encoder are transformed into a series of vectors. Therefore, the word embedding layers are not required. Specifically, the position embedding layer is needed to describe the information about the time point order of the time series, such as the word position in the sentence.

There are two common ways to achieve the position embedding. The first one is to randomly generate a series of vectors and update them during the training, while the second one is to encode the information with sin and cos functions. We choose the first method in this paper. In both the methods, we should create a matrix whose shape is the same as the input data, and assign its parameters using one of the above methods. Afterwards, the matrix is added to the input data.

The core of the Transformer is the self-attention mechanism. Its function is calculating the relationships between all parts of the input data, which are always sequential data, and the relationships are expressed by a series of probabilities whose sum is one. According to the probabilities, the mechanism will distribute different weights to corresponding parts of the input data.

In this study, the self-attention mechanism is modified from the attention mechanism. In the attention mechanism, the input part consists of three matrices, Q, K, and V. K and V come from the input data, while Q generally comes from the output data. In the self-attention mechanism, all matrices are from the input data. In addition, the attention mechanism is usually used to connect the outputs of the encoder and decoder. However, the self-attention mechanism is the core in the structures of the encoder and decoder. The generator method of the three matrices can be expressed as:

\begin{array}{l} X • W_{Q} = Q \\ X • W_{K} = K \\ X • W_{V} = V \end{array}

(2)

where

X

is the input data, whose length equals the number of the time steps;

W_{Q}, W_{K}, W_{V}

are three matrices with the same shape but different parameters, and the parameters can be changed by the training. The operation

•

is the dot product. The calculation of the self-attention mechanism can be defined as:

S e l f - A t t e n t i o n (Q, K, V) = s o f t m a x (\frac{Q • K^{T}}{\sqrt{d_{k}}}) • V

(3)

where

d_{k}

is the dimension of the matrix

K

, which can prevent the result of the dot product from flooding. The

s o f t m a x

function can transform the results of the dot product into probabilities as the weights that describe the relationship between all parts of the data.

Based on the self-attention mechanism, the multi-headed self-attention block can explore the features of the input data effectively. The multi-headed self-attention approach involves using many self-attention blocks to explore the same data together, then integrating the results of every block. It should be noted that one block is called one head.

In general, there are two ways to achieve a multi-headed self-attention block. In the first one, we map the input data to Q, K, V without changing the shape and evenly divide them into many small matrices. Next, we calculate them with the self-attention mechanism. In another approach, we can map the input data to Q, K, V with the same shape, which equals the input data dimension multiplied by the number of heads that are needed, then we can calculate them with the self-attention mechanism and finally map the result to the matrix with the linear projection, whose shape is same as the input data. In this way, we can set the number of heads freely. However, more computing power will be consumed. In this paper, we select the later one.

In the

s o f t m a x

function, we let

x^{(i)}

denote the input samples and

r^{(i)}

denote the corresponding class labels for them;

i = 1, 2, \dots, N

, where

i

is the number of trained samples and

N

is the quantity of samples. We also have

x^{(i)} \in R^{d \times 1}

and

r^{(i)} \in {1, 2, \dots, L}

, where

L

is the whole number of target classes in this paper. According to the input data

x^{(i)}

, the function can give the probability

p (r^{(i)} = j | x^{(i)})

for different class labels. The calculation is based on the below algorithm:

J_{λ} (x^{(i)}) = [\begin{matrix} p (r^{(i)} = 1 | x^{(i)}; λ) \\ p (r^{(i)} = 2 | x^{(i)}; λ) \\ ⋮ \\ p (r^{(i)} = L | x^{(i)}; λ) \end{matrix}] = \frac{1}{\sum_{l = 1}^{L} e^{λ_{l}^{T} • x^{(i)}}} [\begin{matrix} e^{λ_{1}^{T} • x^{(i)}} \\ e^{λ_{2}^{T} • x^{(i)}} \\ ⋮ \\ e^{λ_{L}^{T} • x^{(i)}} \end{matrix}]

(4)

where

λ = {[λ_{1}, λ_{2}, \dots, λ_{L}]}^{T}

represents the coefficients of the

s o f t m a x

function. The output values of the

s o f t m a x

function are all positive and the sum of them is one. Therefore, the result of the

s o f t m a x

function can be used to predict the probabilities of the target classes and to evaluate the relationship among the parts of the input data in the self-attention mechanism.

After the multi-headed self-attention block, there is a feedforward layer block, which is used to explore the output of the Transformer encoder block again. The core of the block is a MLP model that consists of two linear layers with a GELU non-linearity activation function. In the Transformer encoder, the normalization layer is applied before the multi-headed self-attention block and feedforward layer block, and there are residual connections after every block. The MLP head is set after the Transformer encoder, which carries the task as a classifier to predict the class of the input data. The MLP head contains two linear layers.

At last, we select the Adam optimizer for the proposed method.

2.2.2. Compared Methods

The proposed Transformer model offers a new perspective for the assessment of the athletes’ balance control performance with artificial intelligence technology. In this paper, we also implement some popular methods in the current literature for comparisons in order to prove the effectiveness and superiority of the proposed method. The following methods are included.

NN

As a typical neural connection method, we select the basic neural network (NN) to join the comparisons, which includes one hidden layer with 1000 neurons, a leaky ReLU activation function, and other typical operations.

2.: DNN

The deep neural network (DNN) is based on the basic neural network structure. The used DNN method consists of three layers with 1000, 1000, and 500 neurons. Likewise, similar techniques are also employed, such as a leaky ReLU activation function and so on.

3.: DSCNN

The deep single-scale convolution neural network (DSCNN) method is a basic and popular deep learning neural network, which is widely used as a basic cell to build many complex networks, such as LeNet-5, Alex-Net, VGG-16, and so on [43,44,45]. In the comparison, we use a basic network with one convolutional filter size for the feature extraction.

4.: RNN

The recurrent neural network (RNN) method is a typical deep learning neural network, which works well in dealing with sequential data. Therefore, we can better demonstrate the advantage of the proposed method.

5.: Random Forest

The random forest is a classical machine learning approach, which is widely used for classification tasks. It consists of a lot of decision trees, and every one of them works independently. The approach performs well with noise. Therefore, it can be used as a suitable comparison method.

3. Results

3.1. Experiment Description

We organize our experiments here as follows. Experiment 1 aims to show the necessity of the pre-training linear layer before the Transformer encoder. Experiment 2 aims to find the optimal hyper-parameters for the proposed method. Experiment 3 aims to show the superiority of the proposed method by competing with the compared methods. The parameters in the experiments are listed in Table 3. The test data are involved into the parameter selection process and the accuracy score might potentially be biased. The selected parameters are popular choices for the deep learning framework, which can be generally used in different applications.

3.2. Experiment and Results Analysis

3.2.1. Experiment 1

In experiment 1, we aim to investigate the influence of the pre-training linear layer. Therefore, we set one group with a linear layer and the control group without a linear layer. Then, we set 512 neurons for the pre-training linear layer, with a 12-layered Transformer encoder with 8 heads for the multi-headed self-attention part. There are 32 dimensions for every head, and the output layer of the feedforward part contains 64 neurons. The results of the 5 tasks are displayed in Figure 5.

In the figure, the accuracy of the method with the pre-training linear layer is significantly higher than the method without the pre-training linear layer. The pre-training linear layer plays an important role in exploring features of the input data, which we will investigate in the following section.

3.2.2. Experiment 2

In experiment 2, we investigate the optimal hyper-parameters for the proposed method, which include the depth of the Transformer encoder block, the number of pre-training layer neurons, the number of multi-headed self-attention heads, the dimensions of every head, and the output dimensions of the feedforward block. We select the T2 dataset to train the proposed method.

Firstly, we investigate the influence of the Transformer encoder block. Therefore, we set the number of pre-training layer neurons, the number of multi-headed self-attention heads, the dimensions of every head, and the output dimensions of the feedforward block as (512, 8, 32, 64), the results of which are shown in Figure 6.

According to the results, the multi-layer approaches are much more effective than the mono-layer approach. However, the larger Transformer encoder’s depth does not usually lead to better results. The accuracy does not significantly increase when the depth of the multi-layer approach increases.

Secondly, the number of pre-training layer neurons is an important factor for the effectiveness of the proposed method. We set the parameters of the proposed method as (6, 8, 32, 64). According to Figure 7a, the accuracy of the proposed method increases as the neuron number of the pre-training linear layer increases. In particular, the accuracy becomes significantly higher when the number of neurons increases compared with the dimensions of the input data. According to Figure 7b, the training loss function of the methods, whose neuron number is smaller than the dimensions of the input data, decreases slowly or changes a little. However, the training loss function decreases rapidly as the neuron number is bigger than the dimensions of the input data. In this study, it is found that the number of neurons has the largest influence on the training of the proposed method.

Thirdly, the influence of the number of multi-headed self-attention heads is shown in Figure 8. The parameters are set as (6, 512, 32, 64). It is shown that more than 2 heads is suitable, which means the multi-headed self-attention is more effective than the basic self-attention.

In addition, the dimensions of every head also play an important role in the proposed method, which are displayed in Figure 9. The parameters are set as (2, 64, 4, 64). In general, it is noted that the testing accuracy increases as the dimensions of every head increase.

Finally, the output dimensions of the feedforward layer are investigated with the parameters of (4, 128, 4, 64). The results are shown in Figure 10. It can be observed that the dimensions have a great influence on the testing accuracy. To be specific, the minimum accuracy is about 8 percentage points less than the maximum accuracy.

3.2.3. Experiment 3

In the experiment, we compare the proposed method with the current methods that were mentioned before, in order to demonstrate the superiority and effectiveness of the proposed method. The results are displayed in Figure 11.

In Figure 11, it is obvious that the learning ability of the proposed method is much better than the others, whose testing accuracy levels for all tasks are all higher than 95%. Although every method performs well in task T0, which only contains one athlete’s data of every level, it should be noted that task T1 also contains only one athlete’s data for every level. However, all methods’ accuracy levels are reduced to different degrees, except for the proposed method. According to the results for T2 and T4, when different athletes’ data are used, every method’s accuracy is reduced. The proposed method maintains the accuracy to higher than 95%, although the others all drop to lower than 90%.

In addition, in Figure 12, Figure 13 and Figure 14, we use the T-SNE algorithm to process the features of the methods for dimension reductions and visualizations of the learned features [46]. Especially, we compare the proposed method with the DNN method. It is clear that the discrimination effect of the proposed method is better. The different clusters of the DNN method are more overlapped by comparison with the proposed method.

From Figure 11, we can see that the DNN is the best of the compared methods; sometimes its accuracy is close to the proposed method’s, such as for task T3, but Figure 13 shows their clustering effects using scatter diagrams. It is obvious that the scatter diagram of the proposed method shows the clusters of the results with clear boundaries, but the scatter diagram of the DNN shows that the clusters are farraginous, which means there is a great difference between the features of the two methods, and the proposed method’s effect is significantly better.

In addition, it shows that the proposed method maintains the excellent ability for exploring the deep features of the data in Figure 12 and Figure 14, when the sampling frequency becomes sparse and the numbers of athletes at every level increase. In contrast, the results of the DNN method become more chaotic.

4. Conclusions and Future Works

In this paper, a simplified Transformer-based deep neural network model was proposed for the assessment of athlete balance control ability, which processes and analyzes the time-series pressure measurement data from the balance meter. The original data were directly used as the inputs to the model for an automatic assessment without any prior knowledge. Therefore, it is well suited for real applications in various industries. The multi-headed self-attention process is the core of the proposed method, which calculates the deep connections between every point of the input time-series data and explores complex features via the calculations. In addition, the pre-training linear layer is also necessary, which is used to expand the dimensions of the raw input data to expose the deep information. The connection of two parts can enhance the model training efficiency and quality, making it well suited for many tasks with time series data. The real freestyle skiing athletes under-feet pressure measurement dataset was used in the experiments for validation. The results showed that the proposed method has many advantages in the intelligent assessment of freestyle skiing athletes’ balance control abilities. It holds promise for achieving significant success in practical implementations in real scenarios.

However, it should be pointed out that the proposed method generally requires significant computing power, especially with large amounts of data for freestyle skiing athletes. In addition, in order to more efficiently and accurately assess the balance control ability and other related abilities, we could use high-dimension data for freestyle skiing athletes in the future. Therefore, a reduction in the computational burden of the proposed method will be investigated, as well as optimization of the deep neural network architecture. Better pre-processing methods will also be proposed in the next study.

Author Contributions

Conceptualization, X.W.; Formal analysis, N.X.; Investigation, X.C.; Project administration, W.Z.; Resources, T.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Key R&D Plan of China for the Winter Olympics (No. 2021YFF0306401), the Key Special Project of the National Key Research and Development Program “Technical Winter Olympics” (2018YFF0300502 and 2021YFF0306400), and the Key Research Program of Liaoning Province (2020JH2/10300112).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Snyder, N.; Cinelli, M. Comparing balance control between soccer players and non-athletes during a dynamic lower limb reaching task. Res. Q. Exerc. Sport 2020, 91, 166–171. [Google Scholar] [CrossRef] [PubMed]
Nolan, L.; Grigorenko, A.; Thorstensson, A. Balance control: Sex and age differences in 9- to 16-year-olds. Child. Neurol. 2005, 47, 449–454. [Google Scholar]
Andreeva, A.; Melnikov, A.; Skvortsov, D.; Akhmerova, K.; Vavaev, A.; Golov, A.; Draugelite, V.; Nikolaev, R.; Chechelnickaia, S.; Zhuk, D.; et al. Postural stability in athletes: The role of age, sex, performance level, and athlete shoe features. Sports 2020, 8, 89. [Google Scholar] [CrossRef]
Cloak, R.; Nevill, A.; Day, S.; Wyon, M. Six-week combined vibration and wobble board training on balance and stability in footballers with functional ankle instability. Clin. J. Sport Med. 2013, 23, 384–391. [Google Scholar] [CrossRef] [Green Version]
Bruijn, S.M.; van Dieën, J.H. Control of human gait stability through foot placement. J. R. Soc. Interface 2018, 15, 20170816. [Google Scholar] [CrossRef]
Adewusi, S.A.; Al-Bedoor, B.O.B. Wavelet analysis of vibration signals of an overhang rotor with a propagating transverse crack. J. Sound Vib. 2001, 246, 777–793. [Google Scholar] [CrossRef]
Chen, X.; Cheng, G.; Shan, X.L.; Hu, X.; Guo, Q.; Liu, H.G. Research of weak fault feature information extraction of planetary gear based on ensemble empirical mode decomposition and adaptive stochastic resonance. Measurement 2015, 73, 55–67. [Google Scholar] [CrossRef]
Singh, A.N.R.; Peters, B.E.M. Artificial neural networks in the determination of the fluid intake needs of endurance athletes. AASRI Procedia 2014, 8, 9–14. [Google Scholar] [CrossRef]
Li, X.; Jia, X.; Yang, Q.; Lee, J. Quality analysis in metal additive manufacturing with deep learning. J. Intell. Manuf. 2020, 31, 2003–2017. [Google Scholar] [CrossRef]
Zhao, T.; Li, K.; Ma, H. Study on dynamic characteristics of a rotating cylindrical shell with uncertain parameters. Anal. Math. Phys. 2022, 12, 1–18. [Google Scholar] [CrossRef]
Zhao, T.Y.; Yan, K.; Li, H.W.; Wang, X. Study on theoretical modeling and vibration performance of an assembled cylindrical shell-plate structure with whirl motion. Appl. Math. Model. 2022, 110, 618–632. [Google Scholar] [CrossRef]
Lv, J.; Wang, C.; Gao, W.; Zhao, Q. An economic forecasting method based on the lightgbm-optimized lstm and time-series model. Comput. Intell. Neurosci. 2021, 2021, 8128879. [Google Scholar] [CrossRef] [PubMed]
Chen, S.; Han, X.; Shen, Y.; Ye, C. Application of improved lstm algorithm in macroeconomic forecasting. Comput. Intell. Neurosci. 2021, 2021, 4471044. [Google Scholar] [CrossRef] [PubMed]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Zhang, W.; Li, X.; Ma, H.; Luo, Z.; Li, X. Transfer learning using deep representation regularization in remaining useful life prediction across operating conditions. Reliab. Eng. Syst. Saf. 2021, 211, 107556. [Google Scholar] [CrossRef]
Li, X.; Zhang, W.; Xu, N.X.; Ding, Q. Deep learning-based machinery fault diagnostics with domain adaptation across sensors at different places. IEEE Trans. Ind. Electron. 2019, 67, 6785–6794. [Google Scholar] [CrossRef]
Wang, K.; Chen, K.; Du, H.; Liu, S.; Xu, J.; Zhao, J.; Chen, H.; Liu, Y.; Liu, Y. New image dataset and new negative sample judgment method for crop pest recognition based on deep learning models. Ecol. Inform. 2022, 69, 101620. [Google Scholar] [CrossRef]
Fujiyosh, H.; Hirakawa, T.; Yamashita, T. Deep learning-based image recognition for autonomous driving. IATSS Res. 2019, 43, 244–252. [Google Scholar] [CrossRef]
Qian, C.; Xu, F.; Li, H. User Authentication by Gait Data from Smartphone Sensors Using Hybrid Deep Learning Network. Mathematics 2022, 10, 2283. [Google Scholar]
De-Prado-Gil, J.; Osama, Z.; Covadonga, P.; Martínez-García, R. Prediction of Splitting Tensile Strength of Self-Compacting Recycled Aggregate Concrete Using Novel Deep Learning Methods. Mathematics 2022, 10, 2245. [Google Scholar] [CrossRef]
Shankar, K.; Kumar, S.; Dutta, A.K.; Alkhayyat, A.; Jawad, A.J.A.M.; Abbas, A.H.; Yousif, Y.K. An Automated Hyperparameter Tuning Recurrent Neural Network Model for Fruit Classification. Mathematics 2022, 10, 2358. [Google Scholar] [CrossRef]
Anand, M.; Sahay, K.B.; Ahmed, M.A.; Sultan, D.; Chandan, R.R.; Singh, B. Deep learning and natural language processing in computation for offensive language detection in online social networks by feature selection and ensemble classification techniques. Theor. Comput. Sci. 2022. [Google Scholar] [CrossRef]
Yang, G.; Li, J.; Xu, W.; Feng, H.; Zhao, F.; Long, H.; Meng, Y.; Chen, W.; Yang, H.; Yang, G. Fusion of optical and SAR images based on deep learning to reconstruct vegetation NDVI time series in cloud-prone regions. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102818. [Google Scholar]
Yang, H.; Li, X.; Zhang, W. Interpretability of deep convolutional neural networks on rolling bearing fault diagnosis. Meas. Sci. Technol. 2022, 33, 055005. [Google Scholar] [CrossRef]
Wu, B.; Cai, W.; Cheng, F.; Chen, H. Simultaneous-fault diagnosis considering time series with a deep learning transformer architecture for air handling units. Energy Build. 2022, 257, 111608. [Google Scholar] [CrossRef]
Saadallah, A.; Abdulaaty, O.; Morik, K.; Büscher, J.; Panusch, T.; Deuse, J. Early quality prediction using deep learning on time series sensor data. Procedia CIRP 2022, 107, 611–616. [Google Scholar] [CrossRef]
Zhang, W.; Li, X.; Li, X. Deep learning-based prognostic approach for lithium-ion batteries with adaptive time-series prediction and on-line validation. Measurement 2020, 164, 108052. [Google Scholar] [CrossRef]
Li, X.; Li, X.; Ma, H. Deep representation clustering-based fault diagnosis method with unsupervised data applied to rotating machinery. Mech. Syst. Signal Processing 2020, 143, 106825. [Google Scholar] [CrossRef]
Li, X.; Zhang, W.; Ma, H.; Luo, Z.; Li, X. Degradation alignment in remaining useful life prediction using deep cycle-consistent learning. IEEE Trans. Neural Netw. Learn. Syst. 2021, 1–12. [Google Scholar] [CrossRef]
Zhang, W.; Li, X.; Ma, H.; Luo, Z.; Li, X. Universal domain adaptation in fault diagnostics with hybrid weighted deep adversarial learning. IEEE Trans. Ind. Inform. 2021, 17, 7957–7967. [Google Scholar] [CrossRef]
Zhang, W.; Li, X.; Ma, H.; Luo, Z.; Li, X. Federated learning for machinery fault diagnosis with dynamic validation and self-supervision. Knowl.-Based Syst. 2021, 213, 106679. [Google Scholar] [CrossRef]
Zhang, W.; Li, X. Federated transfer learning for intelligent fault diagnostics using deep adversarial networks with data privacy. IEEE/ASME Trans. Mechatron. 2021, 27, 430–439. [Google Scholar] [CrossRef]
Zhang, W.; Li, X.; Ma, H.; Luo, Z.; Li, X. Open-set domain adaptation in machinery fault diagnostics using instance-level weighted adversarial learning. IEEE Trans. Ind. Inform. 2021, 17, 7445–7455. [Google Scholar] [CrossRef]
Mwata-Velu, T.Y.; Avina-Cervantes, J.G.; Ruiz-Pinales, J.; Garcia-Calva, T.A.; González-Barbosa, E.A.; Hurtado-Ramos, J.B.; González-Barbosa, J.J. Improving Motor Imagery EEG Classification Based on Channel Selection Using a Deep Learning Architecture. Mathematics 2022, 10, 2302. [Google Scholar] [CrossRef]
Ghasrodashti, E.K.; Sharma, N. Hyperspectral image classification using an extended Auto-Encoder method. Signal Process. Image Commun. 2021, 92, 116111. [Google Scholar] [CrossRef]
Sertolli, B.; Zhao, R.; Schuller, B.W.; Cummins, N. Representation transfer learning from deep end-to-end speech recognition networks for the classification of health states from speech. Comput. Speech Lang. 2021, 68, 101204. [Google Scholar] [CrossRef]
Ribeiro, M.; Lazzaretti, A.E.; Lopes, H.S. A study of deep convolutional auto-encoders for anomaly detection in videos. Pattern Recognit. Lett. 2018, 105, 13–22. [Google Scholar] [CrossRef]
Yu, F.; Liu, J.; Liu, D.; Wang, H. Supervised convolutional autoencoder-based fault-relevant feature learning for fault diagnosis in industrial processes. J. Taiwan Inst. Chem. Eng. 2022, 132, 104200. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Processing Syst. 2017, 30. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929, 2010. [Google Scholar]
Huang, K.; Tian, C.; Su, J.; Lin, J.C.W. Transformer-based cross reference network for video salient object detection. Pattern Recognit. Lett. 2022, 160, 122–127. [Google Scholar] [CrossRef]
Rai, N.; Kumar, D.; Kaushik, N.; Raj, C.; Ali, A. Fake news classification using transformer based enhanced LSTM and BERT. Int. J. Cogn. Comput. Eng. 2022, 3, 98–105. [Google Scholar] [CrossRef]
Islam, M.R.; Martin, A. Detection of COVID 19 from CT image by the novel LeNet-5 CNN architecture. In Proceedings of the 2020 23rd International Conference on Computer and Information Technology (ICCIT), IEEE, Dhaka, Bangladesh, 19–21 December 2020; pp. 1–5. [Google Scholar]
Sun, J.; Cai, X.; Sun, F.; Zhang, J. Scene image classification method based on Alex-Net model. In Proceedings of the 2016 3rd International Conference on Informative and Cybernetics for Computational Social Systems (ICCSS), IEEE, Jinzhou, China, 26–29 August 2016; pp. 363–367. [Google Scholar]
Rezaee, M.; Zhang, Y.; Mishra, R.; Tong, F.; Tong, H. Using a vgg-16 network for individual tree species detection with an object-based approach. In Proceedings of the 2018 10th IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS), IEEE, Beijing, China, 19–20 August 2018; pp. 1–7. [Google Scholar]
Gisbrecht, A.; Schulz, A.; Hammer, B. Parametric nonlinear dimensionality reduction using kernel t-SNE. Neurocomputing 2015, 147, 71–82. [Google Scholar] [CrossRef] [Green Version]

Figure 1. The scenarios of the athlete movement pressure data collection experiments.

Figure 2. The general flow chart for our proposed method.

Figure 3. The architecture of the basic Transformer.

Figure 4. The detailed architecture of our proposed Transformer.

Figure 5. The experimental results of the proposed method with the pre-training linear layer and the method without the pre-training linear layer.

Figure 6. The influence of the depth of the Transformer encoder.

Figure 7. The influence of the dimensions of the pre-training linear layer. (a) influence on testing accuracy; (b) influence on training loss.

Figure 8. The influence of the number of multi-headed self-attention heads.

Figure 9. The influence of the dimension of the multi-headed self-attention heads.

Figure 10. The influence of the dimensions of the feedforward layer.

Figure 11. The results of the different compared methods for different tasks.

Figure 12. The visualization results of the learned features using different methods for task T2. The different colors represent different athletes; balance control ability levels: (a) the result of our proposed method; (b) the result of the DNN method.

Figure 13. The visualization results of the learned features using different methods for task T3. The different colors represent different athletes’ balance control ability levels: (a) the result of our proposed method; (b) the result of the DNN method.

Figure 14. The visualization results of the learned features using different methods for task T4. The different colors represent different athletes’ balance control ability levels: (a) the result of our proposed method; (b) the result of the DNN method.

Table 1. Information for the athlete movement pressure measurement dataset used in this paper.

Athlete Level	Number of Athletes	Code Names	Sampling Frequency
A	3	A#1, A#2, A#3	100 Hz
B	3	B#1, B#2, B#3	100 Hz
C	3	C#1, C#2, C#3	100 Hz
D	3	D#1, D#2, D#3	100 Hz

Table 2. Information for the different athlete balance control ability evaluation tasks used in this study.

Task Name	Concerned Athletes	Sample Number of Every Athlete	Ratio of Training to Testing
T0	A#1, B#1, C#1, D#1	200	4:1
T1	A#2, B#2, C#2, D#2	200	4:1
T2	A#1, B#1, C#1, D#1	200	4:1
T2	A#2, B#2, C#2, D#2	200	4:1
T3	A#1, B#1, C#1, D#1	400	4:1
T3	A#2, B#2, C#2, D#2	400	4:1
T4	A#1, B#1, C#1, D#1	200	4:1
	A#2, B#2, C#2, D#2
	A#3, B#3, C#3, D#3

Table 3. Parameter information.

Parameter	Value	Parameter	Value
Batch size	32	Learning rate	$1 \times 10^{- 4}$
Epoch number	100	Sample dimension	200 × 2

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, N.; Cui, X.; Wang, X.; Zhang, W.; Zhao, T. An Intelligent Athlete Signal Processing Methodology for Balance Control Ability Assessment with Multi-Headed Self-Attention Mechanism. Mathematics 2022, 10, 2794. https://0-doi-org.brum.beds.ac.uk/10.3390/math10152794

AMA Style

Xu N, Cui X, Wang X, Zhang W, Zhao T. An Intelligent Athlete Signal Processing Methodology for Balance Control Ability Assessment with Multi-Headed Self-Attention Mechanism. Mathematics. 2022; 10(15):2794. https://0-doi-org.brum.beds.ac.uk/10.3390/math10152794

Chicago/Turabian Style

Xu, Nannan, Xinze Cui, Xin Wang, Wei Zhang, and Tianyu Zhao. 2022. "An Intelligent Athlete Signal Processing Methodology for Balance Control Ability Assessment with Multi-Headed Self-Attention Mechanism" Mathematics 10, no. 15: 2794. https://0-doi-org.brum.beds.ac.uk/10.3390/math10152794

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Intelligent Athlete Signal Processing Methodology for Balance Control Ability Assessment with Multi-Headed Self-Attention Mechanism

Abstract

1. Introduction

2. Dataset and Methodology

2.1. Dataset

2.1.1. Introduction of the Dataset

2.1.2. Pre-Processing of the Dataset

2.2. Methods

2.2.1. Proposed Method

2.2.2. Compared Methods

3. Results

3.1. Experiment Description

3.2. Experiment and Results Analysis

3.2.1. Experiment 1

3.2.2. Experiment 2

3.2.3. Experiment 3

4. Conclusions and Future Works

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI