3.1. Results of Using CNN-LSTM Networks to Simulate Triplets of RBC Types
We trained a total of 84 different neural network models, for each combination of architecture type (3 types), data (4 used datasets), and window size
w (7 possibilities) with the
data. We used the mean absolute percentage error as the loss function defined in Equation (
3). The error MAPE (Mean Absolute Percentage Error) of the prediction and the actual value is calculated as:
Figure 5 shows the MAPE for each possible combination of the above options. The plot shows that as the size of the time window for training increases, the MAPE also increases. This trend was likely due to the addition of noise and bias in the augmentation.
The MAPE is in the
range for the data that simulate the data obtained from the video recordings, while the combination of these data can reduce the error to half. The data using information according to all three
-axes reach the lowest level, namely
, which confirms our hypothesis that information about the elasticity of the observed cell can be obtained from the data used.
Table 4 shows the MAPE values by architecture and the subset of data used for the value of the hyperparameter
w for which the resulting model had the smallest error. The results show that the most accurate neural network model was the
CNN-LSTM Conv2D for data along the
-axes,
-axes,
-axes with the size of the time window
w used being successively 3, 3, 5, and along the
-axes, the best model was the LSTM with
.
CNN-LSTM Conv2D with
had the lowest MAPE value of all the trained models, namely
. For the 2D projection, LSTM with
was the best for the z data according to the
-axes, with a MAPE value of
. For the concatenation of the two 2D projections, the MAPE was
for
CNN-LSTM Conv2D with
.
The MAPE distribution of all the elastic blood cell types we simulated can be seen in
Figure 6 for
CNN-LSTM Conv2D xyz with
and in
Figure 7 for
CNN-LSTM Conv2D xy_xz with
. The green triangles represent the average MAPE for each value of the elastic coefficient
, the yellow line indicates the median of the values. The plots on both sides show the same phenomenon, plots (b) and (d) additionally show outliers.
The largest average MAPEs are seen at elasticity values of 0.03 and 0.1 for the
CNN-LSTM Conv2D models
with
and
with
in
Figure 8 and
Figure 9, respectively. For both models, we can notice that for the part of the corpuscles with elasticity of 0.03, the predicted value was at 0.009, and similarly for the elasticity value of 0.1, where the more significant part of the predictions is from the interval
.
Table 4 shows the MAPE values by architecture, the subset of data used and the value of the hyperparameter
w for which the resulting model had the smallest error. The results show that the most accurate neural network model was
CNN-LSTM_Conv2D for data according to
-axes,
-axes,
-axes with the used time window size
w successively 3, 3, 5 and according to
-axes the best model was
LSTM with
. The
CNN-LSTM_Conv2D with
had the lowest MAPE value of all the trained models, namely
. For 2D projection,
LSTM with
for the data from the along-axis
, with a MAPE value of
. For merging the two 2D projections, the MAPE was
for
CNN-LSTM_Conv2D with
.
The results of the experiment described in the
Section 3.1 suggest that it might be possible to determine the elasticity of a red blood cell from the data we can obtain from video recordings of blood flow in microfluidic devices. However, if we use only one view of the blood flow, e.g., along the
-axes, the resulting prediction will contain a relatively large error, on average over
. If a video from two sides,
and
, is used, the prediction will achieve an average error of less than 8% of the true value. This result would be difficult to obtain from data measured in a real blood flow experiment. In the simulation experiment we have all the values of
x,
y,
z coordinates for all points of RBC surface discretisation. Thanks to this we can estimate how big is the deviation of the real blood cell from its recording on the video. By fitting an estimate of the elasticity parameter
, we can estimate the error we have made compared to a situation in which we would have complete information. Our results show that when solving the
parameter estimation problem, we obtain better results for video recordings from multiple axes of symmetry of the monitored channel.
3.2. Use of Regression Neural Network for Red Blood Cell Elasticity Classification
As we mentioned in the introduction, the problem of determining the elasticity of RBCs as a classification problem has been studied before, and we assume that it is a more frequently-studied and an easier-to-solve problem. In order to be able to compare the results, we simply transformed the results of the regression method into a simple classifier. For this we used the same simulation data as in the
Section 3.1,
Sim3a-c.
We decided to investigate the ability of the neural network to predict cell elasticity approximately, and thus the ability to assign blood cells to categories. An intuitive way was to train a classification neural network. However, along with it, we developed yet another method of classifying the corpuscles that used the output of the regression neural network directly. Converting the output of the regression model into a classification consisted of de-averaging the elastic coefficient and then assigning it to the appropriate category. The categories were created by identifying eight boundaries among the nine elasticity values used, with the boundary located midway between two adjacent coefficients (ordered in ascending order). This neural network is referred as RegToClass.
The neural network classification model had the same architecture as the regression neural networks and was created by alternating the last layer. The only change was in the last, output layer, which replaced one output neuron with nine, which was the number of categories, and the activation function was changed from linear to softmax. At the same time, we also changed the loss function to categorical cross-entropy and transformed the target variables to one-hot encoding.
The classification success rate for the best models according to the comparison in
Table 4 can be consulted in
Table 5. In all cases, we see improved classification performance for versions of the models directly optimized for the classification task. The largest difference in accuracy is almost
for the model that learned on data representing the projection onto 2D by the
-axes (
Figure 10), with a minimal improvement of
for the model with
data (
Figure 11). Again, the bias is evident for the blood cells with an elastic coefficient value of 0.03, where the predicted class was 0.009, to a greater extent for the neural network model trained on the
data. We also note the lower classification success for coefficient values of 0.225 and 0.3, for which the corpuscle is very stiff and the differences in elasticity are small, and this makes a correct prediction difficult. (It can be said that such stiff RBCs are rarely seen in reality).
3.3. Validating Models on the Different Simulations
The previous experiments worked with a triplet of simulations, where each was divided into training, validation, and test parts. The next experiment focused on detecting model errors and possible overfitting. The dataset was constructed in a different way in this case:
Training: one simulation with nine cell types (according to the elastic coefficient value )
Validation: one simulation with nine cell types (same as for training the model, but with different initial seeding) and the same simulation parameters
In the previous experiment, subsets of the data for the performance model were shown to be the best data along the
and
axes simultaneously (the subset of data labeled as
), not counting the subset of data
, which we cannot obtain in practice. For this combination of model and data subset we trained the models for different window lengths
w. As in the previous experiment, we pre-processed the data and then expanded the data by a sufficient amount of data for a neural network. The resulting MAPE values are visible in
Figure 12.
The validation showed a significantly degraded performance of the model despite our efforts to limit the possibility of overfitting by adding augmented data by noise and dropout. This deterioration was likely due to the data, or rather the components of the data, containing information about the y and z axes. The neural network overfitted, which in this case means that it over-focused on this subsection of the data by predicting the value of the elastic coefficient based on the position of the cell according to the channel. In the previous experiment, when the entire red blood cell trajectory was divided into parts with the required number of records equal to the parameter w and then divided into subparts for training and validation, which was random, this information was present in both datasets, which explains the better performance of the model. This implies that despite splitting the dataset into four parts to prevent data leakage, such leakage did occur. The best model in such validation was again the CNN_LSTM_Conv2D model with window size , but again the MAPE was remarkably high.
An important observation, despite the significantly degraded results, was the observably better performance of the
CNN_LSTM_Conv2D model in all experiments. Unlike the original
CNN_LSTM_Conv1D architecture from the paper [
36], this model contained a convolution filter of size (
4, 3) (as opposed to number of features,
3). This finding is further evidence that CNNs are one of the most powerful architectures of the present day.
Comparison with Multiple Linear Regression
To better understand the quality of the results obtained using NN as described above, we solved a similar problem using classical regression tools. We used a modified MLR (Multivariate Linear Regression) and the same dataset (). Regression coefficients were again calculated based solely on data and then were used to predict values in . The obtained results were compared with the values obtained for the CNN-LSTM-Conv2D architecture and the parameter , for which we obtained the smallest MAPE value.
The input data matrix was created by arranging the data from the for each simulation step, resulting in a matrix with dimensions 54 × 15,997. Its data were centered and scaled. Since the number of regression parameters is larger than the number of observations, the standard MLR method cannot be implemented due to the singular matrix formation when estimating covariances between independent variables. For this reason, it is necessary to first reduce the space of observations, for which we used the PCA (Principal Component Analysis) method. After performing the space projection using the PCA method, using the first seven principal components preserves approximately of the information from the original data. This effective reduction of the 15,997-dimensional space is due to the significant “similarity” of the processed data. This combination of methods is called the PCR (Principal Component Regression) method, and we subsequently used it to predict values in this reduced space.
The obtained results along with the actual values for each cell are plotted in
Figure 13 part (a). Recall that the multivariate linear regression used the simulation data to estimate
for each cell just once; therefore, a total of 54
values were predicted. For the estimation of
values using CNN-LSTM_Conv2D,
values were predicted (given a window size of
) for sequences of 50 consecutive positions of individual RBCs, so there were 486 estimated
values in total. In the
Figure 13, we see that the MLR method was able to capture the increasing trend of
values, but the MAPE for this method reached as high as
. Compared to the MAPE of 48.86% using the CNN-LSTM-Conv2D model with a window size of
, this is an expected deterioration. The PCR method can only capture linear dependencies in the data structure. In contrast, NNs are generally able to find even non-linear constraints in the data.