We used Scikit-learn [84
] and Keras [85
] running in Google Colab Notebooks to build various models capable of automatically inferring the interpersonal trust of a user classified into two classes: “Low Interpersonal Trust (0)” or “High Interpersonal Trust (1)”. We split all datasets into a 70% training dataset and a 30% test set. We analyze the results with and without sampling, as well as with and without considering the neighboring edges. Lastly, we consider both shallow and deep learning methods, namely: decision trees, random forest, logistic regression, standard feature concatenation-based deep neural network (FC-DNN), and our proposed NADAL architecture.
5.1.1. TrustHWF Dataset
We consider multiple variants of the dataset to quantify the effect of various factors on the classification performance. Table 3
compares all (subsets of) TrustHWF dataset considered in this work and the number of features/rows in each one of them. (Note that while the training data (70%) is balanced between the two classes by creating artificial samples (SMOTE+Tomek), the testing (30%) is done on the imbalanced dataset as consistent with the real-world scenario where such an algorithm is likely to be applied.).
Sampling Technique: As-Is vs. SMOTE+Tomek Resampling
As mentioned in Section 4.1
, we try to counter the problem of class imbalance by creating more balanced training datasets using SMOTE+Tomek resampling. To quantify the performance difference based on the resampling, we run two versions for each experiment—one with and one without the resampling.
Neighbor Awareness: Individual Path (Non-Neighbor-Aware) vs. Neighbor-Aware
We consider the models’ performance if they only utilize the individual node and its edge connecting the target node features vs. utilizing the edge data from two of the closest neighbors. While all the individual path approaches had access to 23 features (20 node features + 3 edge features), the neighbor-aware approaches had access to 29 (20 node features + 3 × 3 edge features). While the difference in the number of features made little impact on the architectures for shallow learning approaches and FC-DNN, the NADAL architecture was adapted to consider only the layers that lie in the path of the abovementioned 23 features for the computation.
Machine Learning Approach: Shallow Learning (Random Forest) vs. Deep Learning (FC-DNN and NADAL)
The first step in the classification process is to compare multiple shallow learning algorithms for predicting interpersonal trust to select the best one to be compared to deep learning approaches, as shown in Table 4
. Note that AUCROC stands for area under the receiver operating characteristic curve and Acc stands for accuracy. While a higher score is better for each of these metrics generally, multiple researchers have suggested in contradiction of using classification accuracy to interpret results in highly imbalanced datasets [86
]. For instance, a simple baseline (Majority Zero-R) algorithm which classifies all ties as “not trusted” will achieve an accuracy of 84.79% in TrustHWF dataset and 98.75% in TrustF dataset. However, such an algorithm would be useless in practice. Hence, we use AUCROC, which balances the majority’s performance and the minority class as the primary metric to compare algorithms.
As can be seen, logistic regression performs the worst in all four combinations of paths and sampling approaches. Random forest is the best in terms of consistently achieving the highest AUCROC compared to decision tree and logistic regression in all four combinations of paths and sampling approaches.
Next, we consider three types of machine learning approaches. First is random forest as a representative of shallow algorithms, which will be useful for comparison. Next is the baseline deep learning approach, which builds upon feature concatenation in the first layer (FC-DNN). Lastly, the NADAL approach, which has been custom-designed to capture the interactions between neighboring nodes’ edges.
After running each experiment 10 times, the average results summarized in Table 5
show the following trends:
For FC-DNN, we passed all the features through a multilayer perceptron (23/40/40/20/40/40/1), all activated by ReLu (rectified linear unit) except the output layer that was activated by Sigmoid with a 16 batch size and 50 epochs as presented in Figure 5
. For NADAL, the features were passed through different layers, as shown in (Figure 4
). All layers in NADAL are activated by ReLu except the output layer which is activated by sigmoid with a 16 batch size and 50 epochs.
For the same algorithmic approach and level of neighbor awareness, the models created with SMOTE+Tomek re-sampling scored higher in AUCROC. This trend is consistent with the expectation and recent research on dealing with imbalanced datasets [86
]. When considering the SMOTE+Tomek results (lower half of the table), we notice that the neighbor-aware approaches consistently outperformed the non-neighbor-aware approaches. The proposed architecture (NADAL) outperformed the shallow learning approach and the baseline deep-learning approach within the neighbor-aware approaches. The best performing algorithm overall is the one with SMOTE+Tomek sampling, neighbor-aware features and NADAL deep learning architecture which is found to be statistically significantly better using two-tailed unpaired t
-tests (at α = 0.05 level) than all comparisons with (Random Forest and FC-DNN) algorithmic approaches together with different data consideration (i.e., Individual Path vs. Neighbor-Aware). This outcome shows the importance of using neighboring edge properties
and custom-designed deep learning architecture
(NADAL) for inferring interpersonal trust between two people, thus supporting the two key contributions of this work.
However, we note that this model still has a relatively modest performance (70.38% AUCROC). We posit that this may be because machine learning approaches, especially deep learning approaches, tend to need large datasets before they start performing well. Acknowledging this as a limitation, we move to the larger TrustF dataset to examine various models’ performance over a larger dataset.
5.1.2. TrustF Dataset
shows all (subsets of) TrustF dataset and the number of features/rows in each one of them, which follows the same approach described in Section 5.1.1
shows that the decision tree performs the worst in all four paths and sampling approaches. Between random forest and logistic regression, both perform better than the other in two of the four scenarios. However, random forest is better at achieving a consistently high AUCROC in all four combinations of paths and sampling approaches.
Next, we consider three types of machine learning approaches again. First is random forest; next is the baseline deep learning approach (FC-DNN); and lastly the NADAL approach.
shows the average results of running experiments with each of the abovementioned settings 10 times. It compares the representative shallow method (random forest) with the proposed deep learning approach (NADAL) as well as a baseline deep learning approach (FC-DNN).
The results summarized in Table 8
show the following trends. For the same algorithmic approach and level of neighbor awareness, the models created with SMOTE+Tomek re-sampling scored higher in AUCROC. The only exception was the Random Forest. When considering the SMOTE+Tomek results (lower half of the table), we notice that the deep learning approaches (both FC-DNN and NADAL) outperform the shallow learning approach (random forest). This finding is again along expected lines as deep learning approaches tend to have more opportunity to capture linear and non-linear associations between different features and create comprehensive models.
Further, the neighbor-aware approach yields better performance in both shallow and deep machine learning approaches. All comparisons between the same algorithmic approaches but different data considerations (i.e., individual path vs. neighbor-aware) showed that the neighbor-aware approaches obtained higher scores. In the case of NADAL and random forest, these gains were found to be statistically significant using two-tailed unpaired t-tests (at α= 0.05 level). This outcome validates the first major contribution of this work, i.e., proposing the use of neighboring edge properties for inferring interpersonal trust between two people, whether in shallow or deep learning.
Lastly, the proposed deep learning architecture (NADAL) was statistically significantly higher than a baseline deep learning (FC-DNN) and the Random Forest shallow learning approach when using the neighboring edge properties, which validates the second contribution. This finding suggests that early fusion of features might not allow for the same channel’s interrelationships to be learned adequately without other channels’ influence. The stepwise unification of different channels across the architecture seems to have provided better opportunities for the social channels to learn both intra-channel and inter-channel relationships.
The highest overall AUCROC score of 93.23% was obtained using SMOTE+Tomek sampling, neighbor-aware features, and NADAL architecture. A score of 93.23% indicates that the model could learn both the majority and minority classes reasonably well and could be useful in practice where the interpersonal trust needs to be inferred using phone-based metadata. Lastly, the noticeable improvement in the models’ performance that has access to more training data (TrustF compared to TrustHWF) suggests that the proposed approach might work well in scenarios where there are a large number of rows in the dataset. A scenario that we expect to become increasingly common in the future.