Next Article in Journal
Enhancing Artifact Protection in Smart Transportation Monitoring Systems via a Porous Structural Triboelectric Nanogenerator
Previous Article in Journal
Multi-Beam Luneburg Lens with Reduced Size Patch Antenna
Previous Article in Special Issue
A Cybersecurity Knowledge Graph Completion Method for Penetration Testing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on the Classification Methods of Social Bots

China Academy of Electronics and Information Technology, Beijing 100041, China
*
Author to whom correspondence should be addressed.
Submission received: 29 April 2023 / Revised: 24 June 2023 / Accepted: 29 June 2023 / Published: 10 July 2023
(This article belongs to the Special Issue Intelligent Data Analysis in Cyberspace)

Abstract

:
In order to ensure the healthy development of social networks and the harmony and stability of the society, as well as to facilitate effective supervision by regulatory authorities, a classification method of social bots is proposed based on the identification of social bots in the early stage. First of all, the topic-related introduction is used to expand the topic, and on this basis, the SBERT (Sentence-BERT) model is applied to make relevance judgments between the micro-blog text and the expanded topics to identify polluters. Then, an opinion sentence recognition method that combines social bots opinion sentence generation rules with a deep learning model TextCNN is proposed to further distinguish commenters and spreaders. Finally, in order to improve the classification effect of the model, the transfer learning method is used to train the model with the help of a large number of micro-blogs of ordinary Weibo accounts, so as to better improve the classification effect of social bots. The comparative experimental results show that the topic expansion method can effectively improve the classification results of the SBERT model for the relevance of micro-blog text topics. When the parameter k of the expanded topic model is set at 20, the content of the expanded topic sequence is more consistent with the core content of most Weibo text sequences, and the obtained model has the best performance. By analyzing the opinion-based micro-blog text generation rules of social bots, focusing on the keywords that express opinions, the problem of difficulty in recognizing opinion sentences produced by the low quality of opinion sentences of social bots is well resolved, and the recognition effect of opinion sentences has been improved by more than 10%. Through the introduction of transfer learning, the problem of insufficient social bots data is effectively alleviated, and the classification effect of social bots is greatly improved, with an increase of more than 10%.

1. Introduction

With the rise of social networks such as Twitter, Weibo, WeChat and live streaming, people can communicate and share widely on different topics on social networks at any time. Meanwhile, the rapid development of artificial intelligence technology has led to the emergence of social bots. Social bots are also of various kinds, and some social bots are used to generate benign interactive and informative blog posts and active social networks [1]. Some social bots are used to publish and spread false information and lure users to visit malicious websites, leading to difficulties in distinguishing the truth from the falsehood of the information spread on social networks. In addition to the deteriorating user experience, malicious social bots may even create panic, publish biased political views or damage company reputation, among others.
Due to the variety of social bots, it is difficult to distinguish between the real and the fake. Therefore, it is necessary to carry out research on the detection and classification of social bots. On the one hand, it can help regulators trace their roots, and on the other hand, regulators can adopt different control measures for different types of social bots. For social bots with positive influence, they are allowed to conduct business and services normally within a certain range. For social bots with negative effects, it is necessary to focus on controlling them to limit their breeding and development, so as to better create a healthy and safe network environment for real users and promote social harmony and stability.
At present, there are many studies on the identification of social bots in social networks, but most of the studies only focus on distinguishing legitimate users from social bots, and few have conducted in-depth research and analysis on the classification of social bots. Existing research mainly selects the account features of social bots and then uses classifiers to classify them.
The literature [2] divides abnormal users into product marketing advertisement publishers, content polluters whose published content does not match the hashtag information and bad speech publishers, such as attacks and abuse, extract user content, behavior, attributes and relationship features from social network datasets, and selects XGBoost (eXtreme Gradient Boosting) algorithm to build a classification model, which can effectively use multi-dimensional features and still achieve good classification results when the sample set is severely unbalanced. The literature [3] divides social accounts into active harassing spam users, over-following spam users, repetitive sending spam users, marketing advertising spam users and normal users. Firstly, a one-to-many SVM (support vector machine) is selected to construct a multi-class classifier, and then, fuzzy clustering is adopted for fuzzy processing to solve the missing classification problem in one-to-many SVM.
In addition to the above classification studies, the literature [1] proposes a classification method that considers both benign and malicious bots. Social bots are divided into three categories: broadcast bots, consumer bots and spam bots. Broadcast bots are managed by a specific organization and are mainly used for information dissemination purposes. Consumer bots are used to aggregate content from different sources and provide update services, while spam bots are used to deliver malicious content, mainly covering malicious bots. The paper begins by plotting the cumulative distribution function (CDF) of several key attributes in order to understand the activity patterns of bots and human accounts. Then, the corresponding classification features are proposed, and finally naive Bayes, random forests, support vector machines and logistic regression models are used for classification. The literature [4] divides social accounts into normal users, authenticated users, promoters and trend hijackers. Among them, promoters include accounts that publish information containing malicious URLs. Trend hijackers include accounts that tweet unrelated to the topic event in order to promote a particular product or service, as well as accounts that tweet about the topic event for opinion manipulation and political propaganda. This paper links similar accounts based on their shared applications and builds a Markov Random Field model on the generated similarity graph for classification.
The above literatures extract various account features through research, and then perform feature selection or draw CDF curves to test whether the selected features are effective, and finally use machine learning methods to complete multiple classification. However, these literatures do not explicitly give the criteria for classifying different categories, nor do they propose targeted features to distinguish different categories of social bots and are less interpretable.
The literature [5] proposes more targeted detection features by analyzing the characteristics of tweets posted by each type of social bots. The paper divides social bots into bots, cyborgs, and human spammers. Of these, bots’ tweets use a very limited vocabulary, and the tweets follow a very structured pattern. Cyborgs tend to copy content from other sources and they have a much larger vocabulary than regular bots. Spammers abuse the algorithm to post a series of almost indistinguishable tweets in order to trick Twitter’s spam detection protocols. Compared with the method of selecting common account features, this method deeply analyzes the differences of different types of bots, summarizes the rules and extracts features, and further promotes the research on the classification method of social bots.
In summary, all existing research extract features for classification based on the behavioral performance and blog post content of social bots, but the behavior and speech of social bot accounts may be adjusted according to changes in detection mechanisms and generation technologies. Therefore, these detection schemes cannot well identify bot accounts that differ from the model paradigm, and cannot evolve over time.
In addition to the classification and detection of social robot accounts, some deep learning networks have been applied to the blog feature processing and emotion classification of ordinary social users. For example, literature [6] has designed a new multi-channel convolutional neural network (MCNN) after appropriate analysis and emotion classification of these tweets because of the growing fear of the COVID-19 pandemic in online tweets/reviews. The network integrates multi-channel convolutional neural network to capture multi-scale information for better classification. According to the proposed feature extraction method and MCNN model, tweets are divided into three emotional categories (positive, neutral and negative). Tej et al. [7] discussed the traditional ML-based algorithm to the latest deep learning (DL)-based algorithm, committed to the automatic analysis of rapid reasoning, focused on the analysis of the deep learning generative model VAE (variable auto encoder), which can be used to analyze text semantics and identify feelings. Li et al. [8] proposed a text sentiment classification method based on the improved BERT BiGRU model. The improved model adopts the BiGRU model for feature extraction based on the output of BERT model, combining the advantages of BERT and BiGRU models. At the same time, cross entropy loss function is used for training, and the parameters in the model are fully optimized to achieve a better effect of emotion classification. The experimental results of this literature indicate that the proposed method has achieved good results in emotion classification and is more accurate and efficient than traditional methods. Pan et al. [9] propose a new sentiment classification method for short comments on e-commerce platforms that combines adversarial networks and BERT. Firstly, the text representation based on BERT processing is input into a convolutional neural network, and then the dual generative adversarial network (DGAN) model is used for feature extraction and emotion analysis. Xie et al. [10] propose a text sentiment classification method based on the multi-feature LSTM (Self Attention) model. Among them, the multi-feature LSTM part used in the feature extraction module in the model combines the global and local attention weights in the attention mechanism, Self Attention, which can better capture the semantic information of the text and the correlation between sentences. The experimental results show that the model has achieved good sentiment classification performance on multiple datasets, which is more accurate and efficient than traditional methods.
According to research and analysis, the purpose of each social bot is unique and unchanged, and all behaviors are to achieve the ultimate goal. Therefore, it would be more effective to study the purpose of social bots directly than to analyze behavior to extract formal features. In order to achieve their own goals, different types of social bots will take different actions for the same event, express different remarks, and make different contributions to the development trend of the event.
Therefore, from the perspective of the purpose of social bots, according to the relationship between Weibo texts and Weibo topics and the characteristics of text generated by social bots, this paper classifies social bots under specific topics into three categories: polluters, commenters and spreaders. In particular, polluters are accounts that publish micro-blog texts that are not related to the topic; commenters are accounts that publish micro-blog texts that are related to the topic and express opinions and views; and spreaders are accounts that publish micro-blog texts that are related to the topic and spread information and explain the objective situation of the event.
The framework diagram of the social bots classification research proposed in this paper is shown in Figure 1. On the basis of the previous identification results of social bots, firstly, the SBERT model is used to judge the relevance of extended topics and micro-blog text, and the accounts that publish micro-blog texts that are not related to the topic are marked as polluters, thus completing the identification of polluters; Secondly, the opinion sentence features are extracted for the first opinion sentence identification, and then the TextCNN model is used to complete further opinion sentence identification. The classification of commenters and spreaders is completed according to whether the micro-blog text is related to the topics and whether it has an opinion; Finally, as deep learning models require a high amount of data, and the amount of data that can be obtained from social bots is relatively limited. Therefore, the training of the social bots classification model is completed by transferring human Weibo data, which in turn enables the classification of social bots.
The main contributions of this paper are as follows:
  • Aiming at the problem of data sparseness due to the large difference between the length of the micro-blog text and the topics, a topic expansion method is proposed. Firstly, the topics are expanded using the introductory text of related topics, and then TextRank is used to extract keywords so that the topics can express the event content more richly and effectively. Finally, based on the SBERT model, the relevance is calculated to realize the identification of polluters.
  • Propose a method to distinguish commenters and spreaders through opinion sentence recognition. Furthermore, different from other opinion sentence recognition methods, this paper starts from the opinion sentence generation principle of commenting social bots, and on the basis of extracting keyword feature, location feature, semantic feature and length feature, it establishes a TextCNN model for opinion sentence recognition.
  • In response to the problems of sparse social bots data and low detection accuracy, a social bots classification method based on transfer learning is proposed. With the help of transfer learning technology, a large number of blog posts of ordinary Weibo accounts are transferred to the classification model of social bots, so as to use rich data to train a more accurate social bots classification model.

2. Methods

2.1. Polluter Recognition

Existing topic relevance research mainly adopts dictionary-based methods or machine learning-based methods [11,12], calculates the similarity of words through synonyms or extracts relevance features, and then uses machine learning models for binary classification. However, these methods rely on feature engineering, which not only has a large workload but also has great deficiencies in semantic disambiguation and language understanding. The method based on deep learning does not require complex feature engineering, and the model can capture the relationship between topics and texts by itself through the training of a large amount of data, which is in line with human cognitive logic [13].
At present, there has been some research on text similarity algorithms based on the Siamese network structure [14,15,16], among which the SBERT model is the most used and has better effect. This method treats the entire phrase, sentence or document as an input sequence to the model. Although this approach is simple, there are obvious problems, that is, most of the current topics only contain a few event keywords. When judging the relevance between micro-blog text and topics with large differences in text length, the problem of data sparseness is prone to occur due to the characteristics of shared parameters between the two network branches.
In order to solve this problem, this paper proposes a text relevance calculation method based on topic expansion, and its model framework is shown in Figure 2. The topic expansion module includes topic expansion and text compression. First, collect topic-related content and expand it to make the semantic information of the topic more complete. Then, the expanded topics are compressed to extract important key phrases, thereby avoiding the problem of data sparseness. Finally, the SBERT model is used to calculate the relevance of the expanded topics and micro-blog text.
In the topic expansion part, the introduction of the topic is used to expand the content of the topic. Take the “新疆棉花” (Xinjiang cotton) incident, which became a hot topic in the early days, as an example. First, search for topics containing “新疆棉花” on Weibo, crawl through all the #XX# introductory content, and organize the introductory content of these topics into one document. Because these introductions are specific explanations for each topic and have been verified by Weibo, they are indeed sentences related to the topic of Xinjiang cotton.
In the text compression part, this paper uses the TextRank algorithm to extract keywords from the expanded topic content. The basic idea of the TextRank algorithm comes from Google’s PageRank algorithm, which builds a graph model by dividing the micro-blog text into several constituent units (words, sentences). The important components in the micro-blog text are sorted through a voting mechanism, and keyword extraction can be achieved only by using the information of a single micro-blog text.
The TextRank algorithm extracts the introductory keywords as follows:
  • Divide the topic content expanded by the introduction into complete sentences.
  • For each sentence, perform operations such as jieba segmentation, part-of-speech tagging and removing stop words.
  • Construct a topic candidate keyword graph G = (V, E), where V is the node set, which is composed of topic candidate keywords generated in step 2, and then uses the co-occurrence relationship to construct an edge between any two points. Edges exist between nodes only if their corresponding words co-occur in a window of length K, where K represents the window size, which means that at most K words co-occur.
  • According to the TextRank formula, iteratively propagate the weight of each node until convergence.
    W S V i = 1 d + d × V j ln ( V i ) w j i V k O u t V j w j k W S V j
    where WS(Vi) denotes the importance of node Vi, d is the damping coefficient and ln(Vi) is the node with a connected edge to node Vi.
  • Sort the node weights obtained in step 4, and use the most important k words as the keywords of the introduction, thus completing the construction of the topic expansion sequence.
  • In this paper, the SBERT model [14] using the Siamese network framework is introduced to judge the topic relevance. The Siamese network proposed by Chopra et al. is a neural network structure based on a group of networks with the same parameters [14], and the symmetry brought by the same parameters provides the Siamese network a good modeling ability for those texts with the same structure. The SBERT model uses the Siamese network structure to fine-tune the pre-trained model BERT to derive sentence embeddings with semantic information and similarity calculation. The model architecture of using the SBERT model for topic relevance judgment is shown in Figure 3.
The process of using the SBERT model for topic relevance judgment is as follows:
(1)
First, input the micro-blog text and extended topics, which are encoded by BERT, to obtain two feature vectors u and v representing the sentences;
(2)
Then, use the MEAN strategy to perform the pooling operation, obtain all the output vectors of the last layer of the sequence and calculate the average value as the sentence vector;
(3)
Finally, the mean square error function is used as the optimized objective function, and the cosine similarity of u and v is calculated for the obtained sentence vector to measure the similarity between the micro-blog text and the topics. When the similarity is greater than the threshold, it is determined that the micro-blog text is related to the topics. Since different thresholds will eventually produce different results, the threshold with the highest accuracy is taken as the final threshold.

2.2. Commenter and Spreader Recognition

For spreaders, their purpose is to automatically publish news, blogs, breaking events, etc. They facilitate the dissemination of information primarily by collecting objective descriptions of events for human consumption and use. In other words, the blog posts of spreaders are based on factual descriptions and do not involve the expression of opinions. For commenters, their purpose is to influence the group’s cognition of things by expressing their own opinions, and to a certain extent, they can manipulate public opinion [17]. In a specific topic, commenters only need to occupy a large enough proportion to change public opinion, so that the opinions they disseminate will eventually become mainstream opinions [18]. As a result, commenters’ blog posts generally have clear opinions. Therefore, commenters and spreaders can be recognized by judging whether a blog post published by social bots has an opinion.
Opinions in blog posts usually appear in opinion sentences. Identifying whether a blog post has an opinion mainly depends on whether there is an opinion sentence. Therefore, the question of whether there is an opinion or not is transformed into the problem of identifying opinion sentences. Opinion sentence recognition needs to extract sentences with opinions, attitudes, positions, etc. from blog posts. It is a typical text classification problem and has the same characteristics and processing flow as a text classification problem [19]. However, the blog posts generated by commenters do not have good opinion sentence characteristics due to the lack of human writing skills.
From the perspective of text generation by social bots, in order to make commenters express specific opinions, the generator usually adopts a corpus containing specific keywords [20]. Specifically, a new opinion sentence can be generated by inputting the opinion sentence example of the specific event corpus, and then adopting the method of synonym deformation. It is also possible to input the opinion sentence examples of the general corpus, extract sentence pattern features and then generate opinion sentences about specific events according to the given keywords and given emotions. Therefore, according to the social bots opinion sentence generation rules, this paper identifies opinion sentences from four perspectives: keyword feature, position feature, semantic feature and length feature.
First, each social bot blog post is split into n sentences, d = s 1 , s 2 , , s n . Then, each sentence is partitioned into l words, s i = w i 1 , w i 2 , , w i l . From this, the opinion sentence score is obtained by weighting and summing the four features of each sentence, as shown in equation (2). Where λ 1 , λ 2 , λ 3 , λ 4 represent the weights of the four features, respectively, and the sum of the four is 1. The values of λ 1 , λ 2 , λ 3 , λ 4 can be adjusted according to the situation.
f s i = λ 1 f k e y w o r d s i + λ 2 f p o s i t i o n s i + λ 3 f s e m a n t i c s s i + λ 4 f l e n g t h s i
Keyword feature: The TextRank algorithm is used to extract keywords with nouns and verbs for all blog posts posted by social bots. When selecting a key sentence, if a keyword appears in the sentence, the probability that the sentence is an opinion sentence will be very high. Therefore, the score function of the keyword feature is as follows:
f k e y w o r d s i = l = 1 n k e y w o r d ω i l k e y w o r d ω i l = 1 ω i l   is   the   keyword 0 ω i l   is   not   a   keyword
Position feature: In a blog post that expresses an opinion, the central point is usually at the beginning or end. Therefore, the first or last sentence of a blog post is more likely to be an opinion sentence, and the score function of the position feature is as follows:
f p o s i t i o n s i = 1 s i   is   the   first   or   last   sentence 0 s i   is   the   middle   sentence
where n is the number of sentences in the text, i is the position of the sentence in the text and a, b and c are coefficients. As can be seen from the function expression, sentences in the middle of the tetext have low scores, while sentences at the beginning and end of the text have high scores.
Semantic feature: In opinion sentences, there are often words that express opinions and words that are more subjective, concluding and transitive. Typical words to express opinions are: “support, oppose, resist, agree, believe” and so on. Words that express subjectivity such as “I, think, estimate, should, maybe, probably” and so on. Summarizing words such as “therefore, in a word, so, summed up, it can be seen from this” and so on. Words that express turning points such as “but, despite, although, however” and so on. Therefore, the score function of the semantic feature is as follows:
f s e m a n t i c s i = 1.0 s i   contains   opinion   words 1.0 s i   contains   subjective   words 0.8 s i   contains   summarising   words 0.6 s i   contains   transitional   expressions
Length feature: Considering the fact that news blog posts published by spreaders are mostly objective factual descriptions of events, most blog posts are longer in length, while opinion sentences generated by commenters based on rules are relatively short. Therefore, this paper takes the length feature as a feature of opinion sentence recognition, and the score function of the length feature is as follows:
f l e n g t h s i = 1   sentence   length > 200 0   other
Finally, by normalizing and summing the four features of the sentence, the opinion sentence score of each sentence of the social bot blog post is obtained. Set the threshold for when the sentence score is judged as an opinion sentence. If there are multiple sentences in a blog post with a score greater than θ, the sentence with the highest score is taken as the opinion sentence, thus completing the rule-based opinion sentence recognition. Moreover, the accounts with opinion sentences in the blog posts are judged as news commenters.
In order to more comprehensively identify opinion sentences, this paper will input the blog posts that are judged to not contain opinion sentences into the TextCNN model for further opinion sentence judgment. The specific recognition process is as follows: first, the blog posts are characterized using the Word2Vec model, and the resulting word vectors are used as the input to the embedding layer of the TextCNN model. Then, a vectorized representation of the blog post features is extracted through different filters. Next, the most significant feature vectors are filtered by a pooling operation. Finally, a fully connected layer is transformed into a classification problem, thus completing the opinion sentence recognition.
After completing the identification of opinion sentences by the rule-based method and the TextCNN classification model-based method, the commenters and spreaders can be obtained by combining the previous blog post topic relevance judgment results. The overall identification process of commenters and spreaders is shown in Figure 4.

2.3. Social Bot Classification Based on Transfer Learning

In real social networks, the number of social bots is far less than the number of normal human users, so the deep learning-based social bots classification method has a serious problem of insufficient samples. Transfer learning has the ability to draw inferences about other cases from one instance, allowing algorithms to quickly train models for new scenarios based on models from existing domains (source domains) and a small amount of data when dealing with problems in new domains (target domains) [21]. Tan et al. define deep transfer learning in [22], and divide deep transfer learning methods into four categories according to the techniques used, including instance-based methods, mapping-based methods, network-based methods and adversarial-based methods. Among them, network-based deep transfer learning, also known as parameter transfer, refers to reusing part of the neural network pre-trained in the source domain and transferring it to the target domain as part of the new network and is currently the most widely used transfer learning method. It is based on the assumption that the neural network is similar to the processing mechanism of the human brain and is an iterative and continuous abstract process. Therefore, this paper uses the sufficient blog post data of real Weibo accounts to effectively improve the accuracy of social bots classification by using a network-based deep transfer learning method.
The network-based deep transfer learning method used in this paper is shown in Figure 5. The deep neural network is first pre-trained using a large amount of human data from the source domain, and then the network structure and network parameters are transferred to the target domain. Since the target data is relatively small, and the social bots blog posts are texts generated by imitating human blog posts, they are very similar to human blog posts, so the features of both shallow and deep networks are relatively similar. Therefore, we freeze all the parameters of the network, except the output layer, that use the social bots training data to fine-tune the output layer and finally use the test data to test the classification effect of the model.

3. Results

3.1. Data Collection and Preprocessing

The source domain dataset consists of two parts: the first part, data collection by writing crawler code. Crawl the content of original blog posts published by all accounts under the topic “#新疆棉花#” (Xinjiang cotton) and related topics on the Weibo platform, annotate the blog posts by manual annotation and select 400 blog posts whose content is not related to the topic and 400 news-based blog posts as the source data of polluters and spreaders, respectively. In the second part, 1200 opinion blog posts are generated by the social bots sample data generation model [23] as the source data of commenters. Thus, the source domain dataset construction for transfer learning is completed.
The target domain dataset is the real data of 188 bot types detected by the social bots identification model, after manual annotation and blog post deduplication, including 139 valid social bots data. Among them are 59 polluters, 63 commenters and 17 spreaders.

3.2. Evaluation Metrics

In order to reflect the overall classification effect more realistically, we use four commonly used indicators, Accuracy, Precision, Recall and F1-score, to measure the performance of the proposed multi-classification method for social bots. Among them, TP means that the samples of class A are correctly classified into class A, TN means that the samples of non-class A are correctly classified into other classes except class A, FP means that the samples of non-class A are classified into class A and FN means that the samples of class A are classified into other classes than class A.
A c c u r a c y = T P + T N T P + T N + F P + F N
P r e c i s i o n = T P T P + F P
R e c a l l = T P T P + F N
F 1 s c o r e = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l

3.3. Experimental Environment and Model Parameters

This experiment uses Pytorch as the deep learning framework. For the SBERT model, the batch size is 64, the epoch is 100, the pre-training model is Roberta and the loss function is cosine similarity. For the TextCNN model, the convolutional layers were set with convolutional kernels of sizes 2, 3 and 4, respectively, with a learning rate of 0.001 and a dropout of 0.5.
The processor of the experimental equipment is Intel® Core TM i7-10875H CPU @ 2.3 GHz, the memory is 16 GB, the operating system is Windows 10 and it is programmed in Python language.

3.4. Result Analysis

3.4.1. Comparative Experiment of Polluters Identification Effect

Crawl all the introductory content on the home page of the topic search page related to “Xinjiang Cotton” as the expanded topic texts, and combine multiple sentences in these texts into a paragraph and then use the TextRank algorithm to extract the keywords. Input the keyword composition sequence into one branch in SBERT, input each micro-blog text into another branch in SBERT and then judge whether the micro-blog text is related to the expanded topics sequence. In order to determine the number of keywords k, this experiment compares the experimental cases where 20, 25 and 30 keywords are extracted for topic expansion with the experimental cases where no topic expansion is performed (using the baseline model SBERT), and a comparison of the classification results when taking the optimal threshold is shown in Figure 6.
As can be seen from Figure 6, compared with the results of the experiment without topic expansion (using the baseline model SBERT), the three cases with topic expansion have improved or remained the same in terms of Accuracy, Recall and F1-score, while the topic expansion model k = 20 decreased slightly on the Precision metric.
The results are analyzed as follows: the Precision and Recall values of the baseline model are both 0.8333, while the topic expansion model k = 20 has a Precision value of 0.8 and a Recall value of 1. This indicates that the topic expansion model k = 20 uses a lower model classification threshold and predicts more accounts with less obvious polluters characteristics as polluters as well. Although the prediction error rate increased slightly, more polluters were identified overall. In addition, the Precision and Recall values of the topic expansion model k = 25 and the topic expansion model k = 30 are higher than those of the baseline model, so the classification effect of the model after topic expansion is better than that of the baseline model. This indicates that the semantic information of the topic can be effectively increased by expanding the topic, which is helpful to extract important phrases and alleviate the data sparse problem when the SBERT model calculates the correlation.
When k = 20, the Precision value of the topic expansion model is slightly lower than that of the baseline SBERT model, and the Accuracy, Recall and F1-score values all improve to some extent. When k = 25, the best results were achieved. The reason for the analysis is that the increase of keywords will enrich the topic content, so as to achieve a better classification effect. When k = 30, its Accuracy, Precision, Recall and F1-score begin to decline, which shows that the number of keywords is not as large as possible, but needs to be set according to the closeness of keywords and micro-blog text. The comprehensive analysis shows that when k = 25, the content of the expanded topic sequence is more consistent with the core content of most micro-blog text sequences, so the model obtains the best effect.

3.4.2. Comparative Experiment of Commenters and Spreaders Identification Effect

(1)
Threshold θ selection experiment
In the opinion sentence recognition algorithm, when the sentence score is greater than the threshold θ, it will be judged as an opinion sentence, so the value of the threshold θ ∈ (0, 1] directly affects the accuracy of the opinion sentence recognition algorithm. For this reason, we conduct experiments on the optimal value of θ. When θ takes values in sequence from 0 to 1, the changes of each evaluation index value are shown in Figure 7.
It can be seen from Figure 7 that Accuracy and F1-score first increase and then decrease with θ and reach the maximum value when θ is 0.55. Precision increases with θ, and Recall decreases with θ. When the value of θ is too small, more sentences that do not contain opinions will be judged as opinion sentences by the algorithm; and when the value of θ is too large, more sentences that contain opinions cannot be recognized. Therefore, for the dataset in this paper, when the threshold θ is set at 0.55, the training set Accuracy of opinion sentence recognition is up to 87.38%, and the F1-score is up to 85.10%. At this time, the test set Accuracy is 75%, Precision is 80%, Recall is 61.53% and F1-score is 69.56%.
(2)
Comparative experiments
In order to verify the effectiveness of the social bots opinion sentence recognition method proposed in this paper, we will use the deep learning model TextCNN to recognize the opinion sentence of the blog post as the baseline model of the experiment, and compare it with the method using social bots opinion sentence generation rules for opinion sentence recognition of blog posts and the method combining social bots opinion sentence generation rules and TextCNN model for opinion sentence recognition of blog posts, and the experimental results are shown in Figure 8.
As can be seen from Figure 8, the general deep learning model TextCNN recognition method has better results, with both Accuracy and F1-score reaching over 80%. This is because the deep learning model can expand the dataset through the transfer learning method, and with its powerful modelling ability for massive data, it can learn more opinion sentence features, thus effectively improving the recognition effect. Using a separate opinion sentence generation rule recognition method is relatively ineffective. Analyzing the reasons, it is found that the advantage of this method is that it can identify opinion sentences with keyword features according to the rule features, but the recognition breadth is limited.
In this paper, a combination of these two methods is applied, and the Rule-TextCNN model is used to judge opinion sentences. First, focus on the keywords that express opinions, so that some opinion sentences can be accurately identified; then, the remaining sentences are classified by the TextCNN model, so that as many remaining opinion sentences can be identified as possible. The experimental results show that compared with the previous two identification methods, the four evaluation index values of this method are greatly improved. This is because the combined method effectively combines the advantages of the previous two methods, which proves the effectiveness of the opinion sentence recognition method proposed in this paper.

3.4.3. Comparative Experiment of Transfer Learning

In order to verify the effect of using transfer learning to classify social bots, three groups of model comparison experiments were set up according to the transfer learning strategy. The first group conducts experiments on the SBERT model for identifying polluters. The second group conducts experiments on a deep learning TextCNN model for identifying commenters and spreaders. The third group conducts experiments on a social bots classification model. The experimental results are shown in Table 1, and the social bots classification confusion matrix heat map is shown in Figure 9.
It can be seen from Table 1 that for the previous two groups of experiments, the Accuracy and F1-score of the model with transfer learning have been greatly improved. For the third group of experiments, the social bots classification method with transfer learning proposed in this paper has a significant improvement in four evaluation indicators compared with the baseline classification method, and the increase in Precision and F1-score reaches more than 10%.
This is because the baseline model only works on the small amount of real blog post data available for social bots, ignoring the importance of large amounts of data for deep feature extraction, and therefore, the classification results are poor. However, the method with transfer learning has significantly improved the classification effect due to the use of a large amount of human data for training. Figure 9 shows that the diagonal color of the confusion matrix is darker with the addition of transfer learning, demonstrating that the social bots classification task can be achieved more effectively by using a transfer learning strategy. Moreover, we found that the polluters and commenters were wrongly predicted as spreaders in both models. This is due to the fact that the 139 social bots contained only 17 spreaders, so the trained model was more likely to predict accounts as polluters and commenters with a larger sample size, resulting in poorer predictions for spreaders with smaller samples.

3.4.4. Comparative Experiments with Traditional Classification Algorithms

Due to the lack of a unified classification standard for social robot types in current research, the categories of social robots in various literature are different from each other. Therefore, this article is compared with other methods of selecting effective features through CDF and using machine learning classification. Among them, Oentaryo et al. [1] used LR (logistic regression) method, Yuan et al. [2] used XGBoost algorithm and Yang et al. [3] used SVM algorithm. The multi-classification results obtained by the four algorithms are shown in Table 2.
By observing Table 2, it was found that overall, the classification performance of our method is superior to LR and SVM methods, but slightly inferior to XGBoost method. The XGBoost algorithm is an algorithm that can effectively utilize multidimensional features and remains effective even when the sample set is severely imbalanced, resulting in excellent performance.

3.4.5. Comparative Experiments with Deep Learning Classification Algorithms

In addition to comparing the previous text with traditional classification models, this section also compares it with the current more typical text semantic classification models. This part selects the multi-channel convolutional neural network (MCNN) mentioned in the literature [6] and the in-depth learning generative model VAE designed in the literature [7] for comparative analysis. The data set blog posts are trained and classified using this model, multi-channel convolutional neural network model (MCNN) and in-depth learning generative model VAE, respectively.
(1)
Introduction to Model Parameters and Training Process
(1)
Model in this article
This model is the main model used in this experiment. It combines the advantages of traditional rule model and deep learning model and can effectively deal with the complex problem of social robot classification.
The specific steps of the model in this article are as follows:
Step 1: Use the SBERT model to make correlation judgments on blog posts and extended topics to identify content polluters.
Step 2: Use viewpoint sentence recognition method to distinguish between commentators and spreaders.
Step 3: Use the transfer learning method to train the model from the blog data of ordinary micro-blog accounts to ensure the generalization performance and effect of the model.
(2)
Multi-channel convolutional neural network (MCNN)
Multi-channel convolutional neural network (MCNN) mainly includes three convolutional layers and two full connection layers, which are used to classify and identify content polluters, commentators and spreaders.
MCNN model parameters: use three convolutional layers with 64 convolutional kernels per layer, batch size = 32, max_ Seq_ Len = 128, learning rate= 1 × 10−3 and training epoch = 10.
The specific steps are as follows:
Step 1: Input the preprocessed dataset and perform word2vec mapping on each word.
Step 2: Use multi-channel convolutional neural network (MCNN) model to extract features, and use softmax function to calculate the probability of different categories.
Step 3: Output the predicted results for each sample’s category.
(3)
Deep learning generative model VAE
Deep learning generative model VAE mainly compresses data features and classifies data during decompression.
VAE model parameters: Use two fully connected layers with 512 neurons per layer, batch size = 64, max_ Seq_ Len = 128, learning rate = 1 × 10−3 and training epoch = 10.
The specific steps are as follows:
Step 1: Use VAE to encode features and extract effective feature vectors.
Step 2: Classify feature vectors and output prediction results for each sample’s category.
(2)
Performance analysis of model comparison results
The results obtained from training and testing the dataset using the above three models are shown in Table 3.
In the comparative experiment, the model presented in this paper, MCNN and VAE models were compared in terms of recognition accuracy, Accuracy, Recall, and F1-score for content polluters, commentators and spreaders.
Firstly, for the identification of content polluters, the experimental results show that the Accuracy, Precision, Recall and F1-score indicators of our model are higher than those of MCNN and VAE models, with the highest Accuracy value reaching 0.8323. This indicates that the model in this article has better accuracy and robustness in classifying content polluters. Secondly, for the identification of commentators, the results show that the Accuracy, Precision, Recall and F1-score indicators of the model and MCNN model in this paper are higher than those of the VAE model, with the highest Accuracy value of 0.8083 in this model. This indicates that the proposed model and MCNN model perform better in identifying commentators, while the VAE model is relatively poor. Finally, for the identification of spreaders, the experimental results show that the Accuracy, Precision, Recall and F1-score indicators of the model in this paper are higher than those of the MCNN and VAE models, especially with the highest Accuracy value, reaching 0.7925. This indicates that the model in this article performs better in identifying spreaders.
So, from the recognition results, the classification effect of this model is the best, with higher recognition accuracy, Accuracy, Recall and F1-score and other indicators, which can better distinguish different types of social robots.
(3)
Analysis of experimental results in statistical testing
In the Wilcoxon model, statistical significance level is usually used to determine the difference between two samples or datasets. The commonly used statistical significance levels include 0.05 and 0.01.
In the social robot classification problem, Wilcoxon model can carry out statistical tests for the classification results of different models and evaluate the classification effect of the model by comparing the significant differences between the real data categories and the classification model result categories. In this article, we assume that H0 represents a classification model (this model, MCNN and VAE models) with results that do not differ significantly from the real data category, while H1 represents a classification model with results that differ significantly from the real data category. Choose a statistical significance level of 0.05, which means an error rate of 5%.
The Wilcoxon ranked test contains four groups of samples. The first group is ground-truth data. The second to the fourth group samples are classified by the model proposed in this article, MCNN model and VAE model, respectively. The test results show that the p-value obtained by the first and the second group data is 0.021, the p-value obtained by the first and the third group data is 0.025 and the p-value obtained by the first and the fourth group data is 0.027, which are lower than the statistical significance level of 0.05. Moreover, the p-value obtained by the model of this paper is the smallest, so it can be considered that the classification effect of this model is better than that of the MCNN and VAE models. According to the results of this experiment, it can be concluded that this model is better than the control model in social robot classification, and the results of significance test support this conclusion.

4. Conclusions

In this paper, social bots are classified into three categories: polluters, commenters and spreaders. The classification of social bots is studied for specific topics. Firstly, we expand the specific topic through the introduction and then calculate the relevance between the micro-blog text and the topics to identify polluters; then, from the perspective of the social bots opinion sentence generation principle, we achieve the identification of commenters and spreaders by judging whether the posted blog posts contain opinions; finally, transfer learning techniques are used to alleviate the problem of insufficient data volume of social bots by crawling rich Sina Weibo general account blog posts, thus promoting the classification research of social bots. The experimental results show that the topic expansion method and the opinion sentence recognition method are effective, and the social bots classification method proposed in this paper can achieve better recognition results after adopting the transfer learning strategy. The topic expansion module can more richly and effectively express event content, improving the effectiveness of topic blog relevance judgment. The viewpoint sentence recognition method combining generation rules and deep learning can fully leverage the advantages of both methods. It can accurately identify some viewpoint sentences that comply with the social robot blog post generation rules and may not be smooth through keywords and further identify as many remaining viewpoint sentences as possible through the TextCNN model, greatly improving the accuracy of social robot viewpoint sentence recognition. With the help of transfer learning technology, the training results of blog data of a large number of ordinary accounts can be transferred to the classification model of social robots, so as to obtain more accurate classification results of social robots. It is worth noting that it is more interpretable to use the purpose of social bots as the classification standard of social bot types, and the classification method by mining blog posts is more general, so the method in this paper is also applicable to solve the problem of classifying abnormal users of other social networks other than Sina Weibo. In the future, research on emotional polarity discrimination of commentators’ opinions will be increased, which can further identify the public opinion tendencies of social robots. By determining whether they hold a positive or negative attitude towards a specific topic, further exploration and research on the purpose of social robots will be achieved.

Author Contributions

Conceptualization, X.L. and Y.Z. (Yue Zhan); methodology, X.L.; software, X.L.; validation, X.L., Y.Z. (Yue Zhan) and H.J.; formal analysis, Y.Z. (Yue Zhan); investigation, Y.W. and H.J.; resources, Y.W.; data curation, H.J.; writing—original draft preparation, X.L.; writing—review and editing, X.L. and Y.Z. (Yue Zhan); visualization, H.J.; supervision, X.L.; project administration, Y.Z. (Yi Zhang); funding acquisition, Y.Z. (Yi Zhang). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number U21B2045 and U21B2043.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Oentaryo, R.J.; Murdopo, A.; Prasetyo, P.K.; Lim, E.-P. On profiling bots in social media. In Proceedings of the Social Informatics: 8th International Conference, SocInfo 2016, Bellevue, WA, USA, 11–14 November 2016; Part I 8. pp. 92–109. [Google Scholar]
  2. Yuan, L.; Gu, Y.; Zhao, D. Research on abnormal user detection technology in social network based on XGBoost method. Appl. Res. Comput. 2020, 37, 814–817. [Google Scholar]
  3. Yang, Y.; Xu, G.; Lei, J. A multi-classification method for detecting microblog spam users. J. Chongqing Univ. 2018, 41, 44–55. [Google Scholar]
  4. El-Mawass, N.; Honeine, P.; Vercouter, L. Supervised classification of social spammers using a similarity-based markov random field approach. In Proceedings of the 5th Multidisciplinary International Social Networks Conference, Saint-Etienne, France, 16–18 July 2018; pp. 1–8. [Google Scholar]
  5. Clark, E.M.; Williams, J.R.; Jones, C.A.; Galbraith, R.A.; Danforth, C.M.; Dodds, P.S. Sifting robotic from organic text: A natural language approach for detecting automation on Twitter. J. Comput. Sci. 2016, 16, 1–7. [Google Scholar] [CrossRef] [Green Version]
  6. Sitaula, C.; Shahi, T.B. Multi-channel CNN to classify nepali COVID-19 related tweets using hybrid features. arXiv 2022, arXiv:2203.10286. [Google Scholar]
  7. Shahi, T.B.; Sitaula, C. Natural language processing for Nepali text: A review. Artif. Intell. Rev. 2022, 55, 3401–3429. [Google Scholar] [CrossRef]
  8. Yun, L.; Yali, P.; Dong, X. Research on Text Emotion Classification Based on Improved BERT BiGRU Model. Electron. Technol. Appl. 2023, 49, 9–14. [Google Scholar]
  9. Mengqiang, P.; Nao, L.; Wei, D. Emotional classification of short comments on e-commerce platforms combining adversarial networks and BERT. J. Chongqing Univ. Posts Telecommun. (Nat. Sci. Ed.) 2022, 34, 147–154. [Google Scholar]
  10. Zhang, X.; Wu, Z.; Liu, K.; Zhao, Z.; Wang, J.; Wu, C. Text sentiment classification based on multi feature LSTM Self Attention. Comput. Simul. 2021, 38, 479–489. [Google Scholar]
  11. He, Y.; Xiao, M.; Zhang, Y. A Research on Hot Topic Emotional Tendency Combined with Topic Relevance. Data Anal. Knowl. Discov. 2017, 1, 46–53. [Google Scholar]
  12. Tian, J.; Zhao, W. Words similarity algorithm based on Tongyici Cilin in semantic web adaptive learning system. J. Jilin Univ. 2010, 28, 602–608. [Google Scholar]
  13. Yin, P.; Pan, W.; Zhang, H.; Chen, D. Clickbait recognition research Based on BERT-BiGA Model. Data Anal. Knowl. Discov. 2021, 5, 126–134. [Google Scholar]
  14. Zhao, C.; Guo, J.; Yu, Z.; Huang, Y.; Liu, Q.; Song, R. Correlation analysis of news and cases based on unbalanced siamese network. J. Chin. Inf. Process. 2020, 34, 99–106. [Google Scholar]
  15. Yang, D.; Ke, X.; Yu, Q. A question similarity calculation method based on RCNN. J. Comput. Eng. Sci. 2021, 43, 1076–1080. [Google Scholar]
  16. Viji, D.; Revathy, S. A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi–LSTM model for semantic text similarity identification. Multimed. Tools Appl. 2022, 81, 6131–6157. [Google Scholar] [CrossRef] [PubMed]
  17. Zhou, Y.; Min, Y.; Jiang, T.; Wu, Y.; Jin, X.; Cai, H. Research Status, Challenges and Prospects of Social Media Bots. J. Chin. Comput. Syst. 2022, 43, 2113–2121. [Google Scholar] [CrossRef]
  18. Cheng, C.; Luo, Y.; Yu, C. Dynamic mechanism of social bots interfering with public opinion in network. Phys. A Stat. Mech. Its Appl. 2020, 551, 124163. [Google Scholar] [CrossRef]
  19. Wang, M.; Fu, C.; Xu, F.; Hong, H. A New Chinese Subjective Sentences Recognition Method Based on Word Co-occurrence Relationship Graphic Model. J. Chin. Inf. Process. 2015, 29, 185–192. [Google Scholar]
  20. Gehrmann, S.; Strobelt, H.; Rush, A.M. Gltr: Statistical detection and visualization of generated text. arXiv 2019, arXiv:1906.04043. [Google Scholar]
  21. Shang, X.; Han, M.; Wang, S.; Jia, T.; Xu, G. A skin diseases diagnosis method combining transfer learning and neural networks. CAAI Trans. Intell. Syst. 2020, 15, 452–459. [Google Scholar]
  22. Tan, C.; Sun, F.; Kong, T.; Zhang, W.; Yang, C.; Liu, C. A survey on deep transfer learning. In Proceedings of the Artificial Neural Networks and Machine Learning—ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, 4–7 October 2018; Part III 27. pp. 270–279. [Google Scholar]
  23. Liu, X.; Xu, Y. Research on Expansion Method of Detection Dataset for “Human-like” Socialbots. J. Univ. Electron. Sci. Technol. China 2022, 51, 130–137. [Google Scholar]
Figure 1. Framework diagram of social bots classification research.
Figure 1. Framework diagram of social bots classification research.
Electronics 12 03030 g001
Figure 2. Framework diagram of topic relevance model.
Figure 2. Framework diagram of topic relevance model.
Electronics 12 03030 g002
Figure 3. SBERT model architecture.
Figure 3. SBERT model architecture.
Electronics 12 03030 g003
Figure 4. Commenters and spreaders identification process.
Figure 4. Commenters and spreaders identification process.
Electronics 12 03030 g004
Figure 5. The network-based deep transfer learning process.
Figure 5. The network-based deep transfer learning process.
Electronics 12 03030 g005
Figure 6. Comparison of evaluation index values before and after topic expansion.
Figure 6. Comparison of evaluation index values before and after topic expansion.
Electronics 12 03030 g006
Figure 7. The value of the evaluation index changes with the threshold θ.
Figure 7. The value of the evaluation index changes with the threshold θ.
Electronics 12 03030 g007
Figure 8. Comparison experiment of opinion sentence recognition methods.
Figure 8. Comparison experiment of opinion sentence recognition methods.
Electronics 12 03030 g008
Figure 9. Social bots classification confusion matrix heatmap.
Figure 9. Social bots classification confusion matrix heatmap.
Electronics 12 03030 g009
Table 1. Model comparison experiments with transfer learning methods.
Table 1. Model comparison experiments with transfer learning methods.
Classification ModelAccuracyPrecisionRecallF1-Score
SBERT model0.67851.00.250.4
SBERT model with transfer learning0.92850.85711.00.923
TextCNN model0.71430.63160.92310.7500
TextCNN model with transfer learning0.85710.90910.76920.8333
Social bots classification model0.60710.70250.60710.5995
Social bots classification model with transfer learning0.64280.87320.64280.6919
Table 2. Model comparison experiments on different classification algorithms.
Table 2. Model comparison experiments on different classification algorithms.
Evaluation IndexCategoryClassifier
This PaperLRXGBoostSVM
Accuracypolluters0.950.71520.99330.8145
commenters0.950.66220.98670.7615
spreaders0.950.94700.99330.9470
Precisionpolluters0.950.64861.00.9062
commenters0.980.65090.98790.7079
spreaders0.870.8750.93331.0
Recallpolluters0.870.44440.98140.5370
commenters0.990.83130.98790.9638
spreaders0.930.51.00.4285
F1-scorepolluters0.910.52740.99060.6744
commenters0.980.73010.98790.8163
spreaders0.900.63630.96550.6
Table 3. Model comparison experiments on different classification algorithms.
Table 3. Model comparison experiments on different classification algorithms.
Evaluation IndexCategoryClassifier
This PaperMCNNVAE
Accuracypolluters0.83230.78360.7252
commenters0.80830.77410.7163
spreaders0.79250.75560.7043
Precisionpolluters0.81620.75140.7031
commenters0.78520.74510.6942
spreaders0.77520.73230.6827
Recallpolluters0.79010.73480.6806
commenters0.76380.730.6722
spreaders0.75420.71130.6562
F1-scorepolluters0. 81320.74340.6853
commenters0.79420.73720.6782
spreaders0.7760.72160.6627
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, X.; Zhan, Y.; Jin, H.; Wang, Y.; Zhang, Y. Research on the Classification Methods of Social Bots. Electronics 2023, 12, 3030. https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12143030

AMA Style

Liu X, Zhan Y, Jin H, Wang Y, Zhang Y. Research on the Classification Methods of Social Bots. Electronics. 2023; 12(14):3030. https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12143030

Chicago/Turabian Style

Liu, Xiaohan, Yue Zhan, Hao Jin, Yuan Wang, and Yi Zhang. 2023. "Research on the Classification Methods of Social Bots" Electronics 12, no. 14: 3030. https://0-doi-org.brum.beds.ac.uk/10.3390/electronics12143030

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop