# SC-Political ResNet: Hashtag Recommendation from Tweets Using Hybrid Optimization-Based Deep Residual Network

^{*}

Previous Article in Journal

Previous Article in Special Issue

Previous Article in Special Issue

School of Computer Science and Technology, Dalian University of Technology, Ganjingzi District, Dalian 116024, China

Author to whom correspondence should be addressed.

Academic Editors: Ida Mele and Luis Martínez López

Received: 23 July 2021 / Revised: 7 September 2021 / Accepted: 9 September 2021 / Published: 22 September 2021

(This article belongs to the Special Issue Recommendation Algorithms and Web Mining)

Hashtags are considered important in various real-world applications, including tweet mining, query expansion, and sentiment analysis. Hence, recommending hashtags from tagged tweets has been considered significant by the research community. However, while many hashtag recommendation methods have been developed, finding the features from dictionary and thematic words has not yet been effectively achieved. Therefore, we developed an effective method to perform hashtag recommendations, using the proposed Sine Cosine Political Optimization-based Deep Residual Network (SC-Political ResNet) classifier. The developed SCPO is designed by integrating the Sine Cosine Algorithm (SCA) with the Political Optimizer (PO) algorithm. Employing the parametric features from both, optimization can enable the acquisition of the global best solution, by training the weights of classifier. The hybrid features acquired from the keyword set can effectively find the information of words associated with dictionary, thematic, and more relevant keywords. Extensive experiments are conducted on the Apple Twitter Sentiment and Twitter datasets. Our empirical results demonstrate that the proposed model can significantly outperform state-of-the-art methods in hashtag recommendation tasks.

The micro-blogging platform Twitter has become one of the best-known social networks on the internet. Twitter offers a platform wherein users can create a set of follower connections, in order to share their views, and subscribe to subjects posted by their followers. Twitter is considered among the first social networking sites to use the hashtag concept [1]. Due to the proliferation of micro-blogging services, there is an ongoing expansion of short-text over the Internet. Due to the production of huge micro-posts, there exists a requirement for effectual categorization and data searching. Twitter is one of the highly rated micro-blogging sites that permits users to exploit hashtags to classify day-to-day posts. However, some tweets do not comprise tags, thus obstructing the search quality [2]. In twitter, users are liberally permitted to allocate hashtags to their tweets, which are nothing but a string whose prefix is the hash symbol, and are used to catalog their posts and highlight certain content. The hashtag is considered a community-driven principle that adds extra context to tweets. This method assists in searching, rapidly propagates the topic from millions of users, and allows users to link into specific discussions [3]. Since its introduction, twitter has become a popular standard for commencing electronic communications. Twitter utilizes post tweets that reflect an enormous variety of topics, including news, personal ideas, political affairs, events, technologies, and celebrities. Furthermore, Twitter permits its users to pursue other users, in order to track their attention [4].

Twitter possesses in-depth reach, due to the utilization of mobile applications such as smartphones. Tweets are communal in nature, and are explicitly available to every person when posted using Twitter [2]. The culture of tagging is extensive; thereby, the hashtag recommendation model has acquired the interest of researchers. Recent literary works have been devised to recommend specific hashtags or infer topics hidden in the tweets. Even though such systems help motivate and assist users in acquiring tagging habits, it is not adequate for the information seeker who desires to discover promising hashtags. The recommendation of well-known hashtags imitates timely topics, but may involve deeply utilized universal hashtags, in which the suggestions are not personalized [5]. Hashtag recommendation addresses the suggestion that one or more hashtags should be used to tag a post made on the social platform. Hashtag users, who tend to allocate posts to a certain social network, rely on the message’s key facets [6]. Hashtags are the major unit of Twitter, operating as a tagging method for grouping messages related to similar topics. The dynamic aspects of hashtags have led to the proposal of various research issues in recent years, involving topics such as semantic hashtag classification and hashtag recommendation-based discovery of events. As users generate hashtags, there is a possibility of excessive hashtags being created in a similar topic. In addition, it is a complex and lengthy process for the user to search associated hashtags. In this circumstance, the recommendation of pertinent hashtags to a user, based on their attention and preference, is complex [7].

Hashtag recommendation for Twitter has acquired huge interest amongst researchers in the natural language processing (NLP) field. Literary works have focused on the content similarity of tweets [8] and modeled topics using Latent Dirichlet Allocation (LDA) [9]. Various techniques [10] have been devised based on deep learning [11], which are considered promising methods. However, the majority of techniques have been devised for hashtag recommendation using micro-blogging platforms. These are mostly devised based on a syntactic criterion or conventional tagging methodologies. They range from probabilistic models to the utilization of classification to perform collaborative filtering. Thus, these techniques present a common shabbiness, wherein they utilize sparse lexical features, such as bag-of-words (BoW), to indicate tweets, while ignoring the semantic behind tweets with an equivalent illustration of multiple NLP tasks. Popular techniques which have been adapted for hashtag recommendation are Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM), which have good ability for learning sequence representations [12,13,14,15]. The classical methods of hashtag recommendation range from collaborative filtering to classification and probabilistic models, including topic models and Naïve Bayes. In addition, the majority of techniques rely on sparse lexical features that involve BoW models and delicately devise patterns. However, feature engineering is labor-intensive, as sparse and discrete features encode syntactic data from words. Meanwhile, neural techniques have been shown to have the ability to learn effectual representations and have presented improved performance in different NLP tasks [16]. The LSTM and modified RNN have been extensively employed in this respect, due to their ability to capture long-term dependencies [17].

This research was designed by employing a Deep Residual Network for recommending hashtags from twitter data. The proposed SCPO consists of the integration of the SCA and the PO algorithm. The SCA creates multiple initial random candidate solutions and requires them to fluctuate outwards or towards the best solution, using a mathematical model based on sine and cosine functions. This algorithm can explore different regions of a search space, avoid local optima, converge towards the global optimum, and exploit promising regions of a search space effectively during optimization. It can be highly effective in solving real problems with constrained and unknown search spaces. The PO consists of the mathematical mapping of all of the major phases of politics, including constituency allocation, party switching, election campaigns, inter-party elections, and parliamentary affairs. This algorithm assigns each solution a dual role, by logically dividing the population into political parties and constituencies, facilitating each candidate to update its position concerning the party leader and the constituency winner. The PO algorithm has excellent convergence speed, with good exploration capability in early iterations. This algorithm is invariant to function shifting and performs consistently in very high-dimensional search spaces. We integrate the PO algorithm with the SCA, in order to attain the global best solution.

The input data considered are twitter data, called tweets. Input tweets are processed more effectively, in order to remove the stop words. The process of stop word removal from the input data increases the quality of data for further processing. After removing stop words, the keyword set is computed by finding dictionary words, thematic words, and more relevant words. The dictionary words are the unique keywords retrieved from all the tweets in the training data. Thematic words are extracted based on the co-occurring frequency words, whereas more relevant words repeatedly occur among the tweets. After forming the keyword set, the features associated with the keyword sets are extracted by considering the dictionary Term frequency (TF), document Inverse document frequency (IDF), thematic words, and the more relevant word features. For each tweet, the document TF, document IDF, thematic words, and the more relevant features are extracted, and their respective variance measures are computed more effectively. The input to the Deep LSTM consists of the features and their variance measures, such that the Deep LSTM can compute the hashing score by considering all the features to generate the feature vector. The Deep Residual network takes the feature vector as input. It processes the features more effectively, in order to recommend the hashtag. The SCPO algorithm carries out the training of the deep learning classifier.

The major contributions of this research are as follows:

- We model an effective Hashtag Recommendation system using a Deep Residual Network Classifier, trained by the SCPO algorithm, which is derived by integrating the SCA and the PO algorithm;
- The incorporation of features in the optimization algorithm pushes the solution towards the global optimum rather than local optima;
- We conduct extensive experiments on different datasets, in which the proposed method is shown to outperform state-of-the-art methods in the Hashtag Recommendation task.

The remainder of this paper is organized as follows: Section 2 provides a review of different Hashtag Recommendation methods. Section 3 briefly introduces the architecture of the proposed framework. The system implementation and evaluation are described in Section 4, and the results and discussions are given in Section 5. Finally, Section 6 concludes the overall work and provides recommendations for future research.

Some traditional hashtag recommendation methods are reviewed in this section. Nada Ben-Lhachemi and El Habib Nfaoui [18] introduced a DBSCAN clustering technique for partitioning tweets into clusters consisting of semantically and syntactically relevant tweets. This method has the potential to recommend pertinent hashtags syntactically and semantically from a given tweet; however, it failed to involve deep architectures on various semantic knowledge bases, in order to enhance the accuracy. Yang Li et al. [17] introduced a Topical Co-Attention Network (TCAN) to guide content and topic attention. This method was shown to be useful in determining structures from large documents; however, the framework failed to involve temporal information. Nada Ben-Lhachemiet al. [15] introduced an LSTM network for encoding the tweet vector concerning the representation. The skip-gram model was employed to generate the tweet embeddings, considering the vector-based representation. It was capable of recommending suitable hashtags, but failed to utilize semantic knowledge bases to improve the accuracy. Asma Belhadi et al. [19] introduced the PM-HRec model to address the issues of hashtag recommendation with two stages; namely, offline and online processing. Offline processing was used to transform the tweet corpus to a transnational database based on temporal information, whereas online processing was used to extract related hashtags. This method works effectively on huge data, but failed to include parallel GPUs to handle large data. Areej Alsini et al. [20] introduced a community-based hashtag recommendation model by applying a tweet similarity process. This method makes it easy to understand the different factors influencing hashtag recommendation, but it involves a lengthy process.

Qi Yang et al. [21] developed an AMNN to learn a representation of multimodal micro-blogging and to recommend related hashtags. The features from both images and text were extracted using a hybrid neural network model. However, it improved performance with multimodal micro-blogs but failed to adapt to external knowledge, such as comments and user information, for the purpose of recommendation. Renfeng Ma et al. [22] developed a Co-attention memory network for the recommendation of hashtags, in which the users were allowed to introduce some new hashtags dynamically. This method can manage a huge number of hashtags; however, the generation of high-quality candidate sets is complex. Da Cao et al. [23] developed a Neural Network-based LOGO model for recommending hashtags by considering multiple modalities. It assists in capturing sequential and considerate features simultaneously, but fails to include a recommendation on micro-videos. Can Li et al. [24] developed a tag recommendation technique—the Tag recommendation method with Deep learning and Collaborative filtering (TagDC)—which was developed with two different modules: initially, word learning is employed using an enhanced CNN; then, collaborative filtering is employed for tag recommendation on software sites. The performance was evaluated based on the F-Measure, recall, and precision. However, they failed to consider unpopular tags. Sahraoui et al. [25] developed a hybrid filtering technique for mining the data in a social network. In this, Big Five personality traits were devised for the mining; then, the user’s interest is represented in the graph and predicted using the meta path discovery method. Its efficiency was evaluated using the precision, recall, and F-Measure. However, they failed to evaluate its performance in signed networks. A community-based hashtag recommendation model [20] was presented for performing recommendations with tweets. However, it involves a complex and lengthy process for searching related hashtags. Moreover, the classical techniques of hashtag recommendation utilize probabilistic models, such as topic models and Naïve Bayes. In addition, the majority of techniques rely on sparse lexical features involving the Bag-of-Words (BoW) model and devised structures. However, the associated feature engineering is challenging, and these methods cannot encode semantic and syntactic information [17]. Table 1 depicts the literature review in detail.

The overall architecture of the SCPO-based Deep Residual Network for Hashtag Recommendation is illustrated in Figure 1, comprising several components. The details of each component are presented in the following.

A user uses a hashtag by inserting a hash symbol (#) in front of a word, in order to access the data more easily. Due to the growth of information being shared among social services, such as Flicker and Instagram, a number of micro-blogs focused on certain the hashtags are available. Existing hashtag recommendation methods have been designed solely considering textual data. Hence, in this research, we focus on designing a model using an SCPO-based Deep Residual Network for recommending hashtags. Here, the input tweet is passed through a stop word removal phase, in order to eliminate the stop words associated with the input tweet data. The keyword set is computed based on dictionary words, thematic words, and more relevant words. The features extracted from the keyword set and hashing score computed from the Deep LSTM are passed as input to the Deep Residual Network, which performs the recommendation process.

Informal languages are commonly used in tweets, which are restricted to 140 characters. However, this restriction forces the user to be imaginative, in terms of shortening words, writing abbreviations, and using symbols and emoticons. Moreover, the tweet may contain special components, such as URLs, media, user mentions, and hashtags. Let us consider a dataset comprised of a number of tweets, represented as
where $\alpha $ denotes the database, ${T}_{j}$ indicates the tweet located at ${j}^{th}$ index of the dataset, and n represents the total number of tweets.

$$\alpha =\{{T}_{1},{T}_{2},\dots {T}_{j},\dots {T}_{n}\},$$

The words associated with tweet ${T}_{j}$ are expressed as
where ${W}_{L}$ denotes the total number of words in the tweet.

$${T}_{j}=\{{W}_{1},{W}_{2},\dots {W}_{k},\dots ,{W}_{L}\},$$

A stop word is the word that appears often and does not contain any meaning. Stop word removal is the process of removing these words, including ’the’, ’a’, ’is’, ’an’, and so on. As the stop word does not have any sentiment, it is very significant to remove these words from the tweet. The stop word removal process increases the performance of hashtag recommendation by reducing the dimensionality rate. After removing the stop words from tweet ${T}_{j}$, the resulting output tweet is represented as ${T}_{sr}$.

In the keyword set process, the extraction of dictionary words, thematic words, and more relevant words is carried out on the tweet.

The keyword set comprises three subsets—namely, $DW$, $TW$, and $MR$—such that the keyword set can be written as ${T}_{j}=\{DW,TW,MR\}$.

After finding the keyword set, the features associated with the keyword set are selected by constructing the feature map. This section explains the construction of feature maps, feature reconstruction, and the computation of hashing scores.

The feature map is constructed for the keyword set by considering each tweet word in the dataset. For an individual word in a tweet, the feature map is constructed based on dictionary keywords, thematic keywords, and more relevant keywords. However, the dictionary feature is constructed using the $TF$ and $IDF$ for each word ${W}_{K}$ of the tweet. Figure 2 represents the schematic view of feature map construction. $TF$ is the number of times each word ${W}_{K}$ is present in the tweet, while ${W}_{K}$ indicates ${W}^{th}$ word of the tweet. The $IDF$ is computed based on the ratio of the total number of tweets to the number of tweets that contain the word ${W}_{K}$, measured as

$$IDF=log\frac{{T}_{n}}{DF\left({W}_{K}\right)}.$$

Let us consider the number of words in a tweet to be four; hence, the document $TF$ computed for the four words of the tweet is represented as ${X}^{1}$, document $IDF$ measured for the same words of the tweet is represented as ${X}^{2}$, and thematic and more relevant keywords measured for the four words of the tweet are denoted by ${X}^{3}$ and ${X}^{4}$, respectively.

After constructing the feature map, the features obtained from the keyword set using the dictionary $TF$, dictionary $IDF$, thematic keyword, and the non-relevant keywords are represented as ${f}_{1}$, ${f}_{3}$, ${f}_{5}$, and ${f}_{7}$, respectively, indicating the keyword set’s mean value, where
where ${f}_{1}$ denotes the features acquired using dictionary $TF$ and the variance measure of respective ${f}_{1}$ is represented as ${f}_{2}$.

$${f}_{1}=\frac{1}{L}\sum _{v=1}^{L}{X}_{v}^{1},$$

$${f}_{2}=\frac{1}{L}\sum _{v=1}^{L}{({X}_{v}^{1}-{f}_{1})}^{2},$$

$${f}_{3}=\frac{1}{L}\sum _{v=1}^{L}{X}_{v}^{2},$$

$${f}_{4}=\frac{1}{L}\sum _{v=1}^{L}{({X}_{v}^{2}-{f}_{3})}^{2}.$$

The features acquired from the keyword set based on the dictionary $IDF$ are denoted as ${f}_{3}$, and their respective variance is indicated as ${f}_{4}$.

$${f}_{5}=\frac{1}{L}\sum _{v=1}^{L}{X}_{v}^{3},$$

$${f}_{6}=\frac{1}{L}\sum _{v=1}^{L}{({X}_{v}^{3}-{f}_{5})}^{2}.$$

The feature ${f}_{5}$ is acquired from the keyword set, based on thematic words, and the variance measure of the respective feature ${f}_{5}$ is represented as ${f}_{6}$.

$${f}_{7}=\frac{1}{L}\sum _{v=1}^{L}{X}_{v}^{4},$$

$${f}_{8}=\frac{1}{L}\sum _{v=1}^{L}{({X}_{v}^{4}-{f}_{7})}^{2}.$$

The features acquired based on more relevant words are represented as ${f}_{7}$, and their corresponding variance measure is represented as ${f}_{8}$. Here, L indicates the total number of words in the tweet, and features ${f}_{2}$, ${f}_{4}$, ${f}_{6}$, and ${f}_{8}$ denote the variances of the corresponding features ${f}_{1}$, ${f}_{3}$, ${f}_{5}$, and ${f}_{7}$, respectively. The features acquired from the keyword set have the dimension of $[V\times 8]$.

Deep LSTM [26] is employed to determine the hashing score, considering the keyword set features as input. A deep neural Network constructed by using a number of layers within the LSTM framework is called a Deep LSTM. It takes advantages of the benefits of both deep neural Networks and LSTM Networks to obtain the hashing score. It consists of internal state cells, which act as memory cells. The output of the Deep LSTM is the hashing score, which is computed based on the cell states. The working process of this Network solely depends on the memory cells.

The input node receives the input feature f, where $f=\{{f}_{1},{f}_{2},{f}_{3},{f}_{4},{f}_{5},{f}_{6},{f}_{7},{f}_{8}\}$, at the input layer.
where $tanh$ denotes the hyperbolic tangent function, ${Q}_{F}$ denotes the input node at time F, f denotes the input feature, ${K}_{Of}$ indicates the weights among the input layer and input nodes of the memory cell, ${M}_{F-1}$ represents the input of the hidden state, ${K}_{QM}$ denotes weight matrix for the hidden state, and ${Z}_{Q}$ indicates the bias of input node.

$${Q}_{F}=tanh(f\ast {K}_{OF}+{M}_{F-1}{K}_{QM}+{Z}_{Q}),$$

The input gate, ${Y}_{F}$, is equivalent to that of the input node, and it receives the input feature f. The input gate blocks information flow between the nodes. The input gate is represented as
where ${Y}_{F}$ denotes the input gate, ${Z}_{Y}$ denotes the bias of the input gate, and X specifies the sigmoid activation function.

$${Y}_{F}=X(f\ast {K}_{OF}+{M}_{F-1}{K}_{QM}+{Z}_{Y}),$$

The internal state, ${\lambda}_{F}$, is a node with a self-loop recurrent edge consisting of a linear activation function and a unit weight, such that the internal state can be represented as
where ${\lambda}_{F}$ denotes the internal state at time F.

$${\lambda}_{F}={Y}_{F}\u2a00{Q}_{F}+{\lambda}_{F-1},$$

The forget gate is the sub-unit used for re-initiating the internal state of the memory cell, which is computed as
where ${\beta}_{F}$ denotes the forget gate, ${K}_{\beta M}$ denotes weights among the forget gate and the hidden states, and ${K}_{Mf}$ represents the weights among the forget gate and the input layer.

$${\beta}_{F}=X(f\ast {K}_{Mf}+{M}_{F-1}{K}_{\beta M}+{Z}_{\beta}),$$

Finally, the output gate, ${\gamma}_{F}$, is expressed as
where ${K}_{\gamma f}$ denotes the weights among the output gate and the input layer, ${K}_{\gamma M}$ represents the weights among the output gate and the hidden states, and ${Z}_{\gamma}$ denotes the bias of the output gate. The output of memory cell is calculated as
where ${\lambda}_{F}={Q}_{F}{\u2a00}_{f}+{\lambda}_{F-1}\u2a00{\beta}_{F}$ and ⨀ denotes the $XNOR$ operator. The score computed using the Deep LSTM is termed the hashing score, and is represented as ${f}_{9}$ with the dimension of $[V\times 1]$. It is used to perform the hashtag recommendation, by passing it as an input to the Deep Residual Network, in addition to the features of the keyword set. Figure 3 shows the structure of the Deep LSTM classifier.

$${\gamma}_{F}=X(f\ast {K}_{\gamma f}+{M}_{F-1}{K}_{\gamma M}+{Z}_{\gamma}),$$

$${M}_{F}=tanh\left({\lambda}_{F}\right)\underset{{\gamma}_{F}}{\u2a00},$$

The feature vector is computed by considering the features of the keyword set, along with the hashing score. The feature vector used to perform the process of hashtag recommendation is represented as
where $\mu $ denotes the feature vector, with the dimension of $[V\times 9]$.

$$\mu =\{{f}_{1},{f}_{2},{f}_{3},{f}_{4},{f}_{5},{f}_{6},{f}_{7},{f}_{8},{f}_{9}\},$$

To make the tweet searching process easy, twitter users use hashtags for classifying their tweets. In this paper, the process of hashtag recommendation is carried out using a Deep Residual Network, in such a way that the training process of deep learning classifier is carried out using the SCPO algorithm.

The process of hashtag recommendation is carried out by a Deep Residual Network, while the training process of the classifier is done using the proposed SCPO algorithm. The structure of a Deep Residual Network [27] is composed of different layers; namely, residual blocks, Convolutional (conv) layers, average pooling layers, and linear classifiers. Figure 4 represents the structure of the Deep Residual Network.

$$Conv2d\left(B\right)=\sum _{z=0}^{b-1}\sum _{h=0}^{b-1}{A}_{z,c}\u2022{\mu}_{(u+z),(v+c)},$$

$$Convld\left(B\right)=\sum _{l=0}^{{X}_{i}n-1}{A}_{l}\ast \mu ,$$

$${z}_{out}=\frac{{z}_{in}-{l}_{z}}{a}+1,$$

$${c}_{out}=\frac{{c}_{in}-{l}_{c}}{a}+1,$$

$$B\left(\mu \right)=\left\{\begin{array}{c}0;\mu <0\hfill \\ \mu ;\mu \ge 0\hfill \end{array},\right.$$

$$C=g\left(\mu \right)+\mu ,$$

$$C=g\left(\mu \right)+{w}_{f}\mu ,$$

$$C={w}_{U\ast V}{\mu}_{V\ast 1}+{D}_{U\ast 1},$$

$$Softmax\left({\mu}_{u}\right)=\frac{{e}_{u}^{\mu}}{{\displaystyle \sum _{l=1}^{{X}_{0}}}{e}_{l}^{\mu}};u=1,2,\dots ,{X}_{0},$$

The Deep Residual Network training process is carried out using the proposed SCPO, which was derived through the integration of the SCA [28] and PO [29] algorithms. The SCA generates a number of candidate solutions randomly and fluctuates them towards an optimal solution using mathematical functions consisting of sine and cosine functions. Adaptive parameters are used to emphasize the exploitation and exploration of the search space. It considers test cases that move the solution towards global optima and offer better convergence. Meanwhile, the PO algorithm considers two perspectives, where each solution optimizes the goodwill to win the election, whereas each party tries to maximize their number of seats in parliament, in order to form a government. Here, the party members are considered as the candidate solution and the position of the candidate in the solution space is termed as their goodwill. However, the goodwill of a political member is defined based on performance-related factors, mimicking the components or variables of the candidate solution. The PO involves five phases; namely, party formation, election campaign, party-switching, inter-party election, and parliamentary phases. The algorithmic procedure involved in the proposed SCPO is elaborated in the following:

$$E=\{{E}_{1},{E}_{2},\dots ,{E}_{k},\dots ,{E}_{q}\}.$$

Each party ${E}_{k}$ contains q members or candidates in the solution space, and is represented as

$${E}_{k}=\{R{}_{k}^{1},R{}_{k}^{2},\dots ,R{}_{k}^{q}\}.$$

Each ${r}_{th}$ member, $R{}_{k}^{r}$, is assumed to be a potential solution, and each solution is a p-dimensional vector, given as
where p denotes the number of input variables and ${R}_{k,i}^{r}$ denotes the ${i}^{th}$ dimension of $R{}_{k}^{r}$.

$${R}_{k}^{r}=\left\{{R}_{k,1}^{r},{R}_{k,2}^{r},\dots ,{R}_{k,p}^{r}\right\},$$

$$F=\frac{1}{N}\sum _{t=1}^{N}{[{C}_{t}-{Q}_{t}]}^{2},$$

$${R}_{k}^{*}={R}_{k}^{x},$$

$$x=\underset{1\le r\le q}{argmin}S\left({R}_{k}^{r}\right);\forall k=\{1,2,\dots ,q\},$$

$${R}^{*}=\{{R}_{1}^{*},{R}_{2}^{*},\dots ,{R}_{q}^{*}\}.$$

$${H}_{r}^{*}={R}_{x}^{*},$$

$$x=\underset{1\le k\le q}{argmin}S\left({R}_{k}^{r}\right).$$

$${R}_{k}^{h+1}={s}^{*}+d({s}^{*}-{R}_{k}^{h});if{R}_{k}^{h-1}\le {R}_{k}^{h}\le {s}^{*}or{R}_{k}^{h-1}\ge {R}_{k}^{h}{s}^{*},$$

$${R}_{k}^{h+1}=\left\{\begin{array}{cc}{R}_{k}^{h}+{d}_{1}sin\left({d}_{2}\right)\times \left|{d}_{3}{G}_{k}^{h}-{R}_{k}^{h}\right|\hfill & ;{d}_{4}<0.5\hfill \\ {R}_{k}^{h}+{d}_{1}cos\left({d}_{2}\right)\times \left|{d}_{3}{G}_{k}^{h}-{R}_{k}^{h}\right|\hfill & ;{d}_{4}\ge 0.5\hfill \end{array}.\right.$$

The equation that satisfies the condition ${d}_{4}\ge 0.5$ is selected for training the weights.

$${R}_{k}^{h+1}={R}_{k}^{h}+{d}_{1}sin\left({d}_{2}\right)\times \left|{d}_{3}{G}_{k}^{h}-{R}_{k}^{h}\right|.$$

Let us assume that ${G}_{k}^{h}>{R}_{k}^{h}$. Thus, $|\xb7|$ will be removed, and

$${R}_{k}^{h+1}={R}_{k}^{h}+{d}_{1}sin\left({d}_{2}\right)\times \left({d}_{3}{G}_{k}^{h}-{R}_{k}^{h}\right),$$

$${R}_{k}^{h+1}={R}_{k}^{h}+{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h}-{d}_{1}sin\left({d}_{2}\right){R}_{k}^{h},$$

$${R}_{k}^{h+1}={R}_{k}^{h}\left[1-{d}_{1}sin\left({d}_{2}\right)\right]+{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h},$$

$${R}_{k}^{h}\left[1-{d}_{1}sin\left({d}_{2}\right)\right]={R}_{k}^{h+1}-{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h},$$

$${R}_{k}^{h}=\frac{{R}_{k}^{h+1}-{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h}}{1-{d}_{1}sin\left({d}_{2}\right)}.$$

By substituting Equation (44) into Equation (37), we have

$${R}_{k}^{h+1}={s}^{*}+d\left({s}^{*}-\left(\frac{{R}_{k}^{h+1}-{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h}}{1-{d}_{1}sin\left({d}_{2}\right)}\right)\right),$$

$${R}_{k}^{h+1}+\frac{d{R}_{k}^{h+1}}{1-{d}_{1}sin\left({d}_{2}\right)}={s}^{*}+d\left({s}^{*}+\frac{{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h}}{1-{d}_{1}sin\left({d}_{2}\right)}\right),$$

$$\frac{{R}_{k}^{h+1}\left(1-{d}_{1}sin\left({d}_{2}\right)\right)+d{R}_{k}^{h+1}}{1-{d}_{1}sin\left({d}_{2}\right)}={s}^{*}+d\left({s}^{*}+\frac{{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h}}{1-{d}_{1}sin\left({d}_{2}\right)}\right),$$

$${R}_{k}^{h+1}\left[\frac{\left(1-{d}_{1}sin\left({d}_{2}\right)\right)+d}{1-{d}_{1}sin\left({d}_{2}\right)}\right]={s}^{*}+d\left({s}^{*}+\frac{{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h}}{1-{d}_{1}sin\left({d}_{2}\right)}\right),$$

$${R}_{k}^{h+1}=\frac{1-{d}_{1}sin\left({d}_{2}\right)}{\left(1-{d}_{1}sin\left({d}_{2}\right)\right)+d}\left[{s}^{*}+d\left({s}^{*}+\frac{{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h}}{1-{d}_{1}sin\left({d}_{2}\right)}\right)\right].$$

Hence, the update solution of proposed SCPO algorithm can be expressed as
where ${R}_{k}^{h}$ denotes the position of the current solution in ${k}^{th}$ dimension at the ${h}^{th}$ iteration; r, ${r}_{1}$, ${r}_{2}$, and ${r}_{3}$ are random numbers; ${G}_{k}$ indicates the position of the destination point in the ${k}^{th}$ dimension; and $|\xb7|$ specifies the absolute value.

$${R}_{k}^{h+1}=\left\{\begin{array}{c}\frac{1-{d}_{1}sin\left({d}_{2}\right)}{\left(1-{d}_{1}sin\left({d}_{2}\right)\right)+d}\left[{s}^{*}+d\left({s}^{*}+\frac{{d}_{1}{d}_{3}sin\left({d}_{2}\right){G}_{k}^{h}}{1-{d}_{1}sin\left({d}_{2}\right)}\right)\right];{d}_{4}<0.5\hfill \\ \frac{1-{d}_{1}cos\left({d}_{2}\right)}{\left(1-{d}_{1}cos\left({d}_{2}\right)\right)+d}\left[{s}^{*}+d\left({s}^{*}+\frac{{d}_{1}{d}_{3}cos\left({d}_{2}\right){G}_{k}^{h}}{1-{d}_{1}cos\left({d}_{2}\right)}\right)\right];{d}_{4}\ge 0.5\hfill \end{array},\right.$$

Algorithm 1 provides the pseudo-code for the proposed SCPO-based Deep Residual Network.

Algorithm 1: Pseudo-code of proposed SCPO-based Deep Residual network. |

1 Input: ${R}_{k}^{h},$ |

2 Output: ${R}_{k}^{h+1}$ |

3 $\mathrm{Initialize}\phantom{\rule{4.pt}{0ex}}E$ |

4 Compute fitness measure |

5 $\mathrm{Compute}\phantom{\rule{4.pt}{0ex}}{R}^{*}$ |

6 $\mathrm{Compute}\phantom{\rule{4.pt}{0ex}}{H}^{*}$ |

7 $h=1$ |

8 $E(h-1)=E$ |

9 $S\left(E\right(h-1\left)\right)=S\left(E\right)$ |

10 $J={J}_{max}$ |

11 $\mathrm{while}\phantom{\rule{4.pt}{0ex}}h\le {h}_{max}$ |

12 Do |

13 ${E}_{\mathrm{tem}\phantom{\rule{4.pt}{0ex}}}=E$ |

14 $S\left({E}_{\mathrm{temp}}\right)=S\left(E\right)\mid $ |

15 $\mathrm{for}\phantom{\rule{4.pt}{0ex}}\mathrm{each}\phantom{\rule{4.pt}{0ex}}{E}_{k}\in E$ |

16 Do |

17 $\mathrm{for}\phantom{\rule{4.pt}{0ex}}\mathrm{each}\phantom{\rule{4.pt}{0ex}}{R}_{k}^{r}\in {E}_{k}$ |

18 Do |

19 ${\mathbf{R}}_{k}^{r}=\phantom{\rule{4.pt}{0ex}}\mathrm{Election}\phantom{\rule{4.pt}{0ex}}\mathrm{campaign}\phantom{\rule{4.pt}{0ex}}\left({R}_{k}^{r},{R}_{k}^{r}(h-1),{R}^{*},{H}^{*}\right)$ |

20 End |

21 End |

22 $\mathrm{Party}\phantom{\rule{4.pt}{0ex}}\mathrm{switching}\phantom{\rule{4.pt}{0ex}}(E,J)$ |

23 $\mathrm{Compute}\phantom{\rule{4.pt}{0ex}}{R}^{*}\phantom{\rule{4.pt}{0ex}}\mathrm{and}\phantom{\rule{4.pt}{0ex}}{H}^{*}\phantom{\rule{4.pt}{0ex}}\mathrm{at}\phantom{\rule{4.pt}{0ex}}\mathrm{election}\phantom{\rule{4.pt}{0ex}}\mathrm{phase}$ |

24 $\mathrm{Parliamentary}\phantom{\rule{4.pt}{0ex}}\mathrm{affairs}\phantom{\rule{4.pt}{0ex}}{H}^{*},E$ |

25 $E(h-1)={E}_{\mathrm{temp}}$ |

26 $S\left(E(h-1)\right)=S\left({E}_{\mathrm{temp}}\right)$ |

27 $J=J-{J}_{max}/{h}_{max}$ |

28 $h=h+1$ |

29 End |

In this section, we first present the used datasets and then describe the experimental setup and baseline benchmarks. Finally, the evaluation metrics are detailed.

In order to evaluate our system, we used the Apple Twitter Sentiment Dataset https://data.world/crowdflower/apple-twitter-sentiment/workspace/file?filename=Apple-Twitter-Sentiment-DFE.csv (accessed on 15 April 2021) and Twitter dataset https://www.kaggle.com/kavita5/twitter-dataset-avengersendgame?select=tweets.csv (accessed on 15 April 2021). These datasets are explained in detail below.

- Apple Twitter Sentiment Dataset: This dataset contains 12 columns and 3886 rows, including a number of fields; namely, id, sentiment, query, text, and so on. The size of the dataset is 798.47 KB. However, the text contains hashtags with positive, negative, or neutral tweet sentiments.
- Twitter Dataset: This dataset contains 10,000 records acquired from the twitter of the trending domain “AvengersEndgame”. These data can be used for sentiment analysis. It contains 17 columns with the 8 string-type variables, 4 Boolean-type variables, 3 Integer-type variables, and two other types of variables.

The method we proposed was implemented in Python Programming Language, with the Neural Network library of Tensorflow/Pytorch. Our Networks were trained on an NVIDIA GTX 1080 in a 64-bit computer with Intel(R) Core(TM) i7-6700 [email protected] GHz, 16 GB RAM, and the Ubuntu 16.04 operating system.

The performance of the developed scheme was analyzed by considering the metrics of Precision, Recall, and F1-Score.

$$PR=\frac{{A}_{p}}{{A}_{p}+{B}_{p}},$$

$$RC=\frac{{A}_{p}}{{A}_{p}+{B}_{n}},$$

$$FS=2\ast \frac{PR\ast RC}{PR+RC},$$

In order to evaluate the effectiveness of the proposed framework, our method was compared with several existing algorithms, including:

- LSTM-RNN Network [15]: For this, the hashtag recommendation was designed by encoding the tweet vector using the LSTM-RNN technique;
- Pattern Mining for Hashtag Recommendation (PMHRec) [19]: PMHRec was designed using the top k high-average utility patterns for temporal transformation tweets, from which the hashtag recommendation was devised;
- Attention-based multimodal neural network model (AMNN) [21]: The AMNN was designed to extract the features from text and images, after which correlations are captured for hashtag recommendation;
- Deep LSTM [26]: Deep-LSTM was designed to evaluate the daily pan evaporation;
- Emhash [30]: For this method, the hashtag recommendation was developed using a neural network-based BERT embedding;
- Community-based hashtag recommendation [20]: For this, the hashtag recommendation was designed using the communities extracted from social Networks.

The performance results of our proposed model are presented in this section. The results are compared with previously introduced methods, which were tested on the same datasets.

This section describes the analysis on the Apple twitter sentiment dataset. Figure 5 shows the analysis of the proposed model when using features ${f}_{1}$–${f}_{6}$. Figure 5a shows the precision measure. With 70% of the training data, the precision measured by the proposed SCPO-based Deep Residual network was 0.88, making it 14.27%, 13.01%, 7.85%, 7.48%, 5.31%, and 4.30% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. The analysis of the recall measure is presented in Figure 5b. With 80% of the training data, the recall measured by the proposed SCPO-based Deep Residual network was 0.91, making it 8.64%, 7.98%, 5.98%, 5.36%, 3.81%, and 3.02% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. The analysis of the F1-score measure of the proposed method in presented in Figure 5c. With 70% of the training data, the F1-score measured by the proposed SCPO-based Deep Residual network was 0.87, making it 16.01%, 14.62%, 8.89%, 8.47%, 6.63%, and 5.36% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively.

The analysis of the developed approach, considering the feature f, is illustrated in Figure 6. Figure 6a portrays the precision metric. With 70% of the training data, the precision measured by the proposed SCPO-based Deep Residual network was 0.90, making it 13.56%, 11.16%, 8.82%, 7.15%, 5.66%, and 4.13% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. The analysis of the recall measure is presented in Figure 6b. With 80% of the training data, the recall measured by the proposed SCPO-based Deep Residual network was 0.92, making it 9.89%, 9.23%, 7.44%, 6.83%, 5.42%, and 3.37% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. The analysis of the F1-score measure is presented in Figure 6c. With 70% of the training data, the F1-score measured by the proposed SCPO-based Deep Residual network was 0.88, making it 16.31%, 13.82%, 8.73%, 6.36%, 5.00%, and 3.61% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively.

The comparative analysis carried out considering the Twitter dataset is detailed in this section. Figure 7 presents the analysis when considering the features ${f}_{1}$–${f}_{6}$. Figure 7a presents the analysis with respect to the precision measure. With 80% of the training data, the precision measured by the proposed SCPO-based Deep Residual network was 0.87, making it 5.14%, 4.65%, 2.81%, 2.60%, 1.79%, and 0.76% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. The analysis of the recall measure is presented in Figure 7b. With 60% of the training data, the recall measured by the proposed SCPO-based Deep Residual network was 0.78, making it 7.94%, 6.54%, 5.24%, 4.74%, 3.80%, and 2.26% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. The analysis of the F1-score measure is presented in Figure 7c. With 70% of the training data, the F1-score measured by the proposed SCPO-based Deep Residual network was 0.85, making it 9.05%, 8.23%, 5.66%, 5.51%, 4.10%, and 2.80% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively.

Figure 8 indicates the comparative analysis of the proposed scheme using the feature f. Figure 8a portrays the analysis based on the precision measure. With 80% of the training data, the precision measured by the proposed SCPO-based Deep Residual network was 0.87, making it 5.56%, 5.18%, 3.29%, 3.05%, 1.94%, and 1.23% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. The analysis of the recall measure is presented in Figure 8b. With 60% of the training data, the recall measured by the proposed SCPO-based Deep Residual network was 0.87, making it 8.92%, 7.68%, 6.21%, 6.12%, 5.04%, and 2.78% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. The analysis of the F1-score measure is presented in Figure 8c. With 70% of the training data, the F1-score measured by the proposed SCPO-based Deep Residual network was 0.86, making it 10.09%, 9.04%, 7.08%, 7.04%, 5.81%, and 4.59% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively.

Table 2 presents a comparison of the results of the tested methods, showing the values obtained when considering all of the features of the dataset and 90% of the training data. For the Apple twitter sentiment dataset, the precision achieved by the proposed SCPO-based Deep Residual Network was 0.958, making it 3.51%, 3.31%, 2.58%, 2.48%, 1.86%, and 1.55% superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. The recall achieved by the proposed SCPO-based Deep Residual Network was 0.958, making it 4.28%, 4.07%, 3.03%, 2.92%, 2.71%, and 1.57%, superior to RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. Likewise, the F1-score obtained by the proposed SCPO-based Deep Residual Network was 0.963, making it 3.84%, 3.74%, 2.80%, 2.70%, 2.28%, and 1.56% better than RNN-LSTM, PM-HRec, AMNN, Deep LSTM, Emhash, and Community-based hashtag recommendation, respectively. Similarly, the proposed SCPO-based Deep Residual Network had the maximum precision, recall, and F1-score values on the Twitter dataset, compared to the considered state-of-the-art methods.

The proposed method outperformed all of the other comparative methods, in terms of precision, recall, and F1-score. This performance enhancement was achieved through the use of population diversity in the proposed algorithm, which improves the search space, thus achieving better performance enhancement. Furthermore, it is helpful to carry out the optimization of complex problems in parallel, in order to enhance the efficiency and accuracy. The proposed algorithm also has a fast convergence rate and obtained the best solution by avoiding local optima. Overall, the proposed method had improved performance, compared to the other conventional techniques.

In this research, we developed a Hashtag Recommendation model by employing the SCPO-based Deep Residual Network method, in which a deep learning classifier—that is, the Deep Residual Network—is trained by the proposed SCPO algorithm. The proposed method performs the hashtag recommendation process by considering the features of the keyword set. The input tweet is subjected to stop word removal, effectively removing the stop words associated with the tweet. The keyword set is then computed by finding the dictionary words, thematic words, and more relevant words in the tweet data. The features of the keyword sets are passed to the Deep LSTM classifier, in order to find the hashing score. In addition to the features of the keyword set, the hashing score is also employed as an input to the Deep Residual Network which recommends the hashtags. Empirical results demonstrated that the proposed model can outperform state-of-the-art models on the Apple Twitter Sentiment dataset and Twitter dataset. In the future, the dimensionality of features can be further reduced by employing a feature selection model. Moreover, the training process may be further enhanced by testing the application of other optimization algorithms.

S.K.B. designed and wrote the paper; H.L. (Hongfei Lin) supervised the work; S.K.B. performed the experiments with an advise from B.X. and H.L. (Haifeng Liu) organized and proofread the paper. All authors have read and agreed to the published version of the manuscript.

This research was supported by the National Natural Science Foundation of China No. 61572102.

Not applicable.

Not applicable.

Not applicable.

The authors declare no conflict of interest.

The following abbreviations are used in this manuscript:

AMNN | Attention-based Multi-model Neural Network |

BoW | Bag of Words |

CNN | Convolutional Neural Network |

DNN | Deep Neural Networks |

LSTM | Long Short-Term Memory |

NB | Naive Bayes |

NLP | Natural Language Processing |

PO | Political Optimizer |

RNN | Recurrent Neural Network |

SCA | Sine Cosine Algorithm |

- Kowald, D.; Pujari, S.C.; Lex, E. Temporal effects on hashtag reuse in twitter: A cognitive-inspired hashtag recommendation approach. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 1401–1410. [Google Scholar]
- Godin, F.; Slavkovikj, V.; De Neve, W.; Schrauwen, B.; Van de Walle, R. Using topic models for twitter hashtag recommendation. In Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013; pp. 593–596. [Google Scholar]
- Alvari, H. Twitter hashtag recommendation using matrix factorization. arXiv
**2017**, arXiv:1705.10453. [Google Scholar] - Kabakus, A.T.; Kara, R. “TwitterSpamDetector”: A Spam Detection Framework for Twitter. Int. J. Knowl. Syst. Sci. (IJKSS)
**2019**, 10, 1–14. [Google Scholar] [CrossRef] - Otsuka, E.; Wallace, S.A.; Chiu, D. A hashtag recommendation system for twitter data streams. Comput. Soc. Netw.
**2016**, 3, 1–26. [Google Scholar] [CrossRef] [PubMed] - Dey, K.; Shrivastava, R.; Kaushik, S.; Subramaniam, L.V. Emtagger: A word embedding based novel method for hashtag recommendation on twitter. In Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW), New Orleans, LA, USA, 18–21 November 2017; pp. 1025–1032. [Google Scholar]
- Cui, A.; Zhang, M.; Liu, Y.; Ma, S.; Zhang, K. Discover breaking events with popular hashtags in twitter. In Proceedings of the 21st ACM international conference on Information and Knowledge Management, Maui, HI, USA, 29 October–2 November 2012; pp. 1794–1798. [Google Scholar]
- Li, T.; Wu, Y.; Zhang, Y. Twitter hash tag prediction algorithm. In Proceedings of the International Conference on Internet Computing (ICOMP), Las Vegas, NV, USA, 18–21 July 2011. [Google Scholar]
- Krestel, R.; Fankhauser, P.; Nejdl, W. Latent dirichlet allocation for tag recommendation. In Proceedings of the Third ACM Conference on Recommender Systems, New York, NY, USA, 23–25 October 2009; pp. 61–68. [Google Scholar]
- Khabiri, E.; Caverlee, J.; Kamath, K.Y. Predicting semantic annotations on the real-time web. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media, Milwaukee, WI, USA, 25–28 June 2012; pp. 219–228. [Google Scholar]
- Weston, J.; Chopra, S.; Adams, K. # tagspace: Semantic embeddings from hashtags. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1822–1827. [Google Scholar]
- Cristin, R.; Cyril Raj, V.; Marimuthu, R. Face image forgery detection by weight optimized neural network model. Multimed. Res.
**2019**, 2, 19–27. [Google Scholar] - Gangappa, M.; Mai, C.; Sammulal, P. Enhanced Crow Search Optimization Algorithm and Hybrid NN-CNN Classifiers for Classification of Land Cover Images. Multimed. Res.
**2019**, 2, 12–22. [Google Scholar] - Vidyadhari, C.; Sandhya, N.; Premchand, P. A semantic word processing using enhanced cat swarm optimization algorithm for automatic text clustering. Multimed. Res.
**2019**, 2, 23–32. [Google Scholar] - Ben-Lhachemi, N.; Boumhidi, J. Hashtag Recommender System Based on LSTM Neural Reccurent Network. In Proceedings of the 2019 Third International Conference on Intelligent Computing in Data Sciences (ICDS), Marrakech, Morocco, 28–30 October 2019; pp. 1–6. [Google Scholar]
- Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput.
**2000**, 12, 2451–2471. [Google Scholar] [CrossRef] [PubMed] - Li, Y.; Liu, T.; Hu, J.; Jiang, J. Topical co-attention networks for hashtag recommendation on microblogs. Neurocomputing
**2019**, 331, 356–365. [Google Scholar] [CrossRef] - Ben-Lhachemi, N. Using tweets embeddings for hashtag recommendation in Twitter. Procedia Comput. Sci.
**2018**, 127, 7–15. [Google Scholar] [CrossRef] - Belhadi, A.; Djenouri, Y.; Lin, J.C.W.; Cano, A. A data-driven approach for Twitter hashtag recommendation. IEEE Access
**2020**, 8, 79182–79191. [Google Scholar] [CrossRef] - Alsini, A.; Datta, A.; Huynh, D.Q. On utilizing communities detected from social networks in hashtag recommendation. IEEE Trans. Comput. Soc. Syst.
**2020**, 7, 971–982. [Google Scholar] [CrossRef] - Yang, Q.; Wu, G.; Li, Y.; Li, R.; Gu, X.; Deng, H.; Wu, J. AMNN: Attention-Based Multimodal Neural Network Model for Hashtag Recommendation. IEEE Trans. Comput. Soc. Syst.
**2020**, 7, 768–779. [Google Scholar] [CrossRef] - Ma, R.; Qiu, X.; Zhang, Q.; Hu, X.; Jiang, Y.G.; Huang, X. Co-attention Memory Network for Multimodal Microblog’s Hashtag Recommendation. IEEE Trans. Knowl. Data Eng.
**2019**, 33, 388–400. [Google Scholar] [CrossRef] - Cao, D.; Miao, L.; Rong, H.; Qin, Z.; Nie, L. Hashtag our stories: Hashtag recommendation for micro-videos via harnessing multiple modalities. Knowl. Based Syst.
**2020**, 203, 106114. [Google Scholar] [CrossRef] - Li, C.; Xu, L.; Yan, M.; Lei, Y. TagDC: A tag recommendation method for software information sites with a combination of deep learning and collaborative filtering. J. Syst. Softw.
**2020**, 170, 110783. [Google Scholar] [CrossRef] - Dhelim, S.; Aung, N.; Ning, H. Mining user interest based on personality-aware hybrid filtering in social networks. Knowl.-Based Syst.
**2020**, 206, 106227. [Google Scholar] [CrossRef] - Majhi, B.; Naidu, D.; Mishra, A.P.; Satapathy, S.C. Improved prediction of daily pan evaporation using Deep-LSTM model. Neural Comput. Appl.
**2020**, 32, 7823–7838. [Google Scholar] [CrossRef] - Chen, Z.; Chen, Y.; Wu, L.; Cheng, S.; Lin, P. Deep residual network based fault detection and diagnosis of photovoltaic arrays using current-voltage curves and ambient conditions. Energy Convers. Manag.
**2019**, 198, 111793. [Google Scholar] [CrossRef] - Mirjalili, S. SCA: A sine cosine algorithm for solving optimization problems. Knowl. Based Syst.
**2016**, 96, 120–133. [Google Scholar] [CrossRef] - Askari, Q.; Younas, I.; Saeed, M. Political optimizer: A novel socio-inspired meta-heuristic for global optimization. Knowl. Based Syst.
**2020**, 195, 105709. [Google Scholar] [CrossRef] - Kaviani, M.; Rahmani, H. Emhash: Hashtag recommendation using neural network based on bert embedding. In Proceedings of the 2020 6th International Conference on Web Research (ICWR), Tehran, Iran, 22–23 April 2020; pp. 113–118. [Google Scholar]

Authors | Methods | Advantages | Disadvantages |
---|---|---|---|

Ben-Lhachemi, N. et al. [18] | DBSCAN clustering technique. | Has the potential to recommend pertinent hashtags syntactically and semantically from a given tweet. | Fails to involve deep architectures on various semantic knowledge bases to enhance accuracy. |

Li, Y. et al. [17] | Topical Co-Attention Network (TCAN). | Useful for determining structures from large documents. | Fails to involve temporal information. |

Ben-Lhachemi, N. and Boumhidi, J. [15] | Long Short-Term Memory Recurrent Neural Network. | Capable of recommending suitable hashtags. | Fails to utilize semantic knowledge bases to improve accuracy. |

Belhadi, A. et al. [19] | Pattern Mining for Hashtag Recommendation (PM-HRec). | Works effectively on huge data. | Fails to include parallel methods (e.g., GPUs) to handle large data. |

Alsini, A. et al. [20] | Community-based hashtag recommendation model. | Easy to understand different factors that influence hashtag recommendation. | Lengthy process. |

Yang, Q. et al. [21] | Attention-based multimodal neural network model (AMNN). | Able to attain improved performance with multimodal micro-blogs. | Fails to adapt external knowledge, such as comments and user information, for recommendation. |

Ma, R. et al. [22] | Co-attention memory network. | Can manage a huge number of hashtags. | The generation of high-quality candidate sets is complex. |

Cao, D. et al. [23] | Neural network-based LOGO. | Assists in capturing sequential and considerate features at the same time. | Fails to include recommendation on micro-videos. |

Can Li et al. [24] | Tag recommendation method with Deep learning and Collaborative filtering (TagDC) | Helps to locate similar software. | Fails to consider unpopular tags. |

Sahraoui et al. [25] | Hybrid filtering technique in a social network for data mining. | Predicts the user’s interest in a social network. | Fails to evaluate the performance in the signed Network. |

Metrics/Method | RNN-LSTM | PM-HRec | AMNN | Deep LSTM | Emhash | Commu. Based | Proposed SCPO-DRN | |
---|---|---|---|---|---|---|---|---|

Apple | Precision | 0.934 | 0.936 | 0.943 | 0.944 | 0.950 | 0.953 | 0.968 |

Recall | 0.917 | 0.919 | 0.929 | 0.930 | 0.932 | 0.943 | 0.958 | |

F1-score | 0.926 | 0.927 | 0.936 | 0.937 | 0.941 | 0.948 | 0.963 | |

Precision | 0.899 | 0.900 | 0.908 | 0.909 | 0.913 | 0.921 | 0.929 | |

Recall | 0.870 | 0.871 | 0.882 | 0.882 | 0.898 | 0.904 | 0.907 | |

F1-score | 0.884 | 0.886 | 0.895 | 0.895 | 0.905 | 0.912 | 0.918 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).