Natural Language Engineering: Methods, Tasks and Applications

A special issue of Future Internet (ISSN 1999-5903). This special issue belongs to the section "Smart System Infrastructure and Applications".

Deadline for manuscript submissions: closed (31 October 2021) | Viewed by 23824

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors


E-Mail Website
Guest Editor

E-Mail Website
Guest Editor
Institute for High Performance Computing and Networking (ICAR), National Research Council (CNR), 80131 Naples, Italy
Interests: artificial intelligence; conversational systems; natural language processing; knowledge management; modelling; reasoning
Special Issues, Collections and Topics in MDPI journals
Institute for High Performance Computing and Networking, National Research Council of Italy (ICAR-CNR), Naples, Italy
Interests: fuzzy modeling; explainable AI; natural language processing; deep neural networks
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

This Special Issue is intended to provide an overview of the research being carried out in the area of natural language processing to face different open issues regarding emerging approaches for single and multiple language learning, understanding and analysis, generation and grounding, interactively or autonomously from data, as well as potential or real applications of them in different domains and also in everyday devices, granting explainability and reduced memory footprints.

To this aim, this Special Issue aims to gather researchers with broad expertise in various fields—natural language processing, cognitive science and psychology, artificial intelligence and neural networks, computational modeling and neuroscience—to discuss their cutting-edge work as well as perspectives on future directions in this exciting field. Original contributions are sought covering the whole range of theoretical and practical aspects, technologies, and systems in this research area.

The topics of interest for this Special Issue include but are not limited to:

  • Natural language understanding, generation and grounding;
  • Multilingual and cross-lingual neural language models;
  • Conversational systems, question answering and visual question answering;
  • Extractive and abstractive summarization;
  • Sentiment analysis, emotion detection and opinion mining;
  • Document analysis, information extraction and text mining;
  • Machine translation;
  • Text de-identification;
  • Search and information retrieval;
  • Common-sense reasoning;
  • Computer/human interactive learning;
  • Low-resource natural language processing;
  • Knowledge distillation and model compression;
  • Neuroscience-inspired cognitive architectures;
  • Trustworthy and explainable artificial intelligence;
  • Cognitive and social robotics;
  • Applications in science, engineering, medicine, healthcare, finance, business, law, education, industry, transportation, retailing, telecommunication and multimedia.

Dr. Massimo Esposito
Dr. Giovanni Luca Masala
Dr. Aniello Minutolo
Dr. Marco Pota
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Future Internet is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language processing
  • text analytics
  • interactive and reinforcement learning
  • machine/deep learning
  • transfer learning
  • explainability
  • knowledge distillation
  • cognitive systems

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

3 pages, 179 KiB  
Editorial
Special Issue “Natural Language Engineering: Methods, Tasks and Applications”
by Massimo Esposito, Giovanni Luca Masala, Aniello Minutolo and Marco Pota
Future Internet 2022, 14(4), 106; https://0-doi-org.brum.beds.ac.uk/10.3390/fi14040106 - 26 Mar 2022
Viewed by 1971
Abstract
Natural language engineering includes a continuously enlarging variety of methods for solving natural language processing (NLP) tasks within a pervasive number of applications [...] Full article
(This article belongs to the Special Issue Natural Language Engineering: Methods, Tasks and Applications)

Research

Jump to: Editorial

18 pages, 1450 KiB  
Article
Introducing Various Semantic Models for Amharic: Experimentation and Evaluation with Multiple Tasks and Datasets
by Seid Muhie Yimam, Abinew Ali Ayele, Gopalakrishnan Venkatesh, Ibrahim Gashaw and Chris Biemann
Future Internet 2021, 13(11), 275; https://0-doi-org.brum.beds.ac.uk/10.3390/fi13110275 - 27 Oct 2021
Cited by 14 | Viewed by 3421
Abstract
The availability of different pre-trained semantic models has enabled the quick development of machine learning components for downstream applications. However, even if texts are abundant for low-resource languages, there are very few semantic models publicly available. Most of the publicly available pre-trained models [...] Read more.
The availability of different pre-trained semantic models has enabled the quick development of machine learning components for downstream applications. However, even if texts are abundant for low-resource languages, there are very few semantic models publicly available. Most of the publicly available pre-trained models are usually built as a multilingual version of semantic models that will not fit well with the need for low-resource languages. We introduce different semantic models for Amharic, a morphologically complex Ethio-Semitic language. After we investigate the publicly available pre-trained semantic models, we fine-tune two pre-trained models and train seven new different models. The models include Word2Vec embeddings, distributional thesaurus (DT), BERT-like contextual embeddings, and DT embeddings obtained via network embedding algorithms. Moreover, we employ these models for different NLP tasks and study their impact. We find that newly-trained models perform better than pre-trained multilingual models. Furthermore, models based on contextual embeddings from FLAIR and RoBERTa perform better than word2Vec models for the NER and POS tagging tasks. DT-based network embeddings are suitable for the sentiment classification task. We publicly release all the semantic models, machine learning components, and several benchmark datasets such as NER, POS tagging, sentiment classification, as well as Amharic versions of WordSim353 and SimLex999. Full article
(This article belongs to the Special Issue Natural Language Engineering: Methods, Tasks and Applications)
Show Figures

Figure 1

15 pages, 534 KiB  
Article
A Sentiment-Aware Contextual Model for Real-Time Disaster Prediction Using Twitter Data
by Guizhe Song and Degen Huang
Future Internet 2021, 13(7), 163; https://0-doi-org.brum.beds.ac.uk/10.3390/fi13070163 - 25 Jun 2021
Cited by 18 | Viewed by 4158
Abstract
The massive amount of data generated by social media present a unique opportunity for disaster analysis. As a leading social platform, Twitter generates over 500 million Tweets each day. Due to its real-time characteristic, more agencies employ Twitter to track disaster events to [...] Read more.
The massive amount of data generated by social media present a unique opportunity for disaster analysis. As a leading social platform, Twitter generates over 500 million Tweets each day. Due to its real-time characteristic, more agencies employ Twitter to track disaster events to make a speedy rescue plan. However, it is challenging to build an accurate predictive model to identify disaster Tweets, which may lack sufficient context due to the length limit. In addition, disaster Tweets and regular ones can be hard to distinguish because of word ambiguity. In this paper, we propose a sentiment-aware contextual model named SentiBERT-BiLSTM-CNN for disaster detection using Tweets. The proposed learning pipeline consists of SentiBERT that can generate sentimental contextual embeddings from a Tweet, a Bidirectional long short-term memory (BiLSTM) layer with attention, and a 1D convolutional layer for local feature extraction. We conduct extensive experiments to validate certain design choices of the model and compare our model with its peers. Results show that the proposed SentiBERT-BiLSTM-CNN demonstrates superior performance in the F1 score, making it a competitive model in Tweets-based disaster prediction. Full article
(This article belongs to the Special Issue Natural Language Engineering: Methods, Tasks and Applications)
Show Figures

Figure 1

24 pages, 893 KiB  
Article
Generating Synthetic Training Data for Supervised De-Identification of Electronic Health Records
by Claudia Alessandra Libbi, Jan Trienes, Dolf Trieschnigg and Christin Seifert
Future Internet 2021, 13(5), 136; https://0-doi-org.brum.beds.ac.uk/10.3390/fi13050136 - 20 May 2021
Cited by 9 | Viewed by 4252
Abstract
A major hurdle in the development of natural language processing (NLP) methods for Electronic Health Records (EHRs) is the lack of large, annotated datasets. Privacy concerns prevent the distribution of EHRs, and the annotation of data is known to be costly and cumbersome. [...] Read more.
A major hurdle in the development of natural language processing (NLP) methods for Electronic Health Records (EHRs) is the lack of large, annotated datasets. Privacy concerns prevent the distribution of EHRs, and the annotation of data is known to be costly and cumbersome. Synthetic data presents a promising solution to the privacy concern, if synthetic data has comparable utility to real data and if it preserves the privacy of patients. However, the generation of synthetic text alone is not useful for NLP because of the lack of annotations. In this work, we propose the use of neural language models (LSTM and GPT-2) for generating artificial EHR text jointly with annotations for named-entity recognition. Our experiments show that artificial documents can be used to train a supervised named-entity recognition model for de-identification, which outperforms a state-of-the-art rule-based baseline. Moreover, we show that combining real data with synthetic data improves the recall of the method, without manual annotation effort. We conduct a user study to gain insights on the privacy of artificial text. We highlight privacy risks associated with language models to inform future research on privacy-preserving automated text generation and metrics for evaluating privacy-preservation during text generation. Full article
(This article belongs to the Special Issue Natural Language Engineering: Methods, Tasks and Applications)
Show Figures

Figure 1

16 pages, 12428 KiB  
Article
A Classification Method for Academic Resources Based on a Graph Attention Network
by Jie Yu, Yaliu Li, Chenle Pan and Junwei Wang
Future Internet 2021, 13(3), 64; https://0-doi-org.brum.beds.ac.uk/10.3390/fi13030064 - 04 Mar 2021
Cited by 1 | Viewed by 2522
Abstract
Classification of resource can help us effectively reduce the work of filtering massive academic resources, such as selecting relevant papers and focusing on the latest research by scholars in the same field. However, existing graph neural networks do not take into account the [...] Read more.
Classification of resource can help us effectively reduce the work of filtering massive academic resources, such as selecting relevant papers and focusing on the latest research by scholars in the same field. However, existing graph neural networks do not take into account the associations between academic resources, leading to unsatisfactory classification results. In this paper, we propose an Association Content Graph Attention Network (ACGAT), which is based on the association features and content attributes of academic resources. The semantic relevance and academic relevance are introduced into the model. The ACGAT makes full use of the association commonality and the influence information of resources and introduces an attention mechanism to improve the accuracy of academic resource classification. We conducted experiments on a self-built scholar network and two public citation networks. Experimental results show that the ACGAT has better effectiveness than existing classification methods. Full article
(This article belongs to the Special Issue Natural Language Engineering: Methods, Tasks and Applications)
Show Figures

Graphical abstract

12 pages, 330 KiB  
Article
Pat-in-the-Loop: Declarative Knowledge for Controlling Neural Networks
by Dario Onorati, Pierfrancesco Tommasino, Leonardo Ranaldi, Francesca Fallucchi and Fabio Massimo Zanzotto
Future Internet 2020, 12(12), 218; https://0-doi-org.brum.beds.ac.uk/10.3390/fi12120218 - 02 Dec 2020
Cited by 5 | Viewed by 2478
Abstract
The dazzling success of neural networks over natural language processing systems is imposing an urgent need to control their behavior with simpler, more direct declarative rules. In this paper, we propose Pat-in-the-Loop as a model to control a specific class of syntax-oriented neural [...] Read more.
The dazzling success of neural networks over natural language processing systems is imposing an urgent need to control their behavior with simpler, more direct declarative rules. In this paper, we propose Pat-in-the-Loop as a model to control a specific class of syntax-oriented neural networks by adding declarative rules. In Pat-in-the-Loop, distributed tree encoders allow to exploit parse trees in neural networks, heat parse trees visualize activation of parse trees, and parse subtrees are used as declarative rules in the neural network. Hence, Pat-in-the-Loop is a model to include human control in specific natural language processing (NLP)-neural network (NN) systems that exploit syntactic information, which we will generically call Pat. A pilot study on question classification showed that declarative rules representing human knowledge, injected by Pat, can be effectively used in these neural networks to ensure correctness, relevance, and cost-effective. Full article
(This article belongs to the Special Issue Natural Language Engineering: Methods, Tasks and Applications)
Show Figures

Figure 1

12 pages, 5817 KiB  
Article
Paranoid Transformer: Reading Narrative of Madness as Computational Approach to Creativity
by Yana Agafonova, Alexey Tikhonov and Ivan P. Yamshchikov
Future Internet 2020, 12(11), 182; https://0-doi-org.brum.beds.ac.uk/10.3390/fi12110182 - 27 Oct 2020
Cited by 5 | Viewed by 3566
Abstract
This paper revisits the receptive theory in the context of computational creativity. It presents a case study of a Paranoid Transformer—a fully autonomous text generation engine with raw output that could be read as the narrative of a mad digital persona without any [...] Read more.
This paper revisits the receptive theory in the context of computational creativity. It presents a case study of a Paranoid Transformer—a fully autonomous text generation engine with raw output that could be read as the narrative of a mad digital persona without any additional human post-filtering. We describe technical details of the generative system, provide examples of output, and discuss the impact of receptive theory, chance discovery, and simulation of fringe mental state on the understanding of computational creativity. Full article
(This article belongs to the Special Issue Natural Language Engineering: Methods, Tasks and Applications)
Show Figures

Figure 1

Back to TopTop