Next Issue
Volume 5, June
Previous Issue
Volume 4, December

Big Data Cogn. Comput., Volume 5, Issue 1 (March 2021) – 15 articles

Cover Story (view full-size image): Traditional IoT using Wi-Fi connectivity has inherent compatibility issues. Seamless integration among IoT devices is required to offer smart data-driven sensor controls and insightful user decisions. When information collected by one device is shared with others non-intrusively and intelligently, user acceptance becomes achievable for a smart automation of the future. This research work factors in the optimisation considerations of big data and machine learning approaches to propose a novel methodology for modelling a non-intrusive smart automation system. To validate it, we developed a prototype of our model to uniquely combine personalisation using an IoT hub implementation in a contemporary home environment. A real-time smart home automation use case was demonstrated by employing our model in big data processing and smart analytics via frameworks such as Apache Spark, Apache NiFi [...] Read more.
Order results
Result details
Select all
Export citation of selected articles as:
Article
ParlTech: Transformation Framework for the Digital Parliament
Big Data Cogn. Comput. 2021, 5(1), 15; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010015 - 15 Mar 2021
Viewed by 2090
Abstract
Societies are entering the age of technological disruption, which also impacts governance institutions such as parliamentary organizations. Thus, parliaments need to adjust swiftly by incorporating innovative methods into their organizational culture and novel technologies into their working procedures. Inter-Parliamentary Union World e-Parliament Reports [...] Read more.
Societies are entering the age of technological disruption, which also impacts governance institutions such as parliamentary organizations. Thus, parliaments need to adjust swiftly by incorporating innovative methods into their organizational culture and novel technologies into their working procedures. Inter-Parliamentary Union World e-Parliament Reports capture digital transformation trends towards open data production, standardized and knowledge-driven business processes, and the implementation of inclusive and participatory schemes. Nevertheless, there is still a limited consensus on how these trends will materialize into specific tools, products, and services, with added value for parliamentary and societal stakeholders. This article outlines the rapid evolution of the digital parliament from the user perspective. In doing so, it describes a transformational framework based on the evaluation of empirical data by an expert survey of parliamentarians and parliamentary administrators. Basic sets of tools and technologies that are perceived as vital for future parliamentary use by intra-parliamentary stakeholders, such as systems and processes for information and knowledge sharing, are analyzed. Moreover, boundary conditions for development and implementation of parliamentary technologies are set and highlighted. Concluding recommendations regarding the expected investments, interdisciplinary research, and cross-sector collaboration within the defined framework are presented. Full article
(This article belongs to the Special Issue Semantic Web Technology and Recommender Systems)
Show Figures

Figure 1

Article
Stacked Community Prediction: A Distributed Stacking-Based Community Extraction Methodology for Large Scale Social Networks
Big Data Cogn. Comput. 2021, 5(1), 14; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010014 - 12 Mar 2021
Viewed by 1284
Abstract
Nowadays, due to the extensive use of information networks in a broad range of fields, e.g., bio-informatics, sociology, digital marketing, computer science, etc., graph theory applications have attracted significant scientific interest. Due to its apparent abstraction, community detection has become one of the [...] Read more.
Nowadays, due to the extensive use of information networks in a broad range of fields, e.g., bio-informatics, sociology, digital marketing, computer science, etc., graph theory applications have attracted significant scientific interest. Due to its apparent abstraction, community detection has become one of the most thoroughly studied graph partitioning problems. However, the existing algorithms principally propose iterative solutions of high polynomial order that repetitively require exhaustive analysis. These methods can undoubtedly be considered resource-wise overdemanding, unscalable, and inapplicable in big data graphs, such as today’s social networks. In this article, a novel, near-linear, and highly scalable community prediction methodology is introduced. Specifically, using a distributed, stacking-based model, which is built on plain network topology characteristics of bootstrap sampled subgraphs, the underlined community hierarchy of any given social network is efficiently extracted in spite of its size and density. The effectiveness of the proposed methodology has diligently been examined on numerous real-life social networks and proven superior to various similar approaches in terms of performance, stability, and accuracy. Full article
Show Figures

Figure 1

Article
From data Processing to Knowledge Processing: Working with Operational Schemas by Autopoietic Machines
Big Data Cogn. Comput. 2021, 5(1), 13; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010013 - 10 Mar 2021
Viewed by 1423
Abstract
Knowledge processing is an important feature of intelligence in general and artificial intelligence in particular. To develop computing systems working with knowledge, it is necessary to elaborate the means of working with knowledge representations (as opposed to data), because knowledge is an abstract [...] Read more.
Knowledge processing is an important feature of intelligence in general and artificial intelligence in particular. To develop computing systems working with knowledge, it is necessary to elaborate the means of working with knowledge representations (as opposed to data), because knowledge is an abstract structure. There are different forms of knowledge representations derived from data. One of the basic forms is called a schema, which can belong to one of three classes: operational, descriptive, and representation schemas. The goal of this paper is the development of theoretical and practical tools for processing operational schemas. To achieve this goal, we use schema representations elaborated in the mathematical theory of schemas and use structural machines as a powerful theoretical tool for modeling parallel and concurrent computational processes. We describe the schema of autopoietic machines as physical realizations of structural machines. An autopoietic machine is a technical system capable of regenerating, reproducing, and maintaining itself by production, transformation, and destruction of its components and the networks of processes downstream contained in them. We present the theory and practice of designing and implementing autopoietic machines as information processing structures integrating both symbolic computing and neural networks. Autopoietic machines use knowledge structures containing the behavioral evolution of the system and its interactions with the environment to maintain stability by counteracting fluctuations. Full article
(This article belongs to the Special Issue Big Data Analytics and Cloud Data Management)
Show Figures

Figure 1

Article
Processing Big Data with Apache Hadoop in the Current Challenging Era of COVID-19
Big Data Cogn. Comput. 2021, 5(1), 12; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010012 - 09 Mar 2021
Cited by 1 | Viewed by 1771
Abstract
Big data have become a global strategic issue, as increasingly large amounts of unstructured data challenge the IT infrastructure of global organizations and threaten their capacity for strategic forecasting. As experienced in former massive information issues, big data technologies, such as Hadoop, should [...] Read more.
Big data have become a global strategic issue, as increasingly large amounts of unstructured data challenge the IT infrastructure of global organizations and threaten their capacity for strategic forecasting. As experienced in former massive information issues, big data technologies, such as Hadoop, should efficiently tackle the incoming large amounts of data and provide organizations with relevant processed information that was formerly neither visible nor manageable. After having briefly recalled the strategic advantages of big data solutions in the introductory remarks, in the first part of this paper, we focus on the advantages of big data solutions in the currently difficult time of the COVID-19 pandemic. We characterize it as an endemic heterogeneous data context; we then outline the advantages of technologies such as Hadoop and its IT suitability in this context. In the second part, we identify two specific advantages of Hadoop solutions, globality combined with flexibility, and we notice that they are at work with a “Hadoop Fusion Approach” that we describe as an optimal response to the context. In the third part, we justify selected qualifications of globality and flexibility by the fact that Hadoop solutions enable comparable returns in opposite contexts of models of partial submodels and of models of final exact systems. In part four, we remark that in both these opposite contexts, Hadoop’s solutions allow a large range of needs to be fulfilled, which fits with requirements previously identified as the current heterogeneous data structure of COVID-19 information. In the final part, we propose a framework of strategic data processing conditions. To the best of our knowledge, they appear to be the most suitable to overcome COVID-19 massive information challenges. Full article
(This article belongs to the Special Issue Big Data Analytics and Cloud Data Management)
Show Figures

Figure 1

Article
A Network-Based Analysis of a Worksite Canteen Dataset
Big Data Cogn. Comput. 2021, 5(1), 11; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010011 - 08 Mar 2021
Viewed by 2106
Abstract
The provision of wellness in workplaces gained interest in recent decades. A factor that contributes significantly to workers’ health is their diet, especially when provided by canteen services. The assessment of such a service involves questions as food cost, its sustainability, quality, nutritional [...] Read more.
The provision of wellness in workplaces gained interest in recent decades. A factor that contributes significantly to workers’ health is their diet, especially when provided by canteen services. The assessment of such a service involves questions as food cost, its sustainability, quality, nutritional facts and variety, as well as employees’ health and disease prevention, productivity increase, economic convenience vs. eating satisfaction when using canteen services. Even if food habits have already been studied using traditional statistical approaches, here we adopt an approach based on Network Science that allows us to deeply study, for instance, the interconnections among people, company and meals and that can be easily used for further analysis. In particular, this work concerns a multi-company dataset of workers and dishes they chose at a canteen worksite. We study eating habits and health consequences, also considering the presence of different companies and the corresponding contact network among workers. The macro-nutrient content and caloric values assessment is carried out both for dishes and for employees, in order to establish when food is balanced and healthy. Moreover, network analysis lets us discover hidden correlations among people and the environment, as communities that cannot be usually inferred with traditional or methods since they are not known a priori. Finally, we represent the dataset as a tripartite network to investigate relationships between companies, people, and dishes. In particular, the so-called network projections can be extracted, each one being a network among specific kind of nodes; further community analysis tools will provide hidden information about people and their food habits. In summary, the contribution of the paper is twofold: it provides a study of a real dataset spanning over several years that gives a new interesting point of view on food habits and healthcare, and it also proposes a new approach based on Network Science. Results prove that this kind of analysis can provide significant information that complements other traditional methodologies. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2020)
Show Figures

Figure 1

Article
Automatic Defects Segmentation and Identification by Deep Learning Algorithm with Pulsed Thermography: Synthetic and Experimental Data
Big Data Cogn. Comput. 2021, 5(1), 9; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010009 - 26 Feb 2021
Cited by 1 | Viewed by 1291
Abstract
In quality evaluation (QE) of the industrial production field, infrared thermography (IRT) is one of the most crucial techniques used for evaluating composite materials due to the properties of low cost, fast inspection of large surfaces, and safety. The application of deep neural [...] Read more.
In quality evaluation (QE) of the industrial production field, infrared thermography (IRT) is one of the most crucial techniques used for evaluating composite materials due to the properties of low cost, fast inspection of large surfaces, and safety. The application of deep neural networks tends to be a prominent direction in IRT Non-Destructive Testing (NDT). During the training of the neural network, the Achilles heel is the necessity of a large database. The collection of huge amounts of training data is the high expense task. In NDT with deep learning, synthetic data contributing to training in infrared thermography remains relatively unexplored. In this paper, synthetic data from the standard Finite Element Models are combined with experimental data to build repositories with Mask Region based Convolutional Neural Networks (Mask-RCNN) to strengthen the neural network, learning the essential features of objects of interest and achieving defect segmentation automatically. These results indicate the possibility of adapting inexpensive synthetic data merging with a certain amount of the experimental database for training the neural networks in order to achieve the compelling performance from a limited collection of the annotated experimental data of a real-world practical thermography experiment. Full article
(This article belongs to the Special Issue Machine Learning and Data Analysis for Image Processing)
Show Figures

Figure 1

Review
IoT Technologies for Livestock Management: A Review of Present Status, Opportunities, and Future Trends
Big Data Cogn. Comput. 2021, 5(1), 10; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010010 - 26 Feb 2021
Cited by 3 | Viewed by 1645
Abstract
The world population currently stands at about 7 billion amidst an expected increase in 2030 from 9.4 billion to around 10 billion in 2050. This burgeoning population has continued to influence the upward demand for animal food. Moreover, the management of finite resources [...] Read more.
The world population currently stands at about 7 billion amidst an expected increase in 2030 from 9.4 billion to around 10 billion in 2050. This burgeoning population has continued to influence the upward demand for animal food. Moreover, the management of finite resources such as land, the need to reduce livestock contribution to greenhouse gases, and the need to manage inherent complex, highly contextual, and repetitive day-to-day livestock management (LsM) routines are some examples of challenges to overcome in livestock production. The Internet of Things (IoT)’s usefulness in other vertical industries (OVI) shows that its role will be significant in LsM. This work uses the systematic review methodology of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) to guide a review of existing literature on IoT in OVI. The goal is to identify the IoT’s ecosystem, architecture, and its technicalities—present status, opportunities, and expected future trends—regarding its role in LsM. Among identified IoT roles in LsM, the authors found that data will be its main contributor. The traditional approach of reactive data processing will give way to the proactive approach of augmented analytics to provide insights about animal processes. This will undoubtedly free LsM from the drudgery of repetitive tasks with opportunities for improved productivity. Full article
Show Figures

Figure 1

Article
NLA-Bit: A Basic Structure for Storing Big Data with Complexity O(1)
Big Data Cogn. Comput. 2021, 5(1), 8; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010008 - 24 Feb 2021
Viewed by 1209
Abstract
This paper introduces a novel approach for storing Resource Description Framework (RDF) data based on the possibilities of Natural Language Addressing (NLA) and on a special NLA basic structure for storing Big Data, called “NLA-bit”, which is aimed to support middle-size or large [...] Read more.
This paper introduces a novel approach for storing Resource Description Framework (RDF) data based on the possibilities of Natural Language Addressing (NLA) and on a special NLA basic structure for storing Big Data, called “NLA-bit”, which is aimed to support middle-size or large distributed RDF triple or quadruple stores with time complexity O(1). The main idea of NLA is to use letter codes as coordinates (addresses) for data storing. This avoids indexing and provides high-speed direct access to the data with time complexity O(1). NLA-bit is a structured set of all RDF instances with the same “Subject”. An example based on a document system, where every document is stored as NLA-bit, which contains all data connected to it by metadata links, is discussed. The NLA-bits open up a wide field for research and practical implementations in the field of large databases with dynamic semi-structured data (Big Data). Important advantages of the approach are as follow: (1) The reduction of the amount of occupied memory due to the complete absence of additional indexes, absolute addresses, pointers, and additional files; (2) reduction of processing time due to the complete lack of demand—the data are stored/extracted to/from a direct address. Full article
Show Figures

Figure 1

Article
The Potential of the SP System in Machine Learning and Data Analysis for Image Processing
Big Data Cogn. Comput. 2021, 5(1), 7; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010007 - 23 Feb 2021
Viewed by 1109
Abstract
This paper aims to describe how pattern recognition and scene analysis may with advantage be viewed from the perspective of the SP system (meaning the SP theory of intelligence and its realisation in the SP computer model (SPCM), both described in an appendix), [...] Read more.
This paper aims to describe how pattern recognition and scene analysis may with advantage be viewed from the perspective of the SP system (meaning the SP theory of intelligence and its realisation in the SP computer model (SPCM), both described in an appendix), and the strengths and potential of the system in those areas. In keeping with evidence for the importance of information compression (IC) in human learning, perception, and cognition, IC is central in the structure and workings of the SPCM. Most of that IC is achieved via the powerful concept of SP-multiple-alignment, which is largely responsible for the AI-related versatility of the system. With examples from the SPCM, the paper describes: how syntactic parsing and pattern recognition may be achieved, with corresponding potential for visual parsing and scene analysis; how those processes are robust in the face of errors in input data; how in keeping with what people do, the SP system can “see” things in its data that are not objectively present; the system can recognise things at multiple levels of abstraction and via part-whole hierarchies, and via an integration of the two; the system also has potential for the creation of a 3D construct from pictures of a 3D object from different viewpoints, and for the recognition of 3D entities. Full article
Show Figures

Graphical abstract

Article
Big Data and Personalisation for Non-Intrusive Smart Home Automation
Big Data Cogn. Comput. 2021, 5(1), 6; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010006 - 30 Jan 2021
Viewed by 1656
Abstract
With the advent of the Internet of Things (IoT), many different smart home technologies are commercially available. However, the adoption of such technologies is slow as many of them are not cost-effective and focus on specific functions such as energy efficiency. Recently, IoT [...] Read more.
With the advent of the Internet of Things (IoT), many different smart home technologies are commercially available. However, the adoption of such technologies is slow as many of them are not cost-effective and focus on specific functions such as energy efficiency. Recently, IoT devices and sensors have been designed to enhance the quality of personal life by having the capability to generate continuous data streams that can be used to monitor and make inferences by the user. While smart home devices connect to the home Wi-Fi network, there are still compatibility issues between devices from different manufacturers. Smart devices get even smarter when they can communicate with and control each other. The information collected by one device can be shared with others for achieving an enhanced automation of their operations. This paper proposes a non-intrusive approach of integrating and collecting data from open standard IoT devices for personalised smart home automation using big data analytics and machine learning. We demonstrate the implementation of our proposed novel technology instantiation approach for achieving non-intrusive IoT based big data analytics with a use case of a smart home environment. We employ open-source frameworks such as Apache Spark, Apache NiFi and FB-Prophet along with popular vendor tech-stacks such as Azure and DataBricks. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2020)
Show Figures

Figure 1

Article
An Exploratory Study of COVID-19 Information on Twitter in the Greater Region
Big Data Cogn. Comput. 2021, 5(1), 5; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010005 - 28 Jan 2021
Cited by 3 | Viewed by 1627
Abstract
The outbreak of the COVID-19 led to a burst of information in major online social networks (OSNs). Facing this constantly changing situation, OSNs have become an essential platform for people expressing opinions and seeking up-to-the-minute information. Thus, discussions on OSNs may become a [...] Read more.
The outbreak of the COVID-19 led to a burst of information in major online social networks (OSNs). Facing this constantly changing situation, OSNs have become an essential platform for people expressing opinions and seeking up-to-the-minute information. Thus, discussions on OSNs may become a reflection of reality. This paper aims to figure out how Twitter users in the Greater Region (GR) and related countries react differently over time through conducting a data-driven exploratory study of COVID-19 information using machine learning and representation learning methods. We find that tweet volume and COVID-19 cases in GR and related countries are correlated, but this correlation only exists in a particular period of the pandemic. Moreover, we plot the changing of topics in each country and region from 22 January 2020 to 5 June 2020, figuring out the main differences between GR and related countries. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2020)
Show Figures

Figure 1

Article
NLP-Based Customer Loyalty Improvement Recommender System (CLIRS2)
Big Data Cogn. Comput. 2021, 5(1), 4; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010004 - 19 Jan 2021
Viewed by 1488
Abstract
Structured data on customer feedback is becoming more costly and timely to collect and organize. On the other hand, unstructured opinionated data, e.g., in the form of free-text comments, is proliferating and available on public websites, such as social media websites, blogs, forums, [...] Read more.
Structured data on customer feedback is becoming more costly and timely to collect and organize. On the other hand, unstructured opinionated data, e.g., in the form of free-text comments, is proliferating and available on public websites, such as social media websites, blogs, forums, and websites that provide recommendations. This research proposes a novel method to develop a knowledge-based recommender system from unstructured (text) data. The method is based on applying an opinion mining algorithm, extracting aspect-based sentiment score per text item, and transforming text into a structured form. An action rule mining algorithm is applied to the data table constructed from sentiment mining. The proposed application of the method is the problem of improving customer satisfaction ratings. The results obtained from the dataset of customer comments related to the repair services were evaluated with accuracy and coverage. Further, the results were incorporated into the framework of a web-based user-friendly recommender system to advise the business on how to maximally increase their profits by introducing minimal sets of changes in their service. Experiments and evaluation results from comparing the structured data-based version of the system CLIRS (Customer Loyalty Improvement Recommender System) with the unstructured data-based version of the system (CLIRS2) are provided. Full article
(This article belongs to the Special Issue Big Data and Cognitive Computing: Feature Papers 2020)
Show Figures

Figure 1

Editorial
Acknowledgment to Reviewers of Big Data and Cognitive Computing in 2020
Big Data Cogn. Comput. 2021, 5(1), 3; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010003 - 14 Jan 2021
Viewed by 1301
Abstract
Rigorous peer-review is the corner-stone of high-quality academic publishing [...] Full article
Review
Forecasting Plant and Crop Disease: An Explorative Study on Current Algorithms
Big Data Cogn. Comput. 2021, 5(1), 2; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010002 - 12 Jan 2021
Cited by 1 | Viewed by 1667
Abstract
Every year, plant diseases cause a significant loss of valuable food crops around the world. The plant and crop disease management practice implemented in order to mitigate damages have changed considerably. Today, through the application of new information and communication technologies, it is [...] Read more.
Every year, plant diseases cause a significant loss of valuable food crops around the world. The plant and crop disease management practice implemented in order to mitigate damages have changed considerably. Today, through the application of new information and communication technologies, it is possible to predict the onset or change in the severity of diseases using modern big data analysis techniques. In this paper, we present an analysis and classification of research studies conducted over the past decade that forecast the onset of disease at a pre-symptomatic stage (i.e., symptoms not visible to the naked eye) or at an early stage. We examine the specific approaches and methods adopted, pre-processing techniques and data used, performance metrics, and expected results, highlighting the issues encountered. The results of the study reveal that this practice is still in its infancy and that many barriers need to be overcome. Full article
Show Figures

Figure 1

Review
A Review of Local Outlier Factor Algorithms for Outlier Detection in Big Data Streams
Big Data Cogn. Comput. 2021, 5(1), 1; https://0-doi-org.brum.beds.ac.uk/10.3390/bdcc5010001 - 29 Dec 2020
Cited by 2 | Viewed by 1921
Abstract
Outlier detection is a statistical procedure that aims to find suspicious events or items that are different from the normal form of a dataset. It has drawn considerable interest in the field of data mining and machine learning. Outlier detection is important in [...] Read more.
Outlier detection is a statistical procedure that aims to find suspicious events or items that are different from the normal form of a dataset. It has drawn considerable interest in the field of data mining and machine learning. Outlier detection is important in many applications, including fraud detection in credit card transactions and network intrusion detection. There are two general types of outlier detection: global and local. Global outliers fall outside the normal range for an entire dataset, whereas local outliers may fall within the normal range for the entire dataset, but outside the normal range for the surrounding data points. This paper addresses local outlier detection. The best-known technique for local outlier detection is the Local Outlier Factor (LOF), a density-based technique. There are many LOF algorithms for a static data environment; however, these algorithms cannot be applied directly to data streams, which are an important type of big data. In general, local outlier detection algorithms for data streams are still deficient and better algorithms need to be developed that can effectively analyze the high velocity of data streams to detect local outliers. This paper presents a literature review of local outlier detection algorithms in static and stream environments, with an emphasis on LOF algorithms. It collects and categorizes existing local outlier detection algorithms and analyzes their characteristics. Furthermore, the paper discusses the advantages and limitations of those algorithms and proposes several promising directions for developing improved local outlier detection methods for data streams. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop