Next Issue
Volume 3, September
Previous Issue
Volume 3, March
 
 

Data, Volume 3, Issue 2 (June 2018) – 12 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
10 pages, 1547 KiB  
Data Descriptor
Taguchi Orthogonal Array Dataset for the Effect of Water Chemistry on Aggregation of ZnO Nanoparticles
by Rizwan Khan, Muhammad Ali Inam, Du Ri Park, Saba Zam Zam and Ick Tae Yeom
Data 2018, 3(2), 21; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020021 - 15 Jun 2018
Cited by 8 | Viewed by 4380
Abstract
The dynamic nature of engineered nanoparticle (ENP) aggregation behavior and kinetics are of paramount importance in the field of toxicological and environmental nanotechnology. The Taguchi orthogonal array (OA) L27(313) matrix based on a fractional factorials design was applied to [...] Read more.
The dynamic nature of engineered nanoparticle (ENP) aggregation behavior and kinetics are of paramount importance in the field of toxicological and environmental nanotechnology. The Taguchi orthogonal array (OA) L27(313) matrix based on a fractional factorials design was applied to systematically evaluate the contribution and significance of water chemistry parameters (pH, temperature, electrolyte, natural organic matter (NOM), content and type) and their interactions in the aggregation behavior of zinc oxide nanoparticles (ZnO NPs). The NPs were dispersed into the solution using a probe-sonicator cell crusher (Bio-safer, 1200-90, Nanjing, China). The data were obtained from UV–Vis spectroscopy (Optizen 2120 UV, Mecasys, Daejeon, Korea), Fourier Transform Infrared Spectrometery (FT-IR 4700, spectroscopy, a JASCO Analytical Instruments, Easton, Pennsylvania, USA) and particle electrophoresis (NanoZS, Zetasizer, Malvern Instruments Ltd., Worcestershire, UK). The dataset revealed that Taguchi OA matrix is an efficient approach to study the main and interactive effects of environmental parameters on the aggregation of ZnO NPs. In addition, the aggregation profile of ZnO NPs was significantly influenced by divalent cations and NOM. The result of the FT–IR data presents a possible mechanism of ZnO NP stabilization in the presence of different NOM. This data may be helpful to predict the aggregation behavior of ZnO NPs in environmental and ecotoxicological contexts. Full article
Show Figures

Graphical abstract

20 pages, 4248 KiB  
Article
Interactive Data Framework and User Interface for Wisconsin’s Oversize-Overweight Vehicle Permits
by Ahmed S. Shatnawi, Nicholas Coley and Hani H. Titi
Data 2018, 3(2), 20; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020020 - 15 Jun 2018
Cited by 2 | Viewed by 3672
Abstract
With continuing increases in the number of Oversize-Overweight (OSOW) vehicle permits issued in recent years, the management and analysis of OSOW permit data is becoming more inefficient and time-consuming. Large quantities of archived OSOW permit data are held by Departments of Transportation (DOTs) [...] Read more.
With continuing increases in the number of Oversize-Overweight (OSOW) vehicle permits issued in recent years, the management and analysis of OSOW permit data is becoming more inefficient and time-consuming. Large quantities of archived OSOW permit data are held by Departments of Transportation (DOTs) across the United States, and manual extraction and analysis of this data requires significant effort. In this paper, the authors present a new framework for analyzing Wisconsin’s historic OSOW permit program data. This framework provides an interactive, web-based interface to query the OSOW permit data, link OSOW records to geospatial data features, and dynamically visualize query results. The web-based interface offers scalability and broad accessibility to the data across different DOT divisions, and use cases. Furthermore, a user survey and heuristic evaluation of the interface demonstrate the project’s utility, and identify goals for future system development. Full article
Show Figures

Figure 1

21 pages, 3568 KiB  
Article
UAT ADS-B Data Anomalies and the Effect of Flight Parameters on Dropout Occurrences
by Asma Tabassum and William Semke
Data 2018, 3(2), 19; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020019 - 08 Jun 2018
Cited by 10 | Viewed by 6564
Abstract
An analysis of the performance of automatic dependent surveillance-broadcast (ADS-B) data received from the Grand Forks, North Dakota International Airport was carried out in this study. The purpose was to understand the vulnerabilities of the universal access transceiver (UAT) ADS-B system and recognize [...] Read more.
An analysis of the performance of automatic dependent surveillance-broadcast (ADS-B) data received from the Grand Forks, North Dakota International Airport was carried out in this study. The purpose was to understand the vulnerabilities of the universal access transceiver (UAT) ADS-B system and recognize the effects on present and future air traffic control (ATC) operation. The Federal Aviation Administration (FAA) mandated all the general aviation aircraft to be equipped with ADS-B. The aircraft flying within United States and below the transition altitude (18,000 feet) are more likely to install a UAT ADS-B. At present, unmanned aircraft systems (UAS) and autonomous air traffic control (ATC) towers are being integrated into the aviation industry and UAT ADS-B is a basic sensor for both class 1 and class 2 detect-and-avoid (DAA) systems. As a fundamental component of future surveillance systems, the anomalies and vulnerabilities of the ADS-B system need to be identified to enable a fully-utilized airspace with enhanced situational awareness. The data received was archived in GDL-90 format, which was parsed into readable data. The anomaly detection of ADS-B messages was based on the FAA ADS-B performance assessment report. The data investigation revealed ADS-B message suffered from different anomalies including dropout, missing payload, data jump, low confidence data, and altitude discrepancy. Among those studied, the most severe was dropout and 32.49% of messages suffered from this anomaly. Dropout is an incident where ADS-B failed to update within a specified rate. Considering the potential danger being imposed, an in-depth analysis was carried out to characterize message dropout. Three flight parameters were selected to investigate their effect on dropout. Statistical analysis was carried out and the Friedman Statistical Test identified that altitude affected dropout more than any other flight parameter. Full article
Show Figures

Figure 1

16 pages, 1557 KiB  
Article
Improving the Efficiency of the ERS Data Analysis Techniques by Taking into Account the Neighborhood Descriptors
by Stanislav Yamashkin, Milan Radovanović, Anatoliy Yamashkin and Darko Vuković
Data 2018, 3(2), 18; https://doi.org/10.3390/data3020018 - 30 May 2018
Cited by 5 | Viewed by 3245
Abstract
Planning based on reliable information about the Earth’s surface is an important approach to minimize economic expenses conditioned by natural factors. Data collected by Earth remote sensing (ERS), as well as the analysis of such data using automated classification methods, are becoming more [...] Read more.
Planning based on reliable information about the Earth’s surface is an important approach to minimize economic expenses conditioned by natural factors. Data collected by Earth remote sensing (ERS), as well as the analysis of such data using automated classification methods, are becoming more and more important for research and practice activities related to assessing the spatio-temporal structure and sustainability of the Earth’s surface. The analysis of the authenticity of the surrounding areas enables a more objective classification of land plots on the basis of spatial patterns. Combined use of various environmental descriptors enables high-quality handling of neighborhood properties, as each descriptor provides its own specific information about a geospatial system. Experiments have shown that the diagnostics of the emergent properties of such internal structure by analyzing the diversity of dynamic characteristics allows reducing exposure to noise, obtaining a generalized result, and improving the classification accuracy. Full article
(This article belongs to the Special Issue Data in Astrophysics & Geophysics: Research and Applications)
Show Figures

Figure 1

9 pages, 1511 KiB  
Data Descriptor
Benthic Macroinvertebrate Diversity in the Middle Doce River Basin, Brazil
by Gabriel Estevão Nogueira Aguila, Diego Guimarães Florencio Pujoni, Maria Margarida Marques, Liss Gato Cupertino Santos, Natália Murta de Lima Dornelas, Karine Andrade, Ivan Menezes Monteiro, Paulina Maria Maia-Barbosa and Francisco Antônio Rodrigues Barbosa
Data 2018, 3(2), 17; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020017 - 22 May 2018
Cited by 3 | Viewed by 3512
Abstract
This resource contains a checklist of the benthic macroinvertebrate community sampled biannually from 1999 to 2010 in eight natural lakes from the middle Rio Doce Valley lake system and eight river segments in the Piracicaba River basin (sub-basin of Doce river), Minas Gerais [...] Read more.
This resource contains a checklist of the benthic macroinvertebrate community sampled biannually from 1999 to 2010 in eight natural lakes from the middle Rio Doce Valley lake system and eight river segments in the Piracicaba River basin (sub-basin of Doce river), Minas Gerais State, Brazil. Three of the lakes are located inside a protected state park and are surrounded by preserved vegetation (Atlantic Forest). The other five lakes are in private properties, surrounded by Eucalyptus plantations. The seven stretches of rivers have a distinct degree of anthropogenic impacts. Samples were collected with a kick net and fixed with formaldehyde solution. Four phyla were represented: Mollusca, Annelida, Arthropoda, and Platyhelminthes. For Insecta, 76 families were identified, one family was identified for Crustacea, and nine families were identified for Mollusca. This subproject belongs to the International Long-Term Ecological Research Project (ILTER—Programa de Pesquisas Ecológicas de Longa Duração—PELD) site 4. Full article
Show Figures

Figure 1

10 pages, 3215 KiB  
Data Descriptor
Plant Trait Dataset for Tree-Like Growth Forms Species of the Subtropical Atlantic Rain Forest in Brazil
by Arthur Vinicius Rodrigues, Fábio Leal Viana Bones, Alisson Schneiders, Laio Zimermann Oliveira, Alexander Christian Vibrans and André Luís de Gasper
Data 2018, 3(2), 16; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020016 - 08 May 2018
Cited by 5 | Viewed by 6708
Abstract
Plant functional traits have been incorporated in studies of vegetation ecology to better understand the mechanisms of ecological processes. For this reason, a global effort has been made to collect functional traits data for as many species as possible. In light of this, [...] Read more.
Plant functional traits have been incorporated in studies of vegetation ecology to better understand the mechanisms of ecological processes. For this reason, a global effort has been made to collect functional traits data for as many species as possible. In light of this, we identified the most common species of an area of 15,335 km2 inserted in the subtropical Atlantic Rain Forest in Southern Brazil. Then, we compiled functional trait information mostly from field samples, but also from herbarium and literature. The dataset presents traits of leaf, branch, maximum potential height, seed mass, and dispersion syndrome of 117 species, including trees, tree ferns, and palms. We also share images of anatomical features of branches used to measure wood traits. Data tables present mean trait values at individual and species level. Images of wood and stomatal features may be useful to assess other anatomical traits that were not covered in the data tables for the anatomical determination of species and/or for educational purposes. Full article
Show Figures

Figure 1

10 pages, 1945 KiB  
Data Descriptor
Datasets for Aspect-Based Sentiment Analysis in Bangla and Its Baseline Evaluation
by Md. Atikur Rahman and Emon Kumar Dey
Data 2018, 3(2), 15; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020015 - 04 May 2018
Cited by 63 | Viewed by 13802
Abstract
With the extensive growth of user interactions through prominent advances of the Web, sentiment analysis has obtained more focus from an academic and a commercial point of view. Recently, sentiment analysis in the Bangla language is progressively being considered as an important task, [...] Read more.
With the extensive growth of user interactions through prominent advances of the Web, sentiment analysis has obtained more focus from an academic and a commercial point of view. Recently, sentiment analysis in the Bangla language is progressively being considered as an important task, for which previous approaches have attempted to detect the overall polarity of a Bangla document. To the best of our knowledge, there is no research on the aspect-based sentiment analysis (ABSA) of Bangla text. This can be described as being due to the lack of available datasets for ABSA. In this paper, we provide two publicly available datasets to perform the ABSA task in Bangla. One of the datasets consists of human-annotated user comments on cricket, and the other dataset consists of customer reviews of restaurants. We also describe a baseline approach for the subtask of aspect category extraction to evaluate our datasets. Full article
Show Figures

Figure 1

6 pages, 666 KiB  
Data Descriptor
RetroTransformDB: A Dataset of Generic Transforms for Retrosynthetic Analysis
by Svetlana Avramova, Nikolay Kochev and Plamen Angelov
Data 2018, 3(2), 14; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020014 - 21 Apr 2018
Cited by 8 | Viewed by 7328
Abstract
Presently, software tools for retrosynthetic analysis are widely used by organic, medicinal, and computational chemists. Rule-based systems extensively use collections of retro-reactions (transforms). While there are many public datasets with reactions in synthetic direction (usually non-generic reactions), there are no publicly-available databases with [...] Read more.
Presently, software tools for retrosynthetic analysis are widely used by organic, medicinal, and computational chemists. Rule-based systems extensively use collections of retro-reactions (transforms). While there are many public datasets with reactions in synthetic direction (usually non-generic reactions), there are no publicly-available databases with generic reactions in computer-readable format which can be used for the purposes of retrosynthetic analysis. Here we present RetroTransformDB—a dataset of transforms, compiled and coded in SMIRKS line notation by us. The collection is comprised of more than 100 records, with each one including the reaction name, SMIRKS linear notation, the functional group to be obtained, and the transform type classification. All SMIRKS transforms were tested syntactically, semantically, and from a chemical point of view in different software platforms. The overall dataset design and the retrosynthetic fitness were analyzed and curated by organic chemistry experts. The RetroTransformDB dataset may be used by open-source and commercial software packages, as well as chemoinformatics tools. Full article
Show Figures

Graphical abstract

15 pages, 2875 KiB  
Data Descriptor
Sigfox and LoRaWAN Datasets for Fingerprint Localization in Large Urban and Rural Areas
by Michiel Aernouts, Rafael Berkvens, Koen Van Vlaenderen and Maarten Weyn
Data 2018, 3(2), 13; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020013 - 10 Apr 2018
Cited by 123 | Viewed by 12797
Abstract
Because of the increasing relevance of the Internet of Things and location-based services, researchers are evaluating wireless positioning techniques, such as fingerprinting, on Low Power Wide Area Network (LPWAN) communication. In order to evaluate fingerprinting in large outdoor environments, extensive, time-consuming measurement campaigns [...] Read more.
Because of the increasing relevance of the Internet of Things and location-based services, researchers are evaluating wireless positioning techniques, such as fingerprinting, on Low Power Wide Area Network (LPWAN) communication. In order to evaluate fingerprinting in large outdoor environments, extensive, time-consuming measurement campaigns need to be conducted to create useful datasets. This paper presents three LPWAN datasets which are collected in large-scale urban and rural areas. The goal is to provide the research community with a tool to evaluate fingerprinting algorithms in large outdoor environments. During a period of three months, numerous mobile devices periodically obtained location data via a GPS receiver which was transmitted via a Sigfox or LoRaWAN message. Together with network information, this location data is stored in the appropriate LPWAN dataset. The first results of our basic fingerprinting implementation, which is also clarified in this paper, indicate a mean location estimation error of 214.58 m for the rural Sigfox dataset, 688.97 m for the urban Sigfox dataset and 398.40 m for the urban LoRaWAN dataset. In the future, we will enlarge our current datasets and use them to evaluate and optimize our fingerprinting methods. Also, we intend to collect additional datasets for Sigfox, LoRaWAN and NB-IoT. Full article
Show Figures

Figure 1

12 pages, 6220 KiB  
Article
Comparison between Simulation and Analytical Methods in Reliability Data Analysis: A Case Study on Face Drilling Rigs
by Seyed Hadi Hoseinie, Hussan Al-Chalabi and Behzad Ghodrati
Data 2018, 3(2), 12; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020012 - 10 Apr 2018
Cited by 9 | Viewed by 5269
Abstract
Collecting the failure data and reliability analysis in an underground mining operation is challenging due to the harsh environment and high level of production pressure. Therefore, achieving an accurate, fast, and applicable analysis in a fleet of underground equipment is usually difficult and [...] Read more.
Collecting the failure data and reliability analysis in an underground mining operation is challenging due to the harsh environment and high level of production pressure. Therefore, achieving an accurate, fast, and applicable analysis in a fleet of underground equipment is usually difficult and time consuming. This paper aims to discuss the main reliability analysis challenges in mining machinery by comparing three main approaches: two analytical methods (white-box and black-box modeling), and a simulation approach. For this purpose, the maintenance data from a fleet of face drilling rigs in a Swedish underground metal mine were extracted by the MAXIMO system over a period of two years and were applied for analysis. The investigations reveal that the performance of these approaches in ranking and the reliability of the studies of the machines is different. However, all mentioned methods provide similar outputs but, in general, the simulation estimates the reliability of the studied machines at a higher level. The simulation and white-box method sometimes provide exactly the same results, which are caused by their similar structure of analysis. On average, 9% of the data are missed in the white-box analysis due to a lack of sufficient data in some of the subsystems of the studies’ rigs. Full article
Show Figures

Figure 1

13 pages, 810 KiB  
Data Descriptor
SIMADL: Simulated Activities of Daily Living Dataset
by Talal Alshammari, Nasser Alshammari, Mohamed Sedky and Chris Howard
Data 2018, 3(2), 11; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020011 - 01 Apr 2018
Cited by 26 | Viewed by 8890
Abstract
With the realisation of the Internet of Things (IoT) paradigm, the analysis of the Activities of Daily Living (ADLs), in a smart home environment, is becoming an active research domain. The existence of representative datasets is a key requirement to advance the research [...] Read more.
With the realisation of the Internet of Things (IoT) paradigm, the analysis of the Activities of Daily Living (ADLs), in a smart home environment, is becoming an active research domain. The existence of representative datasets is a key requirement to advance the research in smart home design. Such datasets are an integral part of the visualisation of new smart home concepts as well as the validation and evaluation of emerging machine learning models. Machine learning techniques that can learn ADLs from sensor readings are used to classify, predict and detect anomalous patterns. Such techniques require data that represent relevant smart home scenarios, for training, testing and validation. However, the development of such machine learning techniques is limited by the lack of real smart home datasets, due to the excessive cost of building real smart homes. This paper provides two datasets for classification and anomaly detection. The datasets are generated using OpenSHS, (Open Smart Home Simulator), which is a simulation software for dataset generation. OpenSHS records the daily activities of a participant within a virtual environment. Seven participants simulated their ADLs for different contexts, e.g., weekdays, weekends, mornings and evenings. Eighty-four files in total were generated, representing approximately 63 days worth of activities. Forty-two files of classification of ADLs were simulated in the classification dataset and the other forty-two files are for anomaly detection problems in which anomalous patterns were simulated and injected into the anomaly detection dataset. Full article
Show Figures

Figure 1

17 pages, 1052 KiB  
Article
Associative Root–Pattern Data and Distribution in Arabic Morphology
by Bassam Haddad, Ahmad Awwad, Mamoun Hattab and Ammar Hattab
Data 2018, 3(2), 10; https://0-doi-org.brum.beds.ac.uk/10.3390/data3020010 - 29 Mar 2018
Cited by 6 | Viewed by 5038
Abstract
This paper intends to present a large-scale dataset for Arabic morphology from a cognitive point of view considering the uniqueness of the root–pattern phenomenon. The center of attention is focused on studying this singularity in terms of estimating associative relationships between roots as [...] Read more.
This paper intends to present a large-scale dataset for Arabic morphology from a cognitive point of view considering the uniqueness of the root–pattern phenomenon. The center of attention is focused on studying this singularity in terms of estimating associative relationships between roots as a higher level of abstraction for words meaning, and all their potential occurrences with multiple morpho-phonetic patterns. A major advantage of this approach resides in providing a novel balanced large-scale language resource, which can be viewed as an instantiated global root–pattern network consisting of roots, patterns, stems, and particles, estimated statistically for studying the morpho-phonetic level of cognition of Arabic. In this context, this paper asserts that balanced root-distribution is an additional significant key criterion for evaluating topic coverage in an Arabic corpus. Furthermore, some additional novel probabilistic morpho-phonetic measures and their distribution have been estimated in the form of root and pattern entropies besides bi-directional conditional probabilities of bi-grams of stems, roots, and particles. Around 29.2 million webpages of ClueWeb were extracted, filtered from non-Arabic texts, and converted into a large textual dataset containing around 11.5 billion word forms and 9.3 million associative relationships. As this dataset is predominantly considering the root–pattern phenomenon in Semitic languages, the acquired data might be significant support for researchers interested in studying phenomena of Arabic such as visual word cognition, morpho-phonetic perception, morphological analysis, and cognitively motivated query expansion, spell-checking, and information retrieval. Furthermore, based on data distribution and frequencies, constructing balanced corpora will be easier. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop