Next Issue
Volume 6, April
Previous Issue
Volume 6, February
 
 

Data, Volume 6, Issue 3 (March 2021) – 12 articles

Cover Story (view full-size image): This work presents LeafLive-DB, a software platform that helps map and characterize species from the Brazilian plant biodiversity, offering the possibility of worldwide distribution. Developed by Brazilian and Peruvian researchers, this platform, which is available in its first version, features some functions for consulting and registering plant species and their taxonomy, among other information, through intuitive interfaces and an environment that promotes collaboration and data and research sharing. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
8 pages, 2359 KiB  
Data Descriptor
A Data Descriptor for Black Tea Fermentation Dataset
by Gibson Kimutai, Alexander Ngenzi, Rutabayiro Ngoga Said, Rose C. Ramkat and Anna Förster
Data 2021, 6(3), 34; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030034 - 19 Mar 2021
Cited by 2 | Viewed by 3841
Abstract
Tea is currently the most popular beverage after water. Tea contributes to the livelihood of more than 10 million people globally. There are several categories of tea, but black tea is the most popular, accounting for about 78% of total tea consumption. Processing [...] Read more.
Tea is currently the most popular beverage after water. Tea contributes to the livelihood of more than 10 million people globally. There are several categories of tea, but black tea is the most popular, accounting for about 78% of total tea consumption. Processing of black tea involves the following steps: plucking, withering, crushing, tearing and curling, fermentation, drying, sorting, and packaging. Fermentation is the most important step in determining the final quality of the processed tea. Fermentation is a time-bound process and it must take place under certain temperature and humidity conditions. During fermentation, tea color changes from green to coppery brown to signify the attainment of optimum fermentation levels. These parameters are currently manually monitored. At present, there is only one existing dataset on tea fermentation images. This study makes a tea fermentation dataset available, composed of tea fermentation conditions and tea fermentation images. Full article
(This article belongs to the Special Issue Machine Learning in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

10 pages, 4608 KiB  
Data Descriptor
Tools for Remote Exploration: A Lithium (Li) Dedicated Spectral Library of the Fregeneda–Almendra Aplite–Pegmatite Field
by Joana Cardoso-Fernandes, João Silva, Filipa Dias, Alexandre Lima, Ana C. Teodoro, Odile Barrès, Jean Cauzid, Mônica Perrotta, Encarnación Roda-Robles and Maria Anjos Ribeiro
Data 2021, 6(3), 33; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030033 - 16 Mar 2021
Cited by 17 | Viewed by 4866
Abstract
The existence of diagnostic features in the visible and infrared regions makes it possible to use reflectance spectra not only to identify mineral assemblages but also for calibration and classification of satellite images, considering lithological and/or mineral mapping. For this purpose, a consistent [...] Read more.
The existence of diagnostic features in the visible and infrared regions makes it possible to use reflectance spectra not only to identify mineral assemblages but also for calibration and classification of satellite images, considering lithological and/or mineral mapping. For this purpose, a consistent spectral library with the target spectra of minerals and rocks is needed. Currently, there is big market pressure for raw materials including lithium (Li) that has driven new satellite image applications for Li exploration. However, there are no reference spectra for petalite (a Li mineral) in large, open spectral datasets. In this work, a spectral library was built exclusively dedicated to Li minerals and Li pegmatite exploration through satellite remote sensing. The database includes field and laboratory spectra collected in the Fregeneda–Almendra region (Spain–Portugal) from (i) distinct Li minerals (spodumene, petalite, lepidolite); (ii) several Li pegmatites and other outcropping lithologies to allow satellite-based lithological mapping; (iii) areas previously misclassified as Li pegmatites using machine learning algorithms to allow comparisons between these regions and the target areas. Ancillary data include (i) sample location and coordinates, (ii) sample conditions, (iii) sample color, (iv) type of face measured, (v) equipment used, and for the laboratory spectra, (vi) sample photographs, (vii) continuum removed spectra files, and (viii) statistics on the main absorption features automatically extracted. The potential future uses of this spectral library are reinforced by its major advantages: (i) data is provided in a universal file format; (ii) it allows users to compare field and laboratory spectra; (iii) a large number of complementary data allow the comparison of shape, asymmetry, and depth of the absorption features of the distinct Li minerals. Full article
(This article belongs to the Section Spatial Data Science and Digital Earth)
Show Figures

Figure 1

12 pages, 5869 KiB  
Data Descriptor
A High-Accuracy GNSS Dataset of Ground Truth Points Collected within Îles-de-Boucherville National Park, Quebec, Canada
by Kathryn Elmer and Margaret Kalacska
Data 2021, 6(3), 32; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030032 - 14 Mar 2021
Cited by 2 | Viewed by 3433
Abstract
A new ground truth dataset generated with high-accuracy Global Navigation Satellite Systems (GNSS) positional data of the invasive reed Phragmites australis subsp. australis within Îles-de-Boucherville National Park (Quebec, Canada) is described. The park is one of five study sites for the Canadian Airborne [...] Read more.
A new ground truth dataset generated with high-accuracy Global Navigation Satellite Systems (GNSS) positional data of the invasive reed Phragmites australis subsp. australis within Îles-de-Boucherville National Park (Quebec, Canada) is described. The park is one of five study sites for the Canadian Airborne Biodiversity Observatory (CABO) and has stands of invasive P. australis spread throughout the park. Previously, within the context of CABO, no ground truth data had been collected within the park consolidating the locations of P. australis. This dataset was collected to serve as training and validation data for CABO airborne hyperspectral imagery acquired in 2019 to assist with the detection and mapping of P. australis. The locations of the ground truth points were found to be accurate within one pixel of the hyperspectral imagery. Overall, 320 ground truth points were collected, representing 158 locations where P. australis was present and 162 locations where it was absent. Auxiliary data includes field photographs and digitized field notes that provide context for each point. Full article
(This article belongs to the Section Spatial Data Science and Digital Earth)
Show Figures

Figure 1

12 pages, 1017 KiB  
Data Descriptor
KazNewsDataset: Single Country Overall Digital Mass Media Publication Corpus
by Kirill Yakunin, Maksat Kalimoldayev, Ravil I. Mukhamediev, Rustam Mussabayev, Vladimir Barakhnin, Yan Kuchin, Sanzhar Murzakhmetov, Timur Buldybayev, Ulzhan Ospanova, Marina Yelis, Akylbek Zhumabayev, Viktors Gopejenko, Zhazirakhanym Meirambekkyzy and Alibek Abdurazakov
Data 2021, 6(3), 31; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030031 - 14 Mar 2021
Cited by 5 | Viewed by 2676
Abstract
Mass media is one of the most important elements influencing the information environment of society. The mass media is not only a source of information about what is happening but is often the authority that shapes the information agenda, the boundaries, and forms [...] Read more.
Mass media is one of the most important elements influencing the information environment of society. The mass media is not only a source of information about what is happening but is often the authority that shapes the information agenda, the boundaries, and forms of discussion on socially relevant topics. A multifaceted and, where possible, quantitative assessment of mass media performance is crucial for understanding their objectivity, tone, thematic focus and, quality. The paper presents a corpus of Kazakhstan media, which contains over 4 million publications from 36 primary sources (which has at least 500 publications). The corpus also includes more than 2 million texts of Russian media for comparative analysis of publication activity of the countries, also about 4000 sections of state policy documents. The paper briefly describes the natural language processing and multiple-criteria decision-making methods, which are the algorithmic basis of the text and mass media evaluation method, and describes the results of several research cases, such as identification of propaganda, assessment of the tone of publications, calculation of the level of socially relevant negativity, comparative analysis of publication activity in the field of renewable energy. Experiments confirm the general possibility of evaluating the socially significant news, identifying texts with propagandistic content, evaluating the sentiment of publications using the topic model of the text corpus since the area under receiver operating characteristics curve (ROC AUC) values of 0.81, 0.73 and 0.93 were achieved on abovementioned tasks. The described cases do not exhaust the possibilities of thematic, tonal, dynamic, etc., analysis of the considered corpus of texts. The corpus will be interesting to researchers considering both multiple publications and mass media analysis, including comparative analysis and identification of common patterns inherent in the media of different countries. Full article
Show Figures

Figure 1

12 pages, 5849 KiB  
Data Descriptor
Dataset of the Optimization of a Low Power Chemoresistive Gas Sensor: Predictive Thermal Modelling and Mechanical Failure Analysis
by Andrea Gaiardo, David Novel, Elia Scattolo, Alessio Bucciarelli, Pierluigi Bellutti and Giancarlo Pepponi
Data 2021, 6(3), 30; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030030 - 09 Mar 2021
Cited by 4 | Viewed by 2085
Abstract
Over the last few years, employment of the standard silicon microfabrication techniques for the gas sensor technology has allowed for the development of ever-small, low-cost, and low-power consumption devices. Specifically, the development of silicon microheaters (MHs) has become well established to produce MOS [...] Read more.
Over the last few years, employment of the standard silicon microfabrication techniques for the gas sensor technology has allowed for the development of ever-small, low-cost, and low-power consumption devices. Specifically, the development of silicon microheaters (MHs) has become well established to produce MOS gas sensors. Therefore, the development of predictive models that help to define a priori the optimal design and layout of the device have become crucial, in order to achieve both low power consumption and high mechanical stability. In this research dataset, we present the experimental data collected to develop a specific and useful predictive thermal-mechanical model for high performing silicon MHs. To this aim, three MH layouts over three different membrane sizes were developed by using the standard silicon microfabrication process. Thermal and mechanical performances of the produced devices were experimentally evaluated, by using probe stations and mechanical failure analysis, respectively. The measured thermal curves were used to develop the predictive thermal model towards low power consumption. Moreover, a statistical analysis was finally introduced to cross-correlate the mechanical failure results and the thermal predictive model, aiming at MH design optimization for gas sensing applications. All the data collected in this investigation are shown. Full article
Show Figures

Figure 1

14 pages, 20876 KiB  
Article
LeafLive-DB: Classification and Data Storage of Botanical Studies
by Jorge Rodolfo Beingolea, Diego Ramos-Pires, Jorge Rendulich, Milagros Zegarra, Juan Borja-Murillo and Simone A. Siqueira da Fonseca
Data 2021, 6(3), 29; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030029 - 09 Mar 2021
Viewed by 2131
Abstract
The development of studies, projects, and technologies that contribute to the understanding and preservation of plant biodiversity is becoming highly necessary, as well as tools and software platforms that enable the storage and classification of information resulting from studies on biodiversity. This work [...] Read more.
The development of studies, projects, and technologies that contribute to the understanding and preservation of plant biodiversity is becoming highly necessary, as well as tools and software platforms that enable the storage and classification of information resulting from studies on biodiversity. This work presents LeafLive-DB, a software platform that helps map and characterize species from the Brazilian plant biodiversity, offering the possibility of worldwide distribution. Developed by Brazilian and Peruvians researchers, this platform, which is available in its first version, features some functions for consulting and registering plant species and their taxonomy, among other information, through intuitive interfaces and an environment that promotes collaboration and data and research sharing. The platform innovates in data processing, functionality, and development architecture. It has ten thousand registers, and it should start to be distributed in partnership with schools and higher education institutions. Full article
Show Figures

Figure 1

12 pages, 240 KiB  
Data Descriptor
Stark Width Data for Tb II, Tb III and Tb IV Spectral Lines
by Milan S. Dimitrijević
Data 2021, 6(3), 28; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030028 - 08 Mar 2021
Viewed by 1677
Abstract
A dataset of Stark widths for Tb II, Tb III and Tb IV is presented. To data obtained before, the results of new calculations for 62 Tb III lines from 5d to 6pj(6,j)o, a transition array, have been added. [...] Read more.
A dataset of Stark widths for Tb II, Tb III and Tb IV is presented. To data obtained before, the results of new calculations for 62 Tb III lines from 5d to 6pj(6,j)o, a transition array, have been added. Calculations have been performed by using the simplified modified semiempirical method for temperatures from 5000 to 80,000 K for an electron density of 1017 cm3. The results were also used to discuss the regularities within multiplets and a supermultiplet. Full article
(This article belongs to the Special Issue Astronomy in the Big Data Era: Perspectives)
5 pages, 677 KiB  
Data Descriptor
Collection of Environmental Variables and Bacterial Community Compositions in Marian Cove, Antarctica, during Summer 2018
by Hyo-Ryeon Kim, Jae-Hyun Lim, Ju-Hyoung Kim and Il-Nam Kim
Data 2021, 6(3), 27; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030027 - 05 Mar 2021
Viewed by 1862
Abstract
Marine bacteria, which are known as key drivers for marine biogeochemical cycles and Earth’s climate system, are mainly responsible for the decomposition of organic matter and production of climate-relevant gases (i.e., CO₂, N₂O, and CH₄). However, research is still required to fully understand [...] Read more.
Marine bacteria, which are known as key drivers for marine biogeochemical cycles and Earth’s climate system, are mainly responsible for the decomposition of organic matter and production of climate-relevant gases (i.e., CO₂, N₂O, and CH₄). However, research is still required to fully understand the correlation between environmental variables and bacteria community composition. Marine bacteria living in the Marian Cove, where the inflow of freshwater has been rapidly increasing due to substantial glacial retreat, must be undergoing significant environmental changes. During the summer of 2018, we conducted a hydrographic survey to collect environmental variables and bacterial community composition data at three different layers (i.e., the seawater surface, middle, and bottom layers) from 15 stations. Of all the bacterial data, 17 different phylum level bacteria and 21 different class level bacteria were found and Proteobacteria occupy 50.3% at phylum level following Bacteroidetes. Gammaproteobacteria and Alphaproteobacteria, which belong to Proteobacteria, are the highest proportion at the class level. Gammaproteobacteria showed the highest relative abundance in all three seawater layers. The collection of environmental variables and bacterial composition data contributes to improving our understanding of the significant relationships between marine Antarctic regions and marine bacteria that lives in the Antarctic. Full article
Show Figures

Figure 1

10 pages, 2068 KiB  
Data Descriptor
FIKWater: A Water Consumption Dataset from Three Restaurant Kitchens in Portugal
by Lucas Pereira, Vitor Aguiar and Fábio Vasconcelos
Data 2021, 6(3), 26; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030026 - 02 Mar 2021
Cited by 9 | Viewed by 2845
Abstract
With the advent of the Internet of Things (IoT) and low-cost sensing technologies, the availability of data has reached levels never imagined before by the research community. However, independently of their size, data are only as valuable as the ability to have access [...] Read more.
With the advent of the Internet of Things (IoT) and low-cost sensing technologies, the availability of data has reached levels never imagined before by the research community. However, independently of their size, data are only as valuable as the ability to have access to them. This paper presents the FIKWater dataset, which contains time series data for hot and cold water demand collected from three restaurant kitchens in Portugal for consecutive periods between two and four weeks. The measurements were taken using ultrasonic flow meters, at a sampling frequency of 0.2 Hz. Additionally, some details of the monitored spaces are also provided. Full article
Show Figures

Figure 1

11 pages, 797 KiB  
Data Descriptor
FIKWaste: A Waste Generation Dataset from Three Restaurant Kitchens in Portugal
by Lucas Pereira, Vitor Aguiar and Fábio Vasconcelos
Data 2021, 6(3), 25; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030025 - 26 Feb 2021
Cited by 3 | Viewed by 2477
Abstract
In the era of big data and artificial intelligence, public datasets are becoming increasingly important for researchers to build and evaluate their models. This paper presents the FIKWaste dataset, which contains time series data for the volume of waste produced in three restaurant [...] Read more.
In the era of big data and artificial intelligence, public datasets are becoming increasingly important for researchers to build and evaluate their models. This paper presents the FIKWaste dataset, which contains time series data for the volume of waste produced in three restaurant kitchens in Portugal. Organic (undifferentiated) and inorganic (glass, paper, and plastic) waste bins were monitored for a consecutive period of four weeks. In addition to the time series measurements, the FIKWaste dataset contains labels for waste disposal events, i.e., when the waste bins are emptied, and technical and non-technical details of the monitored kitchens. Full article
Show Figures

Figure 1

10 pages, 1364 KiB  
Data Descriptor
A Data Resource for Sulfuric Acid Reactivity of Organic Chemicals
by William Bains, Janusz Jurand Petkowski and Sara Seager
Data 2021, 6(3), 24; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030024 - 25 Feb 2021
Cited by 6 | Viewed by 2757
Abstract
We describe a dataset of the quantitative reactivity of organic chemicals with concentrated sulfuric acid. As well as being a key industrial chemical, sulfuric acid is of environmental and planetary importance. In the absence of measured reaction kinetics, the reaction rate of a [...] Read more.
We describe a dataset of the quantitative reactivity of organic chemicals with concentrated sulfuric acid. As well as being a key industrial chemical, sulfuric acid is of environmental and planetary importance. In the absence of measured reaction kinetics, the reaction rate of a chemical with sulfuric acid can be estimated from the reaction rate of structurally related chemicals. To allow an approximate prediction, we have collected 589 sets of kinetic data on the reaction of organic chemicals with sulfuric acid from 262 literature sources and used a functional group-based approach to build a model of how the functional groups would react in any sulfuric acid concentration from 60–100%, and between −20 °C and 100 °C. The data set provides the original reference data and kinetic measurements, parameters, intermediate computation steps, and a set of first-order rate constants for the functional groups across the range of conditions −20 °C–100 °C and 60–100% sulfuric acid. The dataset will be useful for a range of studies in chemistry and atmospheric sciences where the reaction rate of a chemical with sulfuric acid is needed but has not been measured. Full article
(This article belongs to the Section Chemoinformatics)
Show Figures

Figure 1

17 pages, 3876 KiB  
Article
Information System for Selection of Conditions and Equipment for Mammalian Cell Cultivation
by Natalia Menshutina, Elena Guseva, Diana Batyrgazieva and Igor Mitrofanov
Data 2021, 6(3), 23; https://0-doi-org.brum.beds.ac.uk/10.3390/data6030023 - 25 Feb 2021
Viewed by 2396
Abstract
Over the past few decades, animal cell culture technology has advanced significantly. It is now considered a reliable, functional, and relatively well-developed technology. At present, biotherapeutic drugs are synthesized using cell culture techniques by large manufacturing enterprises that produce products for commercial use [...] Read more.
Over the past few decades, animal cell culture technology has advanced significantly. It is now considered a reliable, functional, and relatively well-developed technology. At present, biotherapeutic drugs are synthesized using cell culture techniques by large manufacturing enterprises that produce products for commercial use and clinical research. The reliable implementation of mammalian cell culture technology requires the optimization of a number of variables, including the culture environment and bioreactor conditions, suitable cell lines, operating costs, efficient process management and, most importantly, quality. Successful implementation also requires an appropriate process development strategy, industrial scale, and characteristics, as well as the certification of sustainable procedures that meet the requirements of current regulations. All of this has led to a trend of increasing research in the field of biotechnology and, as a result, to a great accumulation of scientific information which, however, remains fragmentary and non-systematic. The development of information and network technologies allow us to solve this problem. Information system creation allows for implementation of the modern concept of integrating various structured and unstructured data, as well as the collection of information from internal and external sources. We propose and develop an information system which contains the conditions and various parameters of cultivation processes. The associated ranking system is the result of the set of recommendations—both from technological and hardware solutions—which allow for choosing the optimal conditions for the cultivation of mammalian cells at the stage of scientific research, thereby significantly reducing the time and cost of work. The proposed information system allows for the accumulation of experience regarding existing technologies for the cultivation of mammalian cells, along with application to the development of new technologies. The main goal of the present work is to discuss information systems, the organizational support of scientific research in the field of mammalian cell cultivation, and to provide a detailed description of the developed system and its main modules, including the conceptual and logical scheme of the database. Full article
(This article belongs to the Special Issue Data Quality and Data Access for Research)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop