Next Issue
Volume 10, October
Previous Issue
Volume 10, August
 
 

Information, Volume 10, Issue 9 (September 2019) – 24 articles

Cover Story (view full-size image): A polysemous term has many potential translation equivalents in a target language. The translation could lose its meaning if the term translation and domain knowledge are not taken into account. The evaluation of terminology translation has been one of the least-explored areas in machine translation (MT) research. To the best of our knowledge, as of now, no one has proposed any effective way to evaluate terminology translation in MT automatically. This work presents a semi-automatic terminology annotation strategy from which a gold standard for evaluating terminology translation in automatic translation can be created. The paper also introduces a classification framework that can automatically classify term translation-related errors and expose specific problems in relation to terminology translation in MT. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
17 pages, 3598 KiB  
Article
Clustering Algorithms and Validation Indices for a Wide mmWave Spectrum
by Bogdan Antonescu, Miead Tehrani Moayyed and Stefano Basagni
Information 2019, 10(9), 287; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090287 - 19 Sep 2019
Cited by 4 | Viewed by 2494
Abstract
Radio channel propagation models for the millimeter wave (mmWave) spectrum are extremely important for planning future 5G wireless communication systems. Transmitted radio signals are received as clusters of multipath rays. Identifying these clusters provides better spatial and temporal characteristics of the mmWave channel. [...] Read more.
Radio channel propagation models for the millimeter wave (mmWave) spectrum are extremely important for planning future 5G wireless communication systems. Transmitted radio signals are received as clusters of multipath rays. Identifying these clusters provides better spatial and temporal characteristics of the mmWave channel. This paper deals with the clustering process and its validation across a wide range of frequencies in the mmWave spectrum below 100 GHz. By way of simulations, we show that in outdoor communication scenarios clustering of received rays is influenced by the frequency of the transmitted signal. This demonstrates the sparse characteristic of the mmWave spectrum (i.e., we obtain a lower number of rays at the receiver for the same urban scenario). We use the well-known k-means clustering algorithm to group arriving rays at the receiver. The accuracy of this partitioning is studied with both cluster validity indices (CVIs) and score fusion techniques. Finally, we analyze how the clustering solution changes with narrower-beam antennas, and we provide a comparison of the cluster characteristics for different types of antennas. Full article
(This article belongs to the Special Issue Emerging Topics in Wireless Communications for Future Smart Cities)
Show Figures

Figure 1

26 pages, 6453 KiB  
Article
Copy-Move Forgery Detection and Localization Using a Generative Adversarial Network and Convolutional Neural-Network
by Younis Abdalla, M. Tariq Iqbal and Mohamed Shehata
Information 2019, 10(9), 286; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090286 - 16 Sep 2019
Cited by 28 | Viewed by 6393
Abstract
The problem of forged images has become a global phenomenon that is spreading mainly through social media. New technologies have provided both the means and the support for this phenomenon, but they are also enabling a targeted response to overcome it. Deep convolution [...] Read more.
The problem of forged images has become a global phenomenon that is spreading mainly through social media. New technologies have provided both the means and the support for this phenomenon, but they are also enabling a targeted response to overcome it. Deep convolution learning algorithms are one such solution. These have been shown to be highly effective in dealing with image forgery derived from generative adversarial networks (GANs). In this type of algorithm, the image is altered such that it appears identical to the original image and is nearly undetectable to the unaided human eye as a forgery. The present paper investigates copy-move forgery detection using a fusion processing model comprising a deep convolutional model and an adversarial model. Four datasets are used. Our results indicate a significantly high detection accuracy performance (~95%) exhibited by the deep learning CNN and discriminator forgery detectors. Consequently, an end-to-end trainable deep neural network approach to forgery detection appears to be the optimal strategy. The network is developed based on two-branch architecture and a fusion module. The two branches are used to localize and identify copy-move forgery regions through CNN and GAN. Full article
(This article belongs to the Section Information and Communications Technology)
Show Figures

Figure 1

16 pages, 416 KiB  
Article
Low-Cost, Low-Power FPGA Implementation of ED25519 and CURVE25519 Point Multiplication
by Mohamad Ali Mehrabi and Christophe Doche
Information 2019, 10(9), 285; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090285 - 14 Sep 2019
Cited by 18 | Viewed by 4976
Abstract
Twisted Edwards curves have been at the center of attention since their introduction by Bernstein et al. in 2007. The curve ED25519, used for Edwards-curve Digital Signature Algorithm (EdDSA), provides faster digital signatures than existing schemes without sacrificing security. The CURVE25519 is a [...] Read more.
Twisted Edwards curves have been at the center of attention since their introduction by Bernstein et al. in 2007. The curve ED25519, used for Edwards-curve Digital Signature Algorithm (EdDSA), provides faster digital signatures than existing schemes without sacrificing security. The CURVE25519 is a Montgomery curve that is closely related to ED25519. It provides a simple, constant time, and fast point multiplication, which is used by the key exchange protocol X25519. Software implementations of EdDSA and X25519 are used in many web-based PC and Mobile applications. In this paper, we introduce a low-power, low-area FPGA implementation of the ED25519 and CURVE25519 scalar multiplication that is particularly relevant for Internet of Things (IoT) applications. The efficiency of the arithmetic modulo the prime number 2 255 19 , in particular the modular reduction and modular multiplication, are key to the efficiency of both EdDSA and X25519. To reduce the complexity of the hardware implementation, we propose a high-radix interleaved modular multiplication algorithm. One benefit of this architecture is to avoid the use of large-integer multipliers relying on FPGA DSP modules. Full article
Show Figures

Figure 1

8 pages, 511 KiB  
Article
Another Step in the Ladder of DNS-Based Covert Channels: Hiding Ill-Disposed Information in DNSKEY RRs
by Marios Anagnostopoulos and John André Seem
Information 2019, 10(9), 284; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090284 - 12 Sep 2019
Cited by 1 | Viewed by 2914
Abstract
Covert channel communications are of vital importance for the ill-motivated purposes of cyber-crooks. Through these channels, they are capable of communicating in a stealthy way, unnoticed by the defenders and bypassing the security mechanisms of protected networks. The covert channels facilitate the hidden [...] Read more.
Covert channel communications are of vital importance for the ill-motivated purposes of cyber-crooks. Through these channels, they are capable of communicating in a stealthy way, unnoticed by the defenders and bypassing the security mechanisms of protected networks. The covert channels facilitate the hidden distribution of data to internal agents. For instance, a stealthy covert channel could be beneficial for the purposes of a botmaster that desires to send commands to their bot army, or for exfiltrating corporate and sensitive private data from an internal network of an organization. During the evolution of Internet, a plethora of network protocols has been exploited as covert channel. DNS protocol however has a prominent position in this exploitation race, as it is one of the few protocols that is rarely restricted by security policies or filtered by firewalls, and thus fulfills perfectly a covert channel’s requirements. Therefore, there are more than a few cases where the DNS protocol and infrastructure are exploited in well-known security incidents. In this context, the work at hand puts forward by investigating the feasibility of exploiting the DNS Security Extensions (DNSSEC) as a covert channel. We demonstrate that is beneficial and quite straightforward to embed the arbitrary data of an aggressor’s choice within the DNSKEY resource record, which normally provides the public key of a DNSSEC-enabled domain zone. Since DNSKEY contains the public key encoded in base64 format, it can be easily exploited for the dissemination of an encrypted or stego message, or even for the distribution of a malware’s binary encoded in base64 string. To this end, we implement a proof of concept based on two prominent nameserver software, namely BIND and NDS, and we publish in the DNS hierarchy custom data of our choice concealed as the public key of the DNS zone under our jurisdiction in order to demonstrate the effectiveness of the proposed covert channel. Full article
(This article belongs to the Special Issue Botnets)
Show Figures

Figure 1

38 pages, 16344 KiB  
Article
Modelling and Resolution of Dynamic Reliability Problems by the Coupling of Simulink and the Stochastic Hybrid Fault Tree Object Oriented (SHyFTOO) Library
by Ferdinando Chiacchio, Jose Ignacio Aizpurua, Lucio Compagno, Soheyl Moheb Khodayee and Diego D’Urso
Information 2019, 10(9), 283; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090283 - 11 Sep 2019
Cited by 17 | Viewed by 4626
Abstract
Dependability assessment is one of the most important activities for the analysis of complex systems. Classical analysis techniques of safety, risk, and dependability, like Fault Tree Analysis or Reliability Block Diagrams, are easy to implement, but they estimate inaccurate dependability results due to [...] Read more.
Dependability assessment is one of the most important activities for the analysis of complex systems. Classical analysis techniques of safety, risk, and dependability, like Fault Tree Analysis or Reliability Block Diagrams, are easy to implement, but they estimate inaccurate dependability results due to their simplified hypotheses that assume the components’ malfunctions to be independent from each other and from the system working conditions. Recent contributions within the umbrella of Dynamic Probabilistic Risk Assessment have shown the potential to improve the accuracy of classical dependability analysis methods. Among them, Stochastic Hybrid Fault Tree Automaton (SHyFTA) is a promising methodology because it can combine a Dynamic Fault Tree model with the physics-based deterministic model of a system process, and it can generate dependability metrics along with performance indicators of the physical variables. This paper presents the Stochastic Hybrid Fault Tree Object Oriented (SHyFTOO), a Matlab® software library for the modelling and the resolution of a SHyFTA model. One of the novel features discussed in this contribution is the ease of coupling with a Matlab® Simulink model that facilitates the design of complex system dynamics. To demonstrate the utilization of this software library and the augmented capability of generating further dependability indicators, three different case studies are discussed and solved with a thorough description for the implementation of the corresponding SHyFTA models. Full article
Show Figures

Figure 1

15 pages, 4519 KiB  
Article
A Novel Approach to Component Assembly Inspection Based on Mask R-CNN and Support Vector Machines
by Haisong Huang, Zhongyu Wei and Liguo Yao
Information 2019, 10(9), 282; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090282 - 11 Sep 2019
Cited by 15 | Viewed by 3081
Abstract
Assembly is a very important manufacturing process in the age of Industry 4.0. Aimed at the problems of part identification and assembly inspection in industrial production, this paper proposes a method of assembly inspection based on machine vision and a deep neural network. [...] Read more.
Assembly is a very important manufacturing process in the age of Industry 4.0. Aimed at the problems of part identification and assembly inspection in industrial production, this paper proposes a method of assembly inspection based on machine vision and a deep neural network. First, the image acquisition platform is built to collect the part and assembly images. We use the Mask R-CNN model to identify and segment the shape from each part image, and to obtain the part category and position coordinates in the image. Then, according to the image segmentation results, the area, perimeter, circularity, and Hu invariant moment of the contour are extracted to form the feature vector. Finally, the SVM classification model is constructed to identify the assembly defects, with a classification accuracy rate of over 86.5%. The accuracy of the method is verified by constructing an experimental platform. The results show that the method effectively completes the identification of missing and misaligned parts in the assembly, and has good robustness. Full article
(This article belongs to the Special Issue IoT Applications and Industry 4.0)
Show Figures

Figure 1

18 pages, 1492 KiB  
Article
Factors Influencing Online Hotel Booking: Extending UTAUT2 with Age, Gender, and Experience as Moderators
by Chia-Ming Chang, Li-Wei Liu, Hsiu-Chin Huang and Huey-Hong Hsieh
Information 2019, 10(9), 281; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090281 - 09 Sep 2019
Cited by 49 | Viewed by 17164
Abstract
As people feel more comfortable using the Internet, online hotel bookings has become popular in recent years. Understanding the drivers of online booking intention and behavior can help hotel managers to apply corresponding strategies to increase hotel booking rates. Thus the purpose of [...] Read more.
As people feel more comfortable using the Internet, online hotel bookings has become popular in recent years. Understanding the drivers of online booking intention and behavior can help hotel managers to apply corresponding strategies to increase hotel booking rates. Thus the purpose of this study is to investigate the factors influencing the use intention and behavioral intention of online hotel booking. The proposed model has assimilated factors from the extended Unified theory of Acceptance and use of Technology (UTAUT2) along with age, gender, and experience as moderators. Data were collected by conducting a field survey questionnaire completed by 488 participants. The results showed that behavioral intention is significantly and positively influenced by performance expectancy, social influence, facilitating condition, hedonic motivation, price value, and habit behavior. Use behavior is positively influenced by facilitating condition and hedonic motivation. As for moderators, gender moderates the relationships between performance expectancy, social influence, and behavioral intention. Age moderates the relationships between effort expectancy, social influence, hedonic motivation, and behavioral intention. Experience moderates the relationships between social influence, price value, and behavioral intention and between habit behavior and use behavior. Based on the results, recommendations for hotel managers are proposed. Furthermore, research limitations and future directions are discussed. Full article
Show Figures

Figure 1

22 pages, 10442 KiB  
Article
Constructing and Visualizing High-Quality Classifier Decision Boundary Maps
by Francisco C. M. Rodrigues, Mateus Espadoto, Roberto Hirata, Jr. and Alexandru C. Telea
Information 2019, 10(9), 280; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090280 - 09 Sep 2019
Cited by 21 | Viewed by 4558
Abstract
Visualizing decision boundaries of machine learning classifiers can help in classifier design, testing and fine-tuning. Decision maps are visualization techniques that overcome the key sparsity-related limitation of scatterplots for this task. To increase the trustworthiness of decision map use, we perform an extensive [...] Read more.
Visualizing decision boundaries of machine learning classifiers can help in classifier design, testing and fine-tuning. Decision maps are visualization techniques that overcome the key sparsity-related limitation of scatterplots for this task. To increase the trustworthiness of decision map use, we perform an extensive evaluation considering the dimensionality-reduction (DR) projection techniques underlying decision map construction. We extend the visual accuracy of decision maps by proposing additional techniques to suppress errors caused by projection distortions. Additionally, we propose ways to estimate and visually encode the distance-to-decision-boundary in decision maps, thereby enriching the conveyed information. We demonstrate our improvements and the insights that decision maps convey on several real-world datasets. Full article
(This article belongs to the Special Issue Information Visualization Theory and Applications (IVAPP 2019))
Show Figures

Figure 1

3 pages, 149 KiB  
Editorial
Editorial for the Special Issue on “Natural Language Processing and Text Mining”
by Pablo Gamallo and Marcos Garcia
Information 2019, 10(9), 279; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090279 - 06 Sep 2019
Cited by 1 | Viewed by 2382
Abstract
Natural language processing (NLP) and Text Mining (TM) are a set of overlapping strategies working on unstructured text [...] Full article
(This article belongs to the Special Issue Natural Language Processing and Text Mining)
15 pages, 925 KiB  
Article
An Efficient Dummy-Based Location Privacy-Preserving Scheme for Internet of Things Services
by Yongwen Du, Gang Cai, Xuejun Zhang, Ting Liu and Jinghua Jiang
Information 2019, 10(9), 278; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090278 - 05 Sep 2019
Cited by 9 | Viewed by 3677
Abstract
With the rapid development of GPS-equipped smart mobile devices and mobile computing, location-based services (LBS) are increasing in popularity in the Internet of Things (IoT). Although LBS provide enormous benefits to users, they inevitably introduce some significant privacy concerns. To protect user privacy, [...] Read more.
With the rapid development of GPS-equipped smart mobile devices and mobile computing, location-based services (LBS) are increasing in popularity in the Internet of Things (IoT). Although LBS provide enormous benefits to users, they inevitably introduce some significant privacy concerns. To protect user privacy, a variety of location privacy-preserving schemes have been recently proposed. Among these schemes, the dummy-based location privacy-preserving (DLP) scheme is a widely used approach to achieve location privacy for mobile users. However, the computation cost of the existing dummy-based location privacy-preserving schemes is too high to meet the practical requirements of resource-constrained IoT devices. Moreover, the DLP scheme is inadequate to resist against an adversary with side information. Thus, how to effectively select a dummy location is still a challenge. In this paper, we propose a novel lightweight dummy-based location privacy-preserving scheme, named the enhanced dummy-based location privacy-preserving(Enhanced-DLP) to address this challenge by considering both computational costs and side information. Specifically, the Enhanced-DLP adopts an improved greedy scheme to efficiently select dummy locations to form a k-anonymous set. A thorough security analysis demonstrated that our proposed Enhanced-DLP can protect user privacy against attacks. We performed a series of experiments to verify the effectiveness of our Enhanced-DLP. Compared with the existing scheme, the Enhanced-DLP can obtain lower computational costs for the selection of a dummy location and it can resist side information attacks. The experimental results illustrate that the Enhanced-DLP scheme can effectively be applied to protect the user’s location privacy in IoT applications and services. Full article
(This article belongs to the Special Issue The End of Privacy?)
Show Figures

Figure 1

14 pages, 2624 KiB  
Article
Network Model for Online News Media Landscape in Twitter
by Ford Lumban Gaol, Tokuro Matsuo and Ardian Maulana
Information 2019, 10(9), 277; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090277 - 05 Sep 2019
Cited by 7 | Viewed by 2901
Abstract
Today, most studies of audience networks analyze the landscape of the news media on the web. However, media ecology has been drastically reconfigured by the emergence of social media. In this study, we use Twitter follower data to build an online news media [...] Read more.
Today, most studies of audience networks analyze the landscape of the news media on the web. However, media ecology has been drastically reconfigured by the emergence of social media. In this study, we use Twitter follower data to build an online news media network that represents the pattern of news consumption in Twitter. This study adopted a weighted network model proposed by Mukerjee et al. and implemented the Filter Disparity Method suggested by Majó-Vázquez et al. to identify the most significant overlaps in the network. The implementation result on news media outlets data in three countries, namely Indonesia, Malaysia, and Singapore, shows that network analysis of follower overlap data can offer relevant insights about media diet and the way readers navigate various news sources available on social media. Full article
(This article belongs to the Section Information and Communications Technology)
Show Figures

Figure 1

13 pages, 1479 KiB  
Article
Adverse Drug Event Detection Using a Weakly Supervised Convolutional Neural Network and Recurrent Neural Network Model
by Min Zhang and Guohua Geng
Information 2019, 10(9), 276; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090276 - 04 Sep 2019
Cited by 13 | Viewed by 2708
Abstract
Social media and health-related forums, including the expression of customer reviews, have recently provided data sources for adverse drug reaction (ADR) identification research. However, in the existing methods, the neglect of noise data and the need for manually labeled data reduce the accuracy [...] Read more.
Social media and health-related forums, including the expression of customer reviews, have recently provided data sources for adverse drug reaction (ADR) identification research. However, in the existing methods, the neglect of noise data and the need for manually labeled data reduce the accuracy of the prediction results and greatly increase manual labor. We propose a novel architecture named the weakly supervised mechanism (WSM) convolutional neural network (CNN) long-short-term memory (WSM-CNN-LSTM), which combines the strength of CNN and bi-directional long short-term memory (Bi-LSTM). The WSM applies the weakly labeled data to pre-train the parameters of the model and then uses the labeled data to fine-tune the initialized network parameters. The CNN employs a convolutional layer to study the characteristics of the drug reviews and active features at different scales, and then the feed-forward and feed-back neural networks of the Bi-LSTM utilize these salient features to output the regression results. The experimental results effectively demonstrate that our model marginally outperforms the comparison models in ADR identification and that a small quantity of labeled samples results in an optimal performance, which decreases the influence of noise and reduces the manual data-labeling requirements. Full article
(This article belongs to the Section Information Applications)
Show Figures

Figure 1

15 pages, 779 KiB  
Article
Least Squares Consensus for Matching Local Features
by Qingming Zhang, Buhai Shi and Haibo Xu
Information 2019, 10(9), 275; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090275 - 02 Sep 2019
Cited by 1 | Viewed by 2088
Abstract
This paper presents a new approach to estimate the consensus in a data set. Under the framework of RANSAC, the perturbation on data has not been considered sufficiently. We analysis the computation of homography in RANSAC and find that the variance of its [...] Read more.
This paper presents a new approach to estimate the consensus in a data set. Under the framework of RANSAC, the perturbation on data has not been considered sufficiently. We analysis the computation of homography in RANSAC and find that the variance of its estimation monotonically decreases when the size of sample increases. From this result, we carry out an approach which can suppress the perturbation and estimate the consensus set simultaneously. Different from other consensus estimators based on random sampling methods, our approach builds on the least square method and the order statistics and therefore is an alternative scheme for consensus estimation. Combined with the nearest neighbour-based method, our approach reaches higher matching precision than the plain RANSAC and MSAC, which is shown in our simulations. Full article
Show Figures

Figure 1

21 pages, 942 KiB  
Article
Encrypting and Preserving Sensitive Attributes in Customer Churn Data Using Novel Dragonfly Based Pseudonymizer Approach
by Kalyan Nagaraj, Sharvani GS and Amulyashree Sridhar
Information 2019, 10(9), 274; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090274 - 31 Aug 2019
Cited by 5 | Viewed by 3783
Abstract
With miscellaneous information accessible in public depositories, consumer data is the knowledgebase for anticipating client preferences. For instance, subscriber details are inspected in telecommunication sector to ascertain growth, customer engagement and imminent opportunity for advancement of services. Amongst such parameters, churn rate is [...] Read more.
With miscellaneous information accessible in public depositories, consumer data is the knowledgebase for anticipating client preferences. For instance, subscriber details are inspected in telecommunication sector to ascertain growth, customer engagement and imminent opportunity for advancement of services. Amongst such parameters, churn rate is substantial to scrutinize migrating consumers. However, predicting churn is often accustomed with prevalent risk of invading sensitive information from subscribers. Henceforth, it is worth safeguarding subtle details prior to customer-churn assessment. A dual approach is adopted based on dragonfly and pseudonymizer algorithms to secure lucidity of customer data. This twofold approach ensures sensitive attributes are protected prior to churn analysis. Exactitude of this method is investigated by comparing performances of conventional privacy preserving models against the current model. Furthermore, churn detection is substantiated prior and post data preservation for detecting information loss. It was found that the privacy based feature selection method secured sensitive attributes effectively as compared to traditional approaches. Moreover, information loss estimated prior and post security concealment identified random forest classifier as superlative churn detection model with enhanced accuracy of 94.3% and minimal data forfeiture of 0.32%. Likewise, this approach can be adopted in several domains to shield vulnerable information prior to data modeling. Full article
(This article belongs to the Special Issue The End of Privacy?)
Show Figures

Figure 1

28 pages, 657 KiB  
Article
Terminology Translation in Low-Resource Scenarios
by Rejwanul Haque, Mohammed Hasanuzzaman and Andy Way
Information 2019, 10(9), 273; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090273 - 30 Aug 2019
Cited by 2 | Viewed by 4225
Abstract
Term translation quality in machine translation (MT), which is usually measured by domain experts, is a time-consuming and expensive task. In fact, this is unimaginable in an industrial setting where customised MT systems often need to be updated for many reasons (e.g., availability [...] Read more.
Term translation quality in machine translation (MT), which is usually measured by domain experts, is a time-consuming and expensive task. In fact, this is unimaginable in an industrial setting where customised MT systems often need to be updated for many reasons (e.g., availability of new training data, leading MT techniques). To the best of our knowledge, as of yet, there is no publicly-available solution to evaluate terminology translation in MT automatically. Hence, there is a genuine need to have a faster and less-expensive solution to this problem, which could help end-users to identify term translation problems in MT instantly. This study presents a faster and less expensive strategy for evaluating terminology translation in MT. High correlations of our evaluation results with human judgements demonstrate the effectiveness of the proposed solution. The paper also introduces a classification framework, TermCat, that can automatically classify term translation-related errors and expose specific problems in relation to terminology translation in MT. We carried out our experiments with a low resource language pair, English–Hindi, and found that our classifier, whose accuracy varies across the translation directions, error classes, the morphological nature of the languages, and MT models, generally performs competently in the terminology translation classification task. Full article
(This article belongs to the Special Issue Computational Linguistics for Low-Resource Languages)
Show Figures

Figure 1

19 pages, 320 KiB  
Essay
Correlations and How to Interpret Them
by Harald Atmanspacher and Mike Martin
Information 2019, 10(9), 272; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090272 - 29 Aug 2019
Cited by 5 | Viewed by 3849
Abstract
Correlations between observed data are at the heart of all empirical research that strives for establishing lawful regularities. However, there are numerous ways to assess these correlations, and there are numerous ways to make sense of them. This essay presents a bird’s eye [...] Read more.
Correlations between observed data are at the heart of all empirical research that strives for establishing lawful regularities. However, there are numerous ways to assess these correlations, and there are numerous ways to make sense of them. This essay presents a bird’s eye perspective on different interpretive schemes to understand correlations. It is designed as a comparative survey of the basic concepts. Many important details to back it up can be found in the relevant technical literature. Correlations can (1) extend over time (diachronic correlations) or they can (2) relate data in an atemporal way (synchronic correlations). Within class (1), the standard interpretive accounts are based on causal models or on predictive models that are not necessarily causal. Examples within class (2) are (mainly unsupervised) data mining approaches, relations between domains (multiscale systems), nonlocal quantum correlations, and eventually correlations between the mental and the physical. Full article
16 pages, 6117 KiB  
Article
Waveform Optimization of Compressed Sensing Radar without Signal Recovery
by Quanhui Wang and Ying Sun
Information 2019, 10(9), 271; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090271 - 29 Aug 2019
Cited by 1 | Viewed by 2723
Abstract
Radar signal processing mainly focuses on target detection, classification, estimation, filtering, and so on. Compressed sensing radar (CSR) technology can potentially provide additional tools to simultaneously reduce computational complexity and effectively solve inference problems. CSR allows direct compressive signal processing without the need [...] Read more.
Radar signal processing mainly focuses on target detection, classification, estimation, filtering, and so on. Compressed sensing radar (CSR) technology can potentially provide additional tools to simultaneously reduce computational complexity and effectively solve inference problems. CSR allows direct compressive signal processing without the need to reconstruct the signal. This study aimed to solve the problem of CSR detection without signal recovery by optimizing the transmit waveform. Therefore, a waveform optimization method was introduced to improve the output signal-to-interference-plus-noise ratio (SINR) in the case where the target signal is corrupted by colored interference and noise having known statistical characteristics. Two different target models are discussed: deterministic and random. In the case of a deterministic target, the optimum transmit waveform is derived by maximizing the SINR and a suboptimum solution is also presented. In the case of random target, an iterative waveform optimization method is proposed to maximize the output SINR. This approach ensures that SINR performance is improved in each iteration step. The performance of these methods is illustrated by computer simulation. Full article
(This article belongs to the Section Information Processes)
Show Figures

Figure 1

16 pages, 2307 KiB  
Article
Process Discovery in Business Process Management Optimization
by Paweł Dymora, Maciej Koryl and Mirosław Mazurek
Information 2019, 10(9), 270; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090270 - 29 Aug 2019
Cited by 7 | Viewed by 4489
Abstract
Appropriate business processes management (BPM) within an organization can help attain organizational goals. It is particularly important to effectively manage the lifecycle of these processes for organizational effectiveness in improving ever-growing performance and competitivity-building across the company. This paper presents a process discovery [...] Read more.
Appropriate business processes management (BPM) within an organization can help attain organizational goals. It is particularly important to effectively manage the lifecycle of these processes for organizational effectiveness in improving ever-growing performance and competitivity-building across the company. This paper presents a process discovery and how we can use it in a broader framework supporting self-organization in BPM. Process discovery is intrinsically associated with the process lifecycle. We have made a pre-evaluation of the usefulness of our facts using a generated log file. We also compared visualizations of the outcomes of our approach with different cases and showed performance characteristics of the cash loan sales process. Full article
(This article belongs to the Section Information Systems)
Show Figures

Figure 1

17 pages, 854 KiB  
Concept Paper
Computer Vision-Based Unobtrusive Physical Activity Monitoring in School by Room-Level Physical Activity Estimation: A Method Proposition
by Hans Hõrak
Information 2019, 10(9), 269; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090269 - 28 Aug 2019
Cited by 9 | Viewed by 3849
Abstract
As sedentary lifestyles and childhood obesity are becoming more prevalent, research in the field of physical activity (PA) has gained much momentum. Monitoring the PA of children and adolescents is crucial for ascertaining and understanding the phenomena that facilitate and hinder PA in [...] Read more.
As sedentary lifestyles and childhood obesity are becoming more prevalent, research in the field of physical activity (PA) has gained much momentum. Monitoring the PA of children and adolescents is crucial for ascertaining and understanding the phenomena that facilitate and hinder PA in order to develop effective interventions for promoting physically active habits. Popular individual-level measures are sensitive to social desirability bias and subject reactivity. Intrusiveness of these methods, especially when studying children, also limits the possible duration of monitoring and assumes strict submission to human research ethics requirements and vigilance in personal data protection. Meanwhile, growth in computational capacity has enabled computer vision researchers to successfully use deep learning algorithms for real-time behaviour analysis such as action recognition. This work analyzes the weaknesses of existing methods used in PA research; gives an overview of relevant advances in video-based action recognition methods; and proposes the outline of a novel action intensity classifier utilizing sensor-supervised learning for estimating ambient PA. The proposed method, if applied as a distributed privacy-preserving sensor system, is argued to be useful for monitoring the spatio-temporal distribution of PA in schools over long periods and assessing the efficiency of school-based PA interventions. Full article
Show Figures

Figure 1

16 pages, 441 KiB  
Article
The Usefulness of Imperfect Speech Data for ASR Development in Low-Resource Languages
by Jaco Badenhorst and Febe de Wet
Information 2019, 10(9), 268; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090268 - 28 Aug 2019
Cited by 6 | Viewed by 3273
Abstract
When the National Centre for Human Language Technology (NCHLT) Speech corpus was released, it created various opportunities for speech technology development in the 11 official, but critically under-resourced, languages of South Africa. Since then, the substantial improvements in acoustic modeling that deep architectures [...] Read more.
When the National Centre for Human Language Technology (NCHLT) Speech corpus was released, it created various opportunities for speech technology development in the 11 official, but critically under-resourced, languages of South Africa. Since then, the substantial improvements in acoustic modeling that deep architectures achieved for well-resourced languages ushered in a new data requirement: their development requires hundreds of hours of speech. A suitable strategy for the enlargement of speech resources for the South African languages is therefore required. The first possibility was to look for data that has already been collected but has not been included in an existing corpus. Additional data was collected during the NCHLT project that was not included in the official corpus: it only contains a curated, but limited subset of the data. In this paper, we first analyze the additional resources that could be harvested from the auxiliary NCHLT data. We also measure the effect of this data on acoustic modeling. The analysis incorporates recent factorized time-delay neural networks (TDNN-F). These models significantly reduce phone error rates for all languages. In addition, data augmentation and cross-corpus validation experiments for a number of the datasets illustrate the utility of the auxiliary NCHLT data. Full article
(This article belongs to the Special Issue Computational Linguistics for Low-Resource Languages)
Show Figures

Figure 1

12 pages, 582 KiB  
Article
Study on Unknown Term Translation Mining from Google Snippets
by Bin Li and Jianmin Yao
Information 2019, 10(9), 267; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090267 - 28 Aug 2019
Cited by 2 | Viewed by 2574
Abstract
Bilingual web pages are widely used to mine translations of unknown terms. This study focused on an effective solution for obtaining relevant web pages, extracting translations with correct lexical boundaries, and ranking the translation candidates. This research adopted co-occurrence information to obtain the [...] Read more.
Bilingual web pages are widely used to mine translations of unknown terms. This study focused on an effective solution for obtaining relevant web pages, extracting translations with correct lexical boundaries, and ranking the translation candidates. This research adopted co-occurrence information to obtain the subject terms and then expanded the source query with the translation of the subject terms to collect effective bilingual search engine snippets. Afterwards, valid candidates were extracted from small-sized, noisy bilingual corpora using an improved frequency change measurement that combines adjacent information. This research developed a method that considers surface patterns, frequency–distance, and phonetic features to elect an appropriate translation. The experimental results revealed that the proposed method performed remarkably well for mining translations of unknown terms. Full article
Show Figures

Figure 1

23 pages, 15211 KiB  
Article
Enhanced Grid-Based Visual Analysis of Retinal Layer Thickness with Optical Coherence Tomography
by Martin Röhlig, Ruby Kala Prakasam, Jörg Stüwe, Christoph Schmidt, Oliver Stachs and Heidrun Schumann
Information 2019, 10(9), 266; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090266 - 23 Aug 2019
Cited by 10 | Viewed by 10063
Abstract
Optical coherence tomography enables high-resolution 3D imaging of retinal layers in the human eye. The thickness of the layers is commonly assessed to understand a variety of retinal and systemic disorders. Yet, the thickness data are complex and currently need to be considerably [...] Read more.
Optical coherence tomography enables high-resolution 3D imaging of retinal layers in the human eye. The thickness of the layers is commonly assessed to understand a variety of retinal and systemic disorders. Yet, the thickness data are complex and currently need to be considerably reduced prior to further processing and analysis. This leads to a loss of information on localized variations in thickness, which is important for early detection of certain retinal diseases. We propose an enhanced grid-based reduction and exploration of retinal thickness data. Alternative grids are computed, their representation quality is rated, and best fitting grids for given thickness data are suggested. Selected grids are then visualized, adapted, and compared at different levels of granularity. A visual analysis tool bundles all computational, visual, and interactive means in a flexible user interface. We demonstrate the utility of our tool in a complementary analysis procedure, which eases the evaluation of ophthalmic study data. Ophthalmologists successfully applied our solution to study localized variations in thickness of retinal layers in patients with diabetes mellitus. Full article
(This article belongs to the Special Issue Information Visualization Theory and Applications (IVAPP 2019))
Show Figures

Graphical abstract

16 pages, 331 KiB  
Article
Breaking the MDS-PIR Capacity Barrier via Joint Storage Coding
by Hua Sun and Chao Tian
Information 2019, 10(9), 265; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090265 - 22 Aug 2019
Cited by 15 | Viewed by 2859
Abstract
The capacity of private information retrieval (PIR) from databases coded using maximum distance separable (MDS) codes was previously characterized by Banawan and Ulukus, where it was assumed that the messages are encoded and stored separably in the databases. This assumption was also usually [...] Read more.
The capacity of private information retrieval (PIR) from databases coded using maximum distance separable (MDS) codes was previously characterized by Banawan and Ulukus, where it was assumed that the messages are encoded and stored separably in the databases. This assumption was also usually made in other related works in the literature, and this capacity is usually referred to as the MDS-PIR capacity colloquially. In this work, we considered the question of if and when this capacity barrier can be broken through joint encoding and storing of the messages. Our main results are two classes of novel code constructions, which allow joint encoding, as well as the corresponding PIR protocols, which indeed outperformed the separate MDS-coded systems. Moreover, we show that a simple, but novel expansion technique allows us to generalize these two classes of codes, resulting in a wider range of the cases where this capacity barrier can be broken. Full article
(This article belongs to the Special Issue Private Information Retrieval: Techniques and Applications)
31 pages, 861 KiB  
Review
A Systematic Mapping Study of MMOG Backend Architectures
by Nicos Kasenides and Nearchos Paspallis
Information 2019, 10(9), 264; https://0-doi-org.brum.beds.ac.uk/10.3390/info10090264 - 21 Aug 2019
Cited by 4 | Viewed by 4086
Abstract
The advent of utility computing has revolutionized almost every sector of traditional software development. Especially commercial cloud computing services, pioneered by the likes of Amazon, Google and Microsoft, have provided an unprecedented opportunity for the fast and sustainable development of complex distributed systems. [...] Read more.
The advent of utility computing has revolutionized almost every sector of traditional software development. Especially commercial cloud computing services, pioneered by the likes of Amazon, Google and Microsoft, have provided an unprecedented opportunity for the fast and sustainable development of complex distributed systems. Nevertheless, existing models and tools aim primarily for systems where resource usage—by humans and bots alike—is logically and physically quite disperse resulting in a low likelihood of conflicting resource access. However, a number of resource-intensive applications, such as Massively Multiplayer Online Games (MMOGs) and large-scale simulations introduce a requirement for a very large common state with many actors accessing it simultaneously and thus a high likelihood of conflicting resource access. This paper presents a systematic mapping study of the state-of-the-art in software technology aiming explicitly to support the development of MMOGs, a class of large-scale, resource-intensive software systems. By examining the main focus of a diverse set of related publications, we identify a list of criteria that are important for MMOG development. Then, we categorize the selected studies based on the inferred criteria in order to compare their approach, unveil the challenges faced in each of them and reveal research trends that might be present. Finally we attempt to identify research directions which appear promising for enabling the use of standardized technology for this class of systems. Full article
(This article belongs to the Section Review)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop