Quality of Open Data

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Information Processes".

Deadline for manuscript submissions: closed (30 January 2020) | Viewed by 42185

Special Issue Editors


E-Mail Website
Guest Editor
Department of Information Systems, Poznań University of Economics and Business, 61-875 Poznań, Poland
Interests: data quality; information quality; fake news; disinformation; misinformation; Wikipedia; DBpedia; Wikidata; wiki; Open Data; Linked Open Data

E-Mail Website
Guest Editor
Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milan, Italy
Interests: data quality in linked data; time-related quality dimensions; semantic web; information extraction; big data

E-Mail Website
Guest Editor
Department of Information Systems, Poznan University of Economics, Poznań, Poland
Interests: linked data; open data; big data; value of information; semantic web; data quality

Special Issue Information

Dear Colleagues,

text

The 2nd International Workshop on Quality of Open Data (QOD 2019) will be held in June 2019 in Seville in conjunction with the 22nd International Conference on Business Information Systems. The goal of the workshop is to bring together different communities working on quality in Wikipedia, DBpedia, Wikidata, OpenStreetMap, Wikimapia, and other open knowledge bases. The workshop calls for sharing research experiences and knowledge related to quality assessment in open data. We invite papers that provide methodologies and techniques which can help to verify and enrich various community-based services in different languages. This Special Issue of Information intends to attract submissions on the topic of quality issues in Open Data. There is no Article Processing Charge for extended version of the accepted papers at the QOD 2019 workshop.

QOD 2019 website: http://qod.bisconf.info

Dr. Włodzimierz Lewoniewski
Dr. Anisa Rula
Dr. Krzysztof Węcel
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Data quality
  • Open data
  • LOD
  • Information enrichment
  • Geospatial data
  • Knowledge base
  • Wikipedia
  • DBpedia
  • Wikidata
  • OpenStreetMap
  • Wikimapia

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

10 pages, 338 KiB  
Article
Main Influencing Factors of Quality Determination of Collaborative Open Data Pages
by Ralf-Christian Härting and Włodzimierz Lewoniewski
Information 2020, 11(6), 283; https://0-doi-org.brum.beds.ac.uk/10.3390/info11060283 - 27 May 2020
Viewed by 5265
Abstract
Collaborative knowledge bases allow anyone to create and edit information online. One example of a resource with collaborative content is Wikipedia. Despite the fact that this free encyclopedia is one of the most popular sources of information in the world, it is often [...] Read more.
Collaborative knowledge bases allow anyone to create and edit information online. One example of a resource with collaborative content is Wikipedia. Despite the fact that this free encyclopedia is one of the most popular sources of information in the world, it is often criticized for the poor quality of its content. Articles in Wikipedia in different languages on the same topic, can be created and edited independently of each other. Some of these language versions can provide very different but valuable information on each topic. Measuring the quality of articles using metrics is intended to make open data pages such as Wikipedia more reliable and trustworthy. A major challenge is that the ‘gold standard’ in determining the quality of an open data page is unknown. Therefore, we investigated which factors influence the potentials of quality determination of collaborative open data pages and their sources. Our model is based on empirical data derived from the experience of international experts on knowledge management and data quality. It has been developed by using semi-structured interviews and a qualitative content analysis based on Grounded Theory (GT). Important influencing factors are: Better outcomes, Better decision making, Limitations, More efficient workflows for article creation and review, Process efficiency, Quality improvement, Reliable and trustworthy utilization of data. Full article
(This article belongs to the Special Issue Quality of Open Data)
Show Figures

Figure 1

37 pages, 3305 KiB  
Article
Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
by Włodzimierz Lewoniewski, Krzysztof Węcel and Witold Abramowicz
Information 2020, 11(5), 263; https://0-doi-org.brum.beds.ac.uk/10.3390/info11050263 - 13 May 2020
Cited by 17 | Viewed by 22687
Abstract
One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 [...] Read more.
One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia. Full article
(This article belongs to the Special Issue Quality of Open Data)
Show Figures

Figure 1

15 pages, 677 KiB  
Article
Quality of Open Research Data: Values, Convergences and Governance
by Tibor Koltay
Information 2020, 11(4), 175; https://0-doi-org.brum.beds.ac.uk/10.3390/info11040175 - 25 Mar 2020
Cited by 11 | Viewed by 4407
Abstract
This paper focuses on the characteristics of research data quality, and aims to cover the most important issues related to it, giving particular attention to its attributes and to data governance. The corporate word’s considerable interest in the quality of data is obvious [...] Read more.
This paper focuses on the characteristics of research data quality, and aims to cover the most important issues related to it, giving particular attention to its attributes and to data governance. The corporate word’s considerable interest in the quality of data is obvious in several thoughts and issues reported in business-related publications, even if there are apparent differences between values and approaches to data in corporate and in academic (research) environments. The paper also takes into consideration that addressing data quality would be unimaginable without considering big data. Full article
(This article belongs to the Special Issue Quality of Open Data)
Show Figures

Figure 1

13 pages, 738 KiB  
Article
Error Detection in a Large-Scale Lexical Taxonomy
by Yinan An, Sifan Liu and Hongzhi Wang
Information 2020, 11(2), 97; https://0-doi-org.brum.beds.ac.uk/10.3390/info11020097 - 11 Feb 2020
Cited by 3 | Viewed by 2198
Abstract
Knowledge base (KB) is an important aspect in artificial intelligence. One significant challenge faced by KB construction is that it contains many noises, which prevent its effective usage. Even though some KB cleansing algorithms have been proposed, they focus on the structure of [...] Read more.
Knowledge base (KB) is an important aspect in artificial intelligence. One significant challenge faced by KB construction is that it contains many noises, which prevent its effective usage. Even though some KB cleansing algorithms have been proposed, they focus on the structure of the knowledge graph and neglect the relation between the concepts, which could be helpful to discover wrong relations in KB. Motived by this, we measure the relation of two concepts by the distance between their corresponding instances and detect errors within the intersection of the conflicting concept sets. For efficient and effective knowledge base cleansing, we first apply a distance-based model to determine the conflicting concept sets using two different methods. Then, we propose and analyze several algorithms on how to detect and repair the errors based on our model, where we use a hash method for an efficient way to calculate distance. Experimental results demonstrate that the proposed approaches could cleanse the knowledge bases efficiently and effectively. Full article
(This article belongs to the Special Issue Quality of Open Data)
Show Figures

Figure 1

15 pages, 4429 KiB  
Article
Text and Data Quality Mining in CRIS
by Otmane Azeroual
Information 2019, 10(12), 374; https://0-doi-org.brum.beds.ac.uk/10.3390/info10120374 - 28 Nov 2019
Cited by 6 | Viewed by 6471
Abstract
To provide scientific institutions with comprehensive and well-maintained documentation of their research information in a current research information system (CRIS), they have the best prerequisites for the implementation of text and data mining (TDM) methods. Using TDM helps to better identify and eliminate [...] Read more.
To provide scientific institutions with comprehensive and well-maintained documentation of their research information in a current research information system (CRIS), they have the best prerequisites for the implementation of text and data mining (TDM) methods. Using TDM helps to better identify and eliminate errors, improve the process, develop the business, and make informed decisions. In addition, TDM increases understanding of the data and its context. This not only improves the quality of the data itself, but also the institution’s handling of the data and consequently the analyses. This present paper deploys TDM in CRIS to analyze, quantify, and correct the unstructured data and its quality issues. Bad data leads to increased costs or wrong decisions. Ensuring high data quality is an essential requirement when creating a CRIS project. User acceptance in a CRIS depends, among other things, on data quality. Not only is the objective data quality the decisive criterion, but also the subjective quality that the individual user assigns to the data. Full article
(This article belongs to the Special Issue Quality of Open Data)
Show Figures

Figure 1

Back to TopTop