Next Article in Journal
Particle Swarm Optimization Based on a Novel Evaluation of Diversity
Previous Article in Journal
A Memetic Algorithm for an External Depot Production Routing Problem
 
 
Article
Peer-Review Record

Capturing Protein Domain Structure and Function Using Self-Supervision on Domain Architectures

by Damianos P. Melidis 1,* and Wolfgang Nejdl 1,2
Reviewer 1:
Reviewer 2: Anonymous
Submission received: 28 December 2020 / Revised: 6 January 2021 / Accepted: 15 January 2021 / Published: 19 January 2021

Round 1

Reviewer 1 Report

The revised paper is in a better shape and as in my previous report, I vote for acceptance

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This work presents dom2vec, an approach for learning protein domain embeddings. They learned embedding by a well known knowledge base and used pubblic available dataset. So the experiments seem replicable.
Moreover, as an important contribution the authors make available the trained domains embeddings to be used by the research community.


The authors should mention their own related previous research and highlights how much it differs from the present work.
Examples of related work I found online is:
-dom2vec: Capturing domain structure and function using self-supervision on protein domain architectures
-dom2vec: Assessable domain embeddings and their use for protein prediction tasks


Minor remarks
row 26 "There exist two ways to represent domains..." it could be interesting to read more on the graph based approach and how this one used here is better (e.g. less computational expensive)


row 297"This same protein, Diphthine synthase, was picked as
example illustration for annotations in the latest InterPro work [40]." So what? I expected a comparison or something with [40]. In 45 they use InterPro version 70, does it matter? For a reader not confident with InterPro is interesting to read more information and know why [40] is important to the submitted work.

In the following suggestions related to bibliography are provided.

row 43, 43
the authors write:"The use of word embeddings improved the performance
on most of the tasks such as sentiment analysis, Named Entity Recognition (NER), etc."
One other task is negation detection, one example is due to this reference:
Giuseppe Attardi, Vittoria Cozza, Daniele Sartiano. Detecting the scope of negations in clinical notes. Proceedings of the Second Italian Conference on Computational Linguistics, CLiC-it 2015, pp. 130-135, ISBN 978-88-99200-62-6

The authors cited already Collobert et al. work, but there is a more recent reference and that is a journal paper:
-"Ronan Collobert et al. 2011. Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research, 12, 2461–2505."

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

The article presents dom2vec an algorithmic framework that tries to use word embeddings in order to handle efficiently protein information. The main target of the authors is to prove that unsupervised protein domain embeddings capture domains structure and function providing data-driven insights into collocations in domain architectures.

The idea of employing word embeddings in order to characterize biological annotations seems to be novel and the authors appear to successfully map the structure and function in domain architectures to the local linguistic features, semantic and syntactic, in natural languages. The authors also present an extensive experimental study and they successfully depict that inputting both sequence and dom2vec embeddings can boost performance on protein prediction tasks. One of the contributions of the article is that they established a quantitative intrinsic evaluation method based on the most significant biological information for a domain; moreover, the authors made available the trained domains embeddings to be used by the research community

The article is nicely written, and the ideas seem to be nicely worked out, moreover the experimental section seems to be detailed enough and convincing, therefore I vote for acceptance.

Reviewer 2 Report

dom2vec: Unsupervised protein domain embeddings capture domains structure and function providing data-driven insights into collocations in domain architectures

 

The title is too long, and it does not explain the content of the manuscript.

 

The abstract does not show the findings of this work. It is necessary to mention the outstanding quantitative results of the research. Tell us why your study should be published in this prestigious journal. Tell us what the difference is among the other similar works.

 

I do not think that the keyword is inappropriate for the abstract. Those are may and super elaborated.

Try to use passive voice, for example, "we confirm two” "we make five major” "We propose” "We established."

 

It must be indicated the structure of the paper at the end of the introduction

 

Improve your images, for instance, F1 seems like a draft of a student's homework. It must be fully vectorized. If you are using LaTex, please use the command of the matrix for showing separately that figure.

 

You must understand that the caption of the figures as F2 should be just a reference, not many sentences; for that case, put the discussion into the below paragraphs. Same for F4, F5

 

It is tough to follow the structure of the section "Materials and Methods". In one time we do know where we are, is so confusing

 

In rows 245 to 347, did you mean "x" as a vector product?

Authors should be careful regarding the homogeneity of the syntaxis of maths because sometimes write in one form, and then they change, "k = 2” "w=5” "SKIP,w=5,dim=50,ep=40)"

 

My major concern is that there is no literature for comparing, I recommend the authors create a table depicting this issue.

 

Back to TopTop