Bioinformatics and Computational Biology for Cancer Prediction and Prognosis

A special issue of Genes (ISSN 2073-4425). This special issue belongs to the section "Bioinformatics".

Deadline for manuscript submissions: 1 May 2024 | Viewed by 11371

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science, Eastern Connecticut State University, Willimantic, CT, USA
Interests: bioinformatics and computational biology; cancer bioinformatics

E-Mail Website
Guest Editor
First Department of Pediatrics, National and Kapodistrian University of Athens, 11527 Goudi-Athens, Greece
Interests: cancer biology; leukemia
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Bioinformatics tools play a vital role in understanding the biological complexity of cancer through the extraction of meaningful information from a large volume of diverse datasets. Of utmost importance are tools for data analysis, visualization, and interpretation that would aid in the realization of personalized medicine based on omics (genomic, transcriptomic, or proteomic) data, as well as on images and text.

This Special Issue aims to provide an overview of new and current bioinformatics tools for cancer prediction and prognosis. Contributions may describe novel approaches, or the application of new and existing ones, that aid in the identification of diagnostic, prognostic, or predictive cancer biomarkers; that identify potential therapeutic targets and important cancer-related pathways; or that otherwise provide valuable insight into cancer biology and treatment. To make progress in the field of cancer bioinformatics, contributions by experts in the field in the form of research papers and critical reviews are welcomed.

Dr. Garrett M. Dancik
Dr. Spiros Vlahopoulos
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Genes is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • cancer bioinformatics
  • biomarkers
  • biostatistics
  • genomic sequencing
  • image recognition
  • databases

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

30 pages, 6195 KiB  
Article
Comprehensive Bioinformatic Investigation of TP53 Dysregulation in Diverse Cancer Landscapes
by Ruby Khan, Bakht Pari and Krzysztof Puszynski
Genes 2024, 15(5), 577; https://0-doi-org.brum.beds.ac.uk/10.3390/genes15050577 (registering DOI) - 30 Apr 2024
Viewed by 146
Abstract
P53 overexpression plays a critical role in cancer pathogenesis by disrupting the intricate regulation of cellular proliferation. Despite its firmly established function as a tumor suppressor, elevated p53 levels can paradoxically contribute to tumorigenesis, influenced by factors such as exposure to carcinogens, genetic [...] Read more.
P53 overexpression plays a critical role in cancer pathogenesis by disrupting the intricate regulation of cellular proliferation. Despite its firmly established function as a tumor suppressor, elevated p53 levels can paradoxically contribute to tumorigenesis, influenced by factors such as exposure to carcinogens, genetic mutations, and viral infections. This phenomenon is observed across a spectrum of cancer types, including bladder (BLCA), ovarian (OV), cervical (CESC), cholangiocarcinoma (CHOL), colon adenocarcinoma (COAD), diffuse large B-cell lymphoma (DLBC), esophageal carcinoma (ESCA), head and neck squamous cell carcinoma (HNSC), kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC), liver hepatocellular carcinoma (LIHC), lung adenocarcinoma (LUAD), lung squamous cell carcinoma (LUSC), and uterine corpus endometrial carcinoma (UCEC). This broad spectrum of cancers is often associated with increased aggressiveness and recurrence risk. Effective therapeutic strategies targeting tumors with p53 overexpression require a comprehensive approach, integrating targeted interventions aimed at the p53 gene with conventional modalities such as chemotherapy, radiation therapy, and targeted drugs. In this extensive study, we present a detailed analysis shedding light on the multifaceted role of TP53 across various cancers, with a specific emphasis on its impact on disease-free survival (DFS). Leveraging data from the TCGA database and the GTEx dataset, along with GEPIA, UALCAN, and STRING, we identify TP53 overexpression as a significant prognostic indicator, notably pronounced in prostate adenocarcinoma (PRAD). Supported by compelling statistical significance (p < 0.05), our analysis reveals the distinct influence of TP53 overexpression on DFS outcomes in PRAD. Additionally, graphical representations of overall survival (OS) underscore the notable disparity in OS duration between tumors exhibiting elevated TP53 expression (depicted by the red line) and those with lower TP53 levels (indicated by the blue line). The hazard ratio (HR) further emphasizes the profound impact of TP53 on overall survival. Moreover, our investigation delves into the intricate TP53 protein network, unveiling genes exhibiting robust positive correlations with TP53 expression across 13 out of 27 cancers. Remarkably, negative correlations emerge with pivotal tumor suppressor genes. This network analysis elucidates critical proteins, including SIRT1, CBP, p300, ATM, DAXX, HSP 90-alpha, Mdm2, RPA70, 14-3-3 protein sigma, p53, and ASPP2, pivotal in regulating cell cycle dynamics, DNA damage response, and transcriptional regulation. Our study underscores the paramount importance of deciphering TP53 dynamics in cancer, providing invaluable insights into tumor behavior, disease-free survival, and potential therapeutic avenues. Full article
Show Figures

Figure 1

17 pages, 7948 KiB  
Article
Integrated Pleiotropic Gene Set Unveils Comorbidity Insights across Digestive Cancers and Other Diseases
by Xinnan Wu, Guangwen Luo, Zhaonian Dong, Wen Zheng and Gengjie Jia
Genes 2024, 15(4), 478; https://0-doi-org.brum.beds.ac.uk/10.3390/genes15040478 - 10 Apr 2024
Viewed by 528
Abstract
Comorbidities are prevalent in digestive cancers, intensifying patient discomfort and complicating prognosis. Identifying potential comorbidities and investigating their genetic connections in a systemic manner prove to be instrumental in averting additional health challenges during digestive cancer management. Here, we investigated 150 diseases across [...] Read more.
Comorbidities are prevalent in digestive cancers, intensifying patient discomfort and complicating prognosis. Identifying potential comorbidities and investigating their genetic connections in a systemic manner prove to be instrumental in averting additional health challenges during digestive cancer management. Here, we investigated 150 diseases across 18 categories by collecting and integrating various factors related to disease comorbidity, such as disease-associated SNPs or genes from sources like MalaCards, GWAS Catalog and UK Biobank. Through this extensive analysis, we have established an integrated pleiotropic gene set comprising 548 genes in total. Particularly, there enclosed the genes encoding major histocompatibility complex or related to antigen presentation. Additionally, we have unveiled patterns in protein-protein interactions and key hub genes/proteins including TP53, KRAS, CTNNB1 and PIK3CA, which may elucidate the co-occurrence of digestive cancers with certain diseases. These findings provide valuable insights into the molecular origins of comorbidity, offering potential avenues for patient stratification and the development of targeted therapies in clinical trials. Full article
Show Figures

Figure 1

13 pages, 7985 KiB  
Article
Exploring Immune-Related Gene Profiling and Infiltration of Immune Cells in Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma
by Jialu Li and Juqun Xi
Genes 2024, 15(1), 121; https://0-doi-org.brum.beds.ac.uk/10.3390/genes15010121 - 19 Jan 2024
Viewed by 999
Abstract
Cervical cancer is a widespread malignancy among women, leading to a substantial global health impact. Despite extensive research, our understanding of the basic molecules and pathogenic processes of cervical squamous cell carcinoma is still insufficient. This investigation aims to uncover immune-related genes linked [...] Read more.
Cervical cancer is a widespread malignancy among women, leading to a substantial global health impact. Despite extensive research, our understanding of the basic molecules and pathogenic processes of cervical squamous cell carcinoma is still insufficient. This investigation aims to uncover immune-related genes linked to CESC and delineate their functions. Leveraging data from the GEO and ImmPort databases, a total of 22 immune-related genes were identified. Multiple tools, including DAVID, the human protein atlas, STRING, GeneMANIA, and TCGA, were employed to delve into the expression and roles of these immune genes in CESC, alongside their connections to the disease’s pathological features. Through RT-PCR, the study confirmed notable disparities in CXCL8 and CXCL10 mRNA expression between CESC and normal cervical tissue. The TCGA dataset’s immune-related information reinforced the association of CXCL8 and CXCL10 with immune infiltration in CESC. This research sheds light on the potential of CXCL8 and CXCL10 as promising therapeutic targets and essential prognostic factors for individuals diagnosed with CESC. Full article
Show Figures

Figure 1

19 pages, 3162 KiB  
Article
High Expression of THY1 in Intestinal Gastric Cancer as a Key Factor in Tumor Biology: A Poor Prognosis-Independent Marker Related to the Epithelial–Mesenchymal Transition Profile
by Paulo Rohan, Everton Cruz dos Santos, Eliana Abdelhay and Renata Binato
Genes 2024, 15(1), 28; https://0-doi-org.brum.beds.ac.uk/10.3390/genes15010028 - 24 Dec 2023
Viewed by 1027
Abstract
Gastric cancer (GC) is an important cancer-related death worldwide. Among its histological subtypes, intestinal gastric cancer (IGC) is the most common. A previous work showed that increased expression of the THY1 gene was associated with poor overall survival in IGC. Furthermore, it was [...] Read more.
Gastric cancer (GC) is an important cancer-related death worldwide. Among its histological subtypes, intestinal gastric cancer (IGC) is the most common. A previous work showed that increased expression of the THY1 gene was associated with poor overall survival in IGC. Furthermore, it was shown that IGC tumor cells with high expression of THY1 have a greater capacity for tumorigenesis and metastasis in vitro. This study aimed to identify molecular differences between IGC with high and low expression of THY1. Using a feature selection method, a group of 35 genes were found to be the most informative gene set for THY1high IGC tumors. Through a classification model, these genes differentiate THY1high from THY1low tumors with 100% of accuracy both in the test subset and the independent test set. Additionally, this group of 35 genes correctly clustered 100% of the samples. An extensive validation of this potential molecular signature in multiple cohorts successfully segregated between THY1high and THY1low IGC tumors (>95%), proving to be independent of the gene expression quantification methodology. These genes are involved in central processes to tumor biology, such as the epithelial–mesenchymal transition (EMT) and remodeling of the tumor tissue composition. Moreover, patients with THY1high IGC demonstrated poor survival and a more advanced clinicopathological staging. Our findings revealed a molecular signature for IGC with high THY1 expression. This signature showed EMT and remodeling of the tumor tissue composition potentially related to the biology of IGC. Altogether, our results indicate that THY1high IGC tumors are a particular subset of tumors with a specific molecular and prognosis profile. Full article
Show Figures

Figure 1

11 pages, 1881 KiB  
Article
Uncovering the Molecular Drivers of NHEJ DNA Repair-Implicated Missense Variants and Their Functional Consequences
by Raghad Al-Jarf, Malancha Karmakar, Yoochan Myung and David B. Ascher
Genes 2023, 14(10), 1890; https://0-doi-org.brum.beds.ac.uk/10.3390/genes14101890 - 29 Sep 2023
Cited by 1 | Viewed by 1074
Abstract
Variants in non-homologous end joining (NHEJ) DNA repair genes are associated with various human syndromes, including microcephaly, growth delay, Fanconi anemia, and different hereditary cancers. However, very little has been done previously to systematically record the underlying molecular consequences of NHEJ variants and [...] Read more.
Variants in non-homologous end joining (NHEJ) DNA repair genes are associated with various human syndromes, including microcephaly, growth delay, Fanconi anemia, and different hereditary cancers. However, very little has been done previously to systematically record the underlying molecular consequences of NHEJ variants and their link to phenotypic outcomes. In this study, a list of over 2983 missense variants of the principal components of the NHEJ system, including DNA Ligase IV, DNA-PKcs, Ku70/80 and XRCC4, reported in the clinical literature, was initially collected. The molecular consequences of variants were evaluated using in silico biophysical tools to quantitatively assess their impact on protein folding, dynamics, stability, and interactions. Cancer-causing and population variants within these NHEJ factors were statistically analyzed to identify molecular drivers. A comprehensive catalog of NHEJ variants from genes known to be mutated in cancer was curated, providing a resource for better understanding their role and molecular mechanisms in diseases. The variant analysis highlighted different molecular drivers among the distinct proteins, where cancer-driving variants in anchor proteins, such as Ku70/80, were more likely to affect key protein–protein interactions, whilst those in the enzymatic components, such as DNA-PKcs, were likely to be found in intolerant regions undergoing purifying selection. We believe that the information acquired in our database will be a powerful resource to better understand the role of non-homologous end-joining DNA repair in genetic disorders, and will serve as a source to inspire other investigations to understand the disease further, vital for the development of improved therapeutic strategies. Full article
Show Figures

Figure 1

10 pages, 743 KiB  
Article
Radiogenomic Features of GIMAP Family Genes in Clear Cell Renal Cell Carcinoma: An Observational Study on CT Images
by Federico Greco, Andrea Panunzio, Alessandro Tafuri, Caterina Bernetti, Vincenzo Pagliarulo, Bruno Beomonte Zobel, Arnaldo Scardapane and Carlo Augusto Mallio
Genes 2023, 14(10), 1832; https://0-doi-org.brum.beds.ac.uk/10.3390/genes14101832 - 22 Sep 2023
Cited by 2 | Viewed by 1030
Abstract
GTPases of immunity-associated proteins (GIMAP) genes include seven functional genes and a pseudogene. Most of the GIMAPs have a role in the maintenance and development of lymphocytes. GIMAPs could inhibit the development of tumors by increasing the amount and antitumor activity of infiltrating [...] Read more.
GTPases of immunity-associated proteins (GIMAP) genes include seven functional genes and a pseudogene. Most of the GIMAPs have a role in the maintenance and development of lymphocytes. GIMAPs could inhibit the development of tumors by increasing the amount and antitumor activity of infiltrating immunocytes. Knowledge of key factors that affect the tumor immune microenvironment for predicting the efficacy of immunotherapy and establishing new targets in ccRCC is of great importance. A computed tomography (CT)-based radiogenomic approach was used to detect the imaging phenotypic features of GIMAP family gene expression in ccRCC. In this retrospective study we enrolled 193 ccRCC patients divided into two groups: ccRCC patients with GIMAP expression (n = 52) and ccRCC patients without GIMAP expression (n = 141). Several imaging features were evaluated on preoperative CT scan. A statistically significant correlation was found with absence of endophytic growth pattern (p = 0.049), tumor infiltration (p = 0.005), advanced age (p = 0.018), and high Fuhrman grade (p = 0.024). This study demonstrates CT imaging features of GIMAP expression in ccRCC. These results could allow the collection of data on GIMAP expression through a CT-approach and could be used for the development of a targeted therapy. Full article
Show Figures

Figure 1

15 pages, 2850 KiB  
Article
Aldehyde Dehydrogenase Genes as Prospective Actionable Targets in Acute Myeloid Leukemia
by Garrett M. Dancik, Lokman Varisli, Veysel Tolan and Spiros Vlahopoulos
Genes 2023, 14(9), 1807; https://0-doi-org.brum.beds.ac.uk/10.3390/genes14091807 - 16 Sep 2023
Cited by 2 | Viewed by 1269
Abstract
It has been previously shown that the aldehyde dehydrogenase (ALDH) family member ALDH1A1 has a significant association with acute myeloid leukemia (AML) patient risk group classification and that AML cells lacking ALDH1A1 expression can be readily killed via chemotherapy. In the [...] Read more.
It has been previously shown that the aldehyde dehydrogenase (ALDH) family member ALDH1A1 has a significant association with acute myeloid leukemia (AML) patient risk group classification and that AML cells lacking ALDH1A1 expression can be readily killed via chemotherapy. In the past, however, a redundancy between the activities of subgroup members of the ALDH family has hampered the search for conclusive evidence to address the role of specific ALDH genes. Here, we describe the bioinformatics evaluation of all nineteen member genes of the ALDH family as prospective actionable targets for the development of methods aimed to improve AML treatment. We implicate ALDH1A1 in the development of recurrent AML, and we show that from the nineteen members of the ALDH family, ALDH1A1 and ALDH2 have the strongest association with AML patient risk group classification. Furthermore, we discover that the sum of the expression values for RNA from the genes, ALDH1A1 and ALDH2, has a stronger association with AML patient risk group classification and survival than either one gene alone does. In conclusion, we identify ALDH1A1 and ALDH2 as prospective actionable targets for the treatment of AML in high-risk patients. Substances that inhibit both enzymatic activities constitute potentially effective pharmaceutics. Full article
Show Figures

Figure 1

15 pages, 1933 KiB  
Article
Predicting Patterns of Distant Metastasis in Breast Cancer Patients following Local Regional Therapy Using Machine Learning
by Audrey Shiner, Alex Kiss, Khadijeh Saednia, Katarzyna J. Jerzak, Sonal Gandhi, Fang-I Lu, Urban Emmenegger, Lauren Fleshner, Andrew Lagree, Marie Angeli Alera, Mateusz Bielecki, Ethan Law, Brianna Law, Dylan Kam, Jonathan Klein, Christopher J. Pinard, Alex Shenfield, Ali Sadeghi-Naini and William T. Tran
Genes 2023, 14(9), 1768; https://0-doi-org.brum.beds.ac.uk/10.3390/genes14091768 - 07 Sep 2023
Viewed by 1454
Abstract
Up to 30% of breast cancer (BC) patients will develop distant metastases (DM), for which there is no cure. Here, statistical and machine learning (ML) models were developed to estimate the risk of site-specific DM following local-regional therapy. This retrospective study cohort included [...] Read more.
Up to 30% of breast cancer (BC) patients will develop distant metastases (DM), for which there is no cure. Here, statistical and machine learning (ML) models were developed to estimate the risk of site-specific DM following local-regional therapy. This retrospective study cohort included 175 patients diagnosed with invasive BC who later developed DM. Clinicopathological information was collected for analysis. Outcome variables were the first site of metastasis (brain, bone or visceral) and the time interval (months) to developing DM. Multivariate statistical analysis and ML-based multivariable gradient boosting machines identified factors associated with these outcomes. Machine learning models predicted the site of DM, demonstrating an area under the curve of 0.74, 0.75, and 0.73 for brain, bone and visceral sites, respectively. Overall, most patients (57%) developed bone metastases, with increased odds associated with estrogen receptor (ER) positivity. Human epidermal growth factor receptor-2 (HER2) positivity and non-anthracycline chemotherapy regimens were associated with a decreased risk of bone DM, while brain metastasis was associated with ER-negativity. Furthermore, non-anthracycline chemotherapy alone was a significant predictor of visceral metastasis. Here, clinicopathologic and treatment variables used in ML prediction models predict the first site of metastasis in BC. Further validation may guide focused patient-specific surveillance practices. Full article
Show Figures

Figure 1

13 pages, 2074 KiB  
Article
An Automated Prognostic Model for Pancreatic Ductal Adenocarcinoma
by Ioannis Vezakis, Antonios Vezakis, Sofia Gourtsoyianni, Vassilis Koutoulidis, Andreas A. Polydorou, George K. Matsopoulos and Dimitrios D. Koutsouris
Genes 2023, 14(9), 1742; https://0-doi-org.brum.beds.ac.uk/10.3390/genes14091742 - 31 Aug 2023
Cited by 1 | Viewed by 1187
Abstract
Pancreatic ductal adenocarcinoma (PDAC) constitutes a leading cause of cancer-related mortality despite advances in detection and treatment methods. While computed tomography (CT) serves as the current gold standard for initial evaluation of PDAC, its prognostic value remains limited, as it relies on diagnostic [...] Read more.
Pancreatic ductal adenocarcinoma (PDAC) constitutes a leading cause of cancer-related mortality despite advances in detection and treatment methods. While computed tomography (CT) serves as the current gold standard for initial evaluation of PDAC, its prognostic value remains limited, as it relies on diagnostic stage parameters encompassing tumor size, lymph node involvement, and metastasis. Radiomics have recently shown promise in predicting postoperative survival of PDAC patients; however, they rely on manual pancreas and tumor delineation by clinicians. In this study, we collected a dataset of pre-operative CT scans from a cohort of 40 PDAC patients to evaluate a fully automated pipeline for survival prediction. Employing nnU-Net trained on an external dataset, we generated automated pancreas and tumor segmentations. Subsequently, we extracted 854 radiomic features from each segmentation, which we narrowed down to 29 via feature selection. We then combined these features with the Tumor, Node, Metastasis (TNM) system staging parameters, as well as the patient’s age. We trained a random survival forest model to perform an overall survival prediction over time, as well as a random forest classifier for the binary classification of two-year survival, using repeated cross-validation for evaluation. Our results exhibited promise, with a mean C-index of 0.731 for survival modeling and a mean accuracy of 0.76 in two-year survival prediction, providing evidence of the feasibility and potential efficacy of a fully automated pipeline for PDAC prognostication. By eliminating the labor-intensive manual segmentation process, our streamlined pipeline demonstrates an efficient and accurate prognostication process, laying the foundation for future research endeavors. Full article
Show Figures

Figure 1

17 pages, 4064 KiB  
Article
Deciphering the Tumor–Immune–Microbe Interactions in HPV-Negative Head and Neck Cancer
by Min Hu, Samuel Coleman, Muhammad Zaki Hidayatullah Fadlullah, Daniel Spakowicz, Christine H. Chung and Aik Choon Tan
Genes 2023, 14(8), 1599; https://0-doi-org.brum.beds.ac.uk/10.3390/genes14081599 - 08 Aug 2023
Cited by 1 | Viewed by 1185
Abstract
Patients with human papillomavirus-negative head and neck squamous cell carcinoma (HPV-negative HNSCC) have worse outcomes than HPV-positive HNSCC. In our study, we used a published dataset and investigated the microbes enriched in molecularly classified tumor groups. We showed that microbial signatures could distinguish [...] Read more.
Patients with human papillomavirus-negative head and neck squamous cell carcinoma (HPV-negative HNSCC) have worse outcomes than HPV-positive HNSCC. In our study, we used a published dataset and investigated the microbes enriched in molecularly classified tumor groups. We showed that microbial signatures could distinguish Hypoxia/Immune phenotypes similar to the gene expression signatures. Furthermore, we identified three highly-correlated microbes with immune processes that are crucial for immunotherapy response. The survival of patients in a molecularly heterogenous group shows significant differences based on the co-abundance of the three microbes. Overall, we present evidence that tumor-associated microbiota are critical components of the tumor ecosystem that may impact tumor microenvironment and immunotherapy response. The results of our study warrant future investigation to experimentally validate the conclusions, which have significant impacts on clinical decision-making, such as treatment selection. Full article
Show Figures

Figure 1

Back to TopTop