As of 24 May 2020, over 5 millions people in the world have been confirmed as having the 2019 novel coronavirus disease (COVID-19), an infection with Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2) (initially called 2019-nCoV before 11 February 2020) which is part of the Coronaviridae family of positive-sense single-stranded RNA viruses that includes SARS-CoV and MERS-CoV (Middle East Respiratory Syndrome coronavirus), both of which also cause severe respiratory infections. The death count in China so far has been over 1700, but the number is expected to go higher with the increasing number of confirmed and non-confirmed cases. The medical research community is vigorously seeking a treatment to control the infection and save the lives of severely infected patients.
Just a few weeks after the COVID-19 outbreak, the complete genome of SARS-CoV-2 was determined and reported to GenBank (accession MN908947). Viruses were also isolated from patients to understand the genomic characteristics and mechanism of the viral infection. As revealed by the analysis, the SARS-CoV-2 shared 79% sequence identity to SARS-CoV. In one study, SARS-CoV-2 was found to be closely related to two bat-derived Severe Acute Respiratory Syndrome (SARS)-like coronaviruses, with 87.5% and 87.6% shared identity [1
]. In another study, SARS-CoV-2 was 96% identical at the whole-genome level to a bat coronavirus [2
Despite the high sequence identity between the SARS-CoV-2 and the SARS-CoV in the open reading frame regions, the envelop spike protein (S-protein) [3
], which mediates the infection of SARS-CoV via the human host protein ACE-2, has only about 80% shared sequence identity between the SARS-CoV and SARS-CoV-2 [1
]. Within the S-protein, the receptor docking domain has a higher divergence, with four out of five critical ACE-2 interacting amino acid residues replaced in the SARS-CoV-2. However, structural modeling indicated that the four residues in the SARS-CoV-2 retain a structural conformation similar to that of SARS-CoV, and the SARS-CoV-2 S-protein should be able to bind ACE-2 with reasonable affinity [4
]. Indeed, studies by Zhou et al. using cells expressing human ACE-2 confirmed that the SARS-CoV-2 could infect cells via the same protein on ACE-2 as SARS-CoV did [2
]. Thus, one option to treat the infection is to search for an inhibitor that can prevent the interaction of the SARS-CoV-2 S-protein with human ACE-2. The availability of the genome sequence of SARS-CoV-2 allows us to establish structural models for the S-protein [4
The RNA of coronaviruses encodes polyproteins that can be processed by viral proteases to yield mature proteins. The same mechanism is shared by picornaviruses and retroviruses. Patients treated with protease inhibitors appeared to have much better clinical outcomes than without using the inhibitors (SARS death: 28.8% vs. 2.4%) [5
]. Molecular dynamics simulations have revealed that, by molecular docking to the active site of the main protease 3CL of SARS-CoV, both lopinavir and ritonavir could induce conformation changes and potentially interfere with infection by SARS virus [6
]. We expect the same will apply for SARS-CoV-2. The crystal structure of the SARS-CoV-2 protease (3CLpro
) was just recently reported by Liu et al. [7
]. Thus, another option to treat the SARS-CoV-2 infection is to search for inhibitors of the SARS-CoV-2 3CLpro
With these models and crystal data, we performed in silico studies of potential inhibitors of the SARS-CoV-2 S-protein and 3CLpro.
2. Computational Methods
All calculations were operated on Dell PowerEdge C6220 servers. The chemical structures were prepared by AutoDockTools-1.5.6 [8
], Chimera 1.14 [9
], and Avogadro [10
]. The docking studies were performed with Autodock 4.2.6, Autodock4, AutoDockTools4 [11
], and Autodock Vina 1.1.2 [12
2.1. Preparation of Receptor and Ligands
The 3CL protease’s three-dimensional crystal structure was retrieved from the Protein Data Bank (PDB ID: 6LU7), and it was applied as the receptor for molecular docking after a cleaning with Chimera. The ligands observed, i.e., FDA-approved drugs (2454 structures in total), were retrieved from the BindingDB (https://www.bindingdb.org
), and the structures of the ligands were further optimized with Avogadro. The force field applied for geometry optimization was MMFF94.
The SARS-CoV-2/ACE-2 structure was retrieved using the function of the comparative modeling of the Chimera interface with the modeler (version 9.23) [13
]. For the preparation of the SARS-CoV-2/ACE-2 structure, the target template sequence was retrieved from Zhang et al.’s work and the SARS-CoV/ACE-2 (PDB ID: 6ACD) served as a template, as it was also the top candidate from Basic Local Alignment Search Tool (BLAST) results. Because SARS-CoV and SARS-CoV-2 have an 88% similarity, the 3D structure can be predicted with a high accuracy. Next, the sequence alignments were performed using SARS-CoV as a template. Then, the model was built followed by refining the loops, side chain optimization, and model optimization. When the homology model was generated, it was further validated using the WHATCHECK/PROCHECK program [14
] for basic parameters like torsion angle, rotational angle, bond length, etc. Finally, this model was used as receptor for docking purposes. The loop refinement and side chain optimization were performed using Chimera 1.14 by selecting the active region; all the parameters were the default of the version.
It is noteworthy that this calculated work was performed before the crystal structure of the COVID-19 S-protein was released (6LZG, 6VW1, etc.). After the crystal structures were released, their structures were compared with ours and the structures overlaid well (Figure 1
), with 93.22% of its residues in the allowed region and a minor difference on the top right loop, which was not a site that interacts with the ACE-2, so a re-calculation was not conducted using the new crystal structures.
2.2. Molecular Docking with Autodock Vina
For the SARS-CoV-2 3CL inhibition calculation, the input files for Autodock Vina were prepared in the receptor’s original file (PDB format) and ligands files (SDF format) using AutoDockTools-1.5.6. After minimizing, the grid box was set at 22.00 Å × 22.00 Å ×22.00 Å along the x, y, and z axis, respectively. The docking site was defined at 1.00 Å when using the Autodock Vina. The grid box was set into the docking site at the H41, C145, and E166 regions according to the docking site of the coronavirus main proteinase (3CL) of Severe Acute Respiratory Syndrome (SARS). Then, the receptor file (PDBQT format, for docking purposes) was prepared by the addition of polar only hydrogen atoms, the removal of all water molecules, and the calculation of the Gasteiger charge. The instructed command prompts were used for the docking process. The docking output file includes the docking energy (in kcal/mol, which is an indication of the binding affinity/efficiency of one specific ligand to the receptor molecule) and the interaction of the ligands with the receptor (hydrogen bond, pi-pi stacking, etc.).
For the SARS-CoV-2 S-protein inhibition calculation, the PDB files of the SARS-CoV-2 S-protein were generated using the homology modeling method in Chimera; the template used for this was the SARS-CoV S-protein. After minimization, the input file was prepared using AutoDockTools-1.5.6. The grid box, which was a rectangular shaped area that covered all the possible docking sites of the SARS-CoV-2 S-protein with its receptor ACE-2, was chosen as 22.00 Å × 42.00 Å ×22.00 Å along the x, y, and z axis, respectively. The docking site was defined at 1.00 Å when using the Autodock Vina. Then, the receptor file (PDBQT format, for docking purposes) was prepared by the addition of polar only hydrogen atoms, the removal of all water molecules, and the calculation of the Gasteiger charge.
2.3. Analyzing the Docking Results with Chimera and BioLuminate
The docking results were ranked in the order from high to low in different modes according to the docking scores (docking energy, kcal/mol). The ligands with the most negative docking scores—i.e., the highest affinities—were selected for the visualization of the docked complexes using Chimera [9
The docking energies of the SARS-CoV-2 S-protein and human ACE-2 were calculated using BioLuminate [15
], and then compared to the docking energies of the SARS-CoV S-protein and human ACE-2. To verify whether those ligands can be used for blocking the interaction of the S-protein with human ACE-2, the docking energies of the SARS-CoV-2 S-protein/ligands and human ACE-2 were also calculated. The solvation model used was VSGB [18
], and the force field chosen was OPLS_2005 [19
] for all the docking energy predictions.
Disulfiram, lopinavir, and ritonavir are the three approved and active protease inhibitors against SARS and MERS. Indeed, lopinavir and ritonavir were successfully used to treat a patient in Thailand in January 2020. Our results show that among these ligands, saquinavir, tadalafil, rivaroxaban, sildenafil, dasatinib, vardenafil, montelukast are most promising due to their higher docking scores (<−8.5 kcal/mol, which coresponds to < 1 μM IC50) than others. All of these scores appear better than that of the antiviral drug Lopinavir (−8.2 kcal/mol). As a comparison, the docking scores reported for lopinavir with the viral RNA polymerase is −8.3 kcal/mol [18
]. It is a remarkable observation that some SARS-CoV-2 inhibitors such as indinavir could not block the interaction of the SARS-CoV-2 S-protein with ACE-2, while other inhibitors, such as ergotamine and amphotericin B, can effectively inhibit such interaction. This is somewhat confusing, since all of these three compounds dock on the same docking site that is marked by the red circles in Figure 3
—the grove between an extended insertion that contains the β5/β6 strands and the receptor-binding motif (RBM) loop [28
]. To comprehend what caused the significant difference, we overlaid the structures of the three docked compounds and ACE-2 on the SARS-CoV-2 spike protein in Figure 4
. The comparison clearly shows that ergotamine (red) and amphotericin B (blue) extend further out toward the ACE-2 and thus effectively block the interaction of the SARS-CoV-2 spike protein with ACE-2 while indinavir (green) clings to the SARS-CoV-2 spike protein, leaving room for ACE-2 to interact with the spike protein.
For the inhibition of the SARS-CoV-2 3CL protease, saquinavir, tadalafil, rivaroxaban, sildenafil, dasatinib, vardenafil, and montelukast are most promising due to their high docking scores (<−8.5 kcal/mol), which were more negative than those of other ligands.
Among these that showed an excellent inhibiting property to block the interaction of SARS-CoV-2 S-protein with ACE-2 in Table 3
, ergotamine, amphotericin b, and vancomycin are the most promising since they are also among the highest to bind to the SARS-CoV-2 S-protein, as shown in Table 2
For more active results, a combination of 3CL protease inhibitors and ergotamine may be considered.