Next Article in Journal
Five Different Artemisia L. Species Ethanol Extracts’ Phytochemical Composition and Their Antimicrobial and Nematocide Activity
Previous Article in Journal
Meta-Analysis of COVID-19 Metabolomics Identifies Variations in Robustness of Biomarkers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evolution End Classification of tfd Gene Clusters Mediating Bacterial Degradation of 2,4-Dichlorophenoxyacetic Acid (2,4-D)

Ufa Institute of Biology, Ufa Federal Research Centre, Russian Academy of Sciences, Prospekt Oktyabrya, 69, 450054 Ufa, Russia
Int. J. Mol. Sci. 2023, 24(18), 14370; https://0-doi-org.brum.beds.ac.uk/10.3390/ijms241814370
Submission received: 30 July 2023 / Revised: 11 September 2023 / Accepted: 16 September 2023 / Published: 21 September 2023
(This article belongs to the Section Molecular Microbiology)

Abstract

:
The tfd (tfdI and tfdII) are gene clusters originally discovered in plasmid pJP4 which are involved in the bacterial degradation of 2,4-dichlorophenoxyacetic acid (2,4-D) via the ortho-cleavage pathway of chlorinated catechols. They share this activity, with respect to substituted catechols, with clusters tcb and clc. Although great effort has been devoted over nearly forty years to exploring the structural diversity of these clusters, their evolution has been poorly resolved to date, and their classification is clearly obsolete. Employing comparative genomic and phylogenetic approaches has revealed that all tfd clusters can be classified as one of four different types. The following four-type classification and new nomenclature are proposed: tfdI, tfdII, tfdIII and tfdIV(A,B,C). Horizontal gene transfer between Burkholderiales and Sphingomonadales provides phenomenal linkage between tfdI, tfdII, tfdIII and tfdIV type clusters and their mosaic nature. It is hypothesized that the evolution of tfd gene clusters proceeded within first (tcb, clc and tfdI), second (tfdII and tfdIII) and third (tfdIV(A,B,C)) evolutionary lineages, in each of which, the genes were clustered in specific combinations. Their clustering is discussed through the prism of hot spots and driving forces of various models, theories, and hypotheses of cluster and operon formation. Two hypotheses about series of gene deletions and displacements are also proposed to explain the structural variations across members of clusters tfdII and tfdIII, respectively. Taking everything into account, these findings reconstruct the phylogeny of tfd clusters, have delineated their evolutionary trajectories, and allow the contribution of various evolutionary processes to be assessed.

1. Introduction

Currently tfd gene clusters are model objects for studying the microbial acquisition of xenobiotic degradation capacity. They encode for enzymes involved in the degradation of 2,4-dichlorophenoxyacetic acid (2,4-D) which has human health and ecological risks [1,2], but is still used worldwide as an herbicide in agriculture [3]. Continued exploration of microbial 2,4-D degradation in various environments in China [4,5], Brazil [6], Vietnam [7,8], Russia [9] and Japan [10] suggests that this problem is still in the spotlight.
Previously, under the name “TFD”, conjugative plasmids (TFD plasmids) containing genes of the 2,4-D and 4-chloro-2-methylphenoxyacetic acid (MCPA) degradation pathways were described [11]. At the moment, the abbreviation tfd designates two clusters of genes involved in the degradation of 2,4-D, which are located on the pJP4 plasmid of strain Cupriavidus pinatubonensis JMP134 (previously identified as Ralstonia eutropha, Alcaligenes eutrophus, Waustersia eutropha and Cupriavidus necator). Each of these clusters, tfdBIFIEIDICIT (designated as tfd-I/tfdI/tfd-I/tfdI) and tfdKBIIFIIEIICIIDIIR (designated as tfd-II/tfdII/tfd-II/tfdII), encodes a core set of genes for the ortho-cleavage pathway of chlorocatechol. In the case of cluster tfdII, this core set of genes is extended by the tfdA and tfdK genes [12,13,14,15,16,17,18].
The reactions and genes controlling 2,4-D degradation are as follows: acetate side chain cleavage of 2,4-D by α-ketoglutarate-dependent 2,4-D dioxygenase (tfdA) into 2,4-dichlorophenol (2,4-DCP); hydroxylation reaction of 2,4-DCP by 2,4-DCP hydroxylase (tfdB) into 3,5-dichlorocatechol (3,5-DCC); ortho-cleavage of 3,5-dichlorocatechol by chlorocatechol 1,2-dioxygenase (tfdC) to form 2,4-dichloro-cis,cis-muconate (2,4-dichloromuconic acid); conversion of 2,4-dichloro-cis,cis-muconate to 2-chlorodienelactone catalyzed by chloromuconate cycloisomerase (tfdD); conversion of 2-chlorodienelactone to 2-chloromaleylacetate by chlorodienelactone hydrolase (tfdE); and conversion of 2-chloromaleylacetate to 3-oxoadipate via maleylacetate by chloromaleylacetate reductase and maleylacetate reductase (tfdF), respectively, which is further funneled to the tricarboxylic acid cycle (TCA) [19]. Functionally corresponding proteins of both clusters are complementary, since they have different efficiencies in terms of the performed reactions [20,21,22].
An ortho-cleavage pathway of chlorinated catechols is a common catabolic function between tfd gene clusters on the one hand, and tcb with clc gene clusters on the other hand. The tcb gene clusters are located on catabolic plasmid pP51 which contains two gene clusters, tcbAB and tcbRCDEF. The tcbAB controls the degradation of chlorinated benzenes to chlorinated catechols, while tcbRCDEF are directly responsible for the degradation of chlorinated catechols [23,24]. The clc gene cluster (chlorocatechols—Clc) of plasmid pAC27 with a clcRABDE architecture controls the degradation of 3-chlorobenzoate (3-CB) via the ortho-cleavage pathway of chlorinated catechols [25,26,27]. It has been shown that functionally related proteins of tfd, tcb and clc gene clusters have high levels of similarity and identity between their amino acid and nucleotide sequences. Additionally, all clusters have shown strong DNA homology and similar organization [28,29,30,31]. Schlömann with colleagues [32] noted that phylogenetically, tfd-I, tcb and clc gene clusters diverged from a common ancestral pathway of chlorocatechols. However, further phylogenetic analyses have indicated independent recruitment of diverse genes during assemblage of the tfd gene cluster [33] and the probable independent evolution of the tfdA gene [34].
In the past three decades, all explorations of tfd gene clusters, from a comparative genomic point of view, have been focused primarily on their tandem matching with the canonical gene structure of tfdI and tfdII gene clusters. Many of both canonical and non-canonical clusters have been identified, amongst them: tfdT-CDEF [35], tfdDRCEBKAF [36], tfdCDEF and tfdCIIEIIBKA (Delftia acidovorans P4a), tfdRCEBKA [37], tfdC2E2F2 and tfdDRFCE [38], tfdAKBECR [39], tfdRDCEFBKA [40], tfdBCDEFKR [41], tfdFAKBIEICIDIRI and tfdCIIEIIBII [42], tfdFEDCT [43], tfdBIFIEIDICIT and variations of the tfdKBIIFEIICIIDIIR cluster with deletions [44], tfdBFET, tfdAKBFECDR and tfdBFEDC [5], tfdBFEDCS, tfdFE(tctC)DCS [7], tfdBaFRDEC [10], tfdCDEF [45], tfdEICIFIRDI,K,BI, tfdDII,FIIEIICII,BII [46] and other cluster structures. Amongst the clusters related to tcb and clc, the following have been identified: tetRtetCDEF [47], mocpDCBAR [39], cbnR-ABCD [48,49], clcRABCD [50], clcRABCDE [51], clcr1a1b1d1e1 [52], dccEDCBAR [53] and others. Some of the tfd, tcb and clc gene clusters have been sequenced but not annotated [54,55,56,57]. While the studies referred to above have been beneficial for gaining knowledge on a great variety of gene clusters structures controlling the chlorocatechol ortho-cleavage pathway (belonging to the tfd, tcb and clc gene clusters), nevertheless, their phylogeny and evolution are still poorly resolved, their classification obsolete, and the genes are still perhaps misannotated.
Comparative genomic and phylogenetic approaches were applied to fill these knowledge gaps. The findings allowed the proposal of a new classification and nomenclature of these gene clusters, into tfdI, tfdII, tfdIII and tfdIV(A,B,C) types. Additionally, the findings indicated that tfd are a unique family of mosaic clusters whose clustering occurred in several evolutionary lineages with active recruitment of ancestral genes from Burkholderiales and Sphingomonadales through horizontal gene transfer. Possible paths for the clustering of tfd genes within each lineage in light of the various models of operon formation were discussed. These results constitute a further contribution to the understanding of bacterial genome organization and will be beneficial for the correct annotation of tfd clusters, as well as further studies of their diversity, propagation, and evolution.

2. Results

The tfd, tcb and clc gene clusters were identified from publicly available genomes and plasmids by: (i) using canonical gene clusters of the plasmids pJP4 (AY365053), pP51 (M57629) and pAC27 (M16964) as NCBI BLAST query sequences; and (ii) the analysis of published articles. Clusters searched for on 1 October 2023 which had two or more gene overlap matches were taken into account in analysis. In total, the above-mentioned search methods resulted in the identification of eighty-five tfd, tcb and clc genes clusters with different genetic structures.

2.1. Comparative Genomics of tfd, tcb and clc Gene Clusters

2.1.1. The tfdI Gene Cluster

A blastn-search revealed seventeen (excluding canonical tfdI) gene clusters which had an over 90% match with the tfdBIFIEIDICIT structure of the tfdI gene cluster. They were designated as tfdT1T2CD (unpublished), tfdCDEF or tfdT-CDEF [35,45], tfdCDEF [58], tfdFEDCT [43], tfdBIFIEIDICIT [44], tfdBFEDC, tfdBFET [5], tfdBFEDCS [7] or not annotated (CP016005, CP026110) (Table S1). Fourteen of those seventeen tfdI gene clusters were located on plasmids in species belonging to the Burkholderiaceae family. Amongst these plasmids were pJP4 (AY365053), plasmid5 (= unnamed5 = plas5) (CP038640), a number of pPO line plasmid contigs (pPO1, 2, 3, 4, 7, 10, 16, 26, 27), pEMT1 (CP026110), pNK84 (AB050198) and pkk4 (CP016005). The three clusters were identified in genomes Bordetella petrii BT1 9.2 (WMBV01000066), uncultured bacterium (AB478351) and chromosome 1 of Paraburkholderia phytofirmans OLGA172 (CP014578). Both identified species belonged to the Alcaligenaceae and Burkholderiaceae families of the order Burkholderiales, respectively (Table S1).
The genetic structures of each tfdI gene cluster were determined and are represented (Figure 1). Although the genetic architecture tfdBIFIEIDICIT is well conserved in most members of the tfdI type, subsequent comparative analyses have suggested that another five clusters had an incomplete arrangement with deletions of the tfdBI or tfdT genes. Amongst them was the tfdFIEIDICIT cluster of the plasmid pNK84 (AP024328) and P. phytofirmans OLGA172 (Burkholderia sp. R172 = Burkholderia sp. OLGA172) (CP014578). It should be noted that in P. phytofirmans OLGA172 earlier, an incomplete cluster with a tfdDICITT genetic architecture was found. A tfdFIEIDICI type structure was found in an uncultivated bacterium (AB478351). It is also interesting to note that in addition to the complete tfdI gene cluster of the plasmid5, an incomplete cluster with the tfdBIFIEIDI structure and fully identical at the DNA sequence level was found almost immediately downstream on the opposite strand of the plasmid (Figure 1). Analysis of the DNA sequence identity of the aligned complete clusters of this type showed that it varied between 92.7% and 100%. At the same time, the pkk4 plasmid cluster had a minimum level of identity with respect to the other—the range was 92.7–96.8%. An interesting feature to note is that the tfdT gene, both in the pJP4 plasmid and in the vast majority of pPO line plasmids (with the exception of pPO4) was truncated. Thus, the vast majority of tfdI type clusters are represented by a complete set of genes without genetic rearrangements.

2.1.2. The tfdII Gene Cluster

A blastn search revealed thirteen clusters which had a genetic architecture closely related to the tfdII seven-gene cluster of the pJP4 plasmid, namely, tfdIIKBFECDR. All the clusters had plasmid localization in bacterial hosts belonging to the Burkholderiaceae family. Four of them had a tfdIIAKBFECDR eight-gene structure (plasmids pDB1 (JQ436721), p712 (JQ436722), pEMT3 (JX469827) and plasmid5 (CP038640)), with the tfdAII gene located immediately downstream of the tfdKII gene in a cluster, in contrast with the prototype plasmid pJP4. Thus, they had a complete set of genes in their structures (Figure 1). It should be noted that the tfdAII gene was absent in the tfdII cluster in the pJP4 plasmid. It was located on the opposite strand downstream from the tfdRII gene of the cluster, together with the tfdSII gene, and was separated from them by the open reading frames ORF31 and ORF32. However, unlike pJP4, three of them (pDB1, p712, and pEMT3) did not have the tfdI gene cluster. Meanwhile, the plasmid5 plasmid contained the tfdI cluster.
It should be pointed out that the tfdIISA clusters were also identified on the plasmid5, pkk4 (CP016005), pPO3 (CCJJ010000004), pPO4 (CCJK010000002), pPO10 (CCJM010000001) and pEMT1 plasmids which shared almost 100% identity at the nucleotide level. Separately, it is worth noting the variety of structures of the tfdII type clusters on contigs for the pPO line plasmids (pPO1, 2, 3, 7, 10, 16, 26, 27). Of these, only pPO26 had the pJP4-like seven-gene cluster with the tfdIIKBFECDR structure, while the others had structure genes of the form tfdIIFECDR (pPO2, 7, 10, 16, 27), tfdIIKFECD (pPO1) and tfdIIFECD (pPO3). A number of contigs had tfdIISA mini-clusters or tfdAII and tfdSII genes. Thus, twenty clusters with architectures related to tfdIIKBFECDR and tfdIISA were identified. The six plasmids, namely, pPO16, pJP4, p712, pEMT3, pDB1 and plasmid5, possessed a core set of genes tfdIIKBFECDR which shared identity values at the nucleotide level ranging from 87.9% to 100%.

2.1.3. The tfdIII Gene Cluster

A large group of tfdIII type clusters (eleven) sharing a common gene structure with different lengths has been identified by blastn-search (details of the new classification and nomenclature are described in the Section 3 and Section 4). Some were annotated without specifying the type of tfd cluster—tfdDRCEBKAF [36], tfdRCDEF [59], tfdRCEBKA [37], and tfdAKBECR [39]. Others belonged to the tfdI and tfdII type clusters—tfdRCIIEIIBKA [59], tfdFAKBIEICIDIRI, and tfdCIIEIIBII [42] or were unpublished (plasmid pkk4) or not annotated [60] (Table S1). The more complete tfdIIIFAKBECDR eight-gene structures were encoded only on mega-plasmids pM7012 (AB853026) and pkk4. They were fully identical to each other and shared a 67.7–68.2% identity with the complete clusters tfdIIAKBFECDR for the tfdII type (plasmids p712, pEMT3, pDB1 and plasmid5) at the nucleotide level. In both plasmids, the tfdFIII gene was located downstream of the tfdAIII gene. Another feature was the presence of incomplete versions of this type of cluster with structures tfdBECRR and tfdBECR immediately downstream of the tfdRIII gene on the opposite DNA strand. In these reduced clusters, the tfdDIII gene was completely deleted, while tfdRIII was partially reduced and, in pkk4, was represented by two copies. The plasmid regions included both clusters on pM7012 and pkk4 almost identical at the nucleotide level. The clusters with structures tfdIIIBECRR and tfdIIIBECR both shared a 77.6% identity with both tfdIIIFAKBECDR clusters. Importantly, in the mega-plasmid pkk4, the tfdI type cluster was located upstream from tfdIIIFAKBECDR, as described below (Figure 1).
The clusters with structures tfdIIIF,AKBECR were identified in plasmids pIJB1 (JX847411) and pEST4011 (AY540995). In them, tfdFIII genes were also located downstream of the tfdAIII gene such as for the mega-plasmids pkk4 and pM7012 described above, but there were two small ORFs between them. Additionally, in contrast to pM7012 and pkk4, these clusters lacked a tfdDIII gene and short clusters tfdIIIBECR/BECRR. Both plasmids pAKD25 (JN106170) and pAKD26 (JN106171) from the same bacterial host [39] had clusters with a tfdIIIAKBECR gene structure without tfdFIII and tfdDIII genes. The shortest structures of that type were identified in plasmids pRK1-5 (CP062809), pTV1 (AB028643) and Delftia acidovorans P4a (AY078159). They had clusters with tfdIIIKBECR, tfdIIIAKBE and tfdIIIBECR gene structures, respectively. Interestingly, the core set of genes of this tfdIII cluster type (tfdIIIBECR), with the exception of D. acidovorans P4a, varied in terms of nucleotide identity by between 77.3% and 100%. An important distinguishing feature, in addition to a similar structure, in this group of clusters was the presence of tcb gene clusters adjacent to the tfdIII gene clusters. In total, nine clusters were identified, of which three had a full set of tcbRCDEF genes with ORF3, five clusters with structure tcbDORF3 and one with tcbRC. The plasmids pkk4 and pM7012 were exceptions as they did not possess any tcb gene clusters. The complete tcb clusters were identified on opposite chains almost immediately downstream of the tfdAIII genes of D. acidovorans P4a and plasmid pAKD26, as well as the tfdBIII gene of plasmid pRK1-5. Moreover, in D. acidovorans P4a, the tcbRC cluster was identified immediately upstream of gene tfdEIII. This was flanked by an ORF encoding a transcriptional regulator. The other plasmids pEST4011, pIJB1, pAKD25, and pTV1 possessed a tcbDORF3 cluster. Interestingly, pEST4011 possessed two copies of that cluster. The detailed synteny of the regions mentioned above is depicted in Figure 1.
Moreover, identified in a number of plasmids (pEST4011, pIJB1, pAKD25 and pTV1), the two-gene clusters tcbDORF3 were previously annotated as tfdD [61], tfdDI [36], tfdD and orf5 [37], tfdD and ORF [39]. The analysis also revealed the presence of a unique two-gene tcbCR gene cluster adjacent to tfdIIIAKBE of D. acidovorans P4a, previously annotated as tfdRCII [59].

2.1.4. The tfdIV Gene Clusters from Sphingomonas and Bradyrhizobium

This group combines non-traditional clusters that have variable gene structures and significantly differ from the canonical tfd gene clusters of the I and II types. The published clusters were annotated as dccEAIDI, dccAIIDII, dccEAIDI [62], tfdC2E2, tfdC2E2F2, tfdDRFCE, tfdDRF [38,63], cnbCDEF [64], tfdBCDEFKR [41], tfdEC, tfdCE [7], tfdBaFRDEC [10], tfdEICIFIRDI,K,BI and tfdDII,FIIEIICII,BII [46]. Nevertheless, comparative genomic analysis showed that the genetic structure of this tfdIV type cluster could be divided into three main subtypes—A, B and C (Figure 2a) (details of the new classification and nomenclature are described in the Section 3 and Section 4.
The most common structure, subtype A—tfdIVAECFRD,B, possessed bacterial strains Sphingobium herbicidovorans MH, Sphingopyxis sp. KK2, and two plasmids pDB-1 and pCADAB1. According to Nielsen et al. [63], the strain Sphingomonas sp. TFD44 also possesses tfdIVAECFRD,B, but this sequence was absent in publicly available databases. Therefore, the early variant of the sequence Sphingomonas sp. TFD44 with a tfdIVAECFRD structure without the tfdBIVA gene which was sequenced by Thiel and colleagues (2005) [38], was used for analysis. An interesting feature of tfdIVAECFRD,B subclusters was the reverse orientation tfdRIVA gene compared with the tcb, clc and other tfd clusters. Normally, that gene has the opposite orientation to other genes in the cluster. Moreover, the tfdDIVA gene also had another orientation, opposite to the other cluster genes (with the exception of the pDB-1 plasmid). In the strain Sphingomonas histidinilytica BT1 5.2 (WMBU01000047) and on the plasmid pMSHV (CP020539) of the S. herbicidovorans MH, described above, less clustered variations with structures tfdIVARD,B,CE and tfdIVAFRD,B,CE, respectively, were found. It should be noted that there was a high degree of synteny between the majority of this these subtype members (Figure 2b). The most common tfdIVAECFRD structures of subtype A shared an identity at the nucleotide level ranging from 88.6% to 100%.
The second subtype of this cluster, subtype B—tfdIVBD,FEC/FEC/D,FEC,B, was identified on the plasmids pDB-4 (CP102388), pHSL1 (CP018222) and in the strains Pseudomonas stutzeri ZWLR2-1 (GU181397) and Sphingomonas sp. tfd44. This last had the short variant without the tfdDIVB gene—the tfdIVBFEC of the whole set of genes in that subtype were ordered in one direction, in contrast with subtype A (Figure 2a). Interestingly, on the pHSL1 and pDB-4 gene structure, tfdIVBFEC was sandwiched into the cluster of genes responsible for encoding three-component Rieske-type [2Fe-2S] dioxygenase (anthranilate 1,2-dioxygenase) of B. cepacia DBO1 designated AntDO-3C and consisting of AndAaAbAcAd genes. The hybrid cluster had the structure tfdDIVBAndAaAbtfdIVBFECAndAdAc. Other strains, P. stutzeri ZWLR2-1 and Sphingomonas sp. tfd44, had shorter structures—tfdDIVBAndAaAbtfdIVBFECAndAd and AndAaAbtfdIVBFECAndAd, respectively. The tfdIVBFEC core set of genes shared 84.0–95.2% identity at the nucleotide level. Thus, that subtype was distributed mostly in the Sphingomonadaceae family with the exception of P. stutzeri ZWLR2-1.
The third subtype—C was presented on contigs of Bradyrhizobium sp. RD5-C2 (BOVL01000048) and S. histidinilytica BT1 5.2 (WMBU01000019) by gene structures with different degrees of gene assemblage—tfdIVCCEDRF,B and tfdIVCCED,R,F, respectively. The order and orientation of genes in that subtype differed from both subtype A and B (Figure 2a).
The obtained results indicated that three bacterial strains possessed two subtypes of tfd-like clusters. Therefore, Sphingomonas sp. TFD44 (tfd44) and Sphingopyxis sp. DBS4 had the A and B subtypes, while S. histidinilytica BT1 5.2 had the A and C subtypes.

2.1.5. The tcb and clc Gene Clusters

A blastn search revealed the sixteen sequences with high identity at nucleotide level with the canonical clcRABDE cluster of plasmid pP51 (M57629). Among them were both gene clusters annotated as clc and others: clcr1a1b1d1e1 [52], dccEDCBAR [53] and tfdFE(tctC)DCS [7]. Additionally, some clusters were sequenced by different authors, but belonged to the same bacterial strain Pseudomonas knackmussii B13 [52,61,62,63,64,65,66], Paraburkholderia xenovorans LB400 [67,68], Pseudomonas aeruginosa strain JB2 [69,70,71] and Pandoraea pnomenusa MCB032 [51,72]. As a result, the most fully sequenced gene clusters were taken for further analysis based on the latest data deposited in GenBank (Table S1). From a genetic architecture point of view, all the clusters were highly conserved and had a common set of clcRABDE genes, as well as ORF3 genes in their structure. Within the clc cluster, the nucleotide sequences had a shared identity of between 96.3% and 100%. The plasmid pAC27 (M16964) Pseudomonas aeruginosa 142 (AF161263) possessed the truncated clcR gene. The ORF3 of strain Diaphorobacter sp. JS3051 was presented by the two truncated ORFs. There were no deletions, inversions or duplications of genes in the identified clusters. Figure 3a illustrates the results of a comparative genomic analysis of the identified clusters and flanked ORFs. Moreover, the clusters of strains B. petrii DSM 12804 (AM902716), P. knackmussii B13 (HG322950), and plasmid unnamed 2 of the strain P. pnomenusa MCB032 (CP015373) had a high level of synteny downstream of the clcE gene. Downstream of the clcR gene, a high level of synteny was observed in B. petrii DSM 12804 (AM902716), P. knackmussii B13 (HG322950), P. xenovorans LB400 (CP008760), P. aeruginosa JB2 and Diaphorobacter sp. JS3051 (CP065406). Almost all the gene clusters, with one exception (pAC27, paaa and unnamed 2), had non-plasmid localization. Most of the bacteria carrying these clusters belonged to the families Alcaligenaceae, Burkholderiaceae, and Comamonadaceae of the order Burkholderiales of Betaproteobacteria, with the exception of few pseudomonades and Escherichia coli JM103 from Gammaproteobacteria.
Additionally, the blastn search indicated that the ten tcb clusters had a five-gene RCDEF structure (including the canonical tcb gene cluster of the plasmid pP51) which were mainly plasmid encoded: pP51 (M57629), pA81 (CP002288), pENH91 (CP017760), pRK1-5 (CP062809), pC1-1 (HQ891317), pAKD26 (JN106171). All the clusters shared between 97.0% and 100% identity and were much conserved in gene structure—they had no deletions, inversions or duplications. The four identified clusters were not annotated; the six clusters were designated as tetRtetCDEF [47], tfdRCDEF [59], cbnR-ABCD [35,49], mocpRABCD [73] and mocpDCBAR [39] (Table S1).
Comparative genomic analysis has shown a high level of synteny of flanking ORFs for all the clusters (Figure 3b). Interestingly, only in genome Bordetella petrii, DSM 12804 near the RCDEF was identified tcbAB gene cluster. Most of the bacteria carrying these clusters belonged to the families Alcaligenaceae, Burkholderiaceae and Comamonadaceae of the order Burkholderiales of Betaproteobacteria, with the exception of Sphingomonas sp. C8-2 andPseudomonas sp. P51 from Alpha- and Gammaproteobacteria, respectively.
The common feature of both clc and tcb clusters was the flanking immediately downstream from the gene encoding maleylacetate reductases (clcE and tcbF, respectively) by a ORF with a contained conserved AraC binding domain. The following were the exceptions: strains P. aeruginosa 142, Alcaligenes sp. NyZ215 and plasmids pAC27, pAKD26 and pRK1-5. The ORFs across these clusters shared a similarity in the range of 42.5% to 100%. Meanwhile, the range of similarity across clc and tcb clusters ranged from 97.3% to 100% and from 71.5% to 100%, respectively.

2.1.6. Comparative Genomics for the tfd, tcb and clc Clusters

Structurally, clc, tcb and all types of tfd clusters (excluding tfdIV) shared a common structure with a unidirectional set of genes, encoding catabolic reactions and regulatory protein in the opposite direction. Figure 4 represents the comparative genomic analysis of the clc and tcb gene clusters of B. petrii DSM 12804, tfdI cluster of plasmid5, tfdII cluster of pEMT3, and tfdIII cluster of pkk4 plasmid. The clc and tcb clusters were closely related structurally. Both possessed an ORF3 open reading frame with an unknown function and an identical order of genes encoding proteins with the same activity. Additionally, the tfdI clusters were closely related to them, especially by tandem localization of genes encoding chloromuconate cycloisomerases (clcD, tcbD and tfdDI) and chlorocatechol 1,2-dioxygenases (clcA, tcbC and tfdCI). Nevertheless, there were two differences between clc and tcb and the tfdI cluster, namely, in the presence of the tfdBI gene and the absence of ORF3. The results of multiple alignments of complete clc, tcb and tfdI clusters revealed that clc and tcb shared between 46.2% and 50.2% identity and between 47.5% and 50.2% identity at nucleotide level with tfdI, respectively. When comparing complete clc and tcb gene clusters, they were shown to have a similarity in the range of 62.0–63.2%
As for clc and tcb, tfdII and tfdIII clusters were structurally closely related to each other, but differed in terms of tfdF gene localization. Both tfdII and tfdIII clusters possessed a reverse order of chloromuconate cycloisomerase (tfdDII and tfdDIII) and chlorocatechol 1,2-dioxygenase (tfdCII and tfdCIII) encoding genes compared with clc, tcb and tfdI. However, localization of the tfdB gene in tfdI, tfdII and tfdIII clusters was identical. The identity between tfdII and tfdIII clusters ranged from 67.7% to 68.2%. At the same time tfdI clusters shared from 49.0% to 52.2% and from 49.5% to 52.5% identity at the nucleotide level with tfdII and tfdIII, respectively.

2.2. The Phylogenetic Analysis of Deduced Protein Sequences Encoded by the tfd, tcb and clc Gene Clusters

Phylogenetic analysis of corresponding deduced protein sequences allowed for the evolutionary fate of genes in tfd, tcb and clc clusters, as well as the role of horizontal gene transfer to be assessed.

2.2.1. α-Ketoglutarate-Dependent 2,4-D Dioxygenase (tfdA)

Within α-ketoglutarate-dependent 2,4-D dioxygenases (TfdA), the ML analysis recovered tfdII as a strongly-supported paraphyletic with tfdIII falling within that clade (Figure S1a). Interestingly, plasmid5 and pDB1 were recovered as a sister clade with respect to plasmids in the tfdII cluster with moderate support.
Comparative analysis of the protein sequences of α-ketoglutarate-dependent 2,4-D dioxygenases of the tfdII and tfdIII clusters showed that similarity between them ranged from 87.2% to 100%. The ranges of similarity between the protein sequences of TfdAs inside each cluster were 93.8–100% and 96.9–100% for tfdII and tfdIII, respectively.

2.2.2. 2,4-D Transport Protein (tfdK)

The ML tree topology recovered clades of tfdII and tfdIII clusters as monophyletic with strong support. Each lineage further diverged into two well- and strongly supported sister clades (Figure S1b). The two clades of the first lineage were formed TfdKs of plasmid5 and pDB1 on the one hand, and p712, pEMT3, pJP4, pPO1 and pPO26 on the other. The second lineage diverged into two clades, the first of which was formed by almost all the transport proteins of the tfdIII clusters, and the second of which was formed by pkk4 and pM7012.
Comparative analysis of the amino acid sequences of 2,4-D transport proteins of the tfdII and tfdIII clusters encoded by the tfdKII and tfdKIII genes showed that similarity between them varied from 80.8% to 100%. The ranges of similarity between the amino acid sequences of TfdKs inside each cluster ranged from 94.4% to 100% and from 93.8% to 100% for tfdII and tfdIII clusters, respectively.

2.2.3. 2,4-DCP Hydroxylase (tfdB)

The results of the ML analysis clearly showed strongly-supported paraphyly of the tfdIV cluster (subcluster B) with respect to the tfdI, tfdII and tfdIII. Nevertheless, monophyly tfdI clade and other clades were poorly supported (Figure S1c). Interestingly, subcluster A of the tfdIV cluster resolved as paraphyletic with respect to the tfdII and tfdIII clusters, but that result was not supported by bootstrap. Moreover, the tfdIII clade was recovered as a strongly-supported paraphyletic with respect to the tfdII cluster.
A comparative analysis of amino acid sequences of 2,4-DCP hydroxylases encoded by the tfdBI, tfdBII, tfdBIII, and tfdBIV genes showed that they shared similarities of between 47.3% and 100%. The similarities between the amino acid sequences of TfdBs inside each cluster were in the range of 78.0–100% (tfdI), 95.9–100% (tfdII), 91.3–100% (tfdIII), and 64.3–100% (tfdIV), respectively.

2.2.4. Chlorocatechol 1,2-Dioxygenases (tcbC, clcA, tfdCI, tfdCII, tfdCIII, tfdCIV)

The ML tree topology indicated that all the strongly-supported clades, uniting the chlorocatechol 1,2-dioxygenases of all the tfd, clc and tcb gene clusters (with the exception of subcluster C from the tfdIV cluster) resolved as monophyletic with moderate support (Figure 5a). The clades uniting chlorocatechol 1,2-dioxygenases from the tfdII and tfdIII clusters were recovered as sister clades with strong support. The monophyly of chlorocatechol 1,2-dioxygenases for tfdI, tfdII and tfdIII was not significantly supported. At the same time, clades uniting chlorocatechol 1,2-dioxygenases from the clc and tcb clusters resolved as moderately-supported sister clades. Within the tfdIII clade, two well-supported sister clades were recovered, uniting proteins from plasmids pkk4 and pM7012 in the first clade and other plasmids in the second clade. The monophyly of clades, uniting proteins from subtypes A and B of the tfdIV cluster, was well supported.
A comparative analysis of the amino acid sequences of chlorocatechol 1,2-dioxygenases in these clusters encoded by the tcbC, clcA, tfdCI, tfdCII, tfdCIII, and tfdCIV genes showed that they shared a similarity of between 37.9% and 100%. The similarity between the amino acid sequences of chlorocatechol 1,2-dioxygenases inside each cluster was in the range of 95.2–100% (tcb), 96.9–100% (clc), 85.4–100% (tfdI), 95.7–100% (tfdII), 96.5–100% (tfdIII) and 38.0–100% (tfdIV), respectively.

2.2.5. Chloromuconate Cycloisomerases (tcbD, clcB, tfdDI, tfdDII, tfdDIII, tfdDIV)

In ML analysis, almost all the clades were resolved as well-supported monophyletic clades (Figure 5b). The exceptions to this were the two strongly-supported clades uniting the chloromuconate cycloisomerases from the tfdII and tfdIII clusters. At the same time, tfdII was recovered as a poorly supported paraphyletic clade, with the tfdIII clade falling within it. In monophyletic lineage, all clades received strong support. At the same time, two sister clades with low support were recovered within the clade uniting chloromuconate cycloisomerases from the tcb cluster. The monophyly of the clades uniting chloromuconate cycloisomerases from clc and tcb was moderately supported. However, together with the tfdI clade, they were recovered to be monophyletic with strong support. Within the tfdIV cluster, three subclusters (A, B, and C), received good or strong support and were found to be monophyletic with moderate support.
A comparative analysis of the amino acid sequences of chloromuconate cycloisomerases in these clusters encoded by the tcbD, clcB, tfdDI, tfdDII, tfdDIII, tfdDIV genes indicated that they shared a similarity of between 45.6% and 100%. The similarity between the amino acid sequences of chloromuconate cycloisomerases inside each cluster was in the range of 80.3–100% (tcb), 98.7–100% (clc), 92.7–100% (tfdI), 94.8–100% (tfdII), 100% (tfdIII) and 49.5–100% (tfdIV), respectively.

2.2.6. Chlorodienelactone Hydrolases (tcbE, clcD, tfdEI, tfdEII, tfdEIII, tfdEIV)

By ML analysis chlorodienelactone hydrolases were recovered as two monophyletic well- and strong-supported lineages (Figure 5c). The first included clusters tcb, clc, tfdI, and, surprisingly, subcluster A of the tfdIV cluster. Each cluster formed a strongly-supported clade. The tcb and tfdI clusters appeared as the sister group, albeit that they received low support. The second lineage included tfdII, tfdIII, and tfdIV (subclusters A and B) clades with well and strong support. The clade comprised tfdEIII chlorodienelactone hydrolases recovered as paraphyletic, with tfdII cluster members falling within that clade. However, that result was moderately supported. Subclusters A and B of tfdIV were recovered as sister clades with well-supported topology.
A comparative analysis of the amino acid sequences of chlorodienelactone hydrolases in these clusters encoded by the tcbE, clcD, tfdEI, tfdEII, tfdEIII and tfdEIV genes indicated that they shared a similarity of between 25.5% and 100%. The similarity between the amino acid sequences of chlorodienelactone hydrolases inside each cluster varied: 99.6–100% (tcb), 97.0–100% (clc), 84.7–100% (tfdI), 88.1–100% (tfdII), 80.9–100% (tfdIII) and 29.7–100% (tfdIV), respectively.

2.2.7. Maleylacetate Reductases (tcbF, clcE, tfdFI, tfdFII, tfdFIII, tfdFIV)

This phylogenetic analysis recovered these proteins, encoded by all the analyzed clusters (with the exception of maleylacetate reductases from tfdII), as monophyletic with good support (Figure 5d). As such, clc, tcb, tfdI, tfdIV and tfdIII, each clustered in well- and strongly supported clades. Monophyly maleylacetate reductases from clc, tcb and tfdI were strongly supported. The tfdIV cluster was recovered as paraphyletic, with the tfdIII cluster falling within that clade albeit with low bootstrap support. The ML analysis recovered tfdII as monophyletic with strong support and positioned the clade consisting of pDB1 and plasmid5 as the sister of the clade formed by all the other plasmids, also with strong support. All of the above confirmed the polyphyly of maleylacetate reductases.
A comparative analysis of the amino acid sequences of maleylacetate reductases in these clusters encoded by the tcbF, clcE, tfdFI, tfdFII, tfdFIII, tfdFIV genes indicated that they shared a similarity of between 49.7% and 100%. The ranges of similarity between the amino acid sequences of maleylacetate reductases inside each cluster were 98.8–100% (tcb), 89.5–100% (clc), 84.8–100% (tfdI), 88.6–100% (tfdII), 96.9–100% (tfdIII) and 60.7–100% (tfdIV).

2.2.8. Transcriptional Regulator (tcbR, clcR, tfdT, tfdRII/tfdS, tfdRIII, tfdRIV)

The monophyly of the transcriptional regulators encoded by clc, tcb, tfdI, tfdII and tfdIII clusters (with the exception of tfdIV) was well supported in ML analyses (Figure S2). The nodes, which are crucial for understanding the phylogenetic relationships between clc, tcb, tfdI, tfdII and tfdIII clusters, were weak or unsupported (with the exception of the tfdII and tfdIII clusters). Nevertheless, each of the analyzed clusters, tfdI, tfdII, tfdIII, clc and tcb, formed its own strongly-supported clade. The clade, consisting of TfdRIII transcriptional regulators, was recovered as a strongly-supported paraphyletic, with the tfdII cluster members falling within that clade. The monophyly of tfdIV cluster transcriptional regulators was moderately supported, and within the tfdIV cluster, both subclusters, I and III, were recovered to be monophyletic with strong and good support, respectively. Thus, transcriptional regulators were recovered as polyphyletic.
A comparative analysis of the amino acid sequences of transcriptional regulators in these clusters encoded by the tcbR, clcR, tfdT, tfdRII/tfdS), tfdRIII, tfdRIV genes indicated that they shared a similarity of between 47.4% and 100%. The similarities between the amino acid sequences of maleylacetate reductases inside each cluster were in the ranges 99–100% (tcb), 96.6–100% (clc), 80.7–100% (tfdI), 95.6–100% (tfdII), 82.8–100% (tfdIII) and 64.3–100% (tfdIV).
In summary, congruence between all the phylogenies of the proteins involved in catechol degradation through the ortho-cleavage pathway among the tcb, clc and tfdI clusters could be concluded. The proteins of other clusters, tfdII, tfdIII and tfdIV, recovered both congruent and incongruent phylogenies. The proteins of cluster tfdII were congruent with each other, as were the proteins of cluster tfdIII, with one exception, namely, the protein maleylacetate reductase, which showed an incongruent phylogeny. Incongruence within cluster tfdIV was observed among chlorocatechol 1,2-dioxygenases and chlorodienelactone hydrolase proteins.

3. Discussion

3.1. The New Classification Scheme and Nomenclature of tfd Gene Clusters

The obtained results clearly indicate that tfdI and tfdII type clusters are well conserved in relation to their structures and can continue to be classified as type I and type II, without any changes in classification. Meanwhile, the two groups of clusters are separate types of tfd clusters with substantive structural changes and different evolutionary origins. Taking this into account, they should be given independent designation numbers, type III and type IV. Thus, based on both the historical continuity of the designation of tfd gene clusters [15,17] and the results of both comparative genomics and protein phylogeny, a new classification scheme is proposed, categorizing tfd gene clusters into four types—I, II, III and IV (A, B and C)—alongside a new nomenclature (for details of the syntax, see the Section 4.

3.2. The Role of Horizontal Gene Transfer (HGT) and Gene Displacement in the Mosaic Nature of tfd Gene Clusters

It is generally accepted that horizontal gene transfer (HGT) is the major process in bacterial evolution; its role has been proved in numerous studies. Mosaic operons contain genes transferred by HGT, which are characterized by the incongruence of their phylogeny with other genes [74]. Clusters, as well as operons, can also be mosaic in nature [75]. Nevertheless, the analysis of tfd gene clusters from the standpoint of incongruence (or so-called discrepancies) in their phylogeny has only been performed in a few papers [33,76].
The congruence of the phylogeny of all tcb and clc cluster proteins clearly illustrates that they are not mosaic and are spread entirely by horizontal transfer. Moreover, the same conclusion can be drawn about the core five-gene part of cluster tfdI responsible for the catechol ortho-cleavage pathway. Previously, it has been pointed out that from a phylogenetic point of view these clusters diverged from a common ancestral ortho-cleavage pathway for chlorocatechols [33].
Very intriguing findings follow from the congruence of almost all the proteins of clusters tfdII and tfdIII, with the exception of maleylacetate reductases (tfdFII and tfdFIII). These findings may be explained in light of the HGT event, followed by differential gene losses—displacement of an ancestral tfdFIII (probably shared common ancestry with tfdFII) to the functionally equivalent gene of maleylacetate reductase, homologous to those from tfdFIV. This is a phenomenon, a homolog displacement, which is probably selectively neutral [77] but one of those that led to the origin of a different type of tfd cluster.
Analyzing the four proteins of cluster tfdIV (TfdCIV, TfdDIV, TfdEIV and TfdFIV) directly involved in the catechol pathway, it becomes obvious that subclusters A and B are not mosaic and are entirely distributed by HGT. In contrast, subcluster C, in the case of the chlorocatechol 1,2-dioxygenases (TfdCIVC) and chlorodienelactone hydrolases (TfdEIVC) proteins, had other evolutionary ancestors. These findings classify this subcluster as a mosaic.
Interestingly enough, the results of phylogenetic analysis clearly revealed that the tfdB gene for ancestral 2,4-DCP hydroxylase protein was assembled by clusters tfdI, tfdII, tfdIII and tfdIV after distribution through HGT. However, with regard to gene tfdB, the results obtained in this work are inconsistent with an earlier conclusion about its independent evolution [78].
Thus, HGT between orders Burkholderiales and Sphingomonadales provided phenomenal linkage between tfdI, tfdII, tfdIII and tfdIV type clusters. As a result, these orders have proven to be the most adapted to repeated exposures to the herbicide 2,4-D in soils.

3.3. Evolution Lineage including Homologous tfdI, tcb and clc Gene Clusters

All the obtained results point to the existence of the lineage including three types of analyzed clusters, namely, tcb, clc and tfdI, which are considered homologs. This conclusion is supported by at least five lines of evidence: (i) the clusters have similar core five-gene structures (genes, responsible for the ortho-cleavage of catechols) with an identical order of genes, especially encoded chloromuconate cycloisomerases and chlorocatechol 1,2-dioxygenases; (ii) the protein tree topologies suggest a common origin; (iii) high identities and similarities of DNA and protein sequences; (iv) the presence of ORF3 and the flanking ORF encoding the AraC family conserved domain in tcb and clc; and (v) the clusters are widespread mainly across the order Burkholderiales (and even in a genome of the same strain, for example, B. petrii DSM 12804). Previous studies have suggested gene homology and possible evolutionary relatedness between genes of the first-described clusters tcb, clc and tfdI [27,29,31].
The results apparently suggest that tcb, clc and tfdI clusters are descendants of a single ancestral cluster, which further evolved by adapting to different substrates. The absence of ancestral cluster data, high conservation, and almost complete absence of genetic rearrangements among tcb, clc and tfdI clusters prevent the causes and mechanisms of the main models of cluster formation from being delineated, with one exception. Nevertheless, the prevalent plasmid localization (with the exception of clc cluster) clearly suggests that they are the hot spots of cluster evolution and propagation. In turn, this indicates a possible assemblage of these clusters according to the ‘Scribbling Pad’ model [79]. The clc clusters are relatively poorly localized on plasmids compared with tfdI and tcb, while at the same time remaining more conserved at the nucleotide sequence level. This also correlates well with this model. The tcb and clc gene clusters retained a high synteny of their own and flanking gene structure suggesting this is the ancestral state. Interestingly, the conservation of tcb cluster genes was higher than flanking ORFs that could suggest their high priority to bacterial hosts.
Perhaps tfdI type clusters evolved from a common ancestor with clusters tcb and clc by clustering the tfdBI gene (described above) and eliminating ORF3, resulting in the acquisition of the ability to hydroxylate 2,4-dichlorophenol (2,4-DCP) to 3,5-dichlorocatechol (3,5-DCC). This is supported by the absence of this gene in the clusters of the pNK84 plasmid and the P. phytofirmans OLGA172, which, according to the topology of all ortho-pathway proteins, diverged from a common ancestor earlier than the main group. Thus, they probably diverged before the tfdBI gene was assembled. This allowed this cluster to more narrowly specialize in the degradation of 2,4-DCP into 3,5-DCC.

3.4. Evolution Lineage including Homologous tfdII and tfdIII Type Clusters

This lineage of tfd gene clusters is comprised of two types, tfdII and tfdIII, that have a similar structure and whose proteins are almost entirely homologous. The exception is the chloromaleylacetate reductase protein, which has a different evolutionary origin to those types (described above). There is no doubt that these types of clusters originated from the division of a common prototype cluster into two branches, as confirmed by the obtained results. By analogy with the evolutionary lineage including clusters tcb, clc and tfdI, only one of the potentially possible models of its formation can be distinguished for the prototype cluster, namely, the ‘Scribbling Pad’ model [79], since, obviously, exclusively plasmid localization is a consequence of plasmid-mediated clustering and propagation. Apparently, the crucial point in the evolutionary division of this lineage into two types was the homolog displacement of the tfdF gene in the common prototype cluster (described above).
Currently, it is obvious that the full tfdII cluster with a complete tfdIIAKBFECDR eight-gene structure is more widely distributed among plasmids than the tfdIIKBFECDR seven-gene structure of the pJP4 plasmid. Nevertheless, the recent report about pPO line plasmids with their overall structure essentially the same as the pJP4 plasmid and with deletion variations of cluster tfdII [44] puts much into perspective. Apparently, one consequent series of genetic rearrangements, primarily deletions, in pJP4 or pJP4-like plasmids, has resulted in the current diversity and propagation of these cluster variations. Since, of all the resulting variations, the seven-gene pJP4 cluster was the first to be explored, it became canonical.
The features typically associated with tfdIII type clusters can be derived through a series of rearrangements leading to incomplete clusters with tfdIIIAKBECR structure (excluding pRK1-5 and D. acidovorans P4a). It is probable that the ancient tfdIIAKBFECDR exposed: (i) successful tfdFII gene displacement that has resulted in a future tfdIIIFAKBECDR structure of the cluster (plasmids pkk4 and pM7012); (ii) simultaneously or sequentially successful displacement of the tfdFII gene and unsuccessful displacement of tfdDII that has resulted in a future tfdIIIF,AKBECR structure of the cluster (plasmids pIJB1 and pEST4011); and (iii) simultaneously or sequentially unsuccessful displacement of the tfdFII and tfdDII gene that has resulted in a future tfdIIIAKBECR structure of the cluster (plasmids pAKD25, pAKD26, pTV1). It is important to note that plasmid pTV1 also has the tfdIIIAKBECR structure, but the tfdAIII gene was partially sequenced in further study [78]. These findings are not supported by the version proposed by Sakai et al. [42] about the probability of the assemblage of the tfdII type cluster recruiting genes from tfdI.
Moreover, the structure of tfdIIIAKBECR has undergone further rearrangements. In D. acidovorans P4a, the two-gene cluster tfdIIICR was replaced with a two-gene cluster tcbCR which encoded the functionally identical proteins of the tcb cluster. This led to the emergence of a hybrid cluster tfdIIIAKBEtcbCR. Obviously, recombination has occurred between the tfdIII cluster and the tcb cluster located almost immediately downstream. Apparently, the plasmid pRK1-5 also had the structure tfdIIIAKBECR, but genes tfdIIIAK were deleted. Interestingly, plasmids pRK1-5, pAKD26 and D. acidovorans P4a have an almost similarly close proximity in terms of location to the tfdIII and tcb clusters, indicating that their ancestral state was tcbRCDEF and tfdIIIAKBECR and the other structure rearrangements occurred later. The latter is also indicated by the fact that plasmids pEST4011, pIJB1, pAKD25 and pTV1 have tfdIIIAKBECR and tcb clusters consisting of the ORF3 and tcbD gene instead of a complete tcb cluster. Finally, the ancestral state of tcbRCDEF and tfdIIIAKBECR is confirmed by the fact that plasmids pAKD25 and pAKD26 were isolated from the same bacterial host [39].

3.5. Evolution Lineage including Unique tfdIV Type Clusters

This tfdIV clusters have a gene structure with incomplete clustering, which significantly distinguishes them from the above-mentioned tfdI, tfdIII, clc and tcb gene clusters. The second shared feature is several gene structures, and the absence of the tfdA gene in their structures. If the above clusters have a complete set of genes, then tfdIV members have only five genes in the cluster. The results of both synteny and protein phylogenetic analysis revealed the three sublineages inside tfdIV type clusters which evolved separately, but shared a common ancestor for the majority of encoded proteins.
The major subtype A involves seven clusters, three of which were plasmid-localized. The five of them share a five-gene structure and near localized tfdBIVA gene tfdIVAECFRD,B (Figure 2b) suggesting a common path for that sublineage clustering. The induction of genes of that tfdIV-cluster sublineage in Sphingomonas sp. TFD44 in the presence of 2,4-D was previously noted by Thiel and colleagues (2005) [38].
The members of the second subtype, B, of tfdIV type clusters with the tfdIVBD,FEC/FEC/D,FEC,B structure for the cluster, to date, evolved as parts of pathways responsible for the degradation of multiple compounds. The presented results indicate that tfdIVB type clusters are assembled with several genes encoding the anthranilate dioxygenase (AntDO-3C) enzyme of B. cepacia DBO1. Those genes are responsible for the degradation of anthranilate (3-aminobenzoate) [80]. Analogous recruitment was found by Liu et al. (2011) [64] for 2-chloronitrobenzene (2CNB) degrading genes. Nevertheless, the same cluster in Sphingomonas sp. TFD44 (tfdIVBFEC cluster) encoded 2,4-D degradation and had the tfdD gene in another place in the genome [38]. Since the above-mentioned clusters tfdIVBD,FEC/FEC/D,FEC,B have an almost identical structure, high identity and synteny, apparently, they were distributed initially amongst order Sphingomonadales, similar to other tfdIV type clusters, and later were acquired by P. stutzeri ZWLR2-1. The plasmid pDB-4 has the most complete tfdIVBD,FEC,B cluster and the presence of the tfdBIV gene also indicates the relatedness of the first and second subtype of tfdIV type clusters (Figure 2a). Additionally, it is probable that the above-mentioned dioxygenases show broad substrate specificity. Additionally, tfdIVBD,FEC/FEC/D,FEC,B are capable of degrading modified catechols which occur as a result of the intermediate state of the catabolism of many compounds, including anthranilate, 2CNB, 2,4-D and others.
The third sublineage of tfdIVC type clusters, tfdIVCCEDRF,B/CE,D,R,F, to date, can be defined by two clusters with different levels of clustering in contigs of Bradyrhizobium sp. RD5-C2 (BOVL01000048) and Sphingomonas histidinilytica BT1 5.2 (WMBU01000019), respectively. The first one was designated as tfdBaFRDEC and is involved in 2,4-D degradation from dichlorophenol with the same degradation pathway as that of C. pinatubonensis JMP134 [10]. The S. histidinilytica BT1 5.2, which is capable of degrading 2,4-D, possesses the second cluster, tfdIVcCE,D,R,F, which was annotated by Nguyen and colleagues (2021) as tfdF,S,D,EC [7].
The main structural feature, incomplete clustering, across members of this lineage highlight the possible directions, driving forces and models of clustering in contrast with other lineages. The extreme rarity of genes of this evolutionary lineage among bacteria may lead them to form clusters as a protective reaction in order to avoid evolutionary loss and promote propagation across bacteria. This would be very consistent with the ‘Selfish operon model’ [81] and apparently contradicts persistence as a driving force of their clustering [82]. Moreover, the proven HGT, the absence of adjacent insertion elements (IS) with one exception, and duplicated genes, actually oppose co-regulation theory [83], the ‘IDE model’ [84] and SNAP hypothesis [85], respectively. Interestingly, the revealed direct contribution of plasmids to genetic rearrangements in subtype A proves that clustering can proceed in two different models.
Currently, Sphingomonas and Bradyrhizobium genera are classified as I and III classes of 2,4-D degraders possessing the second system, cadABCD gene cluster, responsible for the initiation of degradation of chlorophenoxyacetic acids [86]. The cad clusters were identified in members of all tfdIV subtypes. It was assumed that they were involved in the initial stages of 2,4-D degradation [7,10,41,63]. Nevertheless, some members of Sphingomonas and Bradyrhizobium could have their own versions of the first system, the tfdA gene and its homologs, named tfdA-like and tfdAα, respectively [34].
Thus, tfdIV clusters are unique cad-dependent 2,4-D degradation clusters which have evolved into three subtypes which provide Sphingomonas and Bradyrhizobium with great competitive advantages.

4. Materials and Methods

4.1. Databases and Data Collection

All DNA and protein sequences, as well as additional information were obtained from both finished and unfinished genome sequencing records which are publicly available at NCBI resources [87]. The plasmids and genomes of 56 bacterial species were analyzed. Most belonged to the order Burkholderiales from Betaproteobacteria and a few belonged to Alpha- and Gammaproteobacteria.

4.2. Gene Annotation

Two ORF prediction software, the web version of the NCBI ORF finder and SnapGene Viewer v.4.2.1, were used. Then the predicted ORFs were annotated using NCBI blastp [87] similarity searches against the UniProtKB/Swiss-Prot (swissprot) database.

4.3. Comparative Genomic Analyses

The canonical gene clusters of the plasmids pJP4 (AY365053), pP51 (M57629) and pAC27 (M16964) were used as templates for NCBI blastn [87] similarity searches against the nucleotide collection (nr/nt) and whole-genome shotgun contigs (wgs) databases. Adjacent to the clusters, genomic regions were searched and analyzed in order to determine synteny between clusters and flanked genes. For incomplete genome sequence projects in contig format, this strategy was not possible and gene cluster architecture was supposed based on nucleotide sequence identity.
Multiple sequence alignments of DNA and protein sequences of analyzed gene clusters were obtained using MAFFT at default parameter settings [88]. Values of identity and similarity between corresponding nucleotide and protein sequences of analyzed clusters were calculated using SIAS (http://imed.med.ucm.es/Tools/sias.html, accessed on 15 May 2023 with default parameters.
Comparative synteny analysis was performed by comparison between linearized maps and oriented according to the structure of gene clusters to depict gene arrangements, gene retention and orientation. Detailed synteny maps were visualized and blastn identity comparisons were generated using Easyfig v.2.2.2 [89].
The new classification scheme for tfd gene clusters into four types—I, II, III and IV (A, B and C)—and nomenclature, were proposed. An updated nomenclature of tfd gene clusters was formulated based on the following syntax: the three italicized lowercase letters followed by Roman numerals as a subscript refer to the number of the cluster type. For example, tfdIBFEDCT (in short, tfdI), tfdIIAKBFECDR (in short, tfdII), tfdIIIFAKBECDR (in short, tfdIII), tfdIVAECFRD,B/tfdIVBCEF,D/tfdIVCCEDRF,B (in short, tfdIVA, tfdIVB, tfdIVC). It is worth noting that there is no need to designate each gene as belonging to a certain type in the cluster structure (for example, tfdBIFIEIDICITI), since tfd clusters do not form hybrid clusters among themselves.
For clusters tcb and clc, a reverse spelling of the gene order, tcbFEDCR and clcEDBAR, was proposed, to correspond to the order of genes in other types of ortho-pathway clusters (tfdI, tfdII and tfdIII).

4.4. Alignments and Phylogenetic Analyses

Multiple sequence alignments for all trees were performed with MAFFT using default parameter settings [88]. The evolutionary history was inferred by using the maximum likelihood method based on the JTT matrix-based model. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with a superior log likelihood value. Evolutionary analyses were conducted in MEGA7 [90] with 1000 bootstrap replicates.

5. Conclusions

The tfd is a unique family of diverse mosaic gene cluster types interlinked by their activity with regard to 2,4-D and other chlorinated aromatic compounds. Widespread mainly across the orders Burkholderiales and Sphingomonadales, the extraordinary reservoirs of genes involved in the ortho-cleavage pathway of 2,4-D and catechols, these diverse tfds as well as highly conserved tcb and clc gene clusters enable microbes to exploit a variety of xenobiotic-polluted niches. Systematization of both sequenced and published data for over forty years and subsequent analysis through comparative genomic and protein phylogeny approaches has resulted in new insights into the evolution, classification and nomenclature of these clusters. Application of these work classification schemes provides a powerful approach for future exploration, especially for the correct annotation of tfd, tcb and clc clusters, as well as in the field related to their distribution and evolution across diverse bacteria.

Supplementary Materials

The following supporting information can be downloaded at: https://0-www-mdpi-com.brum.beds.ac.uk/article/10.3390/ijms241814370/s1.

Funding

This work was supported by Russian Science Foundation (RSF) [grant number 23-24-00480].

Data Availability Statement

Data is contained within the article or supplementary material.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Peterson, M.A.; McMaster, S.A.; Riechers, D.E.; Skelton, J.; Stahlman, P.W. 2,4-D Past, Present, and Future: A Review. Weed Technol. 2016, 30, 303–345. [Google Scholar] [CrossRef]
  2. Agency for Toxic Substances and Disease Registry (ATSDR). Toxicological Profile for 2,4-Dichlorophenoxyacetic Acid (2,4-D); U.S. Department of Health and Human Services, Public Health Service: Atlanta, GA, USA, 2020.
  3. Islam, F.; Wang, J.; Farooq, M.A.; Khan, M.S.S.; Xu, L.; Zhu, J.; Zhao, M.; Muños, S.; Li, Q.X.; Zhou, W. Potential impact of the herbicide 2,4-dichlorophenoxyacetic acid on human and ecosystems. Environ. Int. 2018, 111, 332–351. [Google Scholar] [CrossRef] [PubMed]
  4. Xia, Z.Y.; Zhang, L.; Zhao, Y.; Yan, X.; Li, S.P.; Gu, T.; Jiang, J.D. Biodegradation of the herbicide 2,4-dichlorophenoxyacetic acid by a new isolated strain of Achromobacter sp. LZ35. Curr. Microbiol. 2017, 74, 193–202. [Google Scholar] [CrossRef] [PubMed]
  5. Xiang, S.; Lin, R.; Shang, H.; Xu, Y.; Zhang, Z.; Wu, X.; Zong, F. Efficient degradation of phenoxyalkanoic acid herbicides by the alkali-tolerant Cupriavidus oxalaticus strain X32. J. Agric. Food Chem. 2020, 68, 3786–3795. [Google Scholar] [CrossRef]
  6. Brucha, G.; Aldas-Vargas, A.; Ross, Z.; Peng, P.; Atashgahi, S.; Smidt, H.; Langenhoff, A.; Sutton, N.B. 2,4-Dichlorophenoxyacetic acid degradation in methanogenic mixed cultures obtained from Brazilian Amazonian soil samples. Biodegradation 2021, 32, 419–433. [Google Scholar] [CrossRef]
  7. Nguyen, T.L.A.; Dang, H.T.C.; Koekkoek, J.; Braster, M.; Parsons, J.R.; Brouwer, A.; de Boer, T.; van Spanning, R.J.M. Species and metabolic pathways involved in bioremediation of vietnamese soil from Bien Hoa airbase contaminated with herbicides. Front. Sustain. Cities 2021, 3, 692018. [Google Scholar] [CrossRef]
  8. Nguyen, T.L.A.; Dang, H.T.C.; Koekkoek, J.; Dat, T.T.H.; Braster, M.; Brandt, B.W.; Parsons, J.R.; Brouwer, A.; Spanning, R.J.M. Correlating biodegradation kinetics of 2,4-dichlorophenoxyacetic acid (2,4-D) and 2,4,5-trichlorophenoxyacetic acid (2,4,5-T) to the dynamics of microbial communities originating from soil in Vietnam contaminated with herbicides. Front. Sustain. Cities 2021, 3, 692012. [Google Scholar] [CrossRef]
  9. Zharikova, N.V.; Iasakov, T.R.; Zhurenko, E.I.; Korobov, V.V.; Markusheva, T.V. Plasmids of the chlorophenoxyacetic-acid degradation of bacteria of the genus Raoultella. Appl. Biochem. Microbiol. 2021, 57, 335–343. [Google Scholar] [CrossRef]
  10. Hayashi, S.; Tanaka, S.; Takao, S.; Kobayashi, S.; Suyama, K.; Itoh, K. Multiple gene clusters and their role in the degradation of chlorophenoxyacetic acids in Bradyrhizobium sp. RD5-C2 isolated from non-contaminated soil. Microbes Environ. 2021, 36, ME21016. [Google Scholar] [CrossRef]
  11. Don, R.H.; Pemberton, J.M. Properties of six pesticide degradation plasmids isolated from Alcaligenes paradoxus and Alcaligenes eutrophus. J. Bacteriol. 1981, 145, 681–686. [Google Scholar] [CrossRef]
  12. Don, R.H.; Weightman, A.J.; Knackmuss, H.J.; Timmis, K.N. Transposon mutagenesis and cloning analysis of the pathways for degradation of 2,4-dichlorophenoxyacetic acid and 3-chlorobenzoate in Alcaligenes eutrophus JMP134(pJP4). J. Bacteriol. 1985, 161, 85–90. [Google Scholar] [CrossRef] [PubMed]
  13. Streber, W.R.; Timmis, K.N.; Zenk, M.H. Analysis, cloning, and high-level expression of 2,4-dichlorophenoxyacetate monooxygenase gene tfdA of Alcaligenes eutrophus JMP134. J. Bacteriol. 1987, 169, 2950–2955. [Google Scholar] [CrossRef] [PubMed]
  14. Harker, A.R.; Olsen, R.H.; Seidler, R.J. Phenoxyacetic acid degradation by the 2,4-dichlorophenoxyacetic acid (TFD) pathway of plasmid pJP4: Mapping and characterization of the TFD regulatory gene, tfdR. J. Bacteriol. 1989, 171, 314–320. [Google Scholar] [CrossRef]
  15. Leveau, J.H.; Zehnder, A.J.; van der Meer, J.R. The tfdK gene product facilitates uptake of 2,4-dichlorophenoxyacetate by Ralstonia eutropha JMP134(pJP4). J. Bacteriol. 1998, 180, 2237–2243. [Google Scholar] [CrossRef] [PubMed]
  16. Kaphammer, B.; Kukor, J.J.; Olsen, R.H. Regulation of tfdCDEF by tfdR of the 2,4-dichlorophenoxyacetic acid degradation plasmid pJP4. J. Bacteriol. 1990, 172, 2280–2286. [Google Scholar] [CrossRef] [PubMed]
  17. Laemmli, C.M.; Leveau, J.H.; Zehnder, A.J.; van der Meer, J.R. Characterization of a second tfd gene cluster for chlorophenol and chlorocatechol metabolism on plasmid pJP4 in Ralstonia eutropha JMP134 (pJP4). J. Bacteriol. 2000, 182, 4165–4172. [Google Scholar] [CrossRef]
  18. Trefault, N.; De la Iglesia, R.; Molina, A.M.; Manzano, M.; Ledger, T.; Pérez-Pantoja, D.; Sánchez, M.A.; Stuardo, M.; González, B. Genetic organization of the catabolic plasmid pJP4 from Ralstonia eutropha JMP134 (pJP4) reveals mechanisms of adaptation to chloroaromatic pollutants and evolution of specialized chloroaromatic degradation pathways. Environ. Microbiol. 2004, 6, 655–668. [Google Scholar] [CrossRef]
  19. Kumar, A.; Trefault, N.; Olaniran, A.O. Microbial degradation of 2,4-dichlorophenoxyacetic acid: Insight into the enzymes and catabolic genes involved, their regulation and biotechnological implications. Crit. Rev. Microbiol. 2016, 42, 194–208. [Google Scholar] [CrossRef]
  20. Pérez-Pantoja, D.; Guzmán, L.; Manzano, M.; Pieper, D.H.; González, B. Role of tfdCIDIEIFI and tfdDIICIIEIIFII gene modules in catabolism of 3-chlorobenzoate by Ralstonia eutropha JMP134 (pJP4). Appl. Environ. Microbiol. 2000, 66, 1602–1608. [Google Scholar] [CrossRef]
  21. Laemmli, C.M.; Schönenberger, R.; Suter, M.; Zehnder, A.J.; van der Meer, J.R. TfdD(II), one of the two chloromuconate cycloisomerases of Ralstonia eutropha JMP134 (pJP4), cannot efficiently convert 2-chloro-cis,cis-muconate to trans-dienelactone to allow growth on 3-chlorobenzoate. Arch. Microbiol. 2002, 178, 13–25. [Google Scholar] [CrossRef]
  22. Plumeier, I.; Pérez-Pantoja, D.; Heim, S.; González, B.; Pieper, D.H. Importance of different tfd genes for degradation of chloroaromatics by Ralstonia eutropha JMP134. J. Bacteriol. 2002, 184, 4054–4064. [Google Scholar] [CrossRef]
  23. van der Meer, J.R.; Frijters, A.C.; Leveau, J.H.; Eggen, R.I.; Zehnder, A.J.; de Vos, W.M. Characterization of the Pseudomonas sp. strain P51 gene tcbR, a LysR-type transcriptional activator of the tcbCDEF chlorocatechol oxidative operon, and analysis of the regulatory region. J. Bacteriol. 1991, 173, 3700–3708. [Google Scholar] [CrossRef]
  24. van der Meer, J.R.; van Neerven, A.R.; de Vries, E.J.; de Vos, W.M.; Zehnder, A.J. Cloning and characterization of plasmid-encoded genes for the degradation of 1,2-dichloro-, 1,4-dichloro-, and 1,2,4-trichlorobenzene of Pseudomonas sp. strain P51. J. Bacteriol. 1991, 173, 6–15. [Google Scholar] [CrossRef] [PubMed]
  25. Frantz, B.; Chakrabarty, A.M. Organization and nucleotide sequence determination of a gene cluster involved in 3-chlorocatechol degradation. Proc. Natl. Acad. Sci. USA 1987, 84, 4460–4464. [Google Scholar] [CrossRef]
  26. Ghosal, D.; You, I.S. Operon structure and nucleotide homology of the chlorocatechol oxidation genes of plasmids pJP4 and pAC27. Gene 1989, 83, 225–232. [Google Scholar] [CrossRef]
  27. Coco, W.M.; Rothmel, R.K.; Henikoff, S.; Chakrabarty, A.M. Nucleotide sequence and initial functional characterization of the clcR gene encoding a LysR family activator of the clcABD chlorocatechol operon in Pseudomonas putida. J. Bacteriol. 1993, 175, 417–427. [Google Scholar] [CrossRef]
  28. Ghosal, D.; You, I.S.; Chatterjee, D.K.; Chakrabarty, A.M. Genes specifying degradation of 3-chlorobenzoic acid in plasmids pAC27 and pJP4. Proc. Natl. Acad. Sci. USA 1985, 82, 1638–1642. [Google Scholar] [CrossRef] [PubMed]
  29. Ghosal, D.; You, I.S. Nucleotide homology and organization of chlorocatechol oxidation genes of plasmids pJP4 and pAC27. Mol. Gen. Genet. 1988, 211, 113–120. [Google Scholar] [CrossRef]
  30. van der Meer, J.R.; Eggen, R.I.; Zehnder, A.J.; de Vos, W.M. Sequence analysis of the Pseudomonas sp. strain P51 tcb gene cluster, which encodes metabolism of chlorinated catechols: Evidence for specialization of catechol 1,2-dioxygenases for chlorinated substrates. J. Bacteriol. 1991, 173, 2425–2434. [Google Scholar] [CrossRef] [PubMed]
  31. Kasberg, T.; Daubaras, D.L.; Chakrabarty, A.M.; Kinzelt, D.; Reineke, W. Evidence that operons tcb, tfd, and clc encode maleylacetate reductase, the fourth enzyme of the modified ortho pathway. J. Bacteriol. 1995, 177, 3885–3889. [Google Scholar] [CrossRef] [PubMed]
  32. Schlömann, M. Evolution of chlorocatechol catabolic pathways. Conclusions to be drawn from comparisons of lactone hydrolases. Biodegradation 1994, 5, 301–321. [Google Scholar] [CrossRef] [PubMed]
  33. Vallaeys, T.; Courde, L.; Mc Gowan, C.; Wright, A.D.; Fulthorpe, R.R. Phylogenetic analyses indicate independent recruitment of diverse gene cassettes during assemblage of the 2,4-D catabolic pathway. FEMS Microbiol. Ecol. 1999, 28, 373–382. [Google Scholar] [CrossRef]
  34. Zharikova, N.V.; Iasakov, T.R.; Zhurenko, E.Y.; Korobov, V.V.; Markusheva, T.V. Bacterial genes of 2,4-dichlorophenoxyacetic acid degradation encoding α-ketoglutarate-dependent dioxygenase activity. Biol. Bull. Rev. 2018, 8, 155–167. [Google Scholar] [CrossRef]
  35. Liu, S.; Ogawa, N.; Miyashita, K. The chlorocatechol degradative genes, tfdT-CDEF, of Burkholderia sp. strain NK8 are involved in chlorobenzoate degradation and induced by chlorobenzoates and chlorocatechols. Gene 2001, 268, 207–214. [Google Scholar] [CrossRef]
  36. Poh, R.P.; Smith, A.R.; Bruce, I.J. Complete characterisation of Tn5530 from Burkholderia cepacia strain 2a (pIJB1) and studies of 2,4-dichlorophenoxyacetate uptake by the organism. Plasmid 2002, 48, 1–12. [Google Scholar] [CrossRef]
  37. Vedler, E.; Vahter, M.; Heinaru, A. The completely sequenced plasmid pEST4011 contains a novel IncP1 backbone and a catabolic transposon harboring tfd genes for 2,4-dichlorophenoxyacetic acid degradation. J. Bacteriol. 2004, 186, 7161–7174. [Google Scholar] [CrossRef]
  38. Thiel, M.; Kaschabek, S.R.; Gröning, J.; Mau, M.; Schlömann, M. Two unusual chlorocatechol catabolic gene clusters in Sphingomonas sp. TFD44. Arch. Microbiol. 2005, 183, 80–94. [Google Scholar] [CrossRef]
  39. Sen, D.; Van der Auwera, G.A.; Rogers, L.M.; Thomas, C.M.; Brown, C.J.; Top, E.M. Broad-host-range plasmids from agricultural soils have IncP-1 backbones with diverse accessory genes. Appl. Environ. Microbiol. 2011, 77, 7975–7983. [Google Scholar] [CrossRef]
  40. Kim, D.U.; Kim, M.S.; Lim, J.S.; Ka, J.O. Widespread occurrence of the tfd-II genes in soil bacteria revealed by nucleotide sequence analysis of 2,4-dichlorophenoxyacetic acid degradative plasmids pDB1 and p712. Plasmid 2013, 69, 243–248. [Google Scholar] [CrossRef]
  41. Nielsen, T.K.; Xu, Z.; Gözdereliler, E.; Aamand, J.; Hansen, L.H.; Sørensen, S.R. Novel insight into the genetic context of the cadAB genes from a 4-chloro-2-methylphenoxyacetic acid-degrading Sphingomonas. PLoS ONE 2013, 8, e83346. [Google Scholar] [CrossRef]
  42. Sakai, Y.; Ogawa, N.; Shimomura, Y.; Fujii, T. A 2,4-dichlorophenoxyacetic acid degradation plasmid pM7012 discloses distribution of an unclassified megaplasmid group across bacterial species. Microbiol. Read. 2014, 160 Pt 3, 525–536. [Google Scholar] [CrossRef]
  43. Ricker, N.; Shen, S.Y.; Goordial, J.; Jin, S.; Fulthorpe, R.R. PacBio SMRT assembly of a complex multi-replicon genome reveals chlorocatechol degradative operon in a region of genome plasticity. Gene 2016, 586, 239–247. [Google Scholar] [CrossRef]
  44. Nguyen, T.P.O.; Hansen, M.A.; Hansen, L.H.; Horemans, B.; Sørensen, S.J.; De Mot, R.; Springael, D. Intra- and inter-field diversity of 2,4-dichlorophenoxyacetic acid-degradative plasmids and their tfd catabolic genes in rice fields of the Mekong delta in Vietnam. FEMS Microbiol. Ecol. 2019, 95, fiy214. [Google Scholar] [CrossRef]
  45. Yamamoto-Tamura, K.; Moriuchi, R.; Ogawa, N. Complete genome sequence of Caballeronia sp. strain NK8 (MAFF311271), a chlorobenzoate-degrading bacterium. Microbiol. Resour. Announc. 2021, 10, e0041621. [Google Scholar] [CrossRef] [PubMed]
  46. Zhang, L.; Song, M.; Mao, Z.; Liu, Y.; Li, F.; Jiang, J.; Chen, K. A new enantioselective dioxygenase for the (S)-enantiomer of the chiral herbicide dichlorprop in Sphingopyxis sp. DBS4. Int. Biodeterior. Biodegrad. 2023, 176, 105511. [Google Scholar] [CrossRef]
  47. Potrawfke, T.; Armengaud, J.; Wittich, R.M. Chlorocatechols substituted at positions 4 and 5 are substrates of the broad-spectrum chlorocatechol 1,2-dioxygenase of Pseudomonas chlororaphis RW71. J. Bacteriol. 2001, 183, 997–1011. [Google Scholar] [CrossRef]
  48. Ogawa, N.; Miyashita, K. The chlorocatechol-catabolic transposon Tn5707 of Alcaligenes eutrophus NH9, carrying a gene cluster highly homologous to that in the 1,2,4-trichlorobenzene-degrading bacterium Pseudomonas sp. strain P51, confers the ability to grow on 3-chlorobenzoate. Appl. Environ. Microbiol. 1999, 65, 724–731. [Google Scholar] [CrossRef] [PubMed]
  49. Moriuchi, R.; Dohra, H.; Kanesaki, Y.; Ogawa, N. Complete genome sequence of 3-chlorobenzoate-degrading bacterium Cupriavidus necator NH9 and reclassification of the strains of the genera Cupriavidus and Ralstonia based on phylogenetic and whole-genome sequence analyses. Front. Microbiol. 2019, 10, 133. [Google Scholar] [CrossRef]
  50. Xiao, Y.; Zhang, J.J.; Liu, H.; Zhou, N.Y. Molecular characterization of a novel ortho-nitrophenol catabolic gene cluster in Alcaligenes sp. strain NyZ215. J. Bacteriol. 2007, 189, 6587–6593. [Google Scholar] [CrossRef]
  51. Jiang, X.W.; Liu, H.; Xu, Y.; Wang, S.J.; Leak, D.J.; Zhou, N.Y. Genetic and biochemical analyses of chlorobenzene degradation gene clusters in Pandoraea sp. strain MCB032. Arch. Microbiol. 2009, 191, 485–492. [Google Scholar] [CrossRef]
  52. Miyazaki, R.; Bertelli, C.; Benaglio, P.; Canton, J.; De Coi, N.; Gharib, W.H.; Gjoksi, B.; Goesmann, A.; Greub, G.; Harshman, K.; et al. Comparative genome analysis of Pseudomonas knackmussii B13, the first bacterium known to degrade chloroaromatic compounds. Environ. Microbiol. 2015, 17, 91–104. [Google Scholar] [CrossRef] [PubMed]
  53. Li, T.; Gao, Y.Z.; Xu, J.; Zhang, S.T.; Guo, Y.; Spain, J.C.; Zhou, N.Y. A Recently assembled degradation pathway for 2,3-dichloronitrobenzene in Diaphorobacter sp. strain JS3051. mBio 2021, 12, e0223121. [Google Scholar] [CrossRef] [PubMed]
  54. Gross, R.; Guzman, C.A.; Sebaihia, M.; dos Santos, V.A.; Pieper, D.H.; Koebnik, R.; Lechner, M.; Bartels, D.; Buhrmester, J.; Choudhuri, J.V.; et al. The missing link: Bordetella petrii is endowed with both the metabolic versatility of environmental bacteria and virulence traits of pathogenic Bordetellae. BMC Genom. 2008, 9, 449. [Google Scholar] [CrossRef] [PubMed]
  55. Sen, D.; Brown, C.J.; Top, E.M.; Sullivan, J. Inferring the evolutionary history of IncP-1 plasmids despite incongruence among backbone gene trees. Mol. Biol. Evol. 2013, 30, 154–166. [Google Scholar] [CrossRef]
  56. Heo, J.; Park, I.; You, J.; Han, B.-H.; Kwon, S.-W.; Lee, S.-W.; Ahn, J.-H. Genome sequence analysis of Sphingomonas histidinilytica C8-2 degrading a fungicide difenoconazole. Korean J. Microbiol. 2019, 55, 428–431. [Google Scholar] [CrossRef]
  57. Pratama, A.A.; Jiménez, D.J.; Chen, Q.; Bunk, B.; Spröer, C.; Overmann, J.; van Elsas, J.D. Delineation of a subgroup of the genus Paraburkholderia, including P. terrae DSM 17804T, P. hospita DSM 17164T, and four soil-isolated fungiphiles, reveals remarkable genomic and ecological features-proposal for the definition of a P. hospita species cluster. Genome Biol. Evol. 2020, 12, 325–344. [Google Scholar] [CrossRef]
  58. Morimoto, S.; Fujii, T. A new approach to retrieve full lengths of functional genes from soil by PCR-DGGE and metagenome walking. Appl. Microbiol. Biotechnol. 2009, 83, 389–396. [Google Scholar] [CrossRef]
  59. Hoffmann, D.; Kleinsteuber, S.; Müller, R.H.; Babel, W. A transposon encoding the complete 2,4-dichlorophenoxyacetic acid degradation pathway in the alkalitolerant strain Delftia acidovorans P4a. Microbiol. Read. 2003, 149 Pt 9, 2545–2556. [Google Scholar] [CrossRef]
  60. Salvà-Serra, F.; Donoso, R.A.; Cho, K.H.; Yoo, J.A.; Lee, K.; Yoon, S.H.; Piñeiro-Iglesias, B.; Moore, E.R.B.; Pérez-Pantoja, D. Complete multipartite genome sequence of the Cupriavidus basilensis type strain, a 2,6-dichlorophenol-degrading bacterium. Microbiol. Resour. Announc. 2021, 10, e00134-21. [Google Scholar] [CrossRef]
  61. Vallaeys, T.; Albino, L.; Soulas, G.; Wright, A.D.; Weightman, A.J. Isolation and characterization of a stable 2,4-dichlorophenoxyacetic acid degrading bacterium, Variovorax paradoxus, using chemostat culture. Biotechnol. Lett. 1998, 20, 1073–1076. [Google Scholar] [CrossRef]
  62. Müller, T.A.; Byrde, S.M.; Werlen, C.; van der Meer, J.R.; Kohler, H.P. Genetic analysis of phenoxyalkanoic acid degradation in Sphingomonas herbicidovorans MH. Appl. Environ. Microbiol. 2004, 70, 6066–6075. [Google Scholar] [CrossRef]
  63. Nielsen, T.K.; Rasmussen, M.; Demanèche, S.; Cecillon, S.; Vogel, T.M.; Hansen, L.H. Evolution of sphingomonad gene clusters related to pesticide catabolism revealed by genome sequence and mobilomics of Sphingobium herbicidovorans MH. Genome Biol. Evol. 2017, 9, 2477–2490. [Google Scholar] [CrossRef]
  64. Liu, H.; Wang, S.J.; Zhang, J.J.; Dai, H.; Tang, H.; Zhou, N.Y. Patchwork assembly of nag-like nitroarene dioxygenase genes and the 3-chlorocatechol degradation cluster for evolution of the 2-chloronitrobenzene catabolism pathway in Pseudomonas stutzeri ZWLR2-1. Appl. Environ. Microbiol. 2011, 77, 4547–4552. [Google Scholar] [CrossRef]
  65. Ravatn, R.; Studer, S.; Zehnder, A.J.; van der Meer, J.R. Int-B13, an unusual site-specific recombinase of the bacteriophage P4 integrase family, is responsible for chromosomal insertion of the 105-kilobase clc element of Pseudomonas sp. strain B13. J. Bacteriol. 1998, 180, 5505–5514. [Google Scholar] [CrossRef]
  66. Gaillard, M.; Vallaeys, T.; Vorhölter, F.J.; Minoia, M.; Werlen, C.; Sentchilo, V.; Pühler, A.; van der Meer, J.R. The clc element of Pseudomonas sp. strain B13, a genomic island with various catabolic properties. J. Bacteriol. 2006, 188, 1999–2013. [Google Scholar] [CrossRef]
  67. Chain, P.S.; Denef, V.J.; Konstantinidis, K.T.; Vergez, L.M.; Agulló, L.; Reyes, V.L.; Hauser, L.; Córdova, M.; Gómez, L.; González, M.; et al. Burkholderia xenovorans LB400 harbors a multi-replicon, 9.73-Mbp genome shaped for versatility. Proc. Natl. Acad. Sci. USA 2006, 103, 15280–15287. [Google Scholar] [CrossRef]
  68. Daligault, H.E.; Davenport, K.W.; Minogue, T.D.; Bishop-Lilly, K.A.; Broomall, S.M.; Bruce, D.C.; Chain, P.S.; Coyne, S.R.; Frey, K.G.; Gibbons, H.S.; et al. Whole-genome assemblies of 56 burkholderia species. Genome Announc. 2014, 2, e01106-14. [Google Scholar] [CrossRef]
  69. Hickey, W.J.; Sabat, G.; Yuroff, A.S.; Arment, A.R.; Pérez-Lesher, J. Cloning, nucleotide sequencing, and functional analysis of a novel, mobile cluster of biodegradation genes from Pseudomonas aeruginosa strain JB2. Appl. Environ. Microbiol. 2001, 67, 4603–4609. [Google Scholar] [CrossRef] [PubMed]
  70. Corbella, M.E.; Puyet, A. Real-time reverse transcription-PCR analysis of expression of halobenzoate and salicylate catabolism-associated operons in two strains of Pseudomonas aeruginosa. Appl. Environ. Microbiol. 2003, 69, 2269–2275. [Google Scholar] [CrossRef] [PubMed]
  71. Obi, C.C.; Vayla, S.; de Gannes, V.; Berres, M.E.; Walker, J.; Pavelec, D.; Hyman, J.; Hickey, W.J. The integrative conjugative element clc (ICEclc) of Pseudomonas aeruginosa JB2. Front. Microbiol. 2018, 9, 1532. [Google Scholar] [CrossRef] [PubMed]
  72. Chao, H.J.; Chen, Y.Y.; Wu, J.; Yan, D.Z.; Zhou, N.Y. Complete genome sequence of a chlorobenzene degrader, Pandoraea pnomenusa MCB032. Curr. Microbiol. 2019, 76, 1235–1237. [Google Scholar] [CrossRef]
  73. Jencova, V.; Strnad, H.; Chodora, Z.; Ulbrich, P.; Vlcek, C.; Hickey, W.J.; Paces, V. Nucleotide sequence, organization and characterization of the (halo)aromatic acid catabolic plasmid pA81 from Achromobacter xylosoxidans A8. Res. Microbiol. 2008, 159, 118–127. [Google Scholar] [CrossRef] [PubMed]
  74. Omelchenko, M.V.; Makarova, K.S.; Wolf, Y.I.; Rogozin, I.B.; Koonin, E.V. Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ. Genome Biol. 2003, 4, R55. [Google Scholar] [CrossRef] [PubMed]
  75. Casjens, S.R.; Thuman-Commike, P.A. Evolution of mosaically related tailed bacteriophage genomes seen through the lens of phage P22 virion assembly. Virology 2011, 411, 393–415. [Google Scholar] [CrossRef] [PubMed]
  76. McGowan, C.; Fulthorpe, R.; Wright, A.; Tiedje, J.M. Evidence for interspecies gene transfer in the evolution of 2,4-dichlorophenoxyacetic acid degraders. Appl. Environ. Microbiol. 1998, 64, 4089–4092. [Google Scholar] [CrossRef]
  77. Huang, J.; Gogarten, J.P. Concerted gene recruitment in early plant evolution. Genome Biol. 2008, 9, R109. [Google Scholar] [CrossRef]
  78. Vallaeys, T.; Fulthorpe, R.R.; Wright, A.M.; Soulas, G. The metabolic pathway of 2,4-dichlorophenoxyacetic acid degradation involves different families of tfdA and tfdB genes according to PCR-RFLP analysis. FEMS Microbiol. Ecol. 1996, 20, 163–172. [Google Scholar] [CrossRef]
  79. Norris, V.; Merieau, A. Plasmids as scribbling pads for operon formation and propagation. Res. Microbiol. 2013, 164, 779–787. [Google Scholar] [CrossRef]
  80. Chang, H.K.; Mohseni, P.; Zylstra, G.J. Characterization and regulation of the genes for a novel anthranilate 1,2-dioxygenase from Burkholderia cepacia DBO1. J. Bacteriol. 2003, 185, 5871–5881. [Google Scholar] [CrossRef]
  81. Lawrence, J.G.; Roth, J.R. Selfish operons: Horizontal transfer may drive the evolution of gene clusters. Genetics 1996, 143, 1843–1860. [Google Scholar] [CrossRef]
  82. Fang, G.; Rocha, E.P.; Danchin, A. Persistence drives gene clustering in bacterial genomes. BMC Genom. 2008, 9, 4. [Google Scholar] [CrossRef] [PubMed]
  83. Price, M.N.; Huang, K.H.; Arkin, A.P.; Alm, E.J. Operon formation is driven by co-regulation and not by horizontal gene transfer. Genome Res. 2005, 15, 809–819. [Google Scholar] [CrossRef]
  84. Kanai, Y.; Tsuru, S.; Furusawa, C. Experimental demonstration of operon formation catalyzed by insertion sequence. Nucleic Acids Res. 2022, 50, 1673–1686. [Google Scholar] [CrossRef] [PubMed]
  85. Brandis, G.; Hughes, D. The SNAP hypothesis: Chromosomal rearrangements could emerge from positive Selection during Niche Adaptation. PLoS Genet. 2020, 16, e1008615. [Google Scholar] [CrossRef] [PubMed]
  86. Zharikova, N.V.; Iasakov, T.R.; Zhurenko, E.I.; Korobov, V.V.; Markusheva, T.V. Bacterial genes of non-heme iron oxygenases, which have a Rieske-type cluster, catalyzing initial stages of degradation of chlorophenoxyacetic acids. Russ. J. Genet. 2018, 54, 284–295. [Google Scholar] [CrossRef]
  87. Sayers, E.W.; Bolton, E.E.; Brister, J.R.; Canese, K.; Chan, J.; Comeau, D.C.; Farrell, C.M.; Feldgarden, M.; Fine, A.M.; Funk, K.; et al. Database resources of the National Center for Biotechnology Information in 2023. Nucleic Acids Res. 2023, 51, D29–D38. [Google Scholar] [CrossRef]
  88. Madeira, F.; Pearce, M.; Tivey, A.R.N.; Basutkar, P.; Lee, J.; Edbali, O.; Madhusoodanan, N.; Kolesnikov, A.; Lopez, R. Search and sequence analysis tools services from EMBL-EBI in 2022. Nucleic Acids Res. 2022, 50, W276–W279. [Google Scholar] [CrossRef]
  89. Sullivan, M.J.; Petty, N.K.; Beatson, S.A. Easyfig: A genome comparison visualizer. Bioinformatics 2011, 27, 1009–1010. [Google Scholar] [CrossRef]
  90. Kumar, S.; Stecher, G.; Tamura, K. MEGA7: Molecular Evolutionary Genetics Analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 2016, 33, 1870–1874. [Google Scholar] [CrossRef]
Figure 1. Comparative genomic analysis of (a) tfdI, (b) tfdII, and (c) tfdIII gene clusters showing their genomic rearrangements and evolutionary relationships between themselves and tcb gene clusters. Clusters and adjacent regions shown by linear visualization and open reading frames (ORFs) represented by arrows; the clusters are indicated by the color key (bottom). The degree of identity between clusters is indicated by the intensity of grayscale-shaded regions according to blastn as shown in the heat key (bottom right). The scale in kilobase pairs (kbp) is shown at the bottom right of each cluster.
Figure 1. Comparative genomic analysis of (a) tfdI, (b) tfdII, and (c) tfdIII gene clusters showing their genomic rearrangements and evolutionary relationships between themselves and tcb gene clusters. Clusters and adjacent regions shown by linear visualization and open reading frames (ORFs) represented by arrows; the clusters are indicated by the color key (bottom). The degree of identity between clusters is indicated by the intensity of grayscale-shaded regions according to blastn as shown in the heat key (bottom right). The scale in kilobase pairs (kbp) is shown at the bottom right of each cluster.
Ijms 24 14370 g001
Figure 2. Comparative genomic analysis of tfdIV gene clusters. (a) Genomic rearrangements and evolutionary relationships between subtypes A, B and C. (b) Synteny analysis showing putative gene assembly of the tfdIV gene cluster (subtype A). Clusters and adjacent regions shown by linear visualization and open reading frames (ORFs) are represented by arrows; the clusters are indicated by the color key (bottom). The degree of identity between clusters is indicated by the intensity of grayscale-shaded regions according to blastn as shown in the heat key (bottom right). The scale in kilobase pairs (kbp) is shown at the bottom right of each cluster.
Figure 2. Comparative genomic analysis of tfdIV gene clusters. (a) Genomic rearrangements and evolutionary relationships between subtypes A, B and C. (b) Synteny analysis showing putative gene assembly of the tfdIV gene cluster (subtype A). Clusters and adjacent regions shown by linear visualization and open reading frames (ORFs) are represented by arrows; the clusters are indicated by the color key (bottom). The degree of identity between clusters is indicated by the intensity of grayscale-shaded regions according to blastn as shown in the heat key (bottom right). The scale in kilobase pairs (kbp) is shown at the bottom right of each cluster.
Ijms 24 14370 g002
Figure 3. Comparative genomic analysis of (a) clc and (b) tcb gene clusters showing their synteny. Clusters and adjacent regions shown by linear visualization and open reading frames (ORFs) represented by arrows; the clusters are indicated by the color key (bottom). The degree of identity between clusters is indicated by the intensity of grayscale-shaded regions according to blastn as shown in the heat key (bottom right). The scale in kilobase pairs (kbp) is shown at the bottom right of each cluster.
Figure 3. Comparative genomic analysis of (a) clc and (b) tcb gene clusters showing their synteny. Clusters and adjacent regions shown by linear visualization and open reading frames (ORFs) represented by arrows; the clusters are indicated by the color key (bottom). The degree of identity between clusters is indicated by the intensity of grayscale-shaded regions according to blastn as shown in the heat key (bottom right). The scale in kilobase pairs (kbp) is shown at the bottom right of each cluster.
Ijms 24 14370 g003
Figure 4. Comparison of gene structure of complete tfd, clc and tcb clusters. Clusters and adjacent regions shown by linear visualization and open reading frames (ORFs) represented by arrows; the clusters are indicated by the color key (bottom). The degree of identity between clusters is indicated by the intensity of grayscale-shaded regions according to blastn as shown in the heat key (bottom right). The scale in kilobase pairs (kbp) is shown at the bottom right of each cluster.
Figure 4. Comparison of gene structure of complete tfd, clc and tcb clusters. Clusters and adjacent regions shown by linear visualization and open reading frames (ORFs) represented by arrows; the clusters are indicated by the color key (bottom). The degree of identity between clusters is indicated by the intensity of grayscale-shaded regions according to blastn as shown in the heat key (bottom right). The scale in kilobase pairs (kbp) is shown at the bottom right of each cluster.
Ijms 24 14370 g004
Figure 5. Phylogenetic classification of (a) chlorocatechol 1,2-dioxygenases, (b) chlormuconate cycloisomerases, (c) dienelactone hydrolases, and (d) maleylacetate reductases of tfd, tcb and clc gene clusters. Each color corresponds to one cluster. Bootstrap support from the maximum likelihood analyses (ML) higher than 50% are indicated above branches. The trees are drawn to scale, with branch lengths measured in the number of substitutions per site.
Figure 5. Phylogenetic classification of (a) chlorocatechol 1,2-dioxygenases, (b) chlormuconate cycloisomerases, (c) dienelactone hydrolases, and (d) maleylacetate reductases of tfd, tcb and clc gene clusters. Each color corresponds to one cluster. Bootstrap support from the maximum likelihood analyses (ML) higher than 50% are indicated above branches. The trees are drawn to scale, with branch lengths measured in the number of substitutions per site.
Ijms 24 14370 g005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Iasakov, T. Evolution End Classification of tfd Gene Clusters Mediating Bacterial Degradation of 2,4-Dichlorophenoxyacetic Acid (2,4-D). Int. J. Mol. Sci. 2023, 24, 14370. https://0-doi-org.brum.beds.ac.uk/10.3390/ijms241814370

AMA Style

Iasakov T. Evolution End Classification of tfd Gene Clusters Mediating Bacterial Degradation of 2,4-Dichlorophenoxyacetic Acid (2,4-D). International Journal of Molecular Sciences. 2023; 24(18):14370. https://0-doi-org.brum.beds.ac.uk/10.3390/ijms241814370

Chicago/Turabian Style

Iasakov, Timur. 2023. "Evolution End Classification of tfd Gene Clusters Mediating Bacterial Degradation of 2,4-Dichlorophenoxyacetic Acid (2,4-D)" International Journal of Molecular Sciences 24, no. 18: 14370. https://0-doi-org.brum.beds.ac.uk/10.3390/ijms241814370

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop