Genome mining and phylogenetic analyses of gene clusters
To gain insight into potential gene evolution scenarios we searched for gene loci coding for assembly lines related to the aur and nor PKSs. Genome mining using BLAST (Basic Local Alignment Search Tool) in the NCBI (National Center for Biotechnology Information) database identified several gene loci with high homology to the aur and nor gene clusters (Fig. 2a). These gene clusters share the genes for PNBA starter unit biosynthesis, polyketide chain elongation, post-PKS modification and regulation (Supplementary Table 1).
Although eight of the gene clusters can be grouped either into aur- or nor-type, one orphan gene cluster from Streptomyces scabrisporus (DSM41855) deviates from the others as it lacks a norB/aurB homolog. The absence of this gene and the correct assembly of the contigs were confirmed by four pairs of PCR primers (Supplementary Fig. 1). To elucidate the product of the encoded cryptic assembly line, we cultured S. scabrisporus and monitored its metabolic profile. Unexpectedly, we found that this strain produces neoaureothin (Fig. 2b and Supplementary Fig. 11) despite the absence of a norB homolog in the identified gene cluster (Fig. 2a). It is conceivable that a NorB homolog is encoded elsewhere in the genome. Indeed, we identified a candidate for a freestanding NorB gene (WP_078978330.1) in the yet incomplete genome sequence of S. scabrisporus. This finding of a split nor gene cluster is intriguing as it shows that gene rearrangements take place in aur-/nor-type gene clusters. Such rearrangements could in fact drive the evolution of metabolic diversity in the aureothin family.
With the enlarged set of gene clusters at hand, we aimed at gaining insight into their phylogenetic relationship, which could give clues about their evolution. Therefore, the amino acid sequences of the KS domains from those homologous gene clusters were aligned with the KS domain sequences from other actinobacterial PKS gene clusters by the GUIDANCE2 Server26. The aligned sequences were subjected to phylogenetic analyses, and the evolutionary tree was constructed by Bayesian inference with the MrBayes software27 (Fig. 2c, Supplementary Fig. 2, Supplementary Table 2). For simplification, some KS domains from aur– and nor-type gene clusters were excluded as they showed the exact same sequences.
Each module of the aur and nor clusters is monophyletic, i.e. the sequences form their own branches without intermingling of sequences from other modules. The only apparent exception is the NorC KS-1 sequence of S. scabrisporus, which clusters together with NorB KS sequences. An alignment of the S. scabrisporus NorC KS-1 with selected AurB KS/NorB KS and AurC KS-1/NorC KS-1 sequences (Supplementary Fig. 3) revealed its hybrid nature. When comparing positions that have a characteristic amino acid residue or indel pattern in the NorB KS or NorC KS-1 group, it becomes clear that the S. scabrisporus sequence resembles more the module B type in the amino-terminal region whereas the carboxy-terminal stretch shows higher similarity to the module C KS-1 type. This is probably due to a recombination event. Since the similarity with the module B type prevails, the sequence is located in the module B KS-1 branch of the tree.
AurA/NorA as well as NorA′-2 sequences form separate branches. The other modules, however, originated from a common ancestor. It is important to note that the NorA′-1 cluster, which comprises sequences exclusive to the nor cluster, forms an early branch within that big monophyletic group. Therefore, from a phylogenetic perspective, it is reasonable to propose that aur-type PKSs emerged from nor-type PKSs, possibly through gene deletion (Fig. 2d). This result is in line with the previous analysis, which suggested that a nor-to-aur PKS evolution would be the most parsimonious scenario20.
Morphing the nor PKS into an aur PKS
For functional analyses and PKS engineering approaches, we needed to establish a robust expression system. Initially, the heterologous expression of the nor gene cluster was achieved in S. lividans by coexpression of the transcriptional regulator AurD from the aur gene cluster. The titer of neoaureothin was, however, unsatisfactory (15 mg L−1)28. This low yield might result from the non-concerted expression of the nor biosynthesis genes using a three-plasmid system. To increase neoaureothin production we optimized the heterologous expression system to reassemble the nor gene cluster in a continuous gene region (Fig. 3a). First, a part of the nor gene cluster (pNT42) was integrated into the genome of the heterologous expression host S. albus by site-specific recombination. The complete nor gene cluster was then obtained by a homologous recombination using a suicide vector (pYU93) harboring the left part of the gene cluster. The resulting strain (S. albus::pNT42/pYU93; S. albus_nor PKS) produced threefold higher titers (45 mg L−1) of neoaureothin compared to the previous construct (Fig. 3b). Thus, S. albus_nor PKS was used as a platform for PKS engineering.
To emulate the presumed evolutionary processes involved in the nor-to-aur PKS transformation (Fig. 2d) we attempted to morph the nor PKS into an assembly line producing aureothin. Since the polyketide backbone of aureothin lacks two methylmalonyl-derived C2 units, two requisite modules in the nor PKS needed to be deleted. As (AurA/NorA), (AurB/NorB), and (AurC/NorC) share high identities on the DNA and amino acid levels, we excised the gene regions for modules 2 and 3 in the nor gene cluster. Thereby, it was essential to consider the compatibility of the docking domains between the individual PKS proteins29,30,31,32,33, because the sequence elements at the extreme C- and N-termini of PKS subunits help mediate their interactions. The amino acid sequence alignment among PKS proteins indicated that the CDD (C-terminal docking domain) of AurA/NorA and the NDD (N-terminal docking domain) of AurB/NorA′ is a class 1a docking domain system (Supplementary Fig. 4)32,33. On the other hand, the CDD of NorA′ and the NDD of NorB are a class 1b system (Supplementary Fig. 4)32,33. In a control experiment neglecting CDD/NDD compatibility we generated a mutant (S. albus_aur§ PKS) lacking norA′ (Fig. 3c). Surprisingly, this strain produced intermediate (5) and trace amounts of 7-hydroxydeoxyaureothin (3), but not aureothin (1) (Fig. 3e and Supplementary Fig. 12). Apparently, the different types of docking domains between NorA and NorB can communicate, albeit only weakly. To achieve higher compatibility between CDD of NorA and NDD of NorB, we constructed two recombinant PKS variants with different fusion sites. Initially, we have employed S. lividans as heterologous host28, but to increase neoaureothin production we reconstructed an S. albus_nor PKS expression system. Whereas the first system had a fusion site at the docking domain region (aur# PKS), the second one has a fusion site at the KS-AT linker region (aur* PKS) (Fig. 3c, d and Supplementary Fig. 5). In both cases, we have swapped the NDD of NorB for that of NorA′, the natural partner of the CDD of NorA. In modular PKS, two hot spots for evolutionary recombination events have been suggested, KS-AT linker and post AT linker34,35. We initially chose one fusion site upstream of the conserved KS-AT linker, as we have already succeeded in engineering aureothin congeners using this site for recombinations (Supplementary Fig. 5)28,36. The constructs were introduced into S. albus via triparental conjugation to generate S. albus_aur# PKS and S. albus_aur* PKS.
The verified recombinant strains were fermented, and the ethyl acetate extracts of the cultures were monitored by HPLC. In both recombinant strains the production of intermediate (5, Supplementary Figs. 19–23, Supplementary Table 8) and a trace amount of 7-hydroxydeoxyaureothin (3) but no aureothin (1) could be detected (Fig. 3e, and Supplementary Fig. 12). All recombinant strains aur§_PKS, aur#_PKS, and aur*_PKS produce elevated amounts of intermediate (5) compared to AurA alone24, indicating that the docking domains promote protein interactions. However, the low production of the full-length polyketide pointed to incorrect protein folding, likely because of suboptimal fusion sites. Thus, we revisited the fusion site in the aur* PKS and found that the recombinant KS-AT linker in the aur* PKS bore four amino acids more than that in the genuine aur PKS (Fig. 3d). Since this difference might have an impact on protein configuration and interactions in the aur* PKS, we shortened the recombinant KS-AT linker region by λRed-mediated recombination, yielding, in part serendipitously, the aur** PKS gene cluster (Supplementary Figs. 5 and 6).
In the extract of the culture broth of S. albus_aur** PKS, the target molecule (1) could still not be detected by HPLC. Instead, we noted the formation of other metabolites (Fig. 3e and Supplementary Fig. 12). Through HPLC-HRMS analysis and by comparison with an authentic reference, the compounds were identified as 7-OH-deoxyaureothin (3) and 7-deoxyaureothin (4), which differs only from 1 in that they do not form the tetrahydrofuran ring (Supplementary Fig. 12). Notably, the polyketide backbones of 1−3 are identical, which indicated that the nor PKS has been successfully morphed into an aureothin assembly line. Yet, the enzymatic tailoring of the polyketide scaffold proved to be erratic.
From in vivo and in vitro studies we know that the formation of the tetrahydrofuran ring is the last step in aureothin biosynthesis and that its installation involves two sequential C-O-bond formations catalyzed by a single cytochrome P450 monooxygenase, AurH37,38. Furthermore, the AurH-mediated oxygenation processes are highly fine-tuned, and changes in the enzyme or in the size of the substrate result in incomplete transformations or alternative reaction channels39,40. By analogy, the short aureothin backbone does not appear to be the preferred substrate of the homologous oxygenase (NorH) from the nor pathway. NorH is only able to convert deoxyaureothin (4) into the hydroxylated congener 3, whereas the second oxidation and thus also heterocyclization do not take place (Fig. 3f). In case of the tentative nor-to-aur PKS evolution, not only the PKS needed to morph, but also the tailoring enzyme (AurH) needed to adjust.
Mutagenesis and cross-complementation
To gain insight into possible changes in the CYP450 we compared NorH and AurH. Both enzymes share similar amino acid sequences (64% identity, 75% positives), and recognize similar substrates. Thus, the overall structure of NorH is likely similar to AurH39. Threading of the NorH amino acid sequence onto the AurH crystal structure revealed a conserved hydrophobic pocket for binding the pyrone ring of deoxyaureothin. However, the modeling indicated that NorH has a wider cavity around the active center, as some amino acid residues possess smaller or more flexible side chains than those found in AurH (Fig. 4a). Based on this information, a number of point mutations (I19F, V71L, T291L, T292P, W317F, and T392L) could, in principle, reconfigure the active site of NorH to recognize deoxyaureothin39.
To test this hypothesis, we altered the active site of NorH by site-directed mutagenesis. Thus, we constructed a range of norH variants, including the I19F, V71L, T291L, T292P, W317F, and T392L mutants, and cloned them individually into expression vectors for complementation of an aurH knock-out mutant (ΔaurH)37 (Fig. 4b and Supplementary Table 3). Expression vectors containing wild-type norH and aurH served as negative and positive controls, respectively. All plasmids were introduced into the ∆aurH mutant by triparental conjugation, and the metabolic profiles of the individual transformants were monitored by HPLC-MS (Fig. 4b and Supplementary Fig. 13).
The ∆aurH mutant produces exclusively 4; when complementing the mutant with native AurH, 4 is readily converted into 1 (positive control). In contrast, the mutant complemented with NorH partly transformed 4 into 3, and it was not capable of forming 1 (negative control) All ∆aurH mutant strains complemented with point-mutated NorH variants showed the same chemotype. The only difference was a slightly increased 3-to –4 ratio for NorH-V71L and NorH-T292P (Fig. 4b and Supplementary Fig. 13). These results indicate that these individual point mutations of NorH are not sufficient to reconfigure its active site to generate the THF ring of aureothin.
Therefore, we generated an expression plasmid for a NorH variant containing all six point mutations (I19F-V71L-T291L-T292P-W317F-T392L). Yet, in the metabolic profile of the ∆aurH strain complemented with the multiple point-mutated NorH variant, 1 could not be detected, either (Fig. 4b and Supplementary Fig. 13). Thus, we scrutinized the highly similar P450 monooxygenases NorH and AurH (Supplementary Fig. 7 and Supplementary Table 4) and attempted to target the protein domains that are relevant for THF-ring formation by constructing chimeric NorH variants. AurH variants adopt different conformations mainly at the B2 and B2′ two-helix-bundle, FG-loop and β2-loop that surround the active center and approach the center after binding to substrate39. These residues likely generate steric pressure to bend the intermediate and push it towards the reaction center, facilitating THF-ring formation. In order to test whether these residues are important for THF-ring formation, five gene regions around these residues from AurH were amplified and used to replace each corresponding region in the NorH gene (Supplementary Fig. 8, Supplementary Table 5). We noted, however, that these chimeras also produce exclusively 7-OH deoxyaureothin (3) (Supplementary Figs. 14 and 15). We also created AurH/NorH hybrids differing at the N-terminal end of the α helix (Supplementary Fig. 8, Supplementary Table 6). These head/tail exchange hybrids showed reduced catalytic activity (Fig. 4c and Supplementary Fig. 16). Thus, the exchanged region was narrowed down to avoid possible deleterious effects on the overall structure. The fusion sites were placed within the C helix and the K helix (Supplementary Fig. 8). Thus, NorH was divided into three areas, part A, B, and C. Correspondingly, AurH was dissected into parts a, b, and c (Fig. 4c). The HPLC profiles of the obtained hybrids, NorH/AurH ABc, AbC, aBC, Abc, aBc, and abC variants, indicated that only hybrid Abc variant could transform 7-deoxyaureothin (4) to aureothin (1), albeit only incompletely (Fig. 4c, Supplementary Fig. 16).
Taken together, structure-based, rational mutations and domain swapping of NorH are not sufficient to reconfigure the active site of NorH to function like AurH. The 70% C-terminal AurH hybrid NorH (Abc) shows only substantially reduced activity. These results indicate that complex evolutionary processes would be required to maintain THF ring formation activity when mutating NorH to AurH. Consequently, we also considered the reverse scenario and aimed at emulating a potential aur-to-nor PKS evolution.
Therefore, we interrogated the substrate specificity of AurH with respect to the enzyme’s ability to transform deoxyneoaureothin (6) into 2. To this end, norH was deleted in S. albus_nor PKS using the λRed system. The resulting ∆norH mutant was fermented, and the culture extract was analyzed by HPLC. The HPLC profile showed that ∆norH lost the ability to produce 2. Instead, a different metabolite was detected (Fig. 4d and Supplementary Fig. 17). The structure of this metabolite was determined as 6 by 1H and 13C NMR, 1H-1H COSY, HSQC and HMBC (Supplementary Figs. 17 and 24–29, Supplementary Table 9).
The NorH and AurH expression plasmids were introduced into the ∆norH mutant by triparental conjugation. As expected, NorH restored the production of 2 (Fig. 4d and Supplementary Fig. 17). Surprisingly, complementing the ∆norH mutant with AurH also restored the production of 2 (Fig. 4d and Supplementary Fig. 17). This result demonstrates that AurH is relatively flexible in substrate specificity and highly efficient in converting 6 to 2 (Fig. 4e and Supplementary Table 7). The broader substrate specificity of AurH indicates that an evolution from AurH to NorH is a more probable scenario according to the generalist-to-specialist model in enzyme evolution, where ancestral enzymes show higher promiscuity and the more specialized enzymes are evolved to catalyze specific reactions41,42.
Morphing the aur PKS into a nor PKS
Since AurH catalyzes the transformation of 6 into 2, we investigated the possibility of an aur-to-nor gene cluster evolution. Therefore, we aimed at integrating modules 2 and 3 of the nor PKS between modules 1 and 2 of the aur PKS. To achieve this goal, two chimeras with different recombination sites were constructed: one fusion site is in the docking domain region (nor# PKS), and the other one in the KS-AT linker region (nor* PKS) (Fig. 5a).
For the first construct (nor# PKS), we generated a fusion site at the N-terminal docking domain region of aurB and PCR-amplified intact norA′ including its N- and C-terminal docking domains (norA′#) (Supplementary Fig. 9). The alignment of amino acid sequences of the docking domains showed that the interaction between AurA/AurB and NorA/NorA′ are of class 1a, and the interaction between NorA′/NorB is of class 1b (Supplementary Fig. 4)29,30. To facilitate the interaction between NorA′ and AurB, the N-terminal docking domain of AurB was swapped with that of NorB to generate the nor# PKS variant. For the second construct (nor* PKS), we chose a fusion site within the KS-AT linker region of AurB, and amplified the gene fragments for the region between the NorA′-AT2 domain and the NorB-KS4 domain (NorA′*-NorB*) for recombination (Supplementary Figs. 5 and 10). To maintain the overall conformation of proteins in the nor* PKS, the length of the recombinant KS-AT linker was adjusted to match the size of the genuine nor PKS (Fig. 5b and Supplementary Fig. 5).
The verified constructs were introduced into S. albus to generate S. albus_nor# PKS and S. albus_nor* PKS. In the metabolic profile of S. albus_nor# PKS compound 2 could not be detected. Yet, PNBA and 1 were produced, indicating that gene expression and polyketide production were functional (Fig. 5c). In the case of S. albus_nor* PKS, we detected 1 and 2 (Fig. 5c and Supplementary Fig. 18), which indicated that we successfully modified the aur PKS to produce the homologous polyketide 2. Thus, we showed that it is possible to modify the aur gene cluster to the nor gene cluster through PKS engineering in a manner that emulates natural evolutionary processes. The strain expressing the nor* PKS genes also produces a series of congeners, non-oxidized 7-deoxyneoaureothin (6), over-oxidized 11a-hydroxyneoaureothin (7), non-reduced 7-deoxy-7-dehydroneoaureothin (10), non-reduced 2-pyrone-7-deoxy-7-dehydronoeaoreothin (9), and 2-pyrone-4-desmethyl-7-deoxy-7-dehydronoeaoreothin (8). The structures of all compounds were elucidated by NMR analyses (7, Supplementary Figs. 30–34, Supplementary Tables 10 and 8, Supplementary Figs. 35–40, Supplementary Tables 11 and 9, Supplementary Figs. 41–46, Supplementary Tables 12 and 10, Supplementary Figs. 47–52, Supplementary Table 13). The presence of three highly instable, non-reduced congeners indicates that the enoylreductase domain of AurB cannot process the polyketide intermediate accurately, likely because the length of the intermediate is different to the original substrate. This observation is in line with the remarkable finding that the nor* PKS can still produce 1. A plausible explanation could be that two modules are skipped during polyketide chain elongation. The phenomenon of PKS module skipping has previously been reported in a PKS engineering study where module 2 from the rapamycin PKS was inserted between module 1 and 2 in DEBS1-TE43. In the hybrid PKS the polyketide chain underwent direct ACP-to-ACP transfer to pass through rapamycin module 244. In the case of nor* PKS, the formation of the shorter chain may be rationalized by a similar ACP-to-ACP transfer or by the interaction between the C-terminal docking domain of AurA and the N-terminal docking domain of NorB, which is also present in the nor# PKS.
To corroborate this model, we generated another strain (S. albus::pHY129) in which the N-terminal docking domain of AurB was swapped to that of NorB without inserting NorA′# (Supplementary Fig. 9). By LC-MS monitoring we found that the recombinant strain produces aureothin. This experiment confirms that the C-terminal docking domain from AurA in fact recognizes the N-terminal docking domain from NorB. Thus, the shortcut in the nor* PKS can be rationalized. It remains unclear why the class 1a C-terminal docking domain of AurA cannot interact with the class 1a N-terminal docking domain of NorA′ in S. albus_nor#PKS. Weak interactions between different docking domain types have also been observed in other PKS systems45.