LIDo banner

Apply now

Find out more about the different routes to entry and our eligibility criteria

Alice Pettitt: An integrative characterization of proline cis and trans conformers in a disordered peptide

apa
Intrinsically disordered proteins (IDPs) often contain proline residues that undergo cis/trans isomerization.

While molecular dynamics (MD) simulations have the potential to fully characterize the proline cis and trans subensembles, they are limited by the slow timescales of isomerization and force field inaccuracies. NMR spectroscopy can report on ensemble-averaged observables for both the cis-proline and trans-proline states, but a full atomistic characterization of these conformers is challenging. Given the importance of proline cis/trans isomerization for influencing the conformational sampling of disordered proteins, we employed a combination of all-atom MD simulations with enhanced sampling (metadynamics), NMR, and small-angle x-ray scattering (SAXS) to characterize the two subensembles of the ORF6 C-terminal region (ORF6CTR) from SARS-CoV-2 corresponding to the proline-57 (P57) cis and trans states. We performed MD simulations in three distinct force fields: AMBER03ws, AMBER99SB-disp, and CHARMM36m, which are all optimized for disordered proteins. Each simulation was run for an accumulated time of 180–220 μs until convergence was reached, as assessed by blocking analysis. A good agreement between the cis-P57 populations predicted from metadynamic simulations in AMBER03ws was observed with populations obtained from experimental NMR data. Moreover, we observed good agreement between the radius of gyration predicted from the metadynamic simulations in AMBER03ws and that measured using SAXS. Our findings suggest that both the cis-P57 and trans-P57 conformations of ORF6CTR are extremely dynamic and that interdisciplinary approaches combining both multiscale computations and experiments offer avenues to explore highly dynamic states that cannot be reliably characterized by either approach in isolation.

Significance

This study employs MD simulations (with metadynamics), NMR spectroscopy, and SAXS to elucidate the individual cis-proline and trans-proline conformations of ORF6CTR from SARS-CoV-2. The good agreement on proline cis/trans populations observed in experiments (NMR) and those calculated from simulations in the AMBER03ws force field (with SAXS reweighting) showcases the efficiency of this interdisciplinary approach, which can be used to characterize highly dynamic disordered protein states, even for very slow processes. Furthermore, our study emphasizes the importance of considering both computational and experimental methodologies to gain a more holistic understanding of highly dynamic proteins. The presented integrative approach sets a precedent for future studies aiming to explore complex and dynamic biological systems with slow transitions such as proline isomerizations.

Introduction

Intrinsically disordered proteins (IDPs) and disordered regions, which represent at least 30% of the human proteome (1), are particularly common in cancer-associated proteins, with up to 80% containing disordered regions (2), and in viruses, where their coverage ranges from 3 to 55% depending on the viral species (3). Unlike folded proteins, disordered proteins are highly dynamic, and they often exist as an ensemble of diverse heterogeneous conformations that lack a single three-dimensional (3D) structure. Compared with folded proteins, the primary sequences of disordered proteins have a nearly 2-fold increase of proline residues (4), which are well-known to reduce the formation of secondary structure in proteins (5). In particular, proline residues in disordered proteins have been shown to play key roles in regulating protein-protein interactions (6,7), posttranslational modifications (8), and liquid-liquid phase separation (9).

Most peptide bonds within proteins exist almost exclusively in the energetically favorable trans conformation. However, for proline residues, the free energy difference between the cis and trans isomers is lower due to the cyclic structure of this amino acid. Given the high energy barrier to rotation, approximately 84 kJ mol−1 (10), proline isomerization is generally a slow process (11), occurring at a rate of 10−3–10−2 s−1 at room temperature, depending on the adjacent residues (12,13). The cis-proline population typically ranges between 5 and 10% in disordered proteins (4), but this can vary substantially depending on the length and composition of the amino acid sequence (6,14,15). Consequently, multiple cis-proline conformations may be present within polyproline disordered protein ensembles. These ensembles sample a vast conformational space of very slowly exchanging conformers, further increasing their complexity (16).

Molecular dynamics (MD) simulations are often used to characterize the ensemble of disordered proteins as they can resolve individual conformations within an ensemble at atomic resolution, which is a challenge for many experimental techniques. Significant progress has been made over the last decade to optimize force fields for modeling disordered proteins (17,18,19), as well as advances in the integration of MD simulations and experimental data to improve their accuracy (20,21). Despite these advances, sampling the full configurational energy landscape of disordered protein ensembles in all-atom explicit solvent MD simulations is extremely computationally expensive. Proline cis/trans isomerization presents an additional challenge due to the slow timescales of this process (12,13), which are generally not accessible in brute-force MD simulations alone (22), even on today’s most powerful computers. However, when suitable collective variables (CVs) can be identified, metadynamics, an enhanced sampling approach, offers an effective method for sampling slow motions (23,24). Indeed, metadynamics has been used to encourage exploration of the full configurational space of disordered proteins (7,25) and proline cis/trans isomerization in simulations of dipeptides and folded systems (26,27). In the latter cases, the ζ angle (Cαi–1, Oi–1, Cδi, Cαi, where i = proline) was employed as one CV for the isomerization and pyramidalization of the amide nitrogen and the ψ angle (Ni, Cαi, C’i, Ni+1) was employed as an additional CV to control the amide orientation, which may affect the rate of transition between the cis-proline and trans-proline conformations. Both CVs are required to enhance proline cis/trans sampling as they compensate for each other.

NMR spectroscopy is a well-suited experimental technique to characterize ensemble-averaged properties of disordered proteins at atomic resolution under physiological conditions (pH, temperature, and salt concentrations) (28). Furthermore, NMR can uniquely characterize and quantify the populations of cis-proline and trans-proline conformations. The distinct chemical environments for the two proline isomers, coupled with their slow exchange, can result in the detection of two separate peaks for neighboring residues or the proline itself. NMR has therefore not only been used to characterize the overall ensemble of disordered proteins (29,30), but NMR has also been used extensively to characterize the structural propensities and dynamics of cis-proline conformations in disordered proteins (6,14,15,31).

Another experimental technique that can report on the ensembles of disordered proteins in solution is small-angle x-ray scattering (SAXS). This technique can provide coarse structural information relating to a protein’s size and shape. The capability to predict SAXS profiles from atomic coordinates makes it possible to compare conformational ensembles from MD simulations with experimental SAXS data (32). While SAXS measurements offer powerful global information, they report ensemble-averaged states and cannot generally distinguish between the cis-proline and trans-proline configurations. Complementary approaches, such as NMR, are essential for providing detailed experimental information at the local scale.

Here, we used an integrative approach anchored in all-atom explicit solvent metadynamic simulations to characterize the C-terminal region of open reading frame 6 (ORF6CTR) from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). This region of ORF6 is predicted to be disordered (Fig. S1, A and B) and binds to host proteins via an essential methionine residue at position 58 (M58), leading to suppression of the innate immune response (33,34,35). Moreover, this 21-residue peptide contains a single proline residue at position 57 (P57), which may influence its binding to host proteins as this residue is at a preceding position to M58 (33,34). We sampled the conformational space of ORF6CTR using three different force fields, each optimized for disordered proteins: AMBER03ws (a03ws) (17), AMBER99SB-disp (a99SB-disp) (18), and CHARMM36m (C36m) (19). We employed metadynamics to enhance sampling (23,24), using various local and global CVs, including those on the P57 ζ and ψ angles (26,27). To reweight and validate resulting conformational ensembles, we compared ensemble-averaged properties from each force field to NMR and SAXS data. Specifically, we employed NMR chemical shifts to report on the local properties and populations of the cis-P57 and trans-P57 states, NMR diffusion experiments to compare the global properties of both states, and NMR spin-relaxation experiments to probe dynamics. Moreover, SAXS data were used to select the most accurate force field for predicting the ORF6CTR global conformational ensemble. To further refine the conformational ensembles, we updated the statistical reweighting using a Bayesian/maximum entropy (BME) approach (21,36).

By integrating metadynamic simulations, SAXS, and NMR, we can characterize the highly dynamic cis-P57 and trans-P57 subensembles of ORF6CTR. We show that metadynamics with the P57 ζ and ψ angle CVs enhances sampling of P57 isomerization, and we observe convergence of these two CVs for the a03ws and C36m force fields. We find that a03ws most accurately predicts the cis-P57 and trans-P57 populations in the ORF6CTR. By employing SAXS BME reweighting (21,36) and two independent a03ws runs, we obtain cis-P57 populations in the a03ws force field that match those from NMR. NMR diffusion experiments suggest that the cis-P57 subensemble is slightly more compact than the trans-P57 subensemble, in qualitative agreement with the metadynamic simulation predictions. Furthermore, NMR spin-relaxation experiments and metadynamic simulations indicate that both the cis-P57 and the trans-P57 conformations of ORF6CTR are extremely dynamic. We anticipate that this interdisciplinary approach can be broadly applied to the many disordered proteins that undergo complex dynamics across varying timescales.

 

See full article here