425e Protein Structure Prediction and Refinement Using Folding Mechanism-Informed Replica Exchange Methods

M. Scott Shell, Chemical Engineering, University of California, Mail Code 5080, Santa Barbara, CA 93106 and S. Banu Ozkan, Center for Biological Physics, Arizona State University, PO Box 871504, Tempe, AZ 85287.

Efficient conformational sampling is the foremost bottleneck to the prediction of native structures in all-atom, physics-based simulations of proteins and peptides. The use of replica exchange molecular dynamics (REMD) methods improves sampling efficiency; however, it remains challenging to converge simulations of polypeptides of even 15 residues long in a day's worth of real time using commodity cluster computing, a time comparable to that taken by modern bioinformatics-based structure prediction web servers. Yet it is critical that physics-based methods succeed, as bioinformatics methods fail for many emerging synthetic protein and peptide technologies.

Here, we discuss a strategy for accelerating physics-based REMD simulations of proteins by taking note of the major driving force for folding: the hydrophobic interaction. We show that by detecting and encouraging the formation of strong contacts between hydrophobic residues along the chain (through restraints added to the molecular potential energy function), convergence of REMD simulations is accelerated. We demonstrate this approach in three different case scenarios spanning small peptides to complete protein structures: (1) on a library of small peptides that have stable secondary structures, (2) at the level where secondary structure pieces are assembled into a complete protein, and (3) during refinement and selection of full protein structures predicted by other algorithms, such as bioinformatics web servers. In each of these cases, hydrophobic restraints are selected using a combination of short coarse-grained simulations that generate a large ensemble of candidate protein structures with compact hydrophobic cores, and by examining the free energies of residue-residue contact formation computed on the fly during the REMD simulations. Periodic conformational clustering and re-seeding of the REMD simulations are also used to accelerate convergence. Importantly, any biasing effects of the added hydrophobic restraints (beyond accelerating convergence) can ultimately be removed by either scaling their strength across the replica cascade, or by running a restraint-free final “consensus” REMD run using all of the predicted structures in competition with each other. We demonstrate the success of the approach for a number of model test peptides and small proteins, and for several proteins disseminated during the CASP8 competition.