313d Optimal Protein Library Design Using Recombination or Point Mutations Based on Sequence Based Scoring Functions

Robert J. Pantazes, Chemical Engineering, The Pennsylvania State University, 147A Fenske Lab, University Park, PA 16802 and Costas D. Maranas, Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802.

In this talk, we introduce and test two new sequence based protein scoring systems (i.e., S1, S2) for assessing the likelihood that a given protein hybrid will be functional. By binning together amino acids with similar properties (i.e., volume, hydrophobicity and charge) the S1 and S2 scoring systems allow for the quantification of the severity of mismatched interactions in the hybrids. The S2 scoring system is found to be able to significantly functionally enrich a P450 library. Given this scoring base, we subsequently constructed two separate optimization formulations (i.e., OPTCOMB and OPTOLIGO) for optimally designing protein combinatorial libraries involving recombination or mutations respectively. Notably, two separate versions of OPTCOMB are generated (i.e., model M1, M2) with the latter allowing for position dependent parental fragment skipping. Computational benchmarking results demonstrate the efficacy of models OPTCOMB and OPTOLIGO to generate high scoring libraries of a pre-specified size. Specifically, we find that the optimal recombination or mutation patterns are complex and difficult to a priori identify.