598d Metagenome-Wide Metabolic Network Reconstruction of AN Acid Mine Drainage Microbial Communiy

Yu Chen, Department of Chemical Engineering, University of Michigan, 2300 Hayward Rd, 3238 H.H Dow, Ann Arbor, MI 48109 and Xiaoxia (Nina) Lin, Chemical Engineering, The University of Michigan at Ann Arbor, 2300 Hayward St., 3074 HH Dow Bldg., Ann Arbor, MI 48109.

Metagenomic sequencing has emerged as a powerful tool for the study of microbial communities, which play important roles in numerous ecosystems ranging from fungi-plant associations to human guts. In this work, we have developed a bioinformatics pipeline for generating large-scale in silico metabolic networks from metagenomic sequencing data and have applied it to the investigation of the metabolic capabilities of an Acid Mine Drainage (AMD) microbial community. AMD is a major source of water pollution in many mining areas and the microbial community in it is essential for its formation.

Our bioinformatics pipeline automatically reconstructs metagenome-wide metabolic networks of an entire microbial community by integrating different types of information, including metagenomic sequences and annotations, reaction/pathway databases, and organism/genome databases. We also take into count the inaccuracy and incompleteness of the used information and attempt to identify the missing components in the reconstructed network. A mixed integer linear programming (MILP) based framework using the flux balance assumption has been developed as a key step in the pipeline to reconstruct the networks and identify additional reactions which are required to fullfil assumed metabolic capabilities but are currently missing in the metagenome annotation. Next, the BLAST sequence alignment method is employed to estimate the probability of existence for all the potentially missing components in the network, which are then used to evaluate the overall uncertainty of including different sets of additional metabolic reactions in the network. By allowing alternative network reconstructions and evaluating the overall uncertainty of them, we are able to reconstruct the most probable metabolic network, capable of achieving certain pre-defined functions, such as biomass synthesis. At the same time, the enzyme candidates in the metagenome for the network gaps are identified.

We have applied the above bioinformatics pipeline to the reconstruction of the metabolic network of an AMD community which was among the first microbial communities sequenced at the metagenomic scale. This community consists of five major species. For each species, the reconstructed network contains 530 to 620 reactions. For synthesis of basic biomass components (i.e. amino acids, nucleotides and coenzymes), it requires 165-175 reactions, of which 45-80 are added without corresponding genes in the current metagenome annotation. We have identified candidate genes in the metagenome for a substantial fraction of these added reactions. We have further analyzed the reconstructed networks and discovered potential cross-feeding relationships among the species. It has been found that the biosynthesis pathways for several amino acids are very likely to be missing in Leptospirillum sp. Group III which is known to be the only species in the community that can fix nitrogen into ammonium required by all other community members. This hypothesis might lead to an effective treatment of AMD through blockage of the cross-feeding interactions.