DOE Genomes
Human Genome Project Information  Genomic Science Program  DOE Microbial Genomics  home
-

Genomes to Life Contractor-Grantee Workshop III
February 6-9, 2005, Washington, D.C.

Genomics:GTL Program Projects

J. Craig Venter Institute

42

Estimation of the Minimal Mycoplasma Gene Set Using Global Transposon Mutagenesis and Comparative Genomics

John I. Glass* (JGlass@venterinstitute.org), Nina Alperovich, Nacyra Assad-Garcia, Shibu Yooseph, Mahir Maruf, Carole Lartigue, Cynthia Pfannkoch, Clyde A. Hutchison III, Hamilton O. Smith, and J. Craig Venter

J. Craig Venter Institute, Rockville, MD

The Venter Institute aspires to make bacteria with specific metabolic capabilities encoded by artificial genomes. To achieve this we must develop technologies and strategies for creating bacterial cells from constituent parts of either biological or synthetic origin. Determining the minimal gene set needed for a functioning bacterial genome in a defined laboratory environment is a necessary step towards our goal. For our initial rationally designed cell we plan to synthesize a genome based on a mycoplasma blueprint (mycoplasma being the common name for the class Mollicutes). We chose this bacterial taxon because its members already have small, near minimal genomes that encode limited metabolic capacity and complexity. We took two approaches to determine what genes would need to be included in a truly minimal synthetic chromosome of a planned Mycoplasma laboratorium: determination of all the non-essential genes through random transposon mutagenesis of model mycoplasma species, Mycoplasma genitalium, and comparative genomics of a set of 15 mycoplasma genomes in order to identify genes common to all members of the taxon.

Global transposon mutagenesis has been used to predict the essential gene set for a number of bacteria. In Bacillus subtilis all but 271 of bacterium’s ~4100 genes could be knocked out. Because of the redundancy of many genes and systems in B. subtilis and other conventional bacteria, in these bacteria disruptions of genes involved in essential functions are often not lethal. M. genitalium is a slow-growing human urogenital pathogen that has the smallest known genome of any free-living cell at 580 kb. There is almost no redundancy in this genome, and as such M. genitalium is often used as a model of a minimal cell. In 1999 we published preliminary estimate of the M. genitalium minimal gene set. Global transposon mutagenesis identified 130 of the 484 M. genitalium protein-coding genes not essential for cell growth under laboratory conditions, and that work predicted that in a complete study still additional genes would likely be disrupted. In our current effort we have improved the technique to allow isolation and characterization of disruption mutants. Surprisingly, after attaining saturation mutagenesis of the M. genitalium genome we could only identify 98 disrupted genes, suggesting that for growth under our laboratory conditions the minimal mycoplasma essential gene set is ~386 protein coding genes. Some genes were disrupted that are involved in presumably critical metabolic processes, such as lactate, pyruvate and glycerol-3-phosphate dehydrogenases. This suggests that as has already been shown for some M. genitalium kinases, these dehydrogenases may be somewhat redundant due to less than stringent substrate specificity.

The 15 mycoplasma genomes comprise an excellent comparative genomics virtual laboratory. Previous similar computational comparisons of genomes across diverse phyla of the eubacteria are of limited value. Because of non-orthologous gene displacement, pan-bacterial comparisons identified less than 100 genes common to all bacteria; however determination of conserved genes within the narrow mycoplasma taxon is much more instructive. We identified 169 protein coding genes present in all of the complete mycoplasma genome sequences, and an expanded core set of 310 genes that are encoded in almost every member of our set of genomes. The additional genes in the expanded core gene set take into account that non-glycolytic mycoplasmas do not encode some glycolysis genes for instance, and that the obligate intracellular plant parasite, Phytoplasma asteris, has dispensed with many genes because the functions are provided by its host. At least 36 elements of the expanded core gene set are non-essential based on the gene disruption study. The combination of comparative genomics with the gene disruption data, and reports of specific enzymatic activities in different mycoplasma species enabled us to predict what elements are critical for this bacterial taxon. In addition to determining the consensus set of genes involved in different cellular functions, we identified 10 hypothetical genes conserved in almost all the genomes, and paralogous gene families likely involved in antigenic variation that comprise significant fractions of each genome and are presumably unnecessary for cell viability under laboratory conditions.

* Presenting author