Genomic Science Program
U.S. Department of Energy | Office of Science | Biological and Environmental Research Program

From Viromes to Virocells: Dissecting Viral Roles in Terrestrial Microbiomes and Nutrient Cycling


Matthew Sullivan3* (, Sarah Bagby1, Robert Hettich2, Cristina Howard-Varona3, Sylvain Moineau4, Vivek Mutalik5, Malak Tfaily6


1Case Western University; 2Oak Ridge National Laboratory; 3The Ohio State University; 4Université Laval; 5Lawrence Berkeley National Laboratory; 6University of Arizona


Develop paradigms for understanding the role of viruses and MGEs in soil ecology via ecogenomic inference and experimental interrogation new soil-derived model systems, and to build tools—scalable new methods, new databases, and new model systems—to test these paradigms.


The activity of soil microbes affects global energy and nutrient cycles, but they do so under largely unconstrained but likely significant virus impacts. While viruses play pivotal roles in other ecosystems like the oceans, soil virus research is hindered by technical challenges. In this project—‘VirSoil’—the group focused on three aims to (1) detect, identify, and classify soil viruses and their potential roles; (2) mechanistically understand soil virus infections; and (3) develop community resources for studying soil viruses.

In Aim 1, the team elucidated soil virus ecology in a decadal bulk metagenomic dataset from a permafrost thaw gradient at Stordalen Mire. For DNA viruses, 5,051 virus populations were cataloged, documented to have high year-to-year turnover, and linked to carbon cycling through host prediction and gene content analysis that identified virus-encoded carbon degradation, methanotrophy, and methanogenesis genes (Sun and Pratama et al. in review). For RNA viruses, nearly 9,000 sequences were identified, representing 2,651 novel “species”, and then ecologically contextualized according to habitat, depth, and soil properties (Pratama and Dominguez-Huerta et al. in prep).

In Aim 2, the team sought to begin understanding soil virocells (i.e., virus-infected cells) by advancing virocell-specific analytics and developing new soil virus-host model systems. Towards this, researchers applied time-resolved, multiomics technologies to diverse virus genomes, infection efficiencies, and nutrient conditions to establish knowledge and protocol pillars using an established Cellulophaga virocell model system. This revealed that virus genome type and infection efficiency strongly shape bacterial biomolecule composition and dynamics, where virocell biomolecule specificity was highest in transcripts, lower in proteins, and lowest in metabolites (Howard-Varona et al. in prep) and post-translational modifications were uncovered (Peters et al. in prep). In parallel, multiomics of nutrient limited Pseudoalteromonas virocells revealed the interplay between environment and virus infection intracellularly versus extracellularly (Howard-Varona and Lindback et al. in revision). Finally, this aim also established random bar code transposon-site sequencing (RB-TnSeq) and CRISPR-Cas9 engineering approaches to assess virus components of the virus- host arms race and scalably characterize resistance mechanisms.

In Aim 3, the team focused on community empowerment. To this end, researchers established standard operating procedures for auxiliary metabolic gene (AMG) analysis (Pratama et al. 2021), developed ‘MetaPop’ to simplify population genetics analysis (Gregory et al. 2022), created an enhanced protocol for identifying and annotating metabolites through machine learning (Rajakaruna et al. in prep), curated an efam virus protein cluster database to improve virus protein annotation (Zayed et al. 2021), and worked with KBase to layer in basic iVirus functionality including virus identification (VirSorter, VirSorter2) and taxonomic classification (vConTACT2) tools. Finally, towards expanding model systems for soil viral ecology, the group screened hundreds of microbial strains to isolate viruses, triply plaque-purified subsets of these, and developed 60 viruses that were genome‐sequenced and host‐range-characterized as new virus-host model systems for virocell multiomics and other characterization (Gittrich et al. in prep).


Gregory, A. C., et al. 2022. “MetaPop: A Pipeline for Macro- and Microdiversity Analyses and Visualization of Microbial and Viral Metagenome-Derived Populations,” Microbiome 10(1), 49. DOI:10.1186/s40168-022-01231-0.

Pratama, A. A., et al. 2021. “Expanding Standards in Viromics: In Silico Evaluation of dsDNA Viral Genome Identification, Classification, and Auxiliary Metabolic Gene Curation,” PeerJ 9. DOI: 10.7717/peerj.11447.

Zayed, A. A., et al. 2021. “Efam: An Expanded, Metaproteome-Supported HMM Profile Database of Viral Protein Families,” Bioinformatics 37(22), 4202–8. DOI:10.1093/bioinformatics/btab451.

Funding Information

This material is based upon work supported by the U.S. DOE, Office of Science, BER program, under Award Number DE‐SC0020173 and DE-SC0023307.