Genomic Science Program
U.S. Department of Energy | Office of Science | Biological and Environmental Research Program

Quantify the Impact of Your Data with DOE JGI’s Genome Citation Service


Neil Byers, Chris Beecroft, Charles Parker, and Kjiersten Fagnan* (


Joint Genome Institute



Development of a new resource that deepens the community’s understanding of the impact of data reuse.


Steady increases in sequencing capacity, combined with rapid accumulation of publications and associated resources, have increased the complexity of maintaining associations between literature and genomic data. Accumulated errors and omissions in the literature and databases compound the difficulty of the task. Automated approaches to maintaining and confirming associations among these resources have become necessary.

Here researchers present the U.S. Department of Energy (DOE) Joint Genome Institute’s (JGI) Genome Citation Service (GCS), which discovers literature that incorporates genome data whether or not the source of the data was properly attributed by the authors. This service provides a number of advantages over manual curation including consistent coverage of public resources, automatic updating of genome project metadata, and augmentation of genome project metadata through documentation of previously unrecognized uses by the scientific community. The service significantly reduces labor costs associated with manual literature review while improving the quality, accuracy and consistency of genome metadata maintained by the DOE JGI.

The DOE JGI seeks to deepen its understanding of the impact of its user community’s science by connecting its data products to publications. The GCS facilitates this understanding, improves credit attribution for data generators, and can encourage data sharing by allowing scientists to see how reuse amplifies the impact of their original studies. Doing so supports JGI’s commitment to FAIR data practices and allows JGI to meet its obligations as a DOE Office of Science Public Reusable Research (SC PuRe) data resource.

The GCS increases the number of known publications that incorporate JGI data products and, as a publicly available resource, the GCS enables researchers to better understand their impact. Researchers seek feedback from the Genomic Science program (GSP) community on the usability of this resource.

Funding Information

This research was conducted by the U.S. Department of Energy Joint Genome Institute (, a DOE Office of Science User Facility, is supported by the Office of Science of the U.S. Department of Energy operated under Contract No. DE-AC02-05CH11231.