Using genomics for understanding how AMR genes and resistant bacteria move throughout the Canadian food supply
“One Health” is a collaborative approach that recognizes that the health of people is closely connected to the health of animals and our shared environment. Genomic surveillance using a One Health approach is a powerful tool for understanding and tracking how pathogens affecting human health evolve and spread. Genomic surveillance of pathogens requires high quality sequence data as well as well structured contextual data. Contextual data is the sample, laboratory, clinical, epidemiological, and methods information that enables the interpretation of sequence data. One Health initiatives often involve data streams from different sources, agencies, sectors, and information management systems, and because the data is structured in different ways it is often difficult to harmonize and integrate. By structuring contextual data using community standards such as minimum information checklists and ontologies, this information can be more easily understood and used by both humans and computers, and can be more easily reused for different types of analyses.
Antimicrobial resistance occurs when microbes evolve mechanisms that protect them from the effects of antimicrobial agents (such as antibiotics), resulting in a decreased ability to treat infections and illnesses in people, animals and plants. Antibiotic resistance is a public health concern around the world and the number of bacteria that are resistant to antibiotics is increasing.
In support of the Canadian Federal Action Plan for Antimicrobial Resistance (AMR) and Use in Canada (Public Health Agency of Canada, 2015), as well as the Pan-Canadian Action Plan on Antimicrobial Resistance (Public Health Agency of Canada, 2023), the Genomics Research and Development Initiative Shared Priority Projects for AMR (GRDI-AMR and GRDI-AMR-One Health (GAOH; aka AMR-OH or GRDI-AMR2)) uses a genomics-based approach to understand how food production contributes to the development of antimicrobial resistance of human health concern, and explore strategies for reducing antimicrobial resistance in food production systems. The AMR-GRDI is a component of the Federal Action Plan for Antimicrobial Resistance and Use in Canada, and involves data streams from federal departments and agencies spanning human health, agriculture, the environment, and food regulation. Participating government departments and agencies include: Agriculture and Agri-Food Canada, the Canadian Food Inspection Agency, Environment and Climate Change Canada, Fisheries and Oceans Canada, Health Canada, the National Research Council of Canada, Natural Resources Canada, and the Public Health Agency of Canada.
To better harmonize AMR-GRDI contextual data across sectors and agencies, CIDGOH is leading the Metadata Harmonization Working Group in the development of an ontology-based One Health AMR data standard for foodborne pathogens, which provides standardized fields, pick lists of controlled vocabulary and prescribed formats for the harmonized capture of contextual data. The standardized fields are based on community standards such as NCBI’s combined Pathogen and Environmental attribute package derived from internationally agreed upon Minimal Data for Matching (MDM) standards, as well as applicable fields from different MIxS packages (Genomic Standards Consortium). The data standard will also be harmonized with recently developed standards (e.g. One Health Enteric Package v1.0: Expanded and Standardized Metadata for Enteric Genomic Epidemiology in the U.S, and the Food MIxS package).
Labs collect, encode and store information in different ways. They use different fields, terms and formats, they categorize variables in different ways, and the meanings of words change depending on the focus of the organization (think of the word “plant”. To someone in agriculture, “plant” could mean an organism that carries out photosynthesis, while a food regulator might understand the word “plant” to mean a factory where food products are made). This variability makes comparing, integrating and analyzing data generated by different organizations like trying to compare apples, oranges and bananas, which is difficult to do.
Ontologies are collections of controlled vocabulary that are arranged in a hierarchy, where all the terms are linked using logical relationships. Ontologies are open source and meant to represent “universal truth” as much as possible (so not tied to one organization’s vocabulary of use case). Ontologies encode synonyms, which enables mapping between the specific languages used by different organizations, and every term in the ontology is assigned a globally unique and persistent identifier. Using ontology terms to standardize GRDI-AMR contextual data not only helps make data more interoperable by using a common language, it also helps to make contextual data FAIR (Findable, Accessible, Interoperable, Reusable).
The GRDI-AMR specification is also ISO 23418 compliant (Microbiology of the food chain — Whole genome sequencing for typing and genomic characterization of bacteria — General requirements and guidance).
The GRDI-AMR standard is implemented via a spreadsheet-based data collection instrument (i.e. metadata template), accompanying Field and Term reference guides (which provide definitions and additional specific guidance) and a curation Standard Operating Procedure (SOP). New terms can be added by making a new term request using the New Term Request SOP (please note that the specification will be updated periodically to address user needs). Find these resources below:
Implementation Manuscript: Crossing the streams: improving data quality and integration across the One Health genomics continuum with data standards and implementation strategies
Griffiths EJ, Jurga E, Wajnberg G, et al. Crossing the streams: improving data quality and integration across the One Health genomics continuum with data standards and implementation strategies. Can J Microbiol. 2025;71:1-14. doi:10.1139/cjm-2024-0203
Design Manuscript: The Broom of the System: A Harmonized Contextual Data Specification for One Health AMR Pathogen Genomic Surveillance
Griffiths, Emma, et al. The Broom of the System: A Harmonized Contextual Data Specification for One Health AMR Pathogen Genomic Surveillance. OSF Preprints, 2024. doi:10.31219/osf.io/xbf4t_v1
A specification template to facilate data curation and harmonization.
- XLSX version
- A tabular, macro-containing excel spreadsheet version of the specification with a "merged sheet" generated from all the data across individual tabs.
- DataHarmonizer version
- A standardized template-driven spreadsheet editor and validator for harmonizing, validating, and transforming genomics contextual data into submission-ready formats. As the excel spreadsheet became more complex and data processing became unwieldy, a GRDI-AMR/GOAH DataHarmonizer template was engineered to faciliated curation, validation, and automated transformations. The "Pathogen-Genomics-Package" is a version of the DataHarmonizer packaged with specification templates. The GRDI-AMR/GAOH Template is listed as "GRDI" within the DataHarmonizer template menu. Template schema files can be found as .yaml/.json/.tsv here. Instructions for "Getting Started" in the "GRDI_DataHarmonizer-SOP.pdf". Further information about application functionality can be found on the DataHarmonizer Wiki.
- The template is currently implemented as a single tab/worksheet, but a multitabbed version is under active development (beta as of August 2025) to faciliate the "merged sheet"/one-to-many functionality within the DataHarmonizer.
Reference guides providing field and picklist term descriptions, guidance, examples, and version information.
Guidance describing and explaining how to curate your data across the different sections of the specification.
- PDF version
- Online version
- Note: Detailed worked examples, that may help users navigate different curation scenarios, can be found in the "Design Manuscript"
A webpage that outlines how to standardize sample, isolate, and sequence metadata as well as the process for submitting data to the central database (VMR), NCBI, and IRIDA. It contains workflow diagrams and checklists to support you in this process.
- Online version
- Note: see the Connecting to the Science Network webpage for help connecting to the NMLB's Science Network, which is required to access IRIDA to store sequences, and the dashboards.
Guidance on how to download and use the DataHarmonizer version of the specification.
How to request new fields and/or terms to expand the specificantion so that it appropriately encompasses your data needs. Requests undergo working group discussion and data structure review, then experienced data curators integrate them into the specification and create/submit the terms in the appropriate ontologies.
Please note that development of the specification is dynamic and it will be updated periodically to address user needs. Versioning is done in the format of x.y.z.
x = Field level changes
y = Term value / ID level changes
z = Definition, guidance, example, formatting, or other uncategorized changes
Descriptions of updates are provided in release notes for every new version.
Discussions contributing to updates may be tracked on the DataHarmonizer GitHub issue tracker.