Skip to content

MIToS-generated mmCIF templates fail in ColabFold/AlphaFold due to missing PDBx categories #194

@diegozea

Description

@diegozea

Summary

When generating a template mmCIF from a PDB using MIToS (e.g. selecting a single chain, then writing with MMCIFFile / write_file), the resulting mmCIF can be syntactically valid but still fails in ColabFold/AlphaFold template mode with:


ValueError: mmCIF file ... is missing required field _chem_comp.id

Root cause

AlphaFold’s mmCIF parser (alphafold/data/mmcif_parsing.py) expects several PDBx/mmCIF categories that are commonly present in wwPDB mmCIF files, notably:

  • _chem_comp.id and _chem_comp.type (used to classify monomers as “peptide”)
  • _entity_poly_seq.* (polymer sequence)
  • _struct_asym.* (mapping between entity IDs and chain IDs)
  • plus header fields like _entry.id and _exptl.method

MIToS' current mmCIF writing path (via BioStructures.MMCIFDict(residues; ...) and writemmcif) primarily emits _atom_site.* information. In addition, when starting from PDB input, residue identifiers often have empty PDBe numbering, which may lead to _atom_site.label_seq_id being written as "." rather than an integer. AlphaFold’s parser later casts label/auth sequence IDs to integers, so this can be another failure point after _chem_comp is addressed.

Expected behavior

It would be very helpful if MIToS documentation (or helper utilities) could:

  1. warn that “atom_site-only” mmCIFs may not be sufficient for AlphaFold/ColabFold templating, and/or
  2. provide a small utility to “upgrade” a minimal mmCIF to include the minimal additional PDBx categories required by AlphaFold.

Repro (typical)

  • Start from a PDB
  • Use MIToS to select a chain / rename residues
  • Save as mmCIF using MMCIFFile + write_file
  • Use that mmCIF as a ColabFold template → error above

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions