Skip to content

Check function to make labels #119

@dprada

Description

@dprada

Are we working with format strings? That should be very useful to make the inverse process. Something like:

def make_atom_labels(atoms, atom_label_format: str = "{atom_name}-{atom_id}") -> list[str]:
    """Generate textual labels for atoms using a format string.

    The argument `atom_label_format` defines a *format string*, i.e. a string
    template that contains placeholders enclosed in curly braces (e.g.,
    `"{atom_id}-{atom_name}"`). These placeholders are replaced with the
    corresponding atom attributes when generating each label.

    For instance:
    ```python
    atom_label_format = "{atom_name}-{atom_id}"
    ```
    produces labels like `"CA-25"`, whereas
    ```python
    atom_label_format = "{chain_id}:{atom_name}"
    ```
    produces labels like `"A:CA"`.

    Parameters
    ----------
    atoms : iterable of Atom
        Atoms to label. Each atom object must provide the attributes used in
        the format string (e.g., `atom_id`, `atom_name`, `chain_id`, etc.).
    atom_label_format : str, optional
        Format string defining the label template. Must contain valid placeholders
        matching atom attributes. For example: `"{atom_name}-{atom_id}"`.

    Returns
    -------
    list[str]
        List of labels generated for each atom according to the format string.

    Notes
    -----
    A *format string* differs from an *f-string* in that it is not evaluated
    immediately. Instead, it acts as a template that is evaluated later via
    the `.format()` method, allowing the template to be defined dynamically
    at runtime.

    Examples
    --------
    >>> make_atom_labels(atoms, "{atom_name}-{atom_id}")
    ['CA-25', 'CB-26', 'CG-27']
    >>> make_atom_labels(atoms, "{chain_id}:{atom_name}")
    ['A:CA', 'A:CB', 'A:CG']
    """
    return [atom_label_format.format(atom_name=a.name, atom_id=a.id) for a in atoms]

Check this piece of code to have an idea of how to make a simple two-directional process.

import re
from pathlib import Path
from typing import Pattern, Any

def template_to_regex(template: str) -> Pattern[str]:
    """Convert a format-string-like template into a regex with named groups.

    Example
    -------
    "atom_id:{atom_id} / group_id={group_id}"
    → r"^atom_id:(?P<atom_id>.+?)\ / group_id=(?P<group_id>.+?)$"
    """
    parts: list[str] = []
    i = 0
    while i < len(template):
        if template[i] == "{":
            j = template.index("}", i)
            field_name = template[i+1:j].strip()
            # grupo no codicioso con nombre
            parts.append(f"(?P<{field_name}>.+?)")
            i = j + 1
        else:
            parts.append(re.escape(template[i]))
            i += 1
    regex = "^" + "".join(parts) + "$"
    return re.compile(regex)


def format_from_template(template: str, context: dict[str, Any]) -> str:
    """Render a string from a template like '{atom_id}-{atom_name}'."""
    return template.format(**context)

def parse_from_template(template: str, text: str) -> dict[str, str]:
    """Parse a string using the given template and return the captured fields."""
    pattern = template_to_regex(template)
    m = pattern.match(text)
    if not m:
        raise ValueError(f"String {text!r} does not match template {template!r}")
    return m.groupdict()

def parse_many_from_template(template: str, texts: list[str]) -> list[dict[str, str]]:
    """Parse many strings with the same template."""
    pattern = template_to_regex(template)
    results: list[dict[str, str]] = []
    for text in texts:
        m = pattern.match(text)
        if not m:
            raise ValueError(f"String {text!r} does not match template {template!r}")
        results.append(m.groupdict())
    return results
"""

Metadata

Metadata

Assignees

Labels

TODOTask to do in the futureenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions