Skip to content

How are spatial coordinates used in pretraining and downstream tasks ? #25

@TawhidMM

Description

@TawhidMM

Hi Nicheformer team,

I wanted to ask for clarification regarding how spatial coordinate data is utilized throughout the Nicheformer workflow.

Specifically:

During pretraining:

  • Is spatial coordinate information explicitly used as part of the model input?

  • Or is pretraining based only on gene expression tokens, regardless of cell positions?

During downstream tasks:

  • Are spatial coordinates ever used directly in tasks like niche classification, density regression, etc.?

  • Or are the coordinates only used indirectly — for computing ground truth labels like X_niche_n (niche composition) or local density using tools like Squidpy?

From what I understand:

  • The coordinate data is used for generating ground-truth labels (like niche composition/density) via spatial graphs.

  • But the model input itself (in both pretraining and downstream tasks) consists only of ranked gene token sequences, not explicit spatial information.

Could you confirm if this interpretation is correct? And if not, could you clarify how spatial context is encoded (if at all) during model input?
Thanks in advance ..

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions