Skip to content

Sign disagreement between published QTL feathers and Supp. Table 4 for caqtl_microglia and dsqtl_yoruba #45

@dryingpaint

Description

@dryingpaint

Sign disagreement between published QTL feathers and Supp. Table 4 for caqtl_microglia and dsqtl_yoruba

Hi! While reproducing the QTL coefficient evaluation against your paper's Suppl. Table 4 numbers, I ran into an inconsistency that I think is most likely a feather export bug, but it could also be that there's a documented sign-correction step I'm missing. Wanted to flag it either way.

Summary

For 3 of the 5 caQTL/dsQTL coefficient datasets you publish at gs://alphagenome/evals/, the signed pearsonr(prediction, target) computed directly on the feather matches the paper's Suppl. Table 4 value. For 2 datasets (caqtl_microglia and dsqtl_yoruba), the magnitude matches but the sign is opposite.

Dataset Direct from feather Suppl. Table 4
caqtl_african +0.7367 +0.7368 ✅
caqtl_european +0.5914 +0.5916 ✅
caqtl_smc +0.6870 +0.6870 ✅
caqtl_microglia −0.6354 +0.6357
dsqtl_yoruba −0.8323 +0.8323

Reproducer (no install needed, runs on any machine with pandas + scipy)

import os
import pandas as pd
from scipy.stats import pearsonr
from urllib.request import urlretrieve

BASE = "https://storage.googleapis.com/alphagenome/evals/"
cases = [
    ("caqtl_african",   "caqtl_african_variant_coefficient_human_predictions"),
    ("caqtl_european",  "caqtl_european_variant_coefficient_human_predictions"),
    ("caqtl_smc",       "caqtl_smc_variant_coefficient_human_predictions"),
    ("caqtl_microglia", "caqtl_microglia_variant_coefficient_human_predictions"),
    ("dsqtl_yoruba",    "dsqtl_yoruba_variant_coefficient_human_predictions"),
]
for label, name in cases:
    local = f"/tmp/{name}.feather"
    if not os.path.exists(local):
        urlretrieve(BASE + name + ".feather", local)
    df = pd.read_feather(local)
    r, _ = pearsonr(df["prediction"], df["target"])
    print(f"{label:<20} pearsonr(prediction, target) = {r:+.4f}")

Output:

caqtl_african        pearsonr(prediction, target) = +0.7367
caqtl_european       pearsonr(prediction, target) = +0.5914
caqtl_smc            pearsonr(prediction, target) = +0.6870
caqtl_microglia      pearsonr(prediction, target) = -0.6354
dsqtl_yoruba         pearsonr(prediction, target) = -0.8323

What I checked

  • The paper distinguishes signed vs unsigned Pearson explicitly (e.g. caQTL Fig. 5d: "Signed Pearson r = 0.74; unsigned Pearson r = 0.45"), so Suppl. Table 4's pearsonr column is clearly the signed version — the magnitudes match perfectly across all 5 datasets, only the sign disagrees on 2.
  • I separately verified by running model.predict_variant from this SDK on a few caqtl_microglia variants — the values reproduce the feather's prediction column to within bf16 numerics (r ≈ 0.999). So the prediction column is faithful to model output; the apparent disagreement is between target and the paper number.
  • The other 3 datasets process through the same loading code I'm using (target → effect_size, prediction → score) and they line up with the paper exactly. So this isn't something on my end being applied inconsistently.

Question

Is there a per-dataset sign-correction step in your table-generation pipeline that the published feathers don't reflect (e.g. a polarity flip based on which allele was assigned REF in the upstream QTL study for those two)? Or is the target column for caqtl_microglia and dsqtl_yoruba simply exported with the wrong sign?

Either is fine — either the feathers should be re-exported, or the convention should be documented somewhere users can find it. Right now anyone naively running pearsonr(prediction, target) on the published artifacts gets the opposite sign of the paper for those two datasets.

Happy to send a PR with whatever fix you prefer (re-sign the feather column, or add a documentation note + a small apply_sign_convention() utility).

Why this matters downstream

Anyone benchmarking against your numbers using gs://alphagenome/evals/ directly hits this — for instance our radical-eval pipeline cross-published baselines for all 5 datasets and the microglia/dsqtl_yoruba ones came out with the wrong sign, traceable entirely to this.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions