Skip to content

Inconsistent ptid Indexing in Resampled SQLite Opacity Databases #398

@Nicholaswogan

Description

@Nicholaswogan

Codex identified an issue with how pressure and temperature points are indexed for the linear interpolation path of a resampled opacity DB. I've pasted its report below.

Summary

PICASO appears to have an inconsistent internal convention for ptid in resampled SQLite opacity databases.

Some database-builder and reader paths treat ptid as 0-based, while other legacy paths assume ptid is 1-based. The linear interpolation path in picaso/optics.py assumes a 1-based convention when selecting molecular PT rows, which causes it to use the wrong interpolation stencil for databases whose ptid values are validly stored as 0-based.

This is not specific to one custom database. It is a PICASO consistency bug: the codebase does not enforce one ptid contract across builders and runtime readers.

What We Verified

1. The SQLite database can validly be 0-based

In the database writer used to generate the tested opacity database, ptid is written as:

l = j + i*len(P)
cur.execute(..., (int(l), molecule, float(T[i]), float(P[j]), y))

That is clearly 0-based.

The resulting SQLite database is internally consistent:

  • min(ptid) = 0
  • max(ptid) = N-1
  • the PT rows are ordered consistently with the stored pressure/temperature grid

2. opacity_factory.py is internally inconsistent about ptid

picaso/opacity_factory.py contains multiple builder/read paths with conflicting assumptions:

  • some molecular insert paths write 0-based ptid
  • at least one legacy insert path writes 1-based ptid
  • some downstream helper code assumes 1-based ids and subtracts 1 when indexing:
temp_nearest = [pt_pairs[i-1][2] for i in ind_pt]
pres_nearest = [pt_pairs[i-1][1] for i in ind_pt]

So PICASO does not currently maintain a single consistent ptid convention across the codebase.

Runtime Bug in the SQLite Linear Path

The resampled SQLite query_method='linear' path in picaso/optics.py uses logic equivalent to:

data[i + '_' + str(1 + i_t_low_p_low[ind])]
data[i + '_' + str(1 + i_t_hi_p_low[ind])]
data[i + '_' + str(1 + i_t_hi_p_hi[ind])]
data[i + '_' + str(1 + i_t_low_p_hi[ind])]

That +1 implicitly assumes molecular ptid is 1-based.

For a valid 0-based database, this shifts all four molecular interpolation corners upward by one pressure row.

Concrete Example

For a layer at:

  • P = 1e-4 bar
  • T = 180 K

The physically correct PT corners are:

  • (P=1e-4, T=175)
  • (P=1e-4, T=200)
  • (P=3.162e-4, T=175)
  • (P=3.162e-4, T=200)

But the current SQLite linear path fetches:

  • (P=3.162e-4, T=175)
  • (P=3.162e-4, T=200)
  • (P=1e-3, T=175)
  • (P=1e-3, T=200)

So the interpolation stencil is shifted upward by one pressure row.

Impact

  • query_method='linear' in the resampled SQLite backend can return systematically biased molecular opacities for valid 0-based databases.
  • Spectra can differ noticeably because the molecular interpolation stencil is wrong.
  • This issue is not necessarily limited to one code path; any builder/reader combination that mixes 0-based and 1-based assumptions is at risk.

Expected Fix Direction

PICASO should define and enforce one ptid convention for resampled opacity databases.

Recommended fix:

  1. Pick one explicit contract for ptid everywhere.
    Most natural choice: 0-based, since it matches normal Python row indexing and is already used by some builders.
  2. Update runtime reader code in picaso/optics.py so the SQLite linear path uses the actual stored ptid values without an unconditional +1 shift.
  3. Audit picaso/opacity_factory.py for any legacy 1-based assumptions and normalize them.
  4. Add validation/tests that confirm:
    • the PT grid row ordering
    • stored ptid
    • runtime interpolation row selection
      all agree.

Suggested Validation

  • Print the four PT-corner rows chosen for several representative (P, T) layers.
  • Compare those rows against the actual physical PT grid in the database.
  • Confirm that the selected rows match the correct bracketing stencil.
  • Test both builder-generated and custom-generated resampled SQLite databases to ensure the convention is consistent.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions