Skip to content

Blueprint for Dimension reducer / Data Processor #362

@odunbar

Description

@odunbar

Proposal

To construct a data processing tool that can unify the data processing of emulators.

Takes in

  • Data: input-output pairs, or the EKI object (with i-o extracted internally)
  • a schedule of DataProcessing & Dimension reduction

Gives back

  • the processed data pairs

Example

The user could specify N-stage processing:
In each stage the data can be processed with an:

process_schedule = [
    ("in", DataProcessor1(...)), # first process inputs with processor 1
    ("out", DataProcessor2(...)), # next process outputs with processor 2
    ("joint", DataProcessor3(...)), # next jointly process inputs and outputs with processor 3
    ("out", DataProcessor4(...)), # finally. process outputs with processor 4
]

The user then builds emulators on the processed data

io_pairs_processed = process_data(io_pairs, process_schedule) 
# one catch: interfaces must handle that io_pairs_processed may be a different dimension to io_pairs
io_pairs_recovered= reverse_process_data(io_pairs, process_schedule) 
# one catch: interfaces must handle that io_pairs_processed may be a different dimension to io_pairs

Where DataProcessor(...) could be selected from (just input or output)

  • PCA
  • simple scalings, or regularization
  • Nonlinear dim reduction

or (joint input and output)

  • data-informed or likelihood informed subspaces
  • CCA

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions