forked from AllenNeuralDynamics/poisson-numcodecs
-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
The following parameters are currently exposed in codec metadata, i.e., the JSON form of the codec when it is written to zarr.json / .zarray:
Current metadata
| field | type | required? | notes |
|---|---|---|---|
zero_level |
int | yes | the value in the source data that corresponds to 0 detections |
conversion_gain |
float | yes | the intensity of a single observed event in the source data |
"true" metadata
But this codec takes more parameters than that. Here are the remaining parameters that fully specify the codec:
| field | type | required? | notes |
|---|---|---|---|
decoded_dtype |
Zarr data type identifier | probably not | the data type of the input data. Unclear if this is really needed, because it an be inferred from the type of the input array, and the codec can contain logic that validates certain constraints (e.g., raising an error if an incompatible data type is used). The Zarr V3 codec API defines a procedure a codec can use for statically checking its compatibility with an artbirary ndarray, so maybe we could put that logic there |
encoded_dtype |
Zarr data type identifier | yes | the data type of the array generated by the encode operation. This controls the amount of quantization. Right now this is hard-coded to uint8, but conceivably users might want to control this, but I don't have an intuition for that. |
input_max |
int, default is 32767 | unclear | the max value in the input. used in make_anscombe_lookup. This depends on the decoded_dtype parameter, because for integer inputs, input_max can't be larger than the largest value in the range of values defined for decoded_dtype. |
beta |
float, [0, 1] inclusive | yes | used to generate the lookup table from input values -> ancombed values. I don't have an intuition for this parameter yet, but I just know that it's data-dependent, and necessary for defining the lookup table, and thus effectively part of the codec configuration. |
key questions
- what would break if we don't use
decoded_dtypeas a parameter toencode? - should users specify the
encoded_dtype? - is
input_maxdependent on the maximum observed value in the user's data, or the maximum possible value, given the data type?
Metadata
Metadata
Assignees
Labels
No labels