Skip to content

Conversation

@ricjsan
Copy link

@ricjsan ricjsan commented Oct 15, 2025

Generated parquet still includes pvcs.

Digging through the code I could see that ignored_alloc_keys is not correctly initialized since the array is never retrieved from the keys property.

ignore_alloc_keys.json:

{
    "keys": ["pvs", "lbAllocations"]
}

opencost_parquet_exporter.py:

def main():
    ...
    print("Load allocation keys to ignore")
    ignore_alloc_keys = load_config_file(
        file_path=f'{os.path.dirname(os.path.abspath(__file__))}/ignore_alloc_keys.json')

    ...
    print("Processing the data")
    processed_data = process_result(
        result=result,
        ignored_alloc_keys=ignore_alloc_keys,
        rename_cols=rename_cols,
        data_types=data_types)
    ...

def process_result(result, ignored_alloc_keys, rename_cols, data_types):
    """
    Process raw results from the OpenCost API data request.
    Parameters:
    - result (dict): Raw response data from the OpenCost API.
    - ignored_alloc_keys (dict): Allocation keys to ignore
    - rename_cols (dict): Key-value pairs for coloumns to rename
    - data_types (dict): Data types for properties of OpenCost response 

    Returns:
    - DataFrame or None: Processed data as a Pandas DataFrame, or None if an error occurs.
    """
    for split in result:
        # Remove entry for unmounted pv's .
        # this break the table schema in athena
        split.pop('__unmounted__/__unmounted__/__unmounted__', None)
    for split in result:
        for alloc_name in split.keys():
            for ignored_key in ignored_alloc_keys:
                split[alloc_name].pop(ignored_key, None)
    ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant