Skip to content

Conversation

@Furkan-rgb
Copy link
Contributor

@Furkan-rgb Furkan-rgb commented Nov 4, 2024

Motivation

Problem:

When using the CatCMA sampler in Optuna with categorical parameters, a shape mismatch error occurs during the optimization process. This error prevents the sampler from functioning correctly, hindering the hyperparameter optimization workflow.

Error Details:

ValueError: operands could not be broadcast together with shapes (10,33) (6,11) This error arises from the attempt to perform element-wise operations on arrays with incompatible shapes within the CatCMA optimizer. Specifically, the categorical parameters are encoded as ragged arrays (arrays of varying lengths), leading to broadcasting issues.

Root Cause:
The categorical parameters are being one-hot encoded with differing lengths based on the number of choices per categorical variable. When these ragged arrays are converted to NumPy arrays with dtype=object, they lose their uniform shape, causing mismatches during optimizer computations.

Description of the changes

To resolve the shape mismatch, ensure that all categorical parameters are encoded into fixed-size, flat arrays. This can be achieved by concatenating all one-hot encoded vectors into a single flat array with a consistent length across all trials. Here's how to implement this:

Calculate Total Categories: Determine the total number of categories across all categorical variables to establish a fixed size for the concatenated array.

total_categories = sum(len(space.choices) for space in categorical_search_space.values()) Encode Categorical Parameters: Replace the ragged array encoding with a flat array approach, ensuring each c has the same shape.

for t in solution_trials[:popsize]:
    assert t.value is not None, "completed trials must have a value"
    # Convert numerical parameters
    x = trans.transform({k: t.params[k] for k in numerical_search_space.keys()})
    
    # Initialize a flat array for all categorical parameters
    c = np.zeros(total_categories)
    offset = 0
    for k in categorical_search_space.keys():
        choices = categorical_search_space[k].choices
        v = t.params.get(k)
        if v is not None:
            index = choices.index(v)
            c[offset + index] = 1
        offset += len(choices)
    
    y = t.value if study.direction == StudyDirection.MINIMIZE else -t.value
    solutions.append(((x, c), y))

Impact of the Change:

Consistency: Ensures that all categorical parameter arrays have a consistent shape, eliminating broadcasting issues.
Compatibility: Aligns with the CatCMA optimizer's expectations, allowing seamless integration of categorical parameters.

Robustness: Prevents runtime errors related to array shape mismatches, enhancing the reliability of the optimization process.

To illustrate:
If you have:
cat1 = ['A', 'B', 'C'] # 3 choices
cat2 = ['X', 'Y', 'Z', 'W'] # 4 choices

Original method would try to create:

[[1, 0, 0],       # shape (3,)
[0, 1, 0, 0]]     # shape (4,)

→ Results in shape mismatch

New method creates:

[[1, 0, 0, 0],    # padded to max_length
[0, 1, 0, 0]]     # uniform shape (2, 4)

→ Works correctly with the optimizer

Contributor Agreements

  • I agree to the contributor agreements.

…parameters.

Problem:

When using the CatCMA sampler in Optuna with categorical parameters, a shape mismatch error occurs during the optimization process. This error prevents the sampler from functioning correctly, hindering the hyperparameter optimization workflow.

Error Details:
ValueError: operands could not be broadcast together with shapes (10,33) (6,11)
This error arises from the attempt to perform element-wise operations on arrays with incompatible shapes within the CatCMA optimizer. Specifically, the categorical parameters are encoded as ragged arrays (arrays of varying lengths), leading to broadcasting issues.

Root Cause:
The categorical parameters are being one-hot encoded with differing lengths based on the number of choices per categorical variable. When these ragged arrays are converted to NumPy arrays with dtype=object, they lose their uniform shape, causing mismatches during optimizer computations.

Proposed Solution:

To resolve the shape mismatch, ensure that all categorical parameters are encoded into fixed-size, flat arrays. This can be achieved by concatenating all one-hot encoded vectors into a single flat array with a consistent length across all trials. Here's how to implement this:

Calculate Total Categories: Determine the total number of categories across all categorical variables to establish a fixed size for the concatenated array.

total_categories = sum(len(space.choices) for space in categorical_search_space.values())
Encode Categorical Parameters: Replace the ragged array encoding with a flat array approach, ensuring each c has the same shape.

for t in solution_trials[:popsize]:
    assert t.value is not None, "completed trials must have a value"
    # Convert numerical parameters
    x = trans.transform({k: t.params[k] for k in numerical_search_space.keys()})
    
    # Initialize a flat array for all categorical parameters
    c = np.zeros(total_categories)
    offset = 0
    for k in categorical_search_space.keys():
        choices = categorical_search_space[k].choices
        v = t.params.get(k)
        if v is not None:
            index = choices.index(v)
            c[offset + index] = 1
        offset += len(choices)
    
    y = t.value if study.direction == StudyDirection.MINIMIZE else -t.value
    solutions.append(((x, c), y))
Impact of the Change:

Consistency: Ensures that all categorical parameter arrays have a consistent shape, eliminating broadcasting issues.
Compatibility: Aligns with the CatCMA optimizer's expectations, allowing seamless integration of categorical parameters.

Robustness: Prevents runtime errors related to array shape mismatches, enhancing the reliability of the optimization process.
@Furkan-rgb
Copy link
Contributor Author

@HideakiImamura could you please have a look?

@HideakiImamura
Copy link
Member

Thanks for the PR. It seems that the linter CI fails. Could you apply the linters as suggested here?

@HideakiImamura HideakiImamura self-assigned this Nov 5, 2024
@HideakiImamura HideakiImamura added the bug Something isn't working label Nov 5, 2024
@nabenabe0928 nabenabe0928 changed the title Fix shape mismatch error in CatCMA sampler when handling categorical … Fix shape mismatch error in CatCMASampler for categorical problems Nov 7, 2024
Copy link
Member

@HideakiImamura HideakiImamura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@HideakiImamura HideakiImamura merged commit 2b139de into optuna:main Nov 11, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants