Skip to content

proposal: object.union_merge #5144

Open
@charlesdaniels

Description

@charlesdaniels

What is the underlying problem you're trying to solve?

The problem to be solved is merging objects that may have collection fields which should be merged, rather than having one overwritten. For example:

obj1 := {"foo": [1,2,3]}
obj2 := {"foo": [4,5,6]}
obj3 := {"bar": {"a", "b", "c"}}
obj4 := {"bar": {"x", "y", "z"}}
obj5 := {"baz": 7}
obj6 := {"baz": [7,8,9]}

obj1_2 := object.union_merge(obj1, obj2)
obj2_1 := object.union_merge(obj2, obj1)

# all should be true
obj1_2 == {"foo": [1,2,3,4,5,6]}
obj2_1 == {"foo": [4,5,6,1,2,3]}

obj3_4 := object.union_merge(obj3, obj4)
obj4_3 := object.union_merge(obj4, obj3)

# all should be true
obj3_4 == obj4_3
obj3_4 == {"bar": {"a", "b", "c", "x", "y", "z"}}

obj5_6 := object.union_merge(obj5, obj6)
obj6_5 := object.union_merge(obj6, obj5)

# all should be true
obj5_6 == {"baz": [7,8,9]}
obj6_5 == {"baz": 7}

There could be a reasonable argument made that the ordering in which arrays would be concatenated should be the reverse of what I have shown above. It doesn't really matter so long as it's well defined.

Describe the ideal solution

In a perfect world, Rego would have first-class functions and we could simply build a single object.union() that accepted a callback allowing the user to simply specify the conflict resolution behavior would be best.

I think the ideal solution that is actually practical to implement would be to implement the above demonstrated object.union_merge() builtin. Of course this should be paired with an object.union_merge_n() to be consistent with object.union().

We could also consider some kind of special merge function that takes a JSON object describing how conflicts should be resolved as an argument. For example, it could be a list of objects like the following:

[
    {"left": "string", "right": "string", "resolution": "left"},
    ...
    {"left": "set", "right": "set", "resolution": "union"},
    ...
]

This would raise some questions though. What does it do if you ask it to union a string an an int? Or two integers? An array and a set?

I think it might be best to let sleeping dogs lie and just address the problem at hand for now.

Describe a "Good Enough" solution

I have worked around this problem by simply writing the Rego code that implements the desired merging behavior for just the specific objects I needed to merge. It's not a general case solution though, and this requires knowing the schema of the relevant objects in advance.

Additional Context

I ran into this problem while trying to merge some JSON objects representing role information for an RBAC use case. The desired behavior was to have a legacy system of record in read-only mode, and fold new role information into it at the OPA level by merging data pulled from a replacement system that was read-write. The intended functionality was that all existing role privileges should be maintained, but also allow for new ones to be added.


Edit: for the record, I am willing to do the work to implement this, but wanted to make sure everyone was onboard before I started working on a PR.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions