Skip to content

[JSON] Support Edit Values for JSON operator #1132

Open
@ShihChun-H

Description

Issue Description

Current State

  • It is very difficult to manipulate JSON data with JSON operator.

Proposed Change

JSON schema pseudo code

JsonOperator:
  Task: Edit values
  
  Input:
    data: 
      type: object
      description: Original data, which can be a JSON object or array of objects.
    updates: 
      type: array
      description: An array of objects specifying the values to be updated.
      items:
        type: object
        properties:
          field: 
            type: string
            description: The field in the original data whose value needs to be updated, supports nested paths if "supportDotNotation" is true.
          newValue: 
            type: any
            description: The new value that will replace the current value at the specified field.

#    supportDotNotation:
#     type: boolean
#      default: true
#      description: Determines whether to interpret the field as paths using dot notation. If false, field is treated as a literal key.
    conflictResolution:
      type: string
      enum: [create, skip, error]
      default: skip
      description: Defines how to handle cases where the field does not exist in the data. 
      
  Output:
    data:
      type: object
      description: The modified data with the specified values updated.

Edge Cases and Considerations:

1. Non-Existent Fields:

  • You might decide to skip updates where field does not exist in the data or log a warning for such cases.

2. Type Mismatches:

  • Ensure that the new value is compatible with the existing data structure (e.g., if replacing an array with an object, ensure the change is intended).

Key Features:

  1. field: This specifies the exact location of the value to be edited. It allows for nested paths (e.g., address.city); otherwise, it treats the path as a literal key.
  2. newValue: This is the new value that will replace the current value at the specified field.
  3. conflictResolution Parameter:
    a. create: If the field does not exist, the function creates it.
    b. skip: If the field does not exist, the function skips the update (default behavior).
    c. error: If the field does not exist, the function logs an error or returns an error.

Type Checking:
Before updating a value, the type of the existing field is checked to ensure compatibility.

Example Usage:

Scenario: Input data as JSON object

// input
{
  "data": {
    "name": "John Doe",
    "age": 30,
    "address": {
      "street": "123 Main St",
      "city": "Anytown",
      "state": "CA"
    },
    "contacts": [
      {
        "type": "email",
        "value": "[email protected]"
      },
      {
        "type": "phone",
        "value": "555-1234"
      }
    ]
  },
  "updates": [
    {"field": "name", "newValue": "Jane Doe"},
    {"field": "address.city", "newValue": "Othertown"},
    {"field": "age", "newValue": "31"},
    {"field": "address.zipcode", "newValue": "12345"},
    {"field": "contacts.0.value", "newValue": "[email protected]"}
  ],
//  "supportDotNotation": true,
  "conflictResolution": "skip"
}

Conflict Resolution Scenarios:

1. Skip (Default):

  • Field "name": "John Doe" is updated to "Jane Doe".
  • Field "address.city": "Anytown" is updated to "Othertown".
  • Field "age": "30" is updated to "31".
  • Field "address.zipcode": "zipcode" field does not exist initially. Since conflictResolution is "skip", the update is skipped, and "zipcode" is not added.
  • Field "contacts.0.value": "[email protected]" is updated to "[email protected]".

Final output:

{
  "name": "Jane Doe",
  "age": 31,
  "address": {
    "street": "123 Main St",
    "city": "Othertown",
    "state": "CA"
  },
  "contacts": [
    {
      "type": "email",
      "value": "[email protected]"
    },
    {
      "type": "phone",
      "value": "555-1234"
    }
  ]
}

2. Alternate Scenario with "conflictResolution": "create":
If you had set "conflictResolution": "create", the "zipcode" field would have been created in the "address" object, and the output would look like this:

Final output:

{
  "name": "Jane Doe",
  "age": 31,
  "address": {
    "street": "123 Main St",
    "city": "Othertown",
    "state": "CA",
    "zipcode": "12345"
  },
  "contacts": [
    {
      "type": "email",
      "value": "[email protected]"
    },
    {
      "type": "phone",
      "value": "555-1234"
    }
  ]
}

3. Error:
If the conflictResolution is set to "error", the function will raise an error (or log an error) when it encounters a non-existent field during the update process. In this case, the update process will stop as soon as the non-existent field is encountered, and the existing fields will not be updated beyond that point.

// In this case, the function would return an error, and the output would likely look something like this:
  "error": "Field 'address.zipcode' does not exist."

Scenario: Input Data as an Array of Objects
If the input data is an array of objects, the logic needs to be adapted to handle each object in the array individually. The schema and the function would process each object within the array according to the specified updates and conflictResolution rules.

Input Example:

{
  "data": [
    {
      "name": "John Doe",
      "age": 30,
      "address": {
        "street": "123 Main St",
        "city": "Anytown",
        "state": "CA"
      },
      "contacts": [
        {
          "type": "email",
          "value": "[email protected]"
        },
        {
          "type": "phone",
          "value": "555-1234"
        }
      ]
    },
    {
      "name": "Jane Smith",
      "age": 28,
      "address": {
        "street": "456 Oak St",
        "city": "Othertown",
        "state": "NY"
      },
      "contacts": [
        {
          "type": "email",
          "value": "[email protected]"
        }
      ]
    }
  ],
  "updates": [
    {"field": "name", "newValue": "Updated Name"},
    {"field": "address.city", "newValue": "New City"},
    {"field": "contacts.0.value", "newValue": "[email protected]"},
    {"field": "age", "newValue": 29}
  ],
//  "supportDotNotation": true,
  "conflictResolution": "create"
}

Explanation:

  1. Field "name": Updates "name" to "Updated Name" for each object in the array.
  2. Field "address.city": Updates the "city" in the "address" for each object to "New City".
  3. Field "contacts.0.value": Updates the first contact's value to "[email protected]" in each object.
  4. Field "age": Updates the "age" field to 29 for each object.

Output:

{
  "data": [
    {
      "name": "Updated Name",
      "age": 29,
      "address": {
        "street": "123 Main St",
        "city": "New City",
        "state": "CA"
      },
      "contacts": [
        {
          "type": "email",
          "value": "[email protected]"
        },
        {
          "type": "phone",
          "value": "555-1234"
        }
      ]
    },
    {
      "name": "Updated Name",
      "age": 29,
      "address": {
        "street": "456 Oak St",
        "city": "New City",
        "state": "NY"
      },
      "contacts": [
        {
          "type": "email",
          "value": "[email protected]"
        }
      ]
    }
  ]
}

Rules for the Component Hackathon

  • Each issue will only be assigned to one person/team at a time.
  • You can only work on one issue at a time.
  • To express interest in an issue, please comment on it and tag @kuroxx, allowing the Instill AI team to assign it to you.
  • Ensure you address all feedback and suggestions provided by the Instill AI team.
  • If no commits are made within five days, the issue may be reassigned to another contributor.
  • Join our Discord to engage in discussions and seek assistance in #hackathon channel. For technical queries, you can tag @chuang8511.

Component Contribution Guideline | Documentation | Official Go Tutorial

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

Type

No type

Projects

  • Status

    In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions