Skip to content

SY-3797: Oracle Prep 12 - Refactor Tasks and Devices to Move JSON Instead of Strings#2009

Merged
emilbon99 merged 26 commits into
rcfrom
sy-3797-use-proper-json-for-tasks
Feb 20, 2026
Merged

SY-3797: Oracle Prep 12 - Refactor Tasks and Devices to Move JSON Instead of Strings#2009
emilbon99 merged 26 commits into
rcfrom
sy-3797-use-proper-json-for-tasks

Conversation

@emilbon99

@emilbon99 emilbon99 commented Feb 18, 2026

Copy link
Copy Markdown
Contributor

Issue Pull Request

Linear Issue

SY-3797

Description

Basic Readiness

  • I have performed a self-review of my code.
  • I have added relevant, automated tests to cover the changes.
  • I have updated documentation to reflect the changes.

Greptile Summary

This PR refactors task configs and device properties throughout the entire stack to use native JSON objects instead of JSON-encoded strings, improving type safety and API ergonomics.

Key Changes:

  • Modified protobuf schemas to use google.protobuf.Struct instead of string fields for device.properties and task.config
  • Implemented binary.MsgpackEncodedJSON type in Go for backwards-compatible msgpack decoding (handles both old string format and new map format)
  • Added C++ utilities (x/cpp/json/struct.h) for converting between nlohmann::json and protobuf Struct
  • Updated all client libraries (Python, TypeScript, C++) to work with native JSON objects/dicts
  • Removed JSON string encoding/decoding throughout client code
  • Updated all tests across the codebase to use native JSON objects

Impact:

  • Breaking API change for clients using string-based configs
  • Backwards compatible at the storage layer via MsgpackEncodedJSON
  • Cleaner, more type-safe API surface
  • Simplified client code by removing manual JSON string handling

Confidence Score: 4/5

  • This PR is generally safe but requires thorough integration testing due to breaking API changes
  • The refactoring is well-structured with backwards compatibility at the storage layer, comprehensive test updates, and consistent changes across all language clients. However, the score is 4 (not 5) because: 1) this is a breaking API change that requires coordinated deployment, 2) the backwards compatibility relies on the MsgpackEncodedJSON decoder working correctly with existing data, and 3) extensive integration testing is needed to ensure all client-server interactions work correctly across the Python, TypeScript, and C++ clients
  • Pay close attention to x/go/binary/codec.go (backwards compatibility logic), core/pkg/api/grpc/device/handler.go and core/pkg/api/grpc/task/handler.go (translation layer), and integration test results

Important Files Changed

Filename Overview
core/pkg/api/grpc/v1/device.proto Changed properties field from string to google.protobuf.Struct for proper JSON handling
core/pkg/api/grpc/v1/task.proto Changed config field from string to google.protobuf.Struct for proper JSON handling
x/go/binary/codec.go Added MsgpackEncodedJSON type for backwards-compatible msgpack decoding of JSON strings
core/pkg/service/device/device.go Changed Properties field from string to binary.MsgpackEncodedJSON for native JSON support
core/pkg/service/task/task.go Changed Config field from string to binary.MsgpackEncodedJSON for native JSON support
core/pkg/api/grpc/device/handler.go Updated translation functions to convert between map[string]any and structpb.Struct
core/pkg/api/grpc/task/handler.go Updated translation functions to convert between map[string]any and structpb.Struct
x/cpp/json/struct.h New utility functions for converting between nlohmann::json and google.protobuf.Struct
client/cpp/device/device.cpp Updated proto conversion to use x::json::from_struct and x::json::to_struct utilities
client/cpp/task/task.cpp Updated proto conversion to use x::json::from_struct and x::json::to_struct utilities
driver/common/scan_task.h Updated merge_device_properties to use x::json::json instead of JSON strings

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Client: Python/TS/C++] -->|JSON Objects| B[gRPC Layer]
    B -->|google.protobuf.Struct| C[Handler Translation]
    C -->|map string any| D[Go Service Layer]
    D -->|binary.MsgpackEncodedJSON| E[Msgpack Storage]
    E -->|Backwards Compatible| F{Old Format?}
    F -->|JSON String| G[Parse String to Map]
    F -->|Map| H[Use Directly]
    G --> I[Unified Map Representation]
    H --> I
    I -->|Reverse Flow| D
    D -->|map string any| C
    C -->|google.protobuf.Struct| B
    B -->|JSON Objects| A
    
    style E fill:#e1f5ff
    style I fill:#d4edda
    style F fill:#fff3cd
Loading

Last reviewed commit: 99f760b

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

94 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@codecov

codecov Bot commented Feb 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 77.46479% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.72%. Comparing base (eb70d8b) to head (3628b06).
⚠️ Report is 2 commits behind head on rc.

Files with missing lines Patch % Lines
x/go/binary/codec.go 77.41% 4 Missing and 3 partials ⚠️
client/py/synnax/ni/types.py 0.00% 5 Missing ⚠️
client/py/synnax/opcua/types.py 20.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##               rc    #2009      +/-   ##
==========================================
+ Coverage   53.69%   53.72%   +0.03%     
==========================================
  Files        2464     2464              
  Lines      151409   151474      +65     
  Branches     7096     7116      +20     
==========================================
+ Hits        81298    81381      +83     
+ Misses      67645    67625      -20     
- Partials     2466     2468       +2     
Flag Coverage Δ
alamos-go 55.25% <ø> (ø)
arc-go 74.50% <ø> (ø)
aspen 69.30% <ø> (-0.13%) ⬇️
cesium 83.06% <ø> (-0.11%) ⬇️
client-py 86.51% <72.72%> (-0.01%) ⬇️
client-ts 93.09% <100.00%> (-0.01%) ⬇️
console 21.46% <ø> (ø)
core 66.89% <100.00%> (ø)
freighter-go 63.34% <ø> (ø)
freighter-integration 1.50% <ø> (ø)
freighter-py 79.47% <ø> (ø)
freighter-ts 74.58% <ø> (ø)
oracle 55.76% <ø> (ø)
pluto 48.95% <ø> (+0.07%) ⬆️
x-go 78.53% <77.41%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@pjdotson pjdotson left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we force the JSON to be a JSON object or should it be allowed to be any type of JSON (string, null, number, boolean, array, or object)? While I don't see us internally using anything other than objects, it seems like it might be better to allow anything to be satisfied for a more expansive API, and preventing some sort of migrations / code changes if we, say, want to allow customers to store task configurations as an array. I would push

Comment thread core/pkg/service/arc/runtime/factory.go Outdated
f.setConfigStatus(ctx, t, xstatus.VariantError, err.Error())
return nil, true, err
}
if err := json.Unmarshal(cfgJSON, &cfg); err != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we have to marshal and unmarshal the JSON here? will we have to do this every time we
write a new Go driver? Why isn't the config just already usable?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like why don't we try to type assert a t.Config as a TaskConfig instead so we don't
have to marshal / unmarshal?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reality is that we're going to store this as some sort of generic protobuf type in the future. pb.Any or pb.Struct and then we'll use oracle generated code to convert it into a concrete type. If we make the field any we still can't do a type assertion because the json/msgpack codec will automatically decode the interface into the underlying type of map[string]any. The long term solution here is to make this a generic and have proper pb-based encode/decode mechanisms, but that it is out of scope here.

The objective with this PR was to standardize JSON representation so we could have a consistent oracle type.

So yes, this is necessary for now but will be replaced in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an unmarshal helper that abstracts this away from the caller for now.

model: str = ""
configured: bool = False
properties: str = ""
properties: dict[str, Any] = {}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there not a better JSON type or something than having to use Any?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the most narrow, reasonable type for now. I think in the future this should use a properly validate generics structure.

@emilbon99

Copy link
Copy Markdown
Contributor Author

Should we force the JSON to be a JSON object or should it be allowed to be any type of JSON (string, null, number, boolean, array, or object)? While I don't see us internally using anything other than objects, it seems like it might be better to allow anything to be satisfied for a more expansive API, and preventing some sort of migrations / code changes if we, say, want to allow customers to store task configurations as an array. I would push

Looks like this comment got cut off

I don't see any world in which using a non-object format is of considerable value for the end user. There's definitely no reasonable case where you would use a primitive, and I think that the harm in having an array based config is negligible.

By enforcing an object we dramatically simplify the set of types we need to support in typescript, Python, and C++.

@emilbon99 emilbon99 requested a review from pjdotson February 20, 2026 04:06
Base automatically changed from sy-3798-improve-type-safety-of-device-queries to rc February 20, 2026 04:16
@pjdotson

Copy link
Copy Markdown
Contributor

Should we force the JSON to be a JSON object or should it be allowed to be any type of JSON (string, null, number, boolean, array, or object)? While I don't see us internally using anything other than objects, it seems like it might be better to allow anything to be satisfied for a more expansive API, and preventing some sort of migrations / code changes if we, say, want to allow customers to store task configurations as an array. I would push

Looks like this comment got cut off

I don't see any world in which using a non-object format is of considerable value for the end user. There's definitely no reasonable case where you would use a primitive, and I think that the harm in having an array based config is negligible.

By enforcing an object we dramatically simplify the set of types we need to support in typescript, Python, and C++.

Okay. I agree that the primitive / array based aren't that useful. However I don't think it is the case that it really simplifies the typing. IE the standard JSON type in Python is this (which we should be using instead of dict[str, Any] if we can:

from typing import TypeAlias

JSON: TypeAlias = (
    dict[str, "JSON"]
    | list["JSON"]
    | str
    | int
    | float
    | bool
    | None
)

so we would then want the types in our task config to be dict[str, JSON] where JSON is defined like this. Also, the C++ client library type should maybe be different: x::json::json allows for primitives / arrays as well.

All of which to say I think it would actually be easier to support with good typing any JSON value, and right now the state of the PR does not completely have good typing in client libraries.

@pjdotson pjdotson left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment

@emilbon99 emilbon99 merged commit c0dfd77 into rc Feb 20, 2026
61 of 62 checks passed
@emilbon99 emilbon99 deleted the sy-3797-use-proper-json-for-tasks branch February 20, 2026 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants