Implement schema-driven type coercion for profile data to handle schema evolution without bulk updates #140

Copilot · 2025-12-19T07:34:51Z

Purpose

When identity attribute schemas sync from WSO2 Identity Server to CDS, schema changes (type modifications, attribute removal, new attributes) previously required expensive bulk profile updates. This becomes operationally infeasible at scale.

Goals

Enable schema evolution without bulk profile rewrites by applying type coercion at read time based on the current schema definition.

Approach

Core Implementation

Type Coercion Layer (internal/system/utils/type_coercion.go)

CoerceValueToType() - Runtime type conversion supporting: string, integer, decimal, boolean, datetime, complex
ApplySchemaToAttributes() - Batch coercion across attribute maps
Multi-valued (array) attribute support with element-wise coercion
Case-insensitive boolean parsing (true/TRUE/True → bool)

Profile Service Integration (internal/profile/service/profile_service.go)

buildSchemaMap() - Converts ProfileSchema to O(1) lookup structure
applySchemaToProfile() - Applies coercion to identity_attributes, traits, application_data
Integrated into: GetProfile(), GetAllProfiles(), GetAllProfilesWithFilter()
Schema fetched once per request, reused for batch operations

Coercion Rules

Conversion	Behavior
string → integer	Parse numeric strings, omit non-numeric
string → boolean	Parse "true"/"false"/"1"/"0"/"yes"/"no" (case-insensitive)
integer → string	Direct conversion
incompatible	Attribute omitted (graceful degradation)

Example

// Schema v1: age stored as string
profile := CreateProfile({
    IdentityAttributes: {"age": "25"}
})

// Schema v2: age changed to integer
UpdateSchemaAttribute({
    AttributeName: "identity_attributes.age",
    ValueType:     "integer",
})

// Read automatically coerces
retrieved := GetProfile(profileId)
// retrieved.IdentityAttributes["age"] == 25 (int)

Design Rationale

Read-time coercion: No bulk updates, zero downtime
Backward compatible: Removed attributes preserved in profiles
Fail-safe: Original data unchanged, failed coercions → attribute omission
Performance: Schema cached per request, O(attributes) not O(profiles)

User stories

As a platform operator, I can change attribute types in the schema without triggering bulk profile migrations
As a developer, I can evolve schemas knowing old profiles will adapt automatically at read time
As a system administrator, I can remove deprecated attributes from schemas while preserving historical data

Release note

Schema-driven type coercion enables seamless schema evolution. Profile attribute values are now automatically coerced to match the current schema at read time, eliminating the need for bulk profile updates when schemas change.

Documentation

N/A - Internal implementation pattern. No user-facing API changes. Existing profile CRUD operations work identically; coercion is transparent.

Training

N/A - No changes to user workflows or APIs.

Certification

N/A - Internal implementation enhancement with no changes to product behavior or features requiring certification content.

Marketing

N/A - Infrastructure improvement with no customer-facing feature changes.

Automation tests

Unit tests

Coverage: 53 tests across all type conversions
- String/Integer/Decimal/Boolean/DateTime/Complex coercion (47 tests)
- Multi-valued attribute handling (6 tests)
- Schema application logic (4 tests)
Results: 100% pass rate

Integration tests

Test suite: test/integration/schema_evolution_test.go
Scenarios:
- String → Integer evolution
- Integer → String evolution
- String → Boolean evolution (case-insensitive)
- Attribute removal (backward compatibility)
- Multi-valued type changes (["1","2","3"] → [1,2,3])
Results: All existing repository tests passing (profile, schema, unification, consent)

Security checks

Followed secure coding standards in http://wso2.com/technical-reports/wso2-secure-engineering-guidelines? yes
Ran FindSecurityBugs plugin and verified report? N/A (Go codebase)
Confirmed that this PR doesn't commit any keys, passwords, tokens, usernames, or other secrets? yes

Samples

N/A - Internal implementation with no sample code required.

Related PRs

None

Migrations (if applicable)

N/A - No database schema changes. Existing profiles work immediately with new schema versions.

Test environment

Runtime: Go 1.24+
Database: PostgreSQL (via testcontainers)
OS: Linux (GitHub Actions runner)
Linter: golangci-lint v1.64.8 (zero warnings)

Learning

Pattern: Read-time schema projection vs write-time normalization
Trade-off: Small read overhead vs bulk write cost - favoring read-time for schema evolution use case
Reference: Inspired by document database schema-on-read patterns (MongoDB flexible schemas, DynamoDB attribute typing)

Original prompt

This section details on the original issue you should resolve

<issue_title>Data Modification when schema updates</issue_title>
<issue_description>Currently, we are working on synchronizing identity attribute schema updates from IS into CDS. However, these schema changes also need to be reflected in existing profile data. Updating attributes across every stored profile individually would be operationally expensive and difficult to manage at scale.

What would be a suitable approach to handle this efficiently? One option could be to store profile attribute values generically (for example, as strings) within JSON, and then rely on the schema definition at read, update, or render time to interpret and enforce the correct data type. Would this be a reasonable and scalable design choice?</issue_description>

<agent_instructions>We are syncing identity attribute schema updates from WSO2 Identity Server (IS) into the Customer Data Service (CDS).

The problem: when the schema changes (attribute added, type changed, or removed), existing profile data stored in CDS does not automatically reflect these updates. Updating every stored profile eagerly is not scalable.

Please propose and implement a schema-driven approach where:

Profile data is stored in a flexible JSON structure without hard-coupled data types.

The authoritative data type and constraints come from the profile schema, not the stored value.

Type validation, casting, and enforcement happen at read, update, and render time using the latest schema.

Schema changes do NOT require bulk profile rewrites.

Backward compatibility is maintained for existing profiles.

Deliverables:

Suggested data model changes (if any).

Schema-driven validation and type coercion logic.

Handling of schema evolution cases (type change, attribute removal, new attribute).

Clear comments explaining design decisions.

Constraints:

Avoid expensive migration jobs or per-profile updates.

Assume profiles are stored in JSON/JSONB.

Keep the solution extensible for future schema attributes.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

Fixes Data Modification when schema updates #139

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

coderabbitai · 2025-12-19T07:34:58Z

Important

Review skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

CLAassistant · 2025-12-19T07:35:07Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Co-authored-by: VivekVinushanth <[email protected]>

…n tests Co-authored-by: VivekVinushanth <[email protected]>

Co-authored-by: VivekVinushanth <[email protected]>

…ean parsing and import Co-authored-by: VivekVinushanth <[email protected]>

Co-authored-by: VivekVinushanth <[email protected]>

VivekVinushanth · 2025-12-19T07:49:55Z

What impact it will have on unification?

Initial plan

cd0c6fb

Copilot AI assigned Copilot and VivekVinushanth Dec 19, 2025

Copilot started work on behalf of VivekVinushanth December 19, 2025 07:35 View session

Copilot AI and others added 5 commits December 19, 2025 07:39

Add type coercion utilities with comprehensive tests

df67928

Co-authored-by: VivekVinushanth <[email protected]>

Integrate schema-driven coercion into profile service with integratio…

a894348

…n tests Co-authored-by: VivekVinushanth <[email protected]>

Fix linter issues and remove accidentally committed binary

6aca554

Co-authored-by: VivekVinushanth <[email protected]>

Fix code review issues: add strings.ToLower for case-insensitive bool…

1d8d510

…ean parsing and import Co-authored-by: VivekVinushanth <[email protected]>

Remove server binary and add to gitignore

3a16c0b

Co-authored-by: VivekVinushanth <[email protected]>

Copilot AI changed the title ~~[WIP] Implement data modification for schema updates~~ Implement schema-driven type coercion for profile data to handle schema evolution without bulk updates Dec 19, 2025

Copilot AI requested a review from VivekVinushanth December 19, 2025 07:51

Copilot finished work on behalf of VivekVinushanth December 19, 2025 07:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement schema-driven type coercion for profile data to handle schema evolution without bulk updates #140

Implement schema-driven type coercion for profile data to handle schema evolution without bulk updates #140

Uh oh!

Copilot AI commented Dec 19, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Dec 19, 2025

Review skipped

Uh oh!

CLAassistant commented Dec 19, 2025

Uh oh!

VivekVinushanth commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Implement schema-driven type coercion for profile data to handle schema evolution without bulk updates #140

Are you sure you want to change the base?

Implement schema-driven type coercion for profile data to handle schema evolution without bulk updates #140

Uh oh!

Conversation

Copilot AI commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Goals

Approach

Core Implementation

Coercion Rules

Example

Design Rationale

User stories

Release note

Documentation

Training

Certification

Marketing

Automation tests

Unit tests

Integration tests

Security checks

Samples

Related PRs

Migrations (if applicable)

Test environment

Learning

Comments on the Issue (you are @copilot in this section)

Uh oh!

coderabbitai bot commented Dec 19, 2025

Review skipped

Uh oh!

CLAassistant commented Dec 19, 2025

Uh oh!

VivekVinushanth commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Dec 19, 2025 •

edited

Loading