Skip to content

SchemaView - light schema access layer#9151

Draft
rschili wants to merge 37 commits into
masterfrom
rschili/runtime-schema
Draft

SchemaView - light schema access layer#9151
rschili wants to merge 37 commits into
masterfrom
rschili/runtime-schema

Conversation

@rschili
Copy link
Copy Markdown
Contributor

@rschili rschili commented Mar 29, 2026

imodel-native: iTwin/imodel-native#1369

Some diagrams refer to this as "Runtime Schemas" which was my working title. I didn't update all the pictures, just be aware. The final name we're going with is SchemaView.

Motivation

Yet another ECSchema layer for typescript, so let me explain:

  • ecschema-metadata (SchemaContext) is the full-fidelity schema toolkit. It models every detail of the EC specification, but loading it is expensive: one synchronous RPC per schema, large memory footprint (~150-200 MB for a large iModel), and it blocks the backend event loop during loading which is the primary concern here.
  • Incremental schema loading was implemented as a lighter alternative but remains incomplete and does not cover all use cases.
  • ECSQL against ECDbMeta works for targeted queries but is awkward for traversal patterns (e.g. walking all inherited properties of a class) and does not cache on the TS side. For one-shot questions this is perfectly viable!
  • Older backend metadata layer was discontinued and pulled single classes - this is somewhat similar to that again but loads one bigger chunk at once.

This PR introduces an optimized schema bundle which is transported via a single async call and parsed into an in-memory cache that provides fast synchronous access. It is lossy by design - it drops what runtime consumers do not need (units, formats, phenomena, full custom attribute instances) to keep the payload small and parsing fast.

Why one single bundle?
I did extensive testing. Of course it could be broken into smaller chunks, lazy loaded, but we have done all of that before - every approach has tradeoffs. This is keeping it simple and in testing performs rather well, even in scnearios in which it is weak by design.

The most extreme example is a single "check if class A IS of class B" on an iModel with huge schemas, magnitudes larger than an average iModel.
In this example, SchemaView takes about a .8 seconds spinup delay, then it responds instantaneously.
Asking the question directly via ECSql takes about 0.002s. With smaller iModels SchemaView catches up faster - ECSql is always available as an alternative option.

Design

Transport: A new PRAGMA schema_view(N) in ECDb returns the binary blob via ConcurrentQuery. No new RPC methods - it flows through the existing queryRows path. The pragma accepts a format version parameter for forward compatibility.

Binary format: C++ writer reads ec_ tables directly and produces a compact blob with:

  • String lookup table (deduplicates repeated schema/class/property names)
  • Property definition deduplication (many class-property pairs share identical definitions)
  • Per-item ecInstanceId (row ID from ec_ tables) for fallback to ECSQL when needed

Cache invalidation: Uses PRAGMA checksum(ecdb_schema) (SHA3-256) to detect schema changes. Schema imports or changeset pulls that modify schemas automatically invalidate the cache.

Performance characteristics

Initial investigation into how much metadata we ingest, tested on an iModel with enormous metadata and poor performance:

grafik

Performance comparison walking all properties:

grafik

Performance comparison on simulated frontend with limited network bandwidth:

grafik

Memory footprint at runtime:

grafik

@hl662
Copy link
Copy Markdown
Contributor

hl662 commented Mar 30, 2026

When would users want to use this compared to the full fledged schema context?

@rschili
Copy link
Copy Markdown
Contributor Author

rschili commented Mar 31, 2026

When would users want to use this compared to the full fledged schema context?

I was hoping for the answer to be "always" unless they are trying to store/author schemas.

Copy link
Copy Markdown
Member

@grigasp grigasp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good.

A few comments:

  • Not sure the prefix Runtime fits the purpose of the API. Based on the docs, it sounds like SchemaContext is reserved for editing use cases, but that happens during runtime as well.

  • I think a better distinction between schemaContext and getSchemas() on IModelConnection and IModelDb is needed. Why one is a getter and the other is a method?

    The schemaContext getter is still marked @beta, so I think we could at least rename it. However, considering it becomes a niche API used only for specific situations, I would consider removing it completely and instead exposing a global factory function to create a SchemaContext when it's needed.

  • Since RuntimeSchemaContext is "lossy", I think the docs should be very clear with recommendations on what to do when that lost information is needed.

    For example, in my case I used SchemaContext API to traverse from Property -> KindOfQuantity -> Format -> Unit. With the new API, KindOfQuantity only has presentation formats string. What do I do from there?

  • I think we'll want to bump IModelReadRpcInterface.version to ensure frontends can use the new pragma.

Comment thread core/ecschema-metadata/src/RuntimeSchema.ts Outdated
Comment thread core/frontend/src/IModelConnection.ts Outdated
Comment thread docs/learning/metadata/index.md Outdated
Comment thread docs/learning/metadata/RuntimeSchemaContext.md Outdated
Comment thread core/ecschema-metadata/src/RuntimeSchema.ts Outdated
Comment thread docs/learning/metadata/RuntimeSchemaContext.md Outdated
@rschili
Copy link
Copy Markdown
Contributor Author

rschili commented Apr 13, 2026

@grigasp

* Not sure the prefix `Runtime` fits the purpose of the API. Based on the docs, it sounds like `SchemaContext` is reserved for editing use cases, but that happens during runtime as well.

Fair point - "Runtime" is ambiguous since SchemaContext is also used at runtime. I thought hard about this, considered alternatives like CompactSchemaContext, LightSchemaContext, SchemaSnapshot, etc., but none clearly communicated the intent better.

The distinction we're drawing is "optimized for runtime consumption" (pre-loaded, read-only, synchronous, compact) vs "optimized for schema introspection and editing" (lazy-loaded, mutable, full-fidelity). "Runtime" is imperfect, changing it now would be a substantial update.
The prefix does carry the connotation of "this is the thing you reach for in hot paths." Open to a better name if one comes to mind, but the alternatives seemed worse.

* I think a better distinction between `schemaContext` and `getSchemas()` on `IModelConnection` and `IModelDb` is needed. Why one is a getter and the other is a method?
  The `schemaContext` getter is still marked `@beta`, so I think we could at least rename it. However, considering it becomes a niche API used only for specific situations, I would consider removing it completely and instead exposing a global factory function to create a `SchemaContext` when it's needed.

I think keeping it as a getter is the right call here. schemaContext and getSchemas() serve fundamentally different access patterns - schemaContext returns a thin wrapper object instantly (the real work happens lazily on nested calls like getSchemaItem()), while getSchemas() does real async work up front to fetch and deserialize the binary blob. Making one a getter and the other a method reflects that difference: synchronous, cheap construction vs. async, non-trivial I/O. A factory function would obscure that distinction and make the simple case harder to reach. I agree the naming asymmetry looks inconsistent at first glance, but I think it's actually communicating something useful about the cost model.

  For example, in my case I used `SchemaContext` API to traverse from `Property` -> `KindOfQuantity` -> `Format` -> `Unit`. With the new API, `KindOfQuantity` only has presentation formats string. What do I do from there?

For now you would have to use ECSql to pull that remaining info on demand. Let's dive into that and better underatand what we need. Modelling the whole units object model into this seems overkill to me. But I agree, if you DO need it, the path should be simple. Maybe RuntimeProperty could expose a helper method that does the work for you?

* I think we'll want to bump `IModelReadRpcInterface.version` to ensure frontends can use the new pragma.

done

@grigasp
Copy link
Copy Markdown
Member

grigasp commented Apr 14, 2026

@rschili

Fair point - "Runtime" is ambiguous since SchemaContext is also used at runtime. I thought hard about this, considered alternatives like CompactSchemaContext, LightSchemaContext, SchemaSnapshot, etc., but none clearly communicated the intent better.

The distinction we're drawing is "optimized for runtime consumption" (pre-loaded, read-only, synchronous, compact) vs "optimized for schema introspection and editing" (lazy-loaded, mutable, full-fidelity). "Runtime" is imperfect, changing it now would be a substantial update. The prefix does carry the connotation of "this is the thing you reach for in hot paths." Open to a better name if one comes to mind, but the alternatives seemed worse.

I would consider ReadonlySchemaContext.

I think keeping it as a getter is the right call here. schemaContext and getSchemas() serve fundamentally different access patterns - schemaContext returns a thin wrapper object instantly (the real work happens lazily on nested calls like getSchemaItem()), while getSchemas() does real async work up front to fetch and deserialize the binary blob. Making one a getter and the other a method reflects that difference: synchronous, cheap construction vs. async, non-trivial I/O. A factory function would obscure that distinction and make the simple case harder to reach. I agree the naming asymmetry looks inconsistent at first glance, but I think it's actually communicating something useful about the cost model.

Personally, a getter vs a method wouldn't make me realize that one is more expensive than the other. FWIW, I would probably think a getter is cheaper.

However, I'd like to reiterate on my suggestion to remove schemaContext getter completely. If we say that it's going to be needed in less than 10% of use cases, then why confuse all the consumers with two similar but different options? We could provide quick & simple schema access that suits 90% of consumers, and the rest could be forwarded to an API that lets them create SchemaContext. I think it's fair to say that more complex use cases require more complex setup.

But I agree, if you DO need it, the path should be simple. Maybe RuntimeProperty could expose a helper method that does the work for you?

I don't think we'll be able to stop looking at schema-based formats/units anytime soon. Even when the full replacement is ready, we'll probably want to look at what's in schema as a fallback. It doesn't necessarily have to be simple, but it has to be clear.

My preference would be for RuntimeKoQ.presentationFormats to return something like Array<{ formatName: string; precision: number; unitName: string }>. Example:

// koq format spec:
//   f:DefaultRealU(4)[u:M_PER_SEC_SQ];f:DefaultRealU(4)[u:CM_PER_SEC_SQ];f:DefaultRealU(4)[u:FT_PER_SEC_SQ]
// formats array:
[
  { formatName: "Formats.DefaultRealU", precision: 4, unitName: "Units.M_PER_SEC_SQ" },
  { formatName: "Formats.DefaultRealU", precision: 4, unitName: "Units.CM_PER_SEC_SQ" },
  { formatName: "Formats.DefaultRealU", precision: 4, unitName: "Units.FT_PER_SEC_SQ" },
]

That way we could use formatName and unitName to pull the details we need.

@hl662
Copy link
Copy Markdown
Contributor

hl662 commented Apr 14, 2026

I don't think we'll be able to stop looking at schema-based formats/units anytime soon. Even when the full replacement is ready, we'll probably want to look at what's in schema as a fallback. It doesn't necessarily have to be simple, but it has to be clear.

I agree, format set fotmatsProvider already has the concept of a fallback provider to lean into, a schema formats provider naturally fits that. If need be, we can extend the formatsProvider interface to allow an optional fallback formatsProvider within, bringing that functionality out of FSFormatsProvider. And do something similar with unitsProvider and a fallback of some sort.

Comment thread core/backend/src/IModelDb.ts Outdated
Comment thread core/ecschema-metadata/src/SchemaView.ts Fixed
Comment thread core/ecschema-metadata/src/SchemaView.ts Dismissed
@rschili rschili changed the title WIP: RuntimeSchemaContext - light schema access mechanism SchemaView - light schema access layer Apr 17, 2026
Copy link
Copy Markdown
Member

@grigasp grigasp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked over the public API surface and how SchemaView is maintained by and exposed through IModelDb and IModelConnection - LGTM. The only thing I don't really like is that now we'll have schemaContext getter + getSchemaView() method, but it's just my opinion, I understand the unwillingness to break the API.

I didn't look at at the tests or internals of SchemaView implementation. Hopefully someone from the iModels / EC team can do that.

rschili added 18 commits April 27, 2026 08:55
…cess

- Added RuntimeSchemaInterfaces.ts to define runtime schema-related types and enums.
- Enhanced IModelConnection to support fetching runtime schema metadata with getRuntimeSchemas() method.
- Updated ECSQL tutorial and reference documentation to include RuntimeSchemaContext as a performance alternative.
- Created RuntimeSchemaContext.md to document its usage, advantages, and comparison with ecschema-metadata.
- Added comprehensive tests for RuntimeSchemaContext functionalities, including schema navigation, class type checks, property handling, and exhaustive walks.
…ized schema metadata access

- Added RuntimeSchemaContext for high-performance, read-only schema metadata caching.
- Introduced RuntimeSchemaInterfaces to define data structures for schemas, classes, properties, and enumerations.
- Updated ecschema-metadata module exports to include new runtime schema classes and interfaces.
- Refactored IModelConnection to utilize RuntimeSchemaContext instead of the previous common module.
- Updated documentation to reflect the new location of RuntimeSchemaContext and its usage.
- Modified example tests to import RuntimeSchemaContext from the new module.
…Context, clarifying behavior and performance considerations
…en support

- Updated RuntimeClass to use a factory function for creating RuntimeProperty instances.
- Refactored RuntimeProperty to be an abstract class with type-safe accessors for different property kinds.
- Introduced new classes for RuntimePrimitiveProperty, RuntimePrimitiveArrayProperty, RuntimeStructProperty, RuntimeStructArrayProperty, and RuntimeNavigationProperty.
- Added schemaToken support in RuntimeSchemaContext for cache invalidation and tracking context validity.
- Modified parseRuntimeSchemaBlob to accept an optional schemaToken parameter.
- Updated IModelConnection to pass schemaToken when creating RuntimeSchemaContext from binary data.
- Added methods to find enumerations, KindOfQuantities, and PropertyCategories by qualified name in RuntimeSchemaContext.
- Refactored findView method to utilize a new internal helper for resolving schema items.
- Introduced comprehensive tests for RuntimeSchemaContext, covering schema creation, property visibility, and view lookups.
- Updated documentation to reflect the current state of view support in the runtime schema.
- Adjusted example code snippets to align with the new method signatures and property checks.
- Added ecInstanceId property to RuntimeSchema, RuntimeClass, RuntimeProperty, RuntimeEnumeration, RuntimeKoQ, RuntimePropertyCategory, and RuntimeView classes to facilitate ECDbMeta queries.
- Updated RuntimeSchemaContext and RuntimeSchemaBinaryReader to handle ecInstanceId during schema parsing and property resolution.
- Modified IModelDb to improve schema hydration logic, ensuring concurrent callers share the same promise for schema updates.
- Enhanced tests to validate ecInstanceId values against the corresponding database rows for schemas, classes, properties, enumerations, and categories.
- Updated documentation to reflect the inclusion of ECViews in the runtime schema blob.
- Removed all references to views in the runtime schema, including their definitions and handling in the schema context.
- Updated RuntimeClass to handle only entity and relationship classes, with type checks adjusted accordingly.
- Refactored methods to use createRuntimeClass for instantiation, ensuring proper handling of relationship classes.
- Updated tests to reflect changes in class handling and removed any assertions related to views.
- Adjusted documentation to clarify the removal of views and the focus on entity and relationship classes.
- Updated `RuntimeSchemaInterfaces.ts` to change the import order of StrengthType and StrengthDirection, and modified arrayMinOccurs and arrayMaxOccurs to be optional.
- Enhanced `RuntimeSchema.test.ts` by introducing new helper functions for writing schema, class, and property definitions, and refactored existing tests to utilize these helpers for better readability and maintainability.
- Added a new documentation file `RuntimeSchemaBinaryFormat.md` detailing the binary format used for runtime schemas, including version history, table structures, and cross-reference resolution.
- Updated `RuntimeSchemaContext.md` to include a reference to the new binary format documentation and clarified the differences between `ecschema-metadata` and `RuntimeSchemaContext`.
rschili added 8 commits May 4, 2026 12:03
…erformance

- Renamed RuntimePrimitiveType to SchemaViewPrimitiveType for better context in SchemaView.
- Updated comments and documentation to reflect the new naming and clarify the purpose of the SchemaView binary format.
- Enhanced error handling in BinaryReader to provide more informative messages when reading from the SchemaView blob.
- Added unit tests for SchemaView parsing and error scenarios to ensure robustness.
- Removed outdated documentation related to the previous runtime schema binary format.
- Updated IModelConnection to handle schema fetching promises more safely, preventing potential race conditions.
…clarify usage for runtime read-only access and schema authoring.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants