Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ release.

### Entities

- Define merge algorithm.
([4768](https://github.com/open-telemetry/opentelemetry-specification/pull/4768))

### OpenTelemetry Protocol

### Compatibility
Expand Down
41 changes: 41 additions & 0 deletions specification/entities/data-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ weight: 2
- [Resource and Entities](#resource-and-entities)
* [Attribute Referencing Model](#attribute-referencing-model)
* [Placement of Shared Descriptive Attributes](#placement-of-shared-descriptive-attributes)
- [Merging of Entities](#merging-of-entities)
- [Examples of Entities](#examples-of-entities)

<!-- tocstop -->
Expand Down Expand Up @@ -151,6 +152,46 @@ different values, then **only** the `k8s.node` entity can reference this key
Other entities (e.g., `k8s.cluster`) can report this attribute in a separate
telemetry channel (e.g., entity events) where full ownership context is known.

## Merging of Entities

Entities MAY be merged if and only if their types are the same, their
identity attributes are exactly the same AND their schema_url is the same.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see a possibility to relax the last requirement a bit? For example can we aim for: "schema_url is the same or schemas are compatible", where "schemas are compatible" if attributes used by Entities are unchanged between schema versions? I understand it is more work to check compatibility but it makes merging much more widely possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in that event, we expect the following:

  • The SDK/Collector has no way of knowing if a schema_url is compatible with another, without looking at the schema.
  • If we have to look at the schema, we can convert the version succesffuly, and then we can merge two entities that are at the same version.

So while we do want to allow schema_url based conversions, the merge algorithm for Entities can be defined much more simply and we leave room for version-conversion prior to this.

This means both Entities MUST have the same identity attribute keys and
for each key, the values of the key MUST be the same.

Here's an example algorithm that will check compatibility:

```
can_merge(current_entity, new_entity) {
current_entity.type == new_entity.type &&
current_entity.schema_url == new_entity.schema_url &&
has_same_attributes(current_entity.identity, new_entity.identity)
}
```

When merging entities, all attributes in description are merged together, with
one entity acting as "primary" where any conflicting attribute values will be
chosen from the "primary" entity.

Here's an example algorithm that will merge:

```
merge(current_entity, new_entity) {
if can_merge(current_entity, new_entity) {
for attribute in new_entity.description {
if !current_entity.description.contains(attribute.key) {
current_entity.description.insert(attribute)
}
// Ignore otherwise.
}
}
}
```

Note: If Entities have different `schema_url`s, they SHOULD be converted to the
same schema version (if possible) before attempting a merge. The merge algorithm
defined here assumes the entities are already at the same schema version.

## Examples of Entities

_This section is non-normative and is present only for the purposes of
Expand Down
149 changes: 147 additions & 2 deletions specification/resource/data-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ weight: 2
<!-- toc -->

- [Identity](#identity)
- [Merging Resources](#merging-resources)
* [Merging Entities into a Resource](#merging-entities-into-a-resource)

<!-- tocstop -->

Expand Down Expand Up @@ -44,6 +46,149 @@ Entity includes its own notion of identity. The identity of a resource is
the set of entities contained within it. Two resources are considered
different if one contains an entity not found in the other.

Some resources include raw attributes in additon to Entities. Raw attributes are
considered identifying on a resource. That is, if the key-value pairs of
Some resources include raw attributes in addition to Entities. Raw attributes
are considered identifying on a resource. That is, if the key-value pairs of
raw attributes are different, then you can assume the resource is different.

## Merging Resources

Note: The current SDK specification outlines a [merge algorithm](sdk.md#merge).
This specification updates the algorithm to be compliant with entities. This
section will replace that section upon stabilization of entities. SDKs SHOULD
NOT update their merge algorithm until full Entity SDK support is provided.

Merging resources is an action of joining together the context of observation.
That is, we can look at the resource context for a signal and *expand* that
context to include more details (see
[telescoping identity](README.md#telescoping)). As such, a merge SHOULD preserve
any identity that already existed on a Resource while adding in new identifying
information or descriptive attributes.

### Merging Entities into a Resource

We define the following algorithm for merging entities into an existing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fairly complicated algorithm. Can you please add a few examples to demonstrate what it is doing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

resource.

- Construct a set of existing entities on the resource, `E`.
- For each entity, `new_entity`, in priority order (highest first),
do one of the following:
- If an entity `e` exists in `E` with the same entity type as `new_entity`:
- Perform an [Entity DataModel Merge](../entities/data-model.md#merging-of-entities) with `e` and `new_entity`
- Note: If unable to merge `e` and `new_entity`, then no change is made.
- Otherwise, add the entity `new_entity` to set `E`
- Update the Resource to use the set of entities `E`.
- If all entities within `E` have the same `schema_url`, set the
resources `schema_url` to match.
- Otherwise set the Resource `schema_url` blank.
- Remove any attribute from `Attributes` which exists in either the
description or identity of an entity in `E`.
- Solve for resource flattening issues (See
[Attribute Referencing Model](../entities/data-model.md#attribute-referencing-model)).
- If, for all entities, there are now overlapping attribute keys, then nothing
is needed.
- If there is a conflict where two entities use the same attribute key then
remove the lower priority entity from the Resource.

#### Examples

_These examples demonstrate how conflicts are resolved during a merge._

##### Example 1: Entity replaces loose attribute

The conflict between loose attributes and those belonging to an entity. Here when entity is added it removes previous attributes.

**Initial Resource:**
- Entities: _None_
- Attributes:
- `host.name`: `"old-name"`
- `env`: `"prod"`

**Entities to Merge (by priority):**
1. `host`
- type: `"host"`
- identity:
- `host.id`: `"H1"`
- description:
- `host.name`: `"new-name"`
2. `service`
- type: `"service"`
- identity:
- `service.name`: `"my-svc"`

**Resulting Resource:**
- Entities:
- `host`
- type: `"host"`
- identity:
- `host.id`: `"H1"`
- description:
- `host.name`: `"new-name"`
- `service`
- type: `"service"`
- identity:
- `service.name`: `"my-svc"`
- Attributes:
- `env`: `"prod"`

##### Example 2: Loose attribute replaces entity attribute

The conflict between loose attributes and those belonging to an entity. Here when the loose attribute is added, the entity must be removed due to conflict.

**Initial Resource:**
- Entities:
- `host`
- type: `"host"`
- identity:
- `host.id`: `"H1"`
- description:
- `host.name`: `"detected-name"`
- Attributes: _None_

**Resource to Merge:**
- Entities: _None_
- Attributes:
- `host.id`: `"h2"`
- `env`: `"prod"`

**Resulting Resource:**
- Entities: _None_
- Attributes:
- `host.id`: `"h2"`
- `env`: `"prod"`

##### Example 3: Identity & Attribute Conflicts

Reject an entity with a different identity of the same type, and drop a lower priority entity due to an attribute key conflict.

**Initial Resource:**
- Entities:
- `host`
- type: `"host"`
- identity:
- `host.id`: `"H1"`
- description:
- `env`: `"prod"`
- Attributes: _None_

**Entities to Merge (by priority):**
1. `host`
- type: `"host"`
- identity:
- `host.id`: `"H2"`
2. `service`
- type: `"service"`
- identity:
- `service.name`: `"S1"`
- description:
- `env`: `"dev"`

**Resulting Resource:**
- Entities:
- `host`
- type: `"host"`
- identity:
- `host.id`: `"H1"`
- description:
- `env`: `"prod"`
- Attributes: _None_