Skip to content

Conversation

@CiaranMn
Copy link
Member

🌟 What is the purpose of this PR?

We're currently creating a large event history in Temporal by passing entities created/updated in the Graph during flows in and out of action steps. This means we hit Temporal limits quicker, both for total event history and for getting workflow data out of the Temporal API.

This PR improves that by switching from passing full persisted entity details around, to instead just passing the entity id, and resolving the full entity where needed.

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

This PR:

  • does not modify any publishable blocks or libraries, or modifications do not need publishing

📜 Does this require a change to the docs?

The changes in this PR:

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • do not affect the execution graph

⚠️ Known issues

The slight drawback of this approach is that we no longer have a direct record of the exact state of the entity as persisted by the Flow step, as the entity may have been updated again by the time it is resolved for display or other work.

But we can always inspect the history of the event at the time the Flow step was conducted, and any provenance information remains attached to specific editions (e.g. which Flow id and step produced that edition).

🛡 What tests cover this?

  • None yet.

@CiaranMn CiaranMn requested a review from TimDiekmann January 15, 2026 16:33
@CiaranMn CiaranMn self-assigned this Jan 15, 2026
@cursor
Copy link

cursor bot commented Jan 15, 2026

PR Summary

Shifts flow IO and logging to use PersistedEntityMetadata/PersistedEntitiesMetadata (entityId + operation) instead of full entities, reducing Temporal history size and resolving entities on demand where required.

  • Update worker actions (persist-entity, persist-entities, get-file-from-url, infer-metadata-from-document, write-google-sheet) to output/log PersistedEntityMetadata; adjust error paths accordingly
  • Replace PersistedEntity logs with PersistedEntityMetadata and adapt logProgress event types
  • Add async mapActionInputEntitiesToEntities() to accept serialized entities or metadata and fetch entities via queryEntities
  • Revise frontend flow visualizer (activity-log, outputs, entity-result-table, flow-visualizer) to build filters by UUID, query subgraphs, merge missing entities from logs, and display operations; wire outputs to new metadata kinds
  • Change action definitions, flow definitions, browser plugin types, and shared flow types.ts to new payload kinds (PersistedEntityMetadata, PersistedEntitiesMetadata); remove embedded entities from failed proposals
  • Minor: ensure actorId provided where fetching entities; tidy references to entity IDs in Google Sheets writing and answer-question contexts

Written by Cursor Bugbot for commit d4243f4. This will update automatically on new commits. Configure here.

@github-actions github-actions bot added area/apps > hash* Affects HASH (a `hash-*` app) area/libs Relates to first-party libraries/crates/packages (area) type/eng > frontend Owned by the @frontend team type/eng > backend Owned by the @backend team area/apps labels Jan 15, 2026
@vercel
Copy link

vercel bot commented Jan 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
ds-theme Error Error Jan 15, 2026 4:37pm
hashdotdesign Ready Ready Preview, Comment Jan 15, 2026 4:37pm

@augmentcode
Copy link

augmentcode bot commented Jan 15, 2026

🤖 Augment PR Summary

Summary: This PR reduces Temporal workflow/event history size by stopping Flow steps from passing full persisted entity payloads around, and passing only entity IDs (plus minimal metadata) instead.

Changes:

  • Replaced PersistedEntity/PersistedEntities with PersistedEntityMetadata/PersistedEntitiesMetadata (entityId + operation) across flow types and action definitions.
  • Updated AI worker flow activities (persistEntity, persistEntities, inferMetadataFromDocument, getFileFromUrl, etc.) to emit/log persisted-entity metadata rather than full serialized entities.
  • Updated integration worker persistence activity to return persisted entity IDs/operations while keeping failure information in failedEntityProposals.
  • Changed mapActionInputEntitiesToEntities to be async and resolve inputs by querying the Graph using the provided actorId + graphApiClient.
  • Refactored the Flow Visualizer UI to fetch persisted entities from the backend by ID/UUID (GraphQL subgraph query) instead of relying on entity snapshots embedded in run outputs/logs.
  • Adjusted Activity Log and Outputs rendering to derive entity labels/deliverables from fetched entities, with additional fallback fetching for entities mentioned in logs.
  • Updated browser plugin minimal flow run storage types to store persisted entity metadata rather than full entities.

Technical Notes: The UI now uses UUID-based filters with current-time temporal axes to load the latest entity versions, which trades exact “at-the-time” entity snapshots for smaller workflow payloads and improved Temporal scalability.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

const operation = metadataByEntityId.get(entity.entityId);

if (!operation) {
throw new Error(`Operation not found for entity ${entity.entityId}`);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing here can crash the Flow Visualizer UI in transient states (e.g., when Apollo serves previousData that doesn’t correspond to the current persistedEntitiesMetadata, or if some entities fail to load). Consider making this case non-fatal so the UI can still render partial results.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

@codecov
Copy link

codecov bot commented Jan 15, 2026

Codecov Report

❌ Patch coverage is 0% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.25%. Comparing base (86f263f) to head (d4243f4).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...es/shared/map-action-input-entities-to-entities.ts 0.00% 15 Missing ⚠️
...ivities/flow-activities/persist-entities-action.ts 0.00% 8 Missing ⚠️
...tivities/flow-activities/answer-question-action.ts 0.00% 1 Missing ⚠️
...-activities/infer-metadata-from-document-action.ts 0.00% 1 Missing ⚠️
...ctivities/flow-activities/persist-entity-action.ts 0.00% 1 Missing ⚠️
...ities/flow-activities/write-google-sheet-action.ts 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8259      +/-   ##
==========================================
- Coverage   59.26%   59.25%   -0.02%     
==========================================
  Files        1191     1191              
  Lines      113436   113389      -47     
  Branches     4982     4977       -5     
==========================================
- Hits        67232    67187      -45     
+ Misses      45428    45426       -2     
  Partials      776      776              
Flag Coverage Δ
apps.hash-ai-worker-ts 1.40% <0.00%> (+<0.01%) ⬆️
apps.hash-api 0.00% <ø> (ø)
local.hash-isomorphic-utils 0.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

? new HashEntity(output.value.entity)
: undefined,
};
persistedEntitiesByLocalId[unresolvedEntity.localEntityId] = output.value;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Property name mismatch when spreading into FailedEntityProposal

Low Severity

When handling a non-OK status from persistEntityAction, the code spreads output.value (a PersistedEntityMetadata) into the FailedEntityProposal object. However, PersistedEntityMetadata has property entityId while FailedEntityProposal expects existingEntityId. The spread adds an entityId property that doesn't match the expected type, and existingEntityId remains undefined.

Fix in Cursor Fix in Web

recordedAt: new Date().toISOString(),
stepId: Context.current().info.activityId,
type: "PersistedEntity",
type: "PersistedEntityMetadata",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action output type mismatch in getFileFromUrl implementation

High Severity

The action definition for getFileFromUrl was updated to expect payloadKind: "PersistedEntityMetadata" for the fileEntity output, but the implementation still returns kind: "Entity" with the full serialized entity. Consumers of this action expecting PersistedEntityMetadata (with entityId and operation fields) will receive an incompatible full entity object instead.

Additional Locations (1)

Fix in Cursor Fix in Web

@graphite-app graphite-app bot requested review from a team January 15, 2026 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/apps > hash* Affects HASH (a `hash-*` app) area/apps area/libs Relates to first-party libraries/crates/packages (area) type/eng > backend Owned by the @backend team type/eng > frontend Owned by the @frontend team

Development

Successfully merging this pull request may close these issues.

2 participants