Skip to content

Conversation

@tgummerer
Copy link
Contributor

Add blog post for journaling (pulumi/pulumi#13502)

Related issues (optional)

pulumi/pulumi#13502

@claude
Copy link
Contributor

claude bot commented Dec 1, 2025

Critical Issues:

  1. INCOMPLETE TODOs - Line 52 "up to ??(TODO)x", Line 67 "TODO: rerun test", Lines 73-75 missing benchmark data - must complete before publication

  2. YAML FORMATTING ERROR (lines 26-28) - Tags section contains TABS instead of spaces. This will cause parsing errors. Must replace tabs with spaces for: journaling, performance, data-integrity

  3. META DESCRIPTION TOO SHORT (line 11) - Only 48 characters, should be 120-160 for SEO. Suggest: "Learn how Pulumi new journaling feature speeds up deployments by up to 10x for large stacks while maintaining full data integrity guarantees throughout updates"

  4. INCONSISTENT PERFORMANCE CLAIMS - Title says "up to 10x", line 52 has TODO placeholder, line 44 claims "up to 10x". Need consistent actual numbers

Style and Grammar Issues:

  • Line 52: "benchmarks end we picked" -> "benchmarks we picked"
  • Line 52: "and more so at every step" -> "and also at every step"
  • Line 78: duplicate "please please"
  • Line 89: avoid abbreviation "iow" - spell out "in other words"
  • Line 121: heading with question mark seems uncertain
  • Line 131: garbled idiom "eat our cake and eat it" -> "have our cake and eat it too" or rephrase
  • Lines 98, 147: code blocks missing "go" language identifier

Technical:

Minor: verify file ends with newline, check "intermediate" vs "intermittent" on line 124

Overall well-written technical post but CANNOT PUBLISH until TODOs completed and YAML tabs fixed. Mention @claude for re-review after fixes.

Add blog post for journaling (pulumi/pulumi#13502)
@pulumi-bot
Copy link
Collaborator

@pulumi-bot
Copy link
Collaborator

@pulumi-bot
Copy link
Collaborator

Copy link
Contributor

@julienp julienp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't read all of it yet, couple nits

@pulumi-bot
Copy link
Collaborator


| | Time | Bytes sent |
|--------------------|--------|------------|
| Without journaling | 58m26s | 16.5MB |
Copy link
Contributor

@kramhuber kramhuber Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: if you have time, this table would be very powerful as a grouped bar chart.

Edit: Created here, though we may want to add titles https://docs.google.com/spreadsheets/d/1L_CSI7_fkrJogGIHPeeteUJzSHQ3SZQx5RK75WB03Uc/edit?usp=sharing

# for details, and please remove these comments before submitting for review.
---

Pulumi saves a snapshot of the current state of your cloud infrastructure at every deployment, and also at every step of the deployment. This means that Pulumi always has a current view of the state even if there is an issue during an operation. However, this comes with a performance penalty especially for large stacks. Today we're introducing an improvement that can speed up deployments up to 10x. Read on for benchmarks and some technical details of the implementation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at every deployment

Is deployment the term we want to use? Operation? Update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah maybe operation is easier to understand? Happy to switch it around everywhere.


If you are interested in the more technical details read on!

## Introduction into snapshotting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this the first section, before benchmarks; otherwise, the user doesn't know what's being benchmarked


<!--more-->

## Benchmarks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get you're trying to "get to the good stuff", but consider moving this to after "why is it slow".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wanted to start off with "why do I even care", and demonstrate that, and give just enough information to understand what is being benchmarked, rather than going into the technical details too early (I think they are cool and make for a well rounded blog post, but I didn't want to bury the lead).

- We implemented the replay interface inside the Pulumi CLI, and ran it in parallel with the current snapshotting implementation in our tests. The snapshots were then compared automatically, and tests made to fail when the result didn't match.
- Since tests can't cover all possible edge cases, the next step was to run the journaler in parallel with the current snapshotting implementation internally. This was still without sending the results to the service. However we would compare the snapshot, and send an error event to the service if the snapshot didn't match. In our data warehouse we could then inspect any mismatches, and fix them. Since this does involve the service in a minor way, we would only do this if the user is using the Cloud backend.
- Next up was adding a feature flag for the service, so journaling could be turned on selectively for some orgs. At the same time we implemented an opt-in environment variable in the CLI (`PULUMI_ENABLE_JOURNALING`), so the feature could be selectively turned on by users, if both the feature flag is enabled and the user sets the environment variable. This way we could slowly start enabling this in our repos, e.g. first in the integration tests for `pulumi/pulumi`, then in the tests for `pulumi/examples` and `pulumi/templates`, etc.
- Allow users to start opting in. If you want to opt-in with your org, please reach out to us, either on the [Community Slack](https://slack.pulumi.com/), or through our [Support channels](https://support.pulumi.com/hc/en-us), and we'll opt your org into the feature flag. Then you can begin seeing the performance improvements by setting the `PULUMI_ENABLE_JOURNALING` env variable to true.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the action we want a user to take so let's consider how we can hoist it out of the list so it's more apparent, such as "what we've done so far" and "what we're doing now"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, let me see if I can make this more apparent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm looking at it again, I did try to highlight this in the benchmark section, so people can see it early 🤔 do you think that's not enough? Any other ideas how to make this more visible?

Comment on lines 48 to 49
# See the blogging docs at https://github.com/pulumi/docs/blob/master/BLOGGING.md
# for details, and please remove these comments before submitting for review.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# See the blogging docs at https://github.com/pulumi/docs/blob/master/BLOGGING.md
# for details, and please remove these comments before submitting for review.

# https://twitter.com/PulumiCorp/status/1755637618631405655

social:
twitter: Speeding up your Pulumi deployments by 10x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulumi deployments just got up to 10x faster:

  • Journaling: Send only changes, not full snapshots
  • Data integrity: No compromise on reliability
  • Network traffic cut by 85%+ on large stacks

Try it: [link]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!


social:
twitter: Speeding up your Pulumi deployments by 10x
linkedin: Want faster Pulumi deployments on big stacks? Read on how we improved the performance of Pulumi deployments with lots of resources by up to 10x, and how you can get access to it today.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulumi Deployments Get Up to 10x Faster

Large Pulumi stacks just got a major performance boost. Here's what changed.

The Problem
Pulumi saves snapshots at every deployment step to maintain data integrity. For large stacks, this creates a bottleneck: uploading the full snapshot serially slows everything down.

The Solution: Journaling
Instead of sending the whole snapshot, journaling sends only individual changes. These journal entries can go in parallel, and Pulumi Cloud reconstructs the full snapshot on the backend.

The Results
In benchmarks on a 3,000+ resource stack:

  • Time dropped from 58 minutes to 3 minutes
  • Network traffic cut from 16.5MB to 2.3MB

Why This Matters
You get the speed without sacrificing data integrity. Unlike SKIP_CHECKPOINTS, journaling still tracks all in-flight operations. If something fails mid-deployment, Pulumi still knows exactly what happened.

Get Started
This feature is in opt-in testing. Reach out on Pulumi Community Slack or through Support to get your org enrolled. Then set PULUMI_ENABLE_JOURNALING=true.

Read more: [link]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@pulumi-bot
Copy link
Collaborator

@pulumi-bot
Copy link
Collaborator

@pulumi-bot
Copy link
Collaborator

@pulumi-bot
Copy link
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants