Skip to content

command/meta: Fix PSS migration during provider upgrade#38244

Draft
radeksimko wants to merge 2 commits intomainfrom
radek/b-fix-pss-provider-upgrade
Draft

command/meta: Fix PSS migration during provider upgrade#38244
radeksimko wants to merge 2 commits intomainfrom
radek/b-fix-pss-provider-upgrade

Conversation

@radeksimko
Copy link
Member

Fixes #

Target Release

1.15.x

Rollback Plan

  • If a change needs to be reverted, we will roll out an update to the code within 7 days.

Changes to Security Controls

Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.

CHANGELOG entry

  • This change is user-facing and I added a changelog entry.
  • This change is not user-facing.

@radeksimko radeksimko added the no-changelog-needed Add this to your PR if the change does not require a changelog entry label Mar 6, 2026
@radeksimko radeksimko force-pushed the radek/b-fix-pss-provider-upgrade branch from dcecb82 to 447450d Compare March 6, 2026 14:59
Copy link
Member

@SarahFrench SarahFrench left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a look and left some comments. I'm going to ask internally for advice about how to handle the provider cache issue.

Also, while debugging I found some fixes needed for diagnostics: #38275 . It could be worth cherry-picking the change into your branch or rebasing once it's merged.

Comment on lines -3385 to +3389
factories, err := m.ProviderFactories()
factories, err := m.providerFactoriesFromLocks(locks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be using ProviderFactoriesFromLocks instead?

// running provider instance inside the returned backend.Backend instance.
// Stopping the provider process is the responsibility of the calling code.

resp := provider.GetProviderSchema()
Copy link
Member

@SarahFrench SarahFrench Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Through some debugging I've found that we're falling foul of the caching Core performs when accessing a provider's schemas. The cache's keys are based on provider Addr and that'll match between both versions of the same provider. We always use the newer version of the provider first, so that's in the cache when savedStateStore is invoked. This'll cause errors when processing config later in this method if there's a schema change (in provider or store) between the new and old provider versions.

I guess our options are to either update the cache to allow caching a schema per version:

type schemaCache struct {
	mu sync.Mutex
-	m  map[addrs.Provider]ProviderSchema
+ m  map[addrs.Provider]map[versions.Version]ProviderSchema
}

Or we allow calling code to skip the cache. This would require updating the providers.Interface interface to pass in an argument that indicates we want to skip the cache:

type Interface interface {
// GetSchema returns the complete schema for the provider.
GetProviderSchema() GetProviderSchemaResponse

-	GetProviderSchema() GetProviderSchemaResponse
+ GetProviderSchema(GetProviderSchemaRequest) GetProviderSchemaResponse

&

type GetProviderSchemaRequest struct {
    SkipCache bool
}

I don't think this needs to touch the protocol itself through, as the cache is specific to Core (versus this protocol for rpcapi where the cache is external?).
However, I think we aim to make the providers.Interface interface match the plugin protocol, so updating the interface feels like a misstep (or at least something to discuss within the team).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-changelog-needed Add this to your PR if the change does not require a changelog entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants