Skip to content

Tentacle can detect Powershell scripts that don't start, and can abandon them.#1200

Open
LukeButters wants to merge 28 commits intomainfrom
luke/powershell-start-fail
Open

Tentacle can detect Powershell scripts that don't start, and can abandon them.#1200
LukeButters wants to merge 28 commits intomainfrom
luke/powershell-start-fail

Conversation

@LukeButters
Copy link
Copy Markdown
Contributor

@LukeButters LukeButters commented Mar 13, 2026

Background

When powershell.exe is invoked to run a script, it can sometimes start the process but never actually begin executing the script body. Additionally the script could never be terminated. This results in hung deployments where we may wait forever for a script that will never complete.

Refs: EFT-365
Refs: EFT-3145
Refs: HPY-1295, HPY-1296

Results

This PR introduces a startup detection mechanism that lets Tentacle detect and report when PowerShell never executes the script content.

Scripts that opt in include a special marker comment:

# OCTOPUS-POWERSHELL-STARTUP-DETECTION

When Tentacle bootstraps the script, it replaces this marker with generated detection code that:

  1. Attempts to exclusively create a .octopus_powershell_started sentinel file
  2. Checks for a .octopus_powershell_should_run file (written by Tentacle before execution)
  3. Exits early if either check fails

Concurrently, RunningScript runs a monitoring task alongside the script execution task. If the started file isn't created within the timeout window (default: 5 minutes), the monitor concludes PowerShell never executed the script and returns exit code -47 (PowerShellNeverStartedExitCode). Additionally tentacle will ensure the power shell body is never executed.

Notes

  • Detection is opt-in — scripts without the marker comment are unaffected
  • The startup timeout is configurable via the RunningScript constructor (tests use a shorter value)
  • Currently scoped to Windows only (powershell.exe); pwsh support on Linux/Mac is possible as a follow-up

Fixes https://github.com/OctopusDeploy/OctopusTentacle/issues/... (optional public issue)

When Octopus feature toggle is enabled

Message written to server task log when PowerShell startup detection is enabled

12:46:27   Verbose  |       Executable name or full path: C:\WINDOWS\system32\WindowsPowershell\v1.0\PowerShell.exe
12:46:27   Verbose  |       Starting C:\WINDOWS\system32\WindowsPowershell\v1.0\PowerShell.exe in working directory 'C:\Octopus\Tentacle-002\Work\08de894f-87c0-33c4-b71f-356080ed09bf' using 'OEM United States' encoding running as 'DOMAIN\user'
12:46:27   Info     |       PowerShell startup detection: Checks passed, continuing script execution
...
12:46:27   Verbose  |       Executing 'C:\Octopus\Tentacle-002\Work\08de894f-87c0-33c4-b71f-356080ed09bf\Script.ps1'
...
12:46:27   Verbose  |       Invoking target script 'C:\Octopus\Tentacle-002\Work\08de894f-87c0-33c4-b71f-356080ed09bf\Script.ps1'.
12:46:27   Info     |       Hello

Warning written to Tentacle logs when it takes longer than the configured timeout to start powershell:

2026-03-24 12:46:05.8801   3172      1  INFO  WorkspaceCleanerTask.Start(): Starting
...
2026-03-24 12:47:46.8117   3172      5  WARN  PowerShell startup detection: PowerShell did not start within <n> minutes for task T4neCG
...
2026-03-24 12:47:59.6891   3172      1  INFO  WorkspaceCleanerTask.Stop(): Stopping
2026-03-24 12:47:59.6891   3172      1  INFO  WorkspaceCleanerTask.Stop(): Stopped
2026-03-24 12:47:59.6891   3172      1  INFO  WorkspaceCleanerTask.Dispose(): Disposing

How to review this PR

  • PowerShellStartupDetection.cs (new) — Generates and injects the detection code; manages sentinel file paths
  • PowerShellStartupStatus.cs (new) — Enum for monitoring outcomes: NotMonitored, Started, NeverStarted
  • RunningScript.cs — Made Execute async; added concurrent startup monitoring task
  • ScriptWorkspace.cs — Injects detection code during bootstrap when the marker comment is present; creates the should_run file
  • ScriptExitCodes.cs — Added PowerShellNeverStartedExitCode = -47
  • StartScriptCommandV2.cs / ExecuteShellScriptCommand.cs — Added DurationToWaitForPowerShellToStartup parameter to allow configurable timeout
  • PowerShellStartupDetectionTests.cs (new) — Integration tests covering successful detection, failure detection, monitoring timeout, and scripts without the marker

Quality ✔️

Pre-requisites

  • I have read How we use GitHub Issues for help deciding when and where it's appropriate to make an issue.
  • I have considered informing or consulting the right people, according to the ownership map.
  • I have considered appropriate testing for my change.

@LukeButters LukeButters requested a review from hnrkndrssn March 13, 2026 01:30
@hnrkndrssn hnrkndrssn changed the title Add support for powershell start up failire detection Add support for powershell start up failure detection Mar 31, 2026
@LukeButters LukeButters requested a review from hnrkndrssn April 1, 2026 02:07
@LukeButters LukeButters marked this pull request as ready for review April 1, 2026 02:09
@LukeButters LukeButters requested a review from a team as a code owner April 1, 2026 02:09
@LukeButters LukeButters requested a review from Copilot April 1, 2026 02:09

This comment was marked as outdated.

@LukeButters LukeButters force-pushed the luke/powershell-start-fail branch from 5a599f2 to 5bb0dd6 Compare April 1, 2026 02:45
@LukeButters LukeButters changed the title Add support for powershell start up failure detection Tentacle can detect Powershell scripts that don't start, and can abandon them. Apr 1, 2026
Copy link
Copy Markdown
Contributor

@hnrkndrssn hnrkndrssn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Copy Markdown
Contributor

@rhysparry rhysparry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments

/// but silently stall before executing any script content. This was seen happening because
/// CrowdStrike prevented the script body from running.
///
/// When this happens, we get no output from the script AND the script is un-killable.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because it deviates from what we expect or can we literally not kill the powershell process?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentacle is unable to kill the script, with whatever the standard kill command dotnet uses.

We have taken dumps and seen crowdstrike is in the dump, I never saw the dump myself though.

I suspect the issue is something like powershell calls something that enters the kernel which hangs. Since it is in the kernel it can never be killed.

Copy link
Copy Markdown
Contributor

@rhysparry rhysparry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little unclear on the cancellation behaviour.

readonly IOctopusFileSystem fileSystem;
readonly IHomeDirectoryProvider home;
readonly SensitiveValueMasker sensitiveValueMasker;
readonly bool useBashWorkspace;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an enum to select a specific workspace type?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

[IntegrationTestTimeout]
public class PowerShellStartupDetectionTests : IntegrationTest
{
static (ScriptServiceV2 service, ScriptWorkspaceFactory workspaceFactory, ScriptStateStoreFactory stateStoreFactory, TemporaryDirectory tempDir) CreateScriptService(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this with a nested class? It should implement IDisposable for the TemporaryDirectory.

Comment on lines +132 to +134
# Sleep for a long time to simulate PowerShell hanging before reaching detection code
Start-Sleep -Seconds 3600
# TENTACLE-POWERSHELL-STARTUP-DETECTION
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this highlight a risk if we don't inject the startup detection at the top of the script?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes if you don't do this first it is a concern, perhaps the template variable name could reflect that to be self documenting?


directOutputText.Should().Contain("PowerShell startup detection", "The detection code should have run and reported why it exited");

// On Mac/Linux exit codes are unsigned 8-bit, so -47 wraps to 209
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be checking for that value then?


directOutputText.Should().Contain("PowerShell startup detection", "The detection code should have run and reported why it exited");

// On Mac/Linux exit codes are unsigned 8-bit, so -47 wraps to 209
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, should we be checking that value?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants