Skip to content

SAM deploy stalls on first run, works on second. #3811

Open
@MisterGlass

Description

@MisterGlass

Description:

Several months ago we started observing failures during our service deployments. Our deploy script pulls a prebuilt container image and then runs the sam deploy command. On the first attempt to deploy to a given instance, the deployment will stall indefinitely (we have left it running for hours while debugging). When we rerun the deploy, everything is successful.

Steps to reproduce:

Run SAM deploy. It will stall until you send a sigterm. Run SAM deploy again, it will now deploy the service without issue.

Observed result:

At first there is nothing returned. The following is dumped after the job is terminated:

Sending interrupt signal to process

	SAM CLI now collects telemetry to better understand customer needs.

	You can OPT OUT and disable telemetry collection by setting the
	environment variable SAM_CLI_TELEMETRY=0 in your shell.
	Thanks for your help!

	Learn More: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-telemetry.html

2022-04-12 19:49:00,757 | Telemetry endpoint configured to be https://aws-serverless-tools-telemetry.us-west-2.amazonaws.com/metrics
2022-04-12 19:49:00,758 | Using config file: /home/ubuntu/workspace/MyApp Deploy Prod/apps/myapp/reporting-service/samconfig.toml, config environment: nightly
2022-04-12 19:49:00,758 | Expand command line arguments to:
2022-04-12 19:49:00,758 | --template_file=/home/ubuntu/workspace/MyApp Deploy Prod/apps/myapp/reporting-service/template.deploy.yaml --stack_name=pdfgen-nightly --image_repositories={'PdfGenFunction': 'REDACTED ECR URI'} --capabilities=('CAPABILITY_IAM',) --s3_bucket=myapp-reporting-service-deploy --s3_prefix=cmyapp-reporting-service

�[1A�[G�[0KThe push refers to repository [REDACTED ECR URI]

�[1A�[G�[0K
a49912bbd1b1: Preparing �[0B�[G
�[1A�[G�[0K
# I cut out ~1000 lines of similar docker push output
c4bb7b34a062: Pushing [=============>                                     ]  134.1MB/514.5MB�[7B�[G�[26A�[G
�[1A�[G�[0K
a49912bbd1b1: Pushing [============================================>      ]  282.9MB/318.1MB�[26B�[G�[5A�[G
�[1A�[G�[0K
1916928847df: Pushing [==================================>                ]  72.94MB/104.6MB�[5B�[G�[8A�[G
�[1A�[G�[0K
713628b4c6d9: Pushing [==============>                                    ]  118.7MB/405.7MB�[8B�[G�[7A�[G
�[1A�[G�[0K
c4bb7b34a062: Pushing [=============>                                     ]  135.8MB/514.5MB�[7B�[G�[26A�[G
�[1A�[G�[0K
a49912bbd1b1: Pushing [============================================>      ]    284MB/318.1MB�[26B�[G�[8A�[G
�[1A�[G�[0K
713628b4c6d9: Pushing [==============>                                    ]  120.8MB/405.7MB�[8B�[G�[5A�[G
�[1A�[G�[0K> sam deploy --template /home/ubuntu/workspace/MyApp Deploy Prod/apps/myapp/reporting-service/template.deploy.yaml --config-file /home/ubuntu/workspace/MyApp Deploy Prod/apps/myapp/reporting-service/samconfig.toml --config-env nightly --stack-name pdfgen-nightly --image-repositories PdfGenFunction=REDACTED ECR URI --region us-east-1 --capabilities CAPABILITY_IAM --no-confirm-changeset --no-fail-on-empty-changeset --debug
Failure while executing command, exit code -15
script returned exit code 241

Expected result:

This issue is odd because of how repeatable it is. Deploys fail on the first attempt, succeed on the second, every time.

I expected things to work the first time.

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

  1. OS: Ubuntu server
  2. sam --version: SAM CLI, version 1.23.0
  3. AWS region: us-east-1

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/deploysam deploy commandstage/bug-reproThe issue/bug needs to be reproduced

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions