Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 15 additions & 8 deletions backend-api/infra/README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
## Cloudformation template structure
# Cloudformation template structure

The `template.yaml` contains the Cloudformation resources that are deployed into AWS.
This is generated from smaller `*.yaml` files that live in the `backend-api/infra` directory. Modify these, not `template.yaml`.

The `*.yaml` files in the `backend-api/infra` folder are organised into:

- a `parent.yaml` which serves as an empty shell for the template, including the template format version, description and serverless transform.
- a `config/` directory containing separate definitions for parameters, mappings, conditions, globals and outputs
- a `resources/` directory containing all of our resource definitions, grouped by domain.

### Generating the template.yaml
## Generating the template.yaml

When any changes are made to the `*.yaml` files in the `backend/infra` folder, the `backend-api/template.yaml` needs to be regenerated:

```bash
Expand All @@ -22,25 +24,30 @@ Note: this script is also run in the pre-commit hook to ensure that the `backend

Merging templates requires rain to be installed. See https://github.com/aws-cloudformation/rain for installation instructions.

### Adding or updating new infrastructure
## Adding or updating new infrastructure

1) If the infrastructure already exists, update the `infra/*.yaml` file with the required changes.
2) If there isn't, create a file in the relative folder and add resource into it.
3) Generate the `template.yaml`

### Deleting infrastructure
## Deleting infrastructure

1) Locate the resources to be removed in the `resources/` folder.
2) Check if they use definitions from the `config/` folder and remove these, if no other resources depend on them.
3) Delete the `infra/*.yaml` file if empty
4) Generate the `template.yaml`

### CloudWatch Alarms
## CloudWatch Alarms

All CloudWatch Alarms must have a runbook linked in the alarm description. The alarm must be listed in the runbook with the following details;

- What triggers the alarm
- The impact on the end user
- Possible causes
- Details about what to do next

### Infrastructure tests
Infrastructure tests target the `backend-api/template.yaml` given this is the template that is deployed to AWS.
## Infrastructure tests

Infrastructure tests target the `backend-api/template.yaml` given this is the template that is deployed to AWS.

The exception to this is a test in `tests/infra-tests/template.test.ts` that validates the `backend-api/template.yaml` contains the same resources defined in all `*.yaml` files in the `backend-api/infra` folder.
The exception to this is a test in `tests/infra-tests/template.test.ts` that validates the `backend-api/template.yaml` contains the same resources defined in all `*.yaml` files in the `backend-api/infra` folder.
148 changes: 148 additions & 0 deletions backend-api/infra/resources/async/backoffRetryDemo/alarms.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
AWSTemplateFormatVersion: "2010-09-09"
Resources:
AsyncBackoffRetryDemoInvalidSqsEventAlarm:
Type: AWS::CloudWatch::Alarm
Condition: DeployAlarms
DependsOn:
- "AsyncBackoffRetryDemoMetricFilter"
Properties:
ActionsEnabled: true
AlarmActions:
- !Sub "arn:aws:sns:${AWS::Region}:${AWS::AccountId}:platform-alarms-sns-critical"
OKActions:
- !Sub "arn:aws:sns:${AWS::Region}:${AWS::AccountId}:platform-alarms-sns-critical"
InsufficientDataActions: [ ]
AlarmDescription: !Sub
- "Fires when a log event with messageCode MOBILE_ASYNC_BACKOFF_RETRY_DEMO_INVALID_SQS_EVENT is detected. See support manual: ${SupportManualUrl}"
- SupportManualUrl: !FindInMap
- StaticVariables
- SupportManual
- value
AlarmName: !Sub "${AWS::StackName}-backoff-retry-demo-lambda-invalid-sqs-event"
Namespace: !Sub "${AWS::StackName}/LogMessages"
MetricName: AsyncBackoffRetryDemoMessageCode
Dimensions:
- Name: MessageCode
Value: MOBILE_ASYNC_BACKOFF_RETRY_DEMO_INVALID_SQS_EVENT
Statistic: Sum
Period: 60
EvaluationPeriods: 1
DatapointsToAlarm: 1
Threshold: 1
ComparisonOperator: GreaterThanOrEqualToThreshold
TreatMissingData: notBreaching

AsyncBackoffRetryDemoErrorRateAlarm:
Condition: DeployAlarms
Type: AWS::CloudWatch::Alarm
Properties:
ActionsEnabled: true
AlarmDescription: "The number of Async Backoff Retry Demo Lambda errors is greater than or equal to 10% for the latest function version"
AlarmName: !Sub "${AWS::StackName}-backoff-retry-demo-lambda-error-rate"
AlarmActions:
- !Sub "arn:aws:sns:${AWS::Region}:${AWS::AccountId}:platform-alarms-sns-warning"
OKActions:
- !Sub "arn:aws:sns:${AWS::Region}:${AWS::AccountId}:platform-alarms-sns-warning"
ComparisonOperator: GreaterThanOrEqualToThreshold
DatapointsToAlarm: 1
EvaluationPeriods: 1
TreatMissingData: notBreaching
Threshold: 60
Metrics:
- Id: lambdaInvocations
Label: "Sum of invocations for latest Lambda version"
ReturnData: false
MetricStat:
Metric:
Namespace: AWS/Lambda
MetricName: Invocations
Dimensions:
- Name: Resource
Value: !Sub "${AsyncBackoffRetryDemoFunction}:live"
- Name: FunctionName
Value: !Ref AsyncBackoffRetryDemoFunction
- Name: ExecutedVersion
Value: !GetAtt AsyncBackoffRetryDemoFunction.Version.Version
Period: 60
Stat: Sum
- Id: lambdaErrors
Label: "Sum of function errors for latest Lambda version"
ReturnData: false
MetricStat:
Metric:
Namespace: AWS/Lambda
MetricName: Errors
Dimensions:
- Name: Resource
Value: !Sub "${AsyncBackoffRetryDemoFunction}:live"
- Name: FunctionName
Value: !Ref AsyncBackoffRetryDemoFunction
- Name: ExecutedVersion
Value: !GetAtt AsyncBackoffRetryDemoFunction.Version.Version
Period: 60
Stat: Sum
- Id: lambdaErrorPercentage
Label: "Percentage of invocations that result in a function error"
ReturnData: false
Expression: (lambdaErrors/lambdaInvocations)*100
- Id: lambdaErrorRate
Label: "Error threshold calculation"
ReturnData: true
Expression: IF(lambdaErrors >= 10, lambdaErrorPercentage)

AsyncBackoffRetryDemoLowCompletionAlarm:
Condition: DeployAlarms
Type: AWS::CloudWatch::Alarm
DependsOn:
- "AsyncBackoffRetryDemoCompletionMetricFilter"
Properties:
ActionsEnabled: true
AlarmActions:
- !Sub "arn:aws:sns:${AWS::Region}:${AWS::AccountId}:platform-alarms-sns-warning"
OKActions:
- !Sub "arn:aws:sns:${AWS::Region}:${AWS::AccountId}:platform-alarms-sns-warning"
InsufficientDataActions: [ ]
AlarmDescription: "A large proportion of Async Backoff Retry Demo requests have not completed successfully."
AlarmName: !Sub "${AWS::StackName}-backoff-retry-demo-lambda-low-completion"
EvaluationPeriods: 1
DatapointsToAlarm: 1
Threshold: 80
ComparisonOperator: LessThanOrEqualToThreshold
TreatMissingData: notBreaching
Metrics:
- Id: lambdaLogStarted
Label: "Sum of MOBILE_ASYNC_BACKOFF_RETRY_DEMO_STARTED messageCodes for latest Lambda version"
ReturnData: false
MetricStat:
Metric:
Namespace: !Sub "${AWS::StackName}/LogMessages"
MetricName: AsyncBackoffRetryDemoMessageCode
Dimensions:
- Name: MessageCode
Value: MOBILE_ASYNC_BACKOFF_RETRY_DEMO_STARTED
- Name: Version
Value: !GetAtt AsyncBackoffRetryDemoFunction.Version.Version
Period: 60
Stat: Sum
- Id: lambdaLogCompleted
Label: "Sum of MOBILE_ASYNC_BACKOFF_RETRY_DEMO_COMPLETED messageCodes for latest Lambda version"
ReturnData: false
MetricStat:
Metric:
Namespace: !Sub "${AWS::StackName}/LogMessages"
MetricName: AsyncBackoffRetryDemoMessageCode
Dimensions:
- Name: MessageCode
Value: MOBILE_ASYNC_BACKOFF_RETRY_DEMO_COMPLETED
- Name: Version
Value: !GetAtt AsyncBackoffRetryDemoFunction.Version.Version
Period: 60
Stat: Sum
- Id: lambdaLogCompletePercentage
Label: "Percentage of invocations that complete successfully"
ReturnData: false
Expression: (lambdaLogCompleted/lambdaLogStarted)*100
- Id: lowCompletionRateThreshold
Label: "Error threshold calculation"
ReturnData: true
Expression: IF((lambdaLogStarted-lambdaLogCompleted)>= 5, lambdaLogCompletePercentage)
125 changes: 125 additions & 0 deletions backend-api/infra/resources/async/backoffRetryDemo/function.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
AWSTemplateFormatVersion: "2010-09-09"
Resources:
AsyncBackoffRetryDemoFunction:
Type: AWS::Serverless::Function
DependsOn:
- AsyncBackoffRetryDemoLogGroup
Metadata:
BuildMethod: esbuild
BuildProperties:
Minify: true
Target: es2022
Sourcemap: false
EntryPoints:
- src/functions/asyncBackoffRetryDemo/asyncBackoffRetryDemoHandler.ts
Properties:
FunctionName: !Sub ${AWS::StackName}-backoff-retry-demo
Handler: asyncBackoffRetryDemoHandler.lambdaHandler
DeploymentPreference:
Enabled: true
Alarms: !If
- UseCanaryDeployment
- - !Ref AsyncBackoffRetryDemoErrorRateAlarm
- !Ref AsyncBackoffRetryDemoLowCompletionAlarm
- - !Ref AWS::NoValue
Type: !Ref LambdaDeploymentPreference
Environment:
Variables:
DEMO_SQS: !GetAtt BackoffRetryDemoSqs.QueueUrl
MAX_RETRY_DELAY_IN_SECONDS: 60
Role: !GetAtt AsyncBackoffRetryDemoLambdaRole.Arn
VpcConfig:
SubnetIds:
- !ImportValue devplatform-vpc-ProtectedSubnetIdA
- !ImportValue devplatform-vpc-ProtectedSubnetIdB
- !ImportValue devplatform-vpc-ProtectedSubnetIdC
SecurityGroupIds:
- !ImportValue devplatform-vpc-AWSServicesEndpointSecurityGroupId
ReservedConcurrentExecutions: !Ref AWS::NoValue

AsyncBackoffRetryDemoLogGroup:
Type: AWS::Logs::LogGroup
Properties:
RetentionInDays: 30
LogGroupName: !Sub /aws/lambda/${AWS::StackName}-backoff-retry-demo

AsyncBackoffRetryDemoSubscriptionFilter:
Type: AWS::Logs::SubscriptionFilter
Condition: EgressLogsToSplunkUsingCsls
Properties:
DestinationArn:
!FindInMap [EnvironmentVariables, CslsEgress, !Ref Environment]
FilterPattern: ""
LogGroupName: !Ref AsyncBackoffRetryDemoLogGroup

AsyncBackoffRetryDemoLambdaRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub ${AWS::StackName}-backoff-retry-demo
AssumeRolePolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Principal:
Service: lambda.amazonaws.com
Action: sts:AssumeRole
Policies:
- PolicyName: AsyncBackoffRetryDemoLoggingPolicy
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
Resource: arn:aws:logs:*:*:*
- PolicyName: VpcPolicy
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- ec2:DescribeNetworkInterfaces
- ec2:CreateNetworkInterface
- ec2:DeleteNetworkInterface
Resource: '*'
- PolicyName: SQSPolicy
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- sqs:ReceiveMessage
- sqs:DeleteMessage
- sqs:GetQueueAttributes
- sqs:ChangeMessageVisibility
Resource:
- !GetAtt BackoffRetryDemoSqs.Arn
- PolicyName: AsyncBackoffRetryDemoFunctionSQSPolicy
PolicyDocument:
Version: "2012-10-17"
Statement:
- Effect: Allow
Action:
- sqs:SendMessage
Resource: !GetAtt BackoffRetryDemoSqs.Arn
- Effect: Allow
Action:
- kms:Decrypt
- kms:GenerateDataKey
Resource: !GetAtt BackoffRetryDemoKMSEncryptionKey.Arn
PermissionsBoundary: !If
- UsePermissionsBoundary
- !Ref PermissionsBoundary
- !Ref AWS::NoValue

BackoffRetryDemoEventSourceMapping:
Type: AWS::Lambda::EventSourceMapping
Properties:
BatchSize: 1
ScalingConfig:
MaximumConcurrency: 34
Enabled: true
EventSourceArn: !GetAtt BackoffRetryDemoSqs.Arn
FunctionName: !Ref AsyncBackoffRetryDemoFunction.Alias
31 changes: 31 additions & 0 deletions backend-api/infra/resources/async/backoffRetryDemo/metrics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
AWSTemplateFormatVersion: "2010-09-09"
Resources:
AsyncBackoffRetryDemoMetricFilter:
Type: AWS::Logs::MetricFilter
Condition: DeployAlarms
Properties:
LogGroupName: !Ref AsyncBackoffRetryDemoLogGroup
FilterPattern: '{ $.messageCode = * }'
MetricTransformations:
- MetricValue: "1"
MetricNamespace: !Sub "${AWS::StackName}/LogMessages"
MetricName: "AsyncBackoffRetryDemoMessageCode"
Dimensions:
- Key: MessageCode
Value: $.messageCode

AsyncBackoffRetryDemoCompletionMetricFilter:
Condition: DeployAlarms
Type: AWS::Logs::MetricFilter
Properties:
LogGroupName: !Ref AsyncBackoffRetryDemoLogGroup
FilterPattern: "{ ($.messageCode = *) && ($.functionVersion = *) }"
MetricTransformations:
- MetricValue: "1"
MetricNamespace: !Sub "${AWS::StackName}/LogMessages"
MetricName: "AsyncBackoffRetryDemoMessageCode"
Dimensions:
- Key: MessageCode
Value: $.messageCode
- Key: Version
Value: $.functionVersion
Loading