Skip to content

Commit 749ffc3

Browse files
dlpzxmourya-33Mourya Darivemulanoah-paigepetrkalos
authored
2.6.2 Security features (#1737)
### Feature or Bugfix - Security ### Detail ### 🔐 Security * Update sanitization technique for terms filtering by @noah-paige in #1692 and in #1693 * Move access logging to a separate environment logging bucket by @noah-paige in #1695 * Add explicit token duration config for both JWTs by @noah-paige in #1698 * Disable GraphQL introspection if prod sizing by @noah-paige in #1704 * Add snyk workflow on schedule by @noah-paige in #1705, #1708, #1713, #1745 and in in #1746 * Unify Logger Config for Tasks by @noah-paige in #1709 * Updating overly permissive policies tagged by checkov for environment role using least privilege principles by @mourya-33 in #1632 Data.all permission model has been reviewed to ensure all Mutations and Queries have proper permissions: * Add MANAGE_SHARES permissions by @dlpzx in #1702 * Add permission check - is tenant to update SSM parameters API by @dlpzx in #1714 * Add GET_SHARE_OBJECT permissions to get data filters API by @dlpzx in #1717 * Add permissions on list datasets for env group + cosmetic S3 Datasets by @dlpzx in #1718 * Add GET_WORKSHEET permission in RUN_SQL_QUERY by @dlpzx in #1716 * Add permissions to Quicksight monitoring service layer by @dlpzx in #1715 * Add LIST_ENVIRONMENT_DATASETS permission for listing shared datasets and cleanup unused code by @dlpzx in #1719 * Add is_owner permissions to Glossary mutations + add new integration tests by @dlpzx in #1721 * Refactor env permissions + modify getTrustAccount by @dlpzx in #1712 * Add Feed consistent permissions by @dlpzx in #1722 * Add Votes consistent permissions by @dlpzx in #1724 * Consistent get_<DATA_ASSET> permissions - Dashboards by @dlpzx in #1729 ### 🧪 Test improvements Integration tests are in sync with `main` without 2.7 planned features. In this PR all core modules, optional modules and submodules are tested. That includes: tenant-permissions, omics, mlstudio, votes, notifications and backwards compatiblity of s3 shares. by @SofiaSazonova, @noah-paige , @petrkalos and @dlpzx In addition, the following PR adds functional tests that ensure the permission model of data.all is not corrupted. * ⭐ Add resource permission checks by @petrkalos in #1711 ### Dependencies * Update FastAPI by @petrkalos in #1577 * update fastapi dependency by @noah-paige in #1699 * Upgrade "cross-spawn" to "7.0.5" by @dlpzx in #1701 * Bump python runtime to bump cdk klayers cryptography version by @noah-paige in #1707 ### Relates - List above ### Security Please answer the questions below briefly where applicable, or write `N/A`. Based on [OWASP 10](https://owasp.org/Top10/en/). - Does this PR introduce or modify any input fields or queries - this includes fetching data from storage outside the application (e.g. a database, an S3 bucket)? - Is the input sanitized? - What precautions are you taking before deserializing the data you consume? - Is injection prevented by parametrizing queries? - Have you ensured no `eval` or similar functions are used? - Does this PR introduce any functionality or component that requires authorization? - How have you ensured it respects the existing AuthN/AuthZ mechanisms? - Are you logging failed auth attempts? - Are you using or adding any cryptographic features? - Do you use a standard proven implementations? - Are the used keys controlled by the customer? Where are they stored? - Are you introducing any new policies/roles/users? - Have you used the least-privilege principle? How? By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Co-authored-by: mourya-33 <134511711+mourya-33@users.noreply.github.com> Co-authored-by: Mourya Darivemula <mouryacd@amazon.com> Co-authored-by: Noah Paige <69586985+noah-paige@users.noreply.github.com> Co-authored-by: Petros Kalos <kalosp@amazon.com> Co-authored-by: Sofia Sazonova <sofia-s@304.ru> Co-authored-by: Sofia Sazonova <sazonova@amazon.co.uk>
1 parent 99dd5bb commit 749ffc3

File tree

244 files changed

+9412
-13591
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

244 files changed

+9412
-13591
lines changed

.checkov.baseline

Lines changed: 25 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -417,7 +417,7 @@
417417
]
418418
},
419419
{
420-
"file": "/cdk.out/asset.3045cb6b4340be1e173df6dcf6248d565aa849ceda3e2cf2c2f221ccee4bc1d6/pivotRole.yaml",
420+
"file": "/cdk.out/asset.05d71d8b69cd4483d3c9db9120b556b718c72f349debbb79d461c74c4964b350/pivotRole.yaml",
421421
"findings": [
422422
{
423423
"resource": "AWS::IAM::ManagedPolicy.PivotRolePolicy0",
@@ -490,12 +490,6 @@
490490
{
491491
"file": "/checkov_environment_synth.json",
492492
"findings": [
493-
{
494-
"resource": "AWS::IAM::ManagedPolicy.dataallanothergroup111111servicespolicy19AC37181",
495-
"check_ids": [
496-
"CKV_AWS_111"
497-
]
498-
},
499493
{
500494
"resource": "AWS::IAM::ManagedPolicy.dataallanothergroup111111servicespolicy2E85AF510",
501495
"check_ids": [
@@ -508,24 +502,6 @@
508502
"CKV_AWS_111"
509503
]
510504
},
511-
{
512-
"resource": "AWS::IAM::ManagedPolicy.dataallanothergroup111111servicespolicy5A19E75CA",
513-
"check_ids": [
514-
"CKV_AWS_109"
515-
]
516-
},
517-
{
518-
"resource": "AWS::IAM::ManagedPolicy.dataallanothergroup111111servicespolicyCC720210",
519-
"check_ids": [
520-
"CKV_AWS_109"
521-
]
522-
},
523-
{
524-
"resource": "AWS::IAM::ManagedPolicy.dataalltestadmins111111servicespolicy1A0C96958",
525-
"check_ids": [
526-
"CKV_AWS_111"
527-
]
528-
},
529505
{
530506
"resource": "AWS::IAM::ManagedPolicy.dataalltestadmins111111servicespolicy2B12D381A",
531507
"check_ids": [
@@ -538,18 +514,6 @@
538514
"CKV_AWS_111"
539515
]
540516
},
541-
{
542-
"resource": "AWS::IAM::ManagedPolicy.dataalltestadmins111111servicespolicy3E3CBA9E",
543-
"check_ids": [
544-
"CKV_AWS_109"
545-
]
546-
},
547-
{
548-
"resource": "AWS::IAM::ManagedPolicy.dataalltestadmins111111servicespolicy56D7DC525",
549-
"check_ids": [
550-
"CKV_AWS_109"
551-
]
552-
},
553517
{
554518
"resource": "AWS::Lambda::Function.CustomCDKBucketDeployment8693BB64968944B69AAFB0CC9EB8756C81C01536",
555519
"check_ids": [
@@ -563,38 +527,34 @@
563527
"resource": "AWS::Lambda::Function.GlueDatabaseLFCustomResourceHandler7FAF0F82",
564528
"check_ids": [
565529
"CKV_AWS_115",
566-
"CKV_AWS_117",
567-
"CKV_AWS_173"
530+
"CKV_AWS_117"
568531
]
569532
},
570533
{
571534
"resource": "AWS::Lambda::Function.LakeformationDefaultSettingsHandler2CBEDB06",
572535
"check_ids": [
573536
"CKV_AWS_115",
574-
"CKV_AWS_117",
575-
"CKV_AWS_173"
537+
"CKV_AWS_117"
576538
]
577539
},
578540
{
579541
"resource": "AWS::Lambda::Function.dataallGlueDbCustomResourceProviderframeworkonEventF8347BA7",
580542
"check_ids": [
581543
"CKV_AWS_115",
582544
"CKV_AWS_116",
583-
"CKV_AWS_117",
584-
"CKV_AWS_173"
545+
"CKV_AWS_117"
585546
]
586547
},
587548
{
588549
"resource": "AWS::Lambda::Function.dataallLakeformationDefaultSettingsProviderframeworkonEventBB660E32",
589550
"check_ids": [
590551
"CKV_AWS_115",
591552
"CKV_AWS_116",
592-
"CKV_AWS_117",
593-
"CKV_AWS_173"
553+
"CKV_AWS_117"
594554
]
595555
},
596556
{
597-
"resource": "AWS::S3::Bucket.EnvironmentDefaultBucket78C3A8B0",
557+
"resource": "AWS::S3::Bucket.EnvironmentDefaultLogBucket7F0EFAB3",
598558
"check_ids": [
599559
"CKV_AWS_18"
600560
]
@@ -653,6 +613,25 @@
653613
}
654614
]
655615
},
616+
{
617+
"file": "/checkov_pipeline_synth.json",
618+
"findings": [
619+
{
620+
"resource": "AWS::IAM::Role.PipelineRoleDCFDBB91",
621+
"check_ids": [
622+
"CKV_AWS_107",
623+
"CKV_AWS_108",
624+
"CKV_AWS_111"
625+
]
626+
},
627+
{
628+
"resource": "AWS::S3::Bucket.thistableartifactsbucketDB1C8C64",
629+
"check_ids": [
630+
"CKV_AWS_18"
631+
]
632+
}
633+
]
634+
},
656635
{
657636
"file": "/frontend/docker/prod/Dockerfile",
658637
"findings": [

.github/workflows/snyk.yaml

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
name: Snyk
2+
3+
on:
4+
workflow_dispatch:
5+
6+
schedule:
7+
- cron: "0 9 * * 1" # runs each Monday at 9:00 UTC
8+
9+
permissions:
10+
contents: read
11+
security-events: write
12+
13+
jobs:
14+
security:
15+
strategy:
16+
matrix:
17+
python-version: [3.9]
18+
runs-on: ubuntu-latest
19+
steps:
20+
- uses: actions/checkout@v4
21+
- uses: snyk/actions/setup@master
22+
- name: Set up Python ${{ matrix.python-version }}
23+
uses: actions/setup-python@v4
24+
with:
25+
python-version: ${{ matrix.python-version }}
26+
- name: Install All Requirements
27+
run: make install
28+
- name: Run Snyk to check for vulnerabilities
29+
run: snyk test --all-projects --detection-depth=5 --severity-threshold=high
30+
env:
31+
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}

Makefile

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ venv:
1616
@python3 -m venv "venv"
1717
@/bin/bash -c "source venv/bin/activate"
1818

19-
install: upgrade-pip install-deploy install-backend install-cdkproxy install-tests
19+
install: upgrade-pip install-deploy install-backend install-cdkproxy install-tests install-integration-tests install-custom-auth install-userguide
2020

2121
upgrade-pip:
2222
pip install --upgrade pip setuptools
@@ -36,6 +36,12 @@ install-tests:
3636
install-integration-tests:
3737
pip install -r tests_new/integration_tests/requirements.txt
3838

39+
install-custom-auth:
40+
pip install -r deploy/custom_resources/custom_authorizer/requirements.txt
41+
42+
install-userguide:
43+
pip install -r documentation/userguide/requirements.txt
44+
3945
lint:
4046
pip install ruff
4147
ruff check --fix

backend/api_handler.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
from dataall.base.db import get_engine
2424
from dataall.base.loader import load_modules, ImportMode
2525

26+
from graphql.pyutils import did_you_mean
2627

2728
logger = logging.getLogger()
2829
logger.setLevel(os.environ.get('LOG_LEVEL', 'INFO'))
@@ -32,6 +33,11 @@
3233
for name in ['boto3', 's3transfer', 'botocore', 'boto']:
3334
logging.getLogger(name).setLevel(logging.ERROR)
3435

36+
ALLOW_INTROSPECTION = True if os.getenv('ALLOW_INTROSPECTION') == 'True' else False
37+
38+
if not ALLOW_INTROSPECTION:
39+
did_you_mean.__globals__['MAX_LENGTH'] = 0
40+
3541
load_modules(modes={ImportMode.API})
3642
SCHEMA = bootstrap_schema()
3743
TYPE_DEFS = gql(SCHEMA.gql(with_directives=False))
@@ -137,7 +143,9 @@ def handler(event, context):
137143
else:
138144
raise Exception(f'Could not initialize user context from event {event}')
139145

140-
success, response = graphql_sync(schema=executable_schema, data=query, context_value=app_context)
146+
success, response = graphql_sync(
147+
schema=executable_schema, data=query, context_value=app_context, introspection=ALLOW_INTROSPECTION
148+
)
141149

142150
dispose_context()
143151
response = json.dumps(response)

backend/dataall/__init__.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,13 @@
11
from . import core, version
22
from .base import utils, db, api
3+
import logging
4+
import os
5+
import sys
6+
7+
logging.basicConfig(
8+
level=os.environ.get('LOG_LEVEL', 'INFO'),
9+
handlers=[logging.StreamHandler(sys.stdout)],
10+
format='[%(levelname)s] %(message)s',
11+
)
12+
for name in ['boto3', 's3transfer', 'botocore', 'boto', 'urllib3']:
13+
logging.getLogger(name).setLevel(logging.ERROR)
Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,12 @@
1-
aws-cdk-lib==2.99.0
2-
boto3==1.28.23
3-
boto3-stubs==1.28.23
4-
botocore==1.31.23
1+
aws-cdk-lib==2.160.0
2+
boto3==1.35.26
3+
boto3-stubs==1.35.26
54
cdk-nag==2.7.2
6-
constructs==10.0.73
7-
starlette==0.36.3
8-
fastapi == 0.109.2
9-
Flask==2.3.2
5+
fastapi == 0.115.5
106
PyYAML==6.0
117
requests==2.32.2
128
tabulate==0.8.9
139
uvicorn==0.15.0
14-
werkzeug==3.0.3
15-
constructs>=10.0.0,<11.0.0
10+
werkzeug==3.0.6
1611
git-remote-codecommit==1.16
1712
aws-ddk-core==1.3.0

backend/dataall/base/context.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
that in the request scope
55
66
The class uses Flask's approach to handle request: ThreadLocal
7-
That approach should work fine for AWS Lambdas and local server that uses Flask app
7+
That approach should work fine for AWS Lambdas and local server that uses FastApi app
88
"""
99

1010
from dataclasses import dataclass

backend/dataall/base/feature_toggle_checker.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22
Contains decorators that check if a feature has been enabled or not
33
"""
44

5+
import functools
6+
57
from dataall.base.config import config
68
from dataall.base.utils.decorator_utls import process_func
79

@@ -10,6 +12,7 @@ def is_feature_enabled(config_property: str):
1012
def decorator(f):
1113
fn, fn_decorator = process_func(f)
1214

15+
@functools.wraps(fn)
1316
def decorated(*args, **kwargs):
1417
value = config.get_property(config_property)
1518
if not value:

backend/dataall/base/utils/naming_convention.py

Lines changed: 28 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,46 @@
11
from enum import Enum
2-
2+
import re
33
from .slugify import slugify
44

55

66
class NamingConventionPattern(Enum):
7-
S3 = {'regex': '[^a-zA-Z0-9-]', 'separator': '-', 'max_length': 63}
7+
S3 = {
8+
'regex': '[^a-zA-Z0-9-]',
9+
'separator': '-',
10+
'max_length': 63,
11+
'valid_external_regex': '(?!(^xn--|.+-s3alias$))^[a-z0-9][a-z0-9-]{1,61}[a-z0-9]$',
12+
}
13+
KMS = {'regex': '[^a-zA-Z0-9-]$', 'separator': '-', 'max_length': 63, 'valid_external_regex': '^[a-zA-Z0-9_-]+$'}
814
IAM = {'regex': '[^a-zA-Z0-9-_]', 'separator': '-', 'max_length': 63} # Role names up to 64 chars
915
IAM_POLICY = {'regex': '[^a-zA-Z0-9-_]', 'separator': '-', 'max_length': 128} # Policy names up to 128 chars
10-
GLUE = {'regex': '[^a-zA-Z0-9_]', 'separator': '_', 'max_length': 240} # Limit 255 - 15 extra chars buffer
16+
GLUE = {
17+
'regex': '[^a-zA-Z0-9_]',
18+
'separator': '_',
19+
'max_length': 240,
20+
'valid_external_regex': '^[a-zA-Z0-9_]+$',
21+
} # Limit 255 - 15 extra chars buffer
1122
GLUE_ETL = {'regex': '[^a-zA-Z0-9-]', 'separator': '-', 'max_length': 52}
1223
NOTEBOOK = {'regex': '[^a-zA-Z0-9-]', 'separator': '-', 'max_length': 63}
1324
MLSTUDIO_DOMAIN = {'regex': '[^a-zA-Z0-9-]', 'separator': '-', 'max_length': 63}
1425
DEFAULT = {'regex': '[^a-zA-Z0-9-_]', 'separator': '-', 'max_length': 63}
26+
DEFAULT_SEARCH = {'regex': '[^a-zA-Z0-9-_:. ]'}
1527
OPENSEARCH = {'regex': '[^a-z0-9-]', 'separator': '-', 'max_length': 27}
1628
OPENSEARCH_SERVERLESS = {'regex': '[^a-z0-9-]', 'separator': '-', 'max_length': 31}
29+
DATA_FILTERS = {'regex': '[^a-z0-9_]', 'separator': '_', 'max_length': 31}
30+
REDSHIFT_DATASHARE = {
31+
'regex': '[^a-zA-Z0-9_]',
32+
'separator': '_',
33+
'max_length': 1000,
34+
} # Maximum length of 2147483647
1735

1836

1937
class NamingConventionService:
2038
def __init__(
2139
self,
2240
target_label: str,
23-
target_uri: str,
2441
pattern: NamingConventionPattern,
25-
resource_prefix: str,
42+
target_uri: str = '',
43+
resource_prefix: str = '',
2644
):
2745
self.target_label = target_label
2846
self.target_uri = target_uri if target_uri else ''
@@ -37,4 +55,8 @@ def build_compliant_name(self) -> str:
3755
separator = NamingConventionPattern[self.service].value['separator']
3856
max_length = NamingConventionPattern[self.service].value['max_length']
3957
suffix = f'-{self.target_uri}' if len(self.target_uri) else ''
40-
return f"{slugify(self.resource_prefix + '-' + self.target_label[:(max_length- len(self.resource_prefix + self.target_uri))] + suffix, regex_pattern=fr'{regex}', separator=separator, lowercase=True)}"
58+
return f"{slugify(self.resource_prefix + '-' + self.target_label[:(max_length - len(self.resource_prefix + self.target_uri))] + suffix, regex_pattern=fr'{regex}', separator=separator, lowercase=True)}"
59+
60+
def sanitize(self):
61+
regex = NamingConventionPattern[self.service].value['regex']
62+
return re.sub(regex, '', self.target_label)

backend/dataall/core/environment/api/queries.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232

3333
getTrustAccount = gql.QueryField(
3434
name='getTrustAccount',
35+
args=[gql.Argument(name='organizationUri', type=gql.NonNullableType(gql.String))],
3536
type=gql.String,
3637
resolver=get_trust_account,
3738
test_scope='Environment',

0 commit comments

Comments
 (0)