Skip to content

Conversation

@dlpzx
Copy link
Contributor

@dlpzx dlpzx commented Aug 14, 2025

Issue #, if available:
Running the workshop following the steps in https://catalog.us-east-1.prod.workshops.aws/workshops/501cb14c-91b3-455c-a2a9-d0a21ce68114/en-US/20-production results in the stageA and stageB failure (in the routing Lambdas). The initial error is described here: #538

Taking a closer look at the issue the datalake_library and the Lambdas is larger than just some imports:

  • octagon is marked as deprecated in previous commits => octagon code was removed form the datalake_library, but it is still used in some Lambdas
  • if octagon is removed, there are Lambdas that no longer have a purpose, that is the Lambdas that updated metadata in the dynaomdb octagon tables
  • the datalake_library references SSM parameter that do not exists!

Those are bugs, on top of it the architecture of the datalake_library is not clean, there are unused methods, mix of interfaces, configs.

Next steps:

  • The Glue Job deployed using the deploy.sh script in the workshop is also not working as expected. It does not take into account teams. Working on a fix in a separate pull request
  • Provide patterns on how custom transformation code can be added

Description of changes:

  • Removed octagon Lambda steps from stageA and stageB
  • Mark stage lambda and glue as deprecated to follow the docs where users are redirected to stageA/B
  • Simplified and rearchitected datalake_library --> read description of the layer in datalake_library/README
  • updated lambdas to use the new DataLakeClient

Testing

  • Stage A deployment
  • Stage B deployment
  • Stage A runs successfully (files transformed and stored in stage bucket)
  • (provoke the failure of the stageA state machine) -> Stage A error lambda succeeds
  • Stage B runs successfully (files transformed and stored in analytics bucket)
  • (provoke the failure of the stageB state machine) -> Stage B error lambda succeeds

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@dlpzx dlpzx force-pushed the fix/datalake-library-import branch 3 times, most recently from a0ab49f to f3141ad Compare August 14, 2025 11:29
@dlpzx dlpzx force-pushed the fix/datalake-library-import branch from f3141ad to 5758a7d Compare August 14, 2025 11:33
@dlpzx dlpzx marked this pull request as ready for review August 14, 2025 11:38
@dlpzx dlpzx added bug Something isn't working enhancement New feature or request labels Aug 14, 2025
@@ -1,3 +1,4 @@
# [**DEPRECATED**]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these be removed?

sqs_interface = SQSInterface(sqs_config.get_stage_dlq_name)

client = DataLakeClient(
team=event["body"]["team"], pipeline=event["body"]["pipeline"], stage=event["body"]["pipeline_stage"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fw these keys might not exist in the event

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how? we send the event from the routing lambda with those inputs in the event.body

local_path = client.s3.download_object(bucket, key)

# Apply business business logic:
# Apply business logic:
Copy link

@kukushking kukushking Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq - previously datalake library also contained transformation business logic. is the user expected to extend the library or logic has completely moved here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the transformation logic was removed in previous PRs. I am assessing the best way to allow customers to add their custom transformation logic

@dlpzx dlpzx merged commit 3ba3fb4 into aws-solutions-library-samples:main Aug 14, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants