Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS: Add AWS integ tests to check task and enable tests based on required environment variables #12671

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

lliangyu-lin
Copy link
Contributor

@lliangyu-lin lliangyu-lin commented Mar 27, 2025

Description

  • Delegate some of the aws integration tests to use minio docker container, that emulates aws services.
    • This allows running tests in pipeline without the need to have a dedicated aws account and credentials
  • Move container tests from aws test to integrationTest folder
  • Added check task to depend on integrationTest

Fixes #12236

Testing

./gradlew :iceberg-aws:check Confirmed unit test and aws mocked integration tests are ran successfully. When required AWS environment variables are not set, the tests are correctly skipped.

@@ -56,8 +58,7 @@
import software.amazon.awssdk.services.s3.model.HeadObjectRequest;
import software.amazon.awssdk.services.s3.model.NoSuchKeyException;

public class TestDynamoDbCatalog {

public class TestMockDynamoDbCatalog extends BaseAwsMockTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we achieve the same without having to rename tests? I don't think it's clear that tests need to have a Mock in the name to properly be executed and it's very likely that this will be missed in the future if new tests are added. Hence I would suggest to keep the test names as they are

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I use filtering to only run tests start with TestMock, but I agree it's not very clear. I think we could maybe introduce a custom annotation. Let me try out new approaches and give update on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like using Tag and

useJUnitPlatform {
      includeTags 'verification'
    }

is the cleanest approach

@github-actions github-actions bot added the docs label Mar 31, 2025
@lliangyu-lin
Copy link
Contributor Author

cc: @nastra @xiaoxuandev I made some updates based on the reviews. Feel free to take another look when you have time.

@lliangyu-lin
Copy link
Contributor Author

@xiaoxuandev @nastra There are some concerns on moving the tests to moto since we need to add additional configs to make it work correctly. So I revert those changes in this PR and we can revisit those later with adding separate mock test group that can be enabled in the pipeline while maintaining integration tests that use actual aws services.

But we can still close out on the original issue on moving the docker based tests from test to integrationTest. Let me know what you think.

@xiaoxuandev
Copy link
Contributor

@xiaoxuandev @nastra There are some concerns on moving the tests to moto since we need to add additional configs to make it work correctly. So I revert those changes in this PR and we can revisit those later with adding separate mock test group that can be enabled in the pipeline while maintaining integration tests that use actual aws services.

But we can still close out on the original issue on moving the docker based tests from test to integrationTest. Let me know what you think.

Thanks @lliangyu-lin, that sounds good to me. Do we need to change the commit title?

@nastra
Copy link
Contributor

nastra commented Apr 2, 2025

Since the PR is now using JUnit tags to distinguish between tests I was wondering whether it would make more sense to actually move the docker-specific tests to integrationTest instead of introducing a verificationIntegrationTest.
We would then annotate the current tests under integrationTest to indicate that those need ENV vars to be set and we would automatically include those annotated tests when those ENV vars are set.
That way we don't need to introduce another test task and we could finally add integrationTest to be executed as part of normal CI.

@lliangyu-lin lliangyu-lin changed the title AWS: Delegate part of AWS integration tests to using mock aws services and enable tests in check task AWS: Add AWS integ tests to check task and enable tests based on required environment variables Apr 2, 2025
@lliangyu-lin
Copy link
Contributor Author

lliangyu-lin commented Apr 2, 2025

We would then annotate the current tests under integrationTest to indicate that those need ENV vars to be set and we would automatically include those annotated tests when those ENV vars are set.

I think in this scenario, custom annotation is better compared to tags since some integration tests require different env vars, and we don't need to introduce another task as well. Otherwise we will need to add many different tags and need to handle the condition in gradle build file.

@lliangyu-lin
Copy link
Contributor Author

Updated based on the comments.
Please take another look when you have time. cc: @nastra @xiaoxuandev

build.gradle Outdated
@@ -529,7 +529,10 @@ project(':iceberg-aws') {
inputSpec.set(s3SignerSpec)
recommend.set(true)
}
check.dependsOn('validateS3SignerSpec')
tasks.named('check') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to not change this line and keep it as is. This is so that we're in-line with how we define this in other places. And then right after the integrationTest task (after L526) is defined you would add the check.dependsOn integrationTest

@Inherited
@Target({ElementType.TYPE, ElementType.METHOD})
@Retention(RetentionPolicy.RUNTIME)
public @interface EnableAwsTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should be able to reuse EnabledIfEnvironmentVariables instead of introducing a separate Java annotation

AwsIntegTestUtil.AWS_REGION,
AwsIntegTestUtil.AWS_TEST_BUCKET
})
@ExtendWith(EnableAwsTestCondition.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please try and see whether we can achieve the same thing by using EnabledIfEnvironmentVariables? This would be more straightforward and easier to read/understand when the tests should be executed. Also we don't need to introduce any new classes/annotations

Copy link
Contributor Author

@lliangyu-lin lliangyu-lin Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra Thank you for the suggestion! I made a revision to change to EnabledIfEnvironmentVariables and is working the same way, and it's good to reuse as well. But some caveats I found are that EnabledIfEnvironmentVariables cannot be applied on parent class and be inherited to all children classes, which means we need to put these annotations on each test class.

This might be enough, but some other thoughts I have are

  • Extend the EnabledIfEnvironmentVariables to include @inherited (Need to introduce a class)
  • Use AssumeNotNull to check env vars in BeforeAll (Remove any annotation, but is not clear to user why the tests are skipped since it's embedded in test setup).

@@ -47,6 +49,14 @@
import software.amazon.awssdk.services.iam.model.PutRolePolicyRequest;
import software.amazon.awssdk.services.s3.model.S3Exception;

@EnabledIfEnvironmentVariables({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think skipping those tests is reasonable, but hardcoding AWS credentials in environment variables poses security risks. While this is an existing issue, it was never enabled for CI. I would prefer a configuration-based approach over using @EnabledIfEnvironmentVariables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @xiaoxuandev offline. Since we are mixing other issues on this PR, I think we are good with moving with environment variables approach and close out on moving the docker based tests first.
I will follow up with separate PRs to address other concerns in the AWS integration tests. I have created this issue to track.
cc: @nastra

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move docker-specific tests to integrationTest configuration
4 participants