-
Notifications
You must be signed in to change notification settings - Fork 195
e2e test for KafkaTopic webhook #1031
base: master
Are you sure you want to change the base?
Conversation
c272723 to
856ccc3
Compare
tests/e2e/kafkatopic_webhook.go
Outdated
| ) | ||
|
|
||
| // TODO(mihalexa): move to k8s.go | ||
| // applyK8sResourceFromTemplateWithDryRun is copy of applyK8sResourceFromTemplate which calls a "--dry-run=<strategy>" kubectl command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this could be solved with just adding an option for dry-run to the original command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such an option would need to become an input parameter and modifying the initial function was overkill, in my mind, as it's plenty good the way it is and our use of it doesn't normally (aka 90+% of the time) need it to execute with --dry-run in mind.
Outside of what I deemed to be an optimized workflow for webhook testing (aka making use of the dry-run option), in our e2e tests we always want to fully create objects (aka have them written to etcd) because that's in the nature of the tests: we create objects and we check their functionality.
The initial function is also part of a chain - it is not really standalone. Modifying its signature means modifying the signatures of all functions that relate to it in any way.
I preferred to add it as an optional/alternative function.
I could be convinced of the opposite but right now I don't feel like it is a 50-50 choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I get that but at he same time I'd like to avoid major code duplication if we can, and that in my head is more important than what you've written here but I won't hold up the PR on this if everyone else is fine with this reasoning, it's a personal preference thing for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a difficult topic.
I definitely wouldn't add a dryRunStrategy string parameter to the original function.
But maybe we should pay the one time cost and add an extraArgs map[string][]string parameter to that one which is passed down to the TerraTest call through the other helpers.
I know it's a big refactor and I can live with postponing it, but that looks like the desirable ideal solution to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chain in question is : applyK8sResourceFromTemplate -> applyK8sResourceManifestFromString -> now moving into terratest's k8s pkg -> KubectlApplyFromStringE -> KubectlApplyE (inserts "apply -f") -> RunKubectlE -> RunKubectlAndGetOutputE -> shell.RunCommandAndGetOutputE.
Part of the issue with this chain is that if we want to use terratest's KubectlApply<...> functions (link here), they don't have ...args in their signature at all - that's the root of the problem and otherwise it would've probably been an easy refactor. The first point in the chain where we get ...args is RunKubectlE.
Examples of function signatures relevant to us in the current implementation (also check comments I added at the end of some lines):
func KubectlApplyFromStringE(t testing.TestingT, options *KubectlOptions, configData string) error {
...
return KubectlApplyE(t, options, tmpfile)
}
func KubectlApplyE(t testing.TestingT, options *KubectlOptions, configPath string) error {
return RunKubectlE(t, options, "apply", "-f", configPath) <<<<< first args... ; this is where "apply -f" gets added
}
func RunKubectlE(t testing.TestingT, options *KubectlOptions, args ...string) error { <<<<< first args...
_, err := RunKubectlAndGetOutputE(t, options, args...)
return err
}
It is from the implementations behind that chain of functions that I pieced together a direct path to my goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to take Marton's suggestion from #1031 (comment) (which I am guessing was meant for this thread) and refactor applyK8sResourceManifestFromString (and upwards one level in our own functions) to approximate the KubectlApply... from terratest instead of using it directly ? (and in the process eliminate the need for a new function for dry-run ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to take Marton's suggestion from #1031 (comment) (which I am guessing was meant for this thread) and refactor applyK8sResourceManifestFromString (and upwards one level in our own functions) to approximate the KubectlApply... from terratest instead of using it directly ? (and in the process eliminate the need for a new function for dry-run ?)
Yeah, I think that was kind of my point as well, I meant it somewhat that way, I see why my thought was simpler but not working, but the end result seems similar to me.
tests/e2e/kafkatopic_webhook.go
Outdated
| requireCreatingKafkaCluster(kubectlOptions, "../../config/samples/simplekafkacluster.yaml") | ||
| testWebhookCreateKafkaTopic(kubectlOptions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation of e2e tests shouldn't this be done by this point? Not that these should cause problems in that case, it just might make stuff slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% sure about what you meant in the question. I will try to answer what I understood from it but please follow up if I didn't get it.
I favored having the webhook testing not rely on other previous tests and the flexibility this gives us.
There a few things I considered :
- our resources have different prerequisites (e.g. KafkaTopic requires that the indicated KafkaCluster exists) and we need a clear place to aggregate those
- this PR also aimed to provide a structure where other webhook e2e tests (and their prerequisites) could easily slot in ;
- e.g. kafkatopics may only require a/any kafkacluster but other things may require more intricate or specific setups
- this PR also aimed to provide a structure where other webhook e2e tests (and their prerequisites) could easily slot in ;
- our KafkaCluster related testing may not always be done on the absolutely most minimal setting possible while for the purposes of KafkaTopic webhook testing we don't care about the kafkacluster resource (so minimum is best) but there is no way to go around not having the KafkaCluster.
- this structure allowed me to decouple what kafkaclusters we test when we test kafkaclusters from what kafkaclusters we use when we test webhooks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'm fine with this reasoning, though I'd like to see how this looks when Marton's PR is merged.
Update: Checked back on that and I'd suggest modifying this to only really deal with the webhook tests themselves, not with installing a kafka clsuter and such as any real test case we might create which includes the webhook tests will include installing a kafka cluster and such, like it does now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think either we have to do everything from deploying components here or only do the webhooks and let the rest be built up at somewhere else, we are in the middle here which is not good because ensuring the kafka cluster is not the webhook test's responsibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can create an explicit or implicit dependency on the KafkaCluster CR name, remove the creation and cleanup steps and reorder the function calls in the suite file so that webhook testing comes right after one of the KafkaCluster creations we do.
One worry I voiced is about how such code scales for other webhooks and their CRs and their prerequisites and dependencies.
For example, explicit dependency injection (via function parameters like a KafkaClusterName) might mean we would keeeeep adding params to function definitions and that's not something we want either.
Or it could be "implicit" and we just "assume" certain constants are used for names - which we already do "here and there".
I'll explain my initial stance: I wanted to separate the setup steps for each webhook because I worried the prerequisites between webhooks could vary enough to make another type of implementation very messy. That's why the sequence of I am/was proposing was : testWebhooks() -> testWebhookKafkaTopic(kubectlOptions) -> -> testCreate + testUpdate -> teardown.
I can be convinced to, for instance, not have the setup phase as part of testWebhookKafkaTopic().
Later edit: as in to not have it at all and depend on resources installed in the main test suite function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can create an explicit or implicit dependency on the KafkaCluster CR name, remove the creation and cleanup steps and reorder the function calls in the suite file so that webhook testing comes right after one of the KafkaCluster creations we do.
Explicit rather, but otherwise this was my thought, yes.
I can be convinced to, for instance, not have the setup phase as part of testWebhookKafkaTopic().
Yeah with this naming IMO the setup/teardown of the components or clusters shouldn't be part of the function. Either we should name it testKafkaClusterCreateAndWebhookKafkaTopic() or we need to make something else do the setup.
panyuenlau
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found a couple of typos and left a question for my own understanding, no major issues. Good job!
tests/e2e/kafkatopic_webhook.go
Outdated
| }) | ||
| } | ||
|
|
||
| func testWebhookKafkTopic(kubectlOptions k8s.KubectlOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo
| func testWebhookKafkTopic(kubectlOptions k8s.KubectlOptions) { | |
| func testWebhookKafkaTopic(kubectlOptions k8s.KubectlOptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for catching this. I'll have to update it in the function definition and at the call site.
Done in 10b2248.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are using VSCode/VSCodium the extension CodeSpellChecker (streetsidesoftware.code-spell-checker) can catch this in editing time.
856ccc3 to
10b2248
Compare
pregnor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, couple comments, overall looks really good, thanks.
tests/e2e/koperator_suite_test.go
Outdated
| testInstallKafkaCluster("../../config/samples/simplekafkacluster_ssl.yaml") | ||
| testProduceConsumeInternalSSL(defaultTLSSecretName) | ||
| testUninstallKafkaCluster() | ||
| testWebhooks() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Shouldn't this be at line 62? This sounds like something which is both easier to test and required sooner than what comes from line 62.
I know you discussed somewhat this with Dávid, I'm coming from a slightly different angle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current implementation where webhook prerequisites are decoupled, the logical trouble I saw was that webhook testing would end up doing the install testing for the big CRs before those dedicated tests happen.
I'll give the example we have regarding KafkaTopic : in order to test anything about KafkaTopic we need a KafkaCluster to be up which means we would end up testing KafkaCluster installation (without meaning to) before the dedicated KafkaCluster installation and functional tests.
I saw that as not good so I dedided to place webhook testing towards the end as a category of its own which can then rely on dedicated install and functional tests working (and having been addressed in their own sections beforehand).
I can be convinced otherwise, especially within the context of this conversation : #1031 (comment) .
tests/e2e/kafkatopic_webhook.go
Outdated
| ) | ||
|
|
||
| // TODO(mihalexa): move to k8s.go | ||
| // applyK8sResourceFromTemplateWithDryRun is copy of applyK8sResourceFromTemplate which calls a "--dry-run=<strategy>" kubectl command |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a difficult topic.
I definitely wouldn't add a dryRunStrategy string parameter to the original function.
But maybe we should pay the one time cost and add an extraArgs map[string][]string parameter to that one which is passed down to the TerraTest call through the other helpers.
I know it's a big refactor and I can live with postponing it, but that looks like the desirable ideal solution to me.
tests/e2e/kafkatopic_webhook.go
Outdated
| requireCreatingKafkaCluster(kubectlOptions, "../../config/samples/simplekafkacluster.yaml") | ||
| testWebhookCreateKafkaTopic(kubectlOptions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think either we have to do everything from deploying components here or only do the webhooks and let the rest be built up at somewhere else, we are in the middle here which is not good because ensuring the kafka cluster is not the webhook test's responsibility.
tests/e2e/kafkatopic_webhook.go
Outdated
| dryRunStrategyServer, | ||
| ) | ||
| Expect(err).To(HaveOccurred()) | ||
| // Example error: The KafkaTopic "topic-test-internal" is invalid: spec.clusterRef.name: Invalid value: "kafkaNOT": kafkaCluster 'kafkaNOT' in the namespace 'kafka' does not exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Is this really meaningful with line 108 being present? I see it as somewhat of a noise, but could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to show the messages in full and then match on noteworthy parts of them.
Also, in future refactors, someone may have a better matching logic idea if they see the "real" and full expected message/output.
Also on the point of consistency: if it's deemed useful for most other situations to have the "example error" comment, then I'd rather see it even in situations where it can look redundant on first glance, at the very least as an example of sticking to a developer "communication" practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In such cases my practice was to build the message clearly from 1 level of indirection if variables are defined - but mostly no variables for reuse in end to end tests for clarity purposes so string literals are the interface we are using.
I'm biased towards my experience but I can try and convince myself about this approach as well.
|
LGTM! Line 418 in a13805f
like here: func deleteK8sResource( |
94f0a6e
94f0a6e to
d76b478
Compare
| applyK8sResourceManifest(kubectlOptions, tempPath) | ||
| err = applyK8sResourceManifest(kubectlOptions, tempPath) | ||
| if err != nil { | ||
| return errors.WrapIfWithDetails(err, "applying CRD failed", "crd", string(crd)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note:
When we use the emperror.dev/errors's WithDetails feature then IMHO the error message will not contain the details because it is stored in a separate structure member variable and not in the error message itself.
I think we should figure out something to solve this problem.
I favor the std Go error handling with the fmt.Errorf but I know the stacktrace will be not there. (it is also stored in a separate struct member variable in emperror case)
I know we use that lot of places in the e2e codebase
38e9f64 to
e0282d3
Compare
b66f9e3 to
274bc65
Compare
274bc65 to
d21676d
Compare
Description
This PR aims to provide an e2e test for the KafkaTopic validating webhook (for create and update operations).
I chose to implement this by using the
--dry-run=servermode ofkubectlas much as possible because that triggers all the api-server-side logic but without persisting the objects to storage (etcd) which is in exactly what we want in most cases during such tests. This way we have fewer objects to clean up and account/wait for.This mean I had to provide a local shortened replica of Terratest's
KubectlApplyFromStringEbecause the function chain involved in the regular call to ourapplyK8sResourceFromTemplate()does not allow passing...argssoon enough to be able to inject the--dry-run=serverI wanted.The chain in question is : applyK8sResourceFromTemplate -> applyK8sResourceManifestFromString -> now moving into terratest's k8s pkg -> KubectlApplyFromStringE -> KubectlApplyE (inserts "apply -f") -> RunKubectlE -> RunKubectlAndGetOutputE -> shell.RunCommandAndGetOutputE.
ℹ️
Updateoperation tests look similar toCreateones but they are not identical neither in intent nor in output (there are some cases and errors specific only to Update). That is why I did not group them further in some common function.ℹ️ The reason for assertions like
Expect(len(strings.Split(output, "\n"))).To(Equal(1))is that I intend to cause particular errors and test cases so I also want to match that little else happens outside of the targeted validation error.requireDeployKafkaTopic()unusable outside of their initial context.The core issue at play here is that the pattern present in many tests like this one is incorrect.
Ginkgo runs in phases which means that at runtime, the code above doesn't quite execute in the order that it is written.
Doc reference : https://onsi.github.io/ginkgo/#mental-model-how-ginkgo-traverses-the-spec-hierarchy
To be precise :
What this means is that, in that current pattern, the
kubectlOptionsthat is passed (by copy) to those functions is empty because the copying is done during theTree Construction Phase(in aWhen) but the useful value is populated inkubectlOptionsmuch later during theRun Phase(in anIt). This can be easily proven with debug/print statements.For this issue, during this PR, I only provided a temporary local fix so that the new tests can run and reuse the necessary functions. To some extent I'm asking you to overlook that part during review. It will get refactored after #1020.
ℹ️ I indicated some of my refactoring plans with comments including about moving some functions to other files. The reason for keeping as much as possible within a single file during this PR (but not forever!) is that #1020 is large and complicated and in case my PR gets merged first (on account of being far less complex), it will be significantly simpler to resolve conflicts or rebase #1020 if my PR mainly adds one noteworthy file (that I will break up later as indicated in the comments).
Type of Change
Checklist