e2e test for KafkaTopic webhook #1031

mihaialexandrescu · 2023-08-03T11:07:15Z

Description

This PR aims to provide an e2e test for the KafkaTopic validating webhook (for create and update operations).

I chose to implement this by using the --dry-run=server mode of kubectl as much as possible because that triggers all the api-server-side logic but without persisting the objects to storage (etcd) which is in exactly what we want in most cases during such tests. This way we have fewer objects to clean up and account/wait for.
This mean I had to provide a local shortened replica of Terratest's KubectlApplyFromStringE because the function chain involved in the regular call to our applyK8sResourceFromTemplate() does not allow passing ...args soon enough to be able to inject the --dry-run=server I wanted.
The chain in question is : applyK8sResourceFromTemplate -> applyK8sResourceManifestFromString -> now moving into terratest's k8s pkg -> KubectlApplyFromStringE -> KubectlApplyE (inserts "apply -f") -> RunKubectlE -> RunKubectlAndGetOutputE -> shell.RunCommandAndGetOutputE.

ℹ️ Update operation tests look similar to Create ones but they are not identical neither in intent nor in output (there are some cases and errors specific only to Update). That is why I did not group them further in some common function.

ℹ️ The reason for assertions like Expect(len(strings.Split(output, "\n"))).To(Equal(1)) is that I intend to cause particular errors and test cases so I also want to match that little else happens outside of the targeted validation error.

⚠️ This PR will be followed by some refactoring after #1020 because that PR, via it's kubeconfig injection logic, will inadvertently fix an issue we have today which makes functions like requireDeployKafkaTopic() unusable outside of their initial context.
The core issue at play here is that the pattern present in many tests like this one is incorrect.

	return When("Internally produce and consume message to/from Kafka cluster", func() {
		var kubectlOptions k8s.KubectlOptions
		var err error

		It("Acquiring K8s config and context", func() {
			kubectlOptions, err = kubectlOptionsForCurrentContext()
			Expect(err).NotTo(HaveOccurred())
		})

		kubectlOptions.Namespace = koperatorLocalHelmDescriptor.Namespace

		requireDeployingKcatPod(kubectlOptions, kcatPodName, "")
		requireDeployingKafkaTopic(kubectlOptions, testInternalTopicName)

Ginkgo runs in phases which means that at runtime, the code above doesn't quite execute in the order that it is written.
Doc reference : https://onsi.github.io/ginkgo/#mental-model-how-ginkgo-traverses-the-spec-hierarchy
To be precise :

When Ginkgo runs a suite it does so in two phases. The Tree Construction Phase followed by the Run Phase.

During the Tree Construction Phase Ginkgo enters all container nodes by invoking their closures to construct the spec tree. During this phase Ginkgo is capturing and saving off the various setup and subject node closures it encounters in the tree without running them. Only container node closures run during this phase and Ginkgo does not expect to encounter any assertions as no specs are running yet.
...
Once the spec tree is constructed Ginkgo walks the tree to generate a flattened list of specs.
...
During the Run Phase Ginkgo runs through each spec in the spec list sequentially. When running a spec Ginkgo invokes the setup and subject nodes closures in the correct order and tracks any failed assertions. Note that container node closures are never invoked during the run phase.

What this means is that, in that current pattern, the kubectlOptions that is passed (by copy) to those functions is empty because the copying is done during the Tree Construction Phase (in a When) but the useful value is populated in kubectlOptions much later during the Run Phase (in an It). This can be easily proven with debug/print statements.

For this issue, during this PR, I only provided a temporary local fix so that the new tests can run and reuse the necessary functions. To some extent I'm asking you to overlook that part during review. It will get refactored after #1020.

ℹ️ I indicated some of my refactoring plans with comments including about moving some functions to other files. The reason for keeping as much as possible within a single file during this PR (but not forever!) is that #1020 is large and complicated and in case my PR gets merged first (on account of being far less complex), it will be significantly simpler to resolve conflicts or rebase #1020 if my PR mainly adds one noteworthy file (that I will break up later as indicated in the comments).

Type of Change

Other (please describe): e2e test

Checklist

I have read the contributing guidelines
I have verified this change is not present in other open pull requests
All code style checks pass

Kuvesz · 2023-08-07T09:17:00Z

tests/e2e/kafkatopic_webhook.go

+)
+
+// TODO(mihalexa): move to k8s.go
+// applyK8sResourceFromTemplateWithDryRun is copy of applyK8sResourceFromTemplate which calls a "--dry-run=<strategy>" kubectl command


I feel like this could be solved with just adding an option for dry-run to the original command.

Such an option would need to become an input parameter and modifying the initial function was overkill, in my mind, as it's plenty good the way it is and our use of it doesn't normally (aka 90+% of the time) need it to execute with --dry-run in mind.
Outside of what I deemed to be an optimized workflow for webhook testing (aka making use of the dry-run option), in our e2e tests we always want to fully create objects (aka have them written to etcd) because that's in the nature of the tests: we create objects and we check their functionality.

The initial function is also part of a chain - it is not really standalone. Modifying its signature means modifying the signatures of all functions that relate to it in any way.

I preferred to add it as an optional/alternative function.

I could be convinced of the opposite but right now I don't feel like it is a 50-50 choice.

Yeah, I get that but at he same time I'd like to avoid major code duplication if we can, and that in my head is more important than what you've written here but I won't hold up the PR on this if everyone else is fine with this reasoning, it's a personal preference thing for me.

Yeah, that's a difficult topic.
I definitely wouldn't add a dryRunStrategy string parameter to the original function.

But maybe we should pay the one time cost and add an extraArgs map[string][]string parameter to that one which is passed down to the TerraTest call through the other helpers.
I know it's a big refactor and I can live with postponing it, but that looks like the desirable ideal solution to me.

The chain in question is : applyK8sResourceFromTemplate -> applyK8sResourceManifestFromString -> now moving into terratest's k8s pkg -> KubectlApplyFromStringE -> KubectlApplyE (inserts "apply -f") -> RunKubectlE -> RunKubectlAndGetOutputE -> shell.RunCommandAndGetOutputE.

Part of the issue with this chain is that if we want to use terratest's KubectlApply<...> functions (link here), they don't have ...args in their signature at all - that's the root of the problem and otherwise it would've probably been an easy refactor. The first point in the chain where we get ...args is RunKubectlE.
Examples of function signatures relevant to us in the current implementation (also check comments I added at the end of some lines):

func KubectlApplyFromStringE(t testing.TestingT, options *KubectlOptions, configData string) error { ... return KubectlApplyE(t, options, tmpfile) } func KubectlApplyE(t testing.TestingT, options *KubectlOptions, configPath string) error { return RunKubectlE(t, options, "apply", "-f", configPath) <<<<< first args... ; this is where "apply -f" gets added } func RunKubectlE(t testing.TestingT, options *KubectlOptions, args ...string) error { <<<<< first args... _, err := RunKubectlAndGetOutputE(t, options, args...) return err }

It is from the implementations behind that chain of functions that I pieced together a direct path to my goal.

Do we want to take Marton's suggestion from #1031 (comment) (which I am guessing was meant for this thread) and refactor applyK8sResourceManifestFromString (and upwards one level in our own functions) to approximate the KubectlApply... from terratest instead of using it directly ? (and in the process eliminate the need for a new function for dry-run ?)

Do we want to take Marton's suggestion from #1031 (comment) (which I am guessing was meant for this thread) and refactor applyK8sResourceManifestFromString (and upwards one level in our own functions) to approximate the KubectlApply... from terratest instead of using it directly ? (and in the process eliminate the need for a new function for dry-run ?)

Yeah, I think that was kind of my point as well, I meant it somewhat that way, I see why my thought was simpler but not working, but the end result seems similar to me.

Kuvesz · 2023-08-07T09:18:36Z

tests/e2e/kafkatopic_webhook.go

+	requireCreatingKafkaCluster(kubectlOptions, "../../config/samples/simplekafkacluster.yaml")
+	testWebhookCreateKafkaTopic(kubectlOptions)


In the current implementation of e2e tests shouldn't this be done by this point? Not that these should cause problems in that case, it just might make stuff slower.

I am not 100% sure about what you meant in the question. I will try to answer what I understood from it but please follow up if I didn't get it.

I favored having the webhook testing not rely on other previous tests and the flexibility this gives us.

There a few things I considered :

our resources have different prerequisites (e.g. KafkaTopic requires that the indicated KafkaCluster exists) and we need a clear place to aggregate those

this PR also aimed to provide a structure where other webhook e2e tests (and their prerequisites) could easily slot in ;

e.g. kafkatopics may only require a/any kafkacluster but other things may require more intricate or specific setups

our KafkaCluster related testing may not always be done on the absolutely most minimal setting possible while for the purposes of KafkaTopic webhook testing we don't care about the kafkacluster resource (so minimum is best) but there is no way to go around not having the KafkaCluster.

this structure allowed me to decouple what kafkaclusters we test when we test kafkaclusters from what kafkaclusters we use when we test webhooks

Okay, I'm fine with this reasoning, though I'd like to see how this looks when Marton's PR is merged.

Update: Checked back on that and I'd suggest modifying this to only really deal with the webhook tests themselves, not with installing a kafka clsuter and such as any real test case we might create which includes the webhook tests will include installing a kafka cluster and such, like it does now.

Yeah I think either we have to do everything from deploying components here or only do the webhooks and let the rest be built up at somewhere else, we are in the middle here which is not good because ensuring the kafka cluster is not the webhook test's responsibility.

We can create an explicit or implicit dependency on the KafkaCluster CR name, remove the creation and cleanup steps and reorder the function calls in the suite file so that webhook testing comes right after one of the KafkaCluster creations we do.

One worry I voiced is about how such code scales for other webhooks and their CRs and their prerequisites and dependencies.
For example, explicit dependency injection (via function parameters like a KafkaClusterName) might mean we would keeeeep adding params to function definitions and that's not something we want either.
Or it could be "implicit" and we just "assume" certain constants are used for names - which we already do "here and there".

I'll explain my initial stance: I wanted to separate the setup steps for each webhook because I worried the prerequisites between webhooks could vary enough to make another type of implementation very messy. That's why the sequence of I am/was proposing was : testWebhooks() -> testWebhookKafkaTopic(kubectlOptions) -> -> testCreate + testUpdate -> teardown.

I can be convinced to, for instance, not have the setup phase as part of testWebhookKafkaTopic().
Later edit: as in to not have it at all and depend on resources installed in the main test suite function

We can create an explicit or implicit dependency on the KafkaCluster CR name, remove the creation and cleanup steps and reorder the function calls in the suite file so that webhook testing comes right after one of the KafkaCluster creations we do.

Explicit rather, but otherwise this was my thought, yes.

I can be convinced to, for instance, not have the setup phase as part of testWebhookKafkaTopic().

Yeah with this naming IMO the setup/teardown of the components or clusters shouldn't be part of the function. Either we should name it testKafkaClusterCreateAndWebhookKafkaTopic() or we need to make something else do the setup.

tests/e2e/kafkatopic_webhook.go

panyuenlau

Found a couple of typos and left a question for my own understanding, no major issues. Good job!

panyuenlau · 2023-08-09T14:44:53Z

tests/e2e/kafkatopic_webhook.go

+	})
+}
+
+func testWebhookKafkTopic(kubectlOptions k8s.KubectlOptions) {


Typo

Suggested change

func testWebhookKafkTopic(kubectlOptions k8s.KubectlOptions) {

func testWebhookKafkaTopic(kubectlOptions k8s.KubectlOptions) {

Thank you for catching this. I'll have to update it in the function definition and at the call site.

Done in 10b2248.

If you are using VSCode/VSCodium the extension CodeSpellChecker (streetsidesoftware.code-spell-checker) can catch this in editing time.

tests/e2e/kafkatopic_webhook.go

tests/e2e/koperator_suite_test.go

pregnor

LGTM, couple comments, overall looks really good, thanks.

pregnor · 2023-08-14T16:53:17Z

tests/e2e/koperator_suite_test.go

 	testInstallKafkaCluster("../../config/samples/simplekafkacluster_ssl.yaml")
 	testProduceConsumeInternalSSL(defaultTLSSecretName)
 	testUninstallKafkaCluster()
+	testWebhooks()


Question: Shouldn't this be at line 62? This sounds like something which is both easier to test and required sooner than what comes from line 62.
I know you discussed somewhat this with Dávid, I'm coming from a slightly different angle.

With the current implementation where webhook prerequisites are decoupled, the logical trouble I saw was that webhook testing would end up doing the install testing for the big CRs before those dedicated tests happen.

I'll give the example we have regarding KafkaTopic : in order to test anything about KafkaTopic we need a KafkaCluster to be up which means we would end up testing KafkaCluster installation (without meaning to) before the dedicated KafkaCluster installation and functional tests.

I saw that as not good so I dedided to place webhook testing towards the end as a category of its own which can then rely on dedicated install and functional tests working (and having been addressed in their own sections beforehand).

I can be convinced otherwise, especially within the context of this conversation : #1031 (comment) .

tests/e2e/kafkatopic_webhook.go

pregnor · 2023-08-14T17:05:01Z

tests/e2e/kafkatopic_webhook.go

+)
+
+// TODO(mihalexa): move to k8s.go
+// applyK8sResourceFromTemplateWithDryRun is copy of applyK8sResourceFromTemplate which calls a "--dry-run=<strategy>" kubectl command


Yeah, that's a difficult topic.
I definitely wouldn't add a dryRunStrategy string parameter to the original function.

But maybe we should pay the one time cost and add an extraArgs map[string][]string parameter to that one which is passed down to the TerraTest call through the other helpers.
I know it's a big refactor and I can live with postponing it, but that looks like the desirable ideal solution to me.

pregnor · 2023-08-14T17:07:42Z

tests/e2e/kafkatopic_webhook.go

+	requireCreatingKafkaCluster(kubectlOptions, "../../config/samples/simplekafkacluster.yaml")
+	testWebhookCreateKafkaTopic(kubectlOptions)


Yeah I think either we have to do everything from deploying components here or only do the webhooks and let the rest be built up at somewhere else, we are in the middle here which is not good because ensuring the kafka cluster is not the webhook test's responsibility.

pregnor · 2023-08-14T17:09:52Z

tests/e2e/kafkatopic_webhook.go

+				dryRunStrategyServer,
+			)
+			Expect(err).To(HaveOccurred())
+			// Example error: The KafkaTopic "topic-test-internal" is invalid: spec.clusterRef.name: Invalid value: "kafkaNOT": kafkaCluster 'kafkaNOT' in the namespace 'kafka' does not exist


Question: Is this really meaningful with line 108 being present? I see it as somewhat of a noise, but could be wrong.

The idea was to show the messages in full and then match on noteworthy parts of them.
Also, in future refactors, someone may have a better matching logic idea if they see the "real" and full expected message/output.

Also on the point of consistency: if it's deemed useful for most other situations to have the "example error" comment, then I'd rather see it even in situations where it can look redundant on first glance, at the very least as an example of sticking to a developer "communication" practice.

In such cases my practice was to build the message clearly from 1 level of indirection if variables are defined - but mostly no variables for reuse in end to end tests for clarity purposes so string literals are the interface we are using.
I'm biased towards my experience but I can try and convince myself about this approach as well.

bartam1 · 2023-08-16T12:13:16Z

LGTM!
I agree with pregnor that it would be better to go deeper and propagate upwards the extraArgs map[string][]string from here:

koperator/tests/e2e/k8s.go

Line 418 in a13805f

    
           func applyK8sResourceManifestFromString(kubectlOptions k8s.KubectlOptions, manifest string) error {

. So we would go deeper at this point and use something like this:

       tmpfile, err := StoreConfigToTempFileE(t, configData)
	if err != nil {
		return err
	}
	defer os.Remove(tmpfile)
        
       args := []string{"apply", "-f","tmpfile"}
       args := append(args,extraArgs)


	_, err := k8s.RunKubectlAndGetOutputE(
		GinkgoT(),
		&kubectlOptions,
		args...,
	)
	return err

like here: func deleteK8sResource(
This would not be a big refactor and we would have bigger consistency

bartam1 · 2023-08-23T12:24:39Z

tests/e2e/k8s.go

-		applyK8sResourceManifest(kubectlOptions, tempPath)
+		err = applyK8sResourceManifest(kubectlOptions, tempPath)
+		if err != nil {
+			return errors.WrapIfWithDetails(err, "applying CRD failed", "crd", string(crd))


Note:
When we use the emperror.dev/errors's WithDetails feature then IMHO the error message will not contain the details because it is stored in a separate structure member variable and not in the error message itself.

I think we should figure out something to solve this problem.
I favor the std Go error handling with the fmt.Errorf but I know the stacktrace will be not there. (it is also stored in a separate struct member variable in emperror case)

I know we use that lot of places in the e2e codebase

tests/e2e/k8s.go

tests/e2e/kafkatopic_webhook.go

…merged

…in e2e tests

…e function

mihaialexandrescu force-pushed the e2etest/webhook_kafkatopic branch from c272723 to 856ccc3 Compare August 3, 2023 11:35

mihaialexandrescu marked this pull request as ready for review August 3, 2023 11:57

mihaialexandrescu requested a review from a team as a code owner August 3, 2023 11:57

Kuvesz reviewed Aug 7, 2023

View reviewed changes

Kuvesz previously approved these changes Aug 9, 2023

View reviewed changes

panyuenlau reviewed Aug 9, 2023

View reviewed changes

mihaialexandrescu dismissed Kuvesz’s stale review via 10b2248 August 9, 2023 15:11

mihaialexandrescu force-pushed the e2etest/webhook_kafkatopic branch from 856ccc3 to 10b2248 Compare August 9, 2023 15:11

Kuvesz previously approved these changes Aug 10, 2023

View reviewed changes

panyuenlau previously approved these changes Aug 10, 2023

View reviewed changes

pregnor reviewed Aug 14, 2023

View reviewed changes

mihaialexandrescu dismissed stale reviews from panyuenlau and Kuvesz via 94f0a6e August 21, 2023 08:18

mihaialexandrescu force-pushed the e2etest/webhook_kafkatopic branch 2 times, most recently from 94f0a6e to d76b478 Compare August 22, 2023 08:06

bartam1 reviewed Aug 25, 2023

View reviewed changes

mihaialexandrescu marked this pull request as draft August 25, 2023 11:03

mihaialexandrescu force-pushed the e2etest/webhook_kafkatopic branch 4 times, most recently from 38e9f64 to e0282d3 Compare August 31, 2023 08:00

mihaialexandrescu added 8 commits August 31, 2023 16:19

webhook test logic with temporary structure until kubeconfig PR gets …

205bb67

…merged

fix typo in root kafkatopic validator; same typo cannot yet be fixed …

02700a1

…in e2e tests

split some function calls into new lines better

7208e1b

refactor Apply-type functions

111d82c

refactor kafkatopic webhook test case data

8a0926f

update to go.mod and go.sum in root and tests/e2e after tidy

cf9faef

update testcase logic; temporary commit - needs cleanup

34483c4

update testcase logic; use local kafkatopic struct; use local tempfil…

651a2c8

…e function

mihaialexandrescu force-pushed the e2etest/webhook_kafkatopic branch 3 times, most recently from b66f9e3 to 274bc65 Compare September 5, 2023 14:43

remove kafkaCluster parameter from testWebhookKafkaTopic()

d21676d

mihaialexandrescu force-pushed the e2etest/webhook_kafkatopic branch from 274bc65 to d21676d Compare September 5, 2023 15:01

		requireCreatingKafkaCluster(kubectlOptions, "../../config/samples/simplekafkacluster.yaml")
		testWebhookCreateKafkaTopic(kubectlOptions)

	func testWebhookKafkTopic(kubectlOptions k8s.KubectlOptions) {
	func testWebhookKafkaTopic(kubectlOptions k8s.KubectlOptions) {

e2e test for KafkaTopic webhook #1031

Are you sure you want to change the base?

e2e test for KafkaTopic webhook #1031

Uh oh!

Conversation

mihaialexandrescu commented Aug 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kuvesz Aug 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mihaialexandrescu Aug 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

panyuenlau left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mihaialexandrescu Aug 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pregnor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mihaialexandrescu Aug 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bartam1 commented Aug 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bartam1 Aug 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mihaialexandrescu commented Aug 3, 2023 •

edited

Loading

Kuvesz Aug 9, 2023 •

edited

Loading

mihaialexandrescu Aug 16, 2023 •

edited

Loading

mihaialexandrescu Aug 9, 2023 •

edited

Loading

mihaialexandrescu Aug 16, 2023 •

edited

Loading

bartam1 commented Aug 16, 2023 •

edited

Loading

bartam1 Aug 23, 2023 •

edited

Loading