-
Notifications
You must be signed in to change notification settings - Fork 413
Add test/integration
#3372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test/integration
#3372
Conversation
Hi @ntnn. Thanks for your PR. I'm waiting for a kcp-dev member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
func NewControllers() *Controllers { | ||
kcmDefaults, err := kcmoptions.NewKubeControllerManagerOptions() | ||
if err != nil { | ||
panic(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to bubble up the error, but that'd require to add error handling to every NewOptions
that is calling it:
kcp/pkg/server/options/options.go
Line 99 in 90605ce
Controllers: *NewControllers(), |
kcp/cmd/kcp/options/options.go
Line 41 in 90605ce
Server: *serveroptions.NewOptions(rootDir), |
Line 90 in 90605ce
kcpOptions := options.NewOptions(rootDir) |
Which would be a good change - but would also touch a lot of the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather to that in a separate PR tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Panic here is not that bad. It at the point of startup config validation so its quick feedback. Not something happening at runtime 🤷
Tests with race condition testing fails. I think kube also doesn't race test their integration tests. |
/ok-to-test |
8abda95
to
f7cb942
Compare
/test pull-kcp-test-unit |
/test pull-kcp-test-e2e-shared |
This is kind of a known problem with starting kcp the first time, so we should probably tackle this in kcp itself, not the integration test framework. |
Yeah, I just measured the startup times - etcd takes less than a second, kcp phase1 takes roughly 22s. And using a shared kcd for the integration tests would make less sense imho - if the kcp instance could be shared without tests affecting each other they could likely also be e2e tests. |
Running with race detection works. Root cause was that this method could be called from multiple goroutines if no Lines 39 to 48 in a2014d6
This originates from the server config here: Lines 208 to 210 in a2014d6
So either the wrapper function is wrapped with a mutex to prevent concurrent writes or the wrapper function acts atomically. I don't particulalry like either option. kcp/pkg/network/dialer_linux.go Lines 35 to 41 in a2014d6
|
Technically it would maybe possible to set Maybe there was a good reason not to use However I can't discern a reason from reading the documentation - it even reads more like setting the transport would be the right approach to setting the TCP timeout: // Transport may be used for custom HTTP behavior. This attribute may not
// be specified with the TLS client certificate options. Use WrapTransport
// to provide additional per-server middleware behavior.
Transport http.RoundTripper From attempting to set this attribute instead of wrap - I'd guess wrapping was used because setting a transport requires setting up a transport how the library expects it rather than having the library provide a working transport and just adjusting the timeouts. |
/test pull-kcp-test-e2e-multiple-runs |
Found out why // New returns an http.RoundTripper that will provide the authentication
// or transport level security defined by the provided Config.
func New(config *Config) (http.RoundTripper, error) {
// Set transport level security
if config.Transport != nil && (config.HasCA() || config.HasCertAuth() || config.HasCertCallback() || config.TLS.Insecure) {
return nil, fmt.Errorf("using a custom transport with TLS certificate options or the insecure flag is not allowed")
} So the TLS configuration would have to be removed, which then causes connections to fail due to the missing TLS configuration:
a07f610#diff-d84eac55294d8f540a660bfcb4043f7c4e1f76c8bccbb91c129e8e4992dbafe7R210 Technically a bug, but that was introduced >7y ago - for a fix on the KCP side for now we'll have to go with either of the workarounds. |
Very nice! About startup, I think we need somebody debugging our startup process. I think there must be some polling that is slow. At some point, we have split the kcp core informers into one part that does not need identities (system CRDs) and those resource based on APIExports (which need identities). And there is logic to bootstrap in a multi-shard env. Something in this area must be far too slow. Maybe some polling with too long interval. |
Marking as WIP as I want to change how the server is started |
c5f6cb0
to
8bdd8e0
Compare
The goleak ignore list is pretty long but I'm fairly certain that quite a few are just downstream of other issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, happy to also approve if none of the other usual suspects in this PR (@sttts and @mjudeikis) want to take a look.
LGTM label has been added. Git tree hash: 3f155fed1096611d0ced7799a303725e25426e9b
|
Signed-off-by: Nelo-T. Wallus <[email protected]> Replace init function to prevent sharing the same defaults When multiple kcp instances are started in the same process they share the same kcmDefaults, which is initialized in the `init` function. Signed-off-by: Nelo-T. Wallus <[email protected]> Implement atomic concurrenct access solution Signed-off-by: Nelo-T. Wallus <[email protected]> Implement sync.Once concurrent access solution Signed-off-by: Nelo-T. Wallus <[email protected]> Fix nil server returned when Option changes name Signed-off-by: Nelo-T. Wallus <[email protected]> Fix data races and panics when setting up multiple servers in parallel Signed-off-by: Nelo-T. Wallus <[email protected]> Add pull-kcp-test-integration Signed-off-by: Nelo-T. Wallus <[email protected]> Fix deadlock in .Stop .cancel is locking by itself. Signed-off-by: Nelo-T. Wallus <[email protected]>
/lgtm |
LGTM label has been added. Git tree hash: 40a308411262a2b308e7b33218fabbd9051a7f92
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mjudeikis The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary
Adding a framework for integration tests.
Startup takes a long time - I am contemplating adding support for a shared embedded etcd (or an external etcd like in kube) if that significantly improves startup.
What Type of PR Is This?
/kind feature
Related Issue(s)
Release Notes