Skip to content

Do not persist a driver instance in the configuration object but in the browser #266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 5, 2025

Conversation

Vampire
Copy link
Contributor

@Vampire Vampire commented Apr 19, 2025

Before this PR the driver instance was persisted in the configuration object.
So if the same configuration object was used, the same driver was used.
This happens for example when using a GebTestManager which persists a configuration per thread
and uses that in the browser creator if no custom one was set.

This causes for example the same driver to be used multiple times, even if cacheDriver is set to false.
And more prominent, if quitDriverOnBrowserReset is enabled, then after the reset the driver is still used,
but as it is quit already, throws exceptions.

With this PR, the driver is no longer persisted in the configuration instance,
but in the browser instance, and the configuration is only used to create new driver instances.

@Vampire Vampire force-pushed the non-cached-driver-fix branch from b04a389 to 32e84f4 Compare April 19, 2025 04:05
@jonnybot0
Copy link
Contributor

jonnybot0 commented Apr 22, 2025

Hey! Thanks for working to contribute.

I can see from CI (and from checking it out locally), that this breaks the ParallelExecutionWithReportingTest2 and a few other tests.

To be clear, I think the problem you describe of driver objects getting recycled even when they definitely shouldn't be is a problem. I'm concerned, though, that the test failures indicate we'd actually be breaking a part of the API contract (even if it's only implicit).

Can I ask a bit more about the scenarios where this problem pops up? I'm trying to wrap my head around what a "best fix" would be.

@jonnybot0 jonnybot0 self-assigned this Apr 22, 2025
@Vampire
Copy link
Contributor Author

Vampire commented Apr 23, 2025

Actually, this is kind of a multi-part regression, and I think this should indeed be the proper way to fix it.
If someone relied on this behaviour (which I doubt) then he relied on a bug and should imho fix their code.
Fixing would usually be as simple as enabling driver caching that was manually disabled, or at worst using a custom Configuration implementation.

Before commit 6f2117e (i.e. version 0.6.0.RC1)

  • whenever you constructed a Browser without giving an explicit Driver, the Configuration was asked for the driverInstance
  • the Browser remembered this Driver instance so new Browser means new question to Configuration
  • the Configuration decided what to do
    • whether to always return the same instance e.g. when the dirver config option was set to a concrete driver instance
    • or always return the same instance on the same thread e.g. when cacheDriver was at its default value true (cacheDriver back then meant cache driver per thread, it was a static cache with thread-local so was even cached if new configuration instances were used)
    • or whatever a custom Configuration implementation decided to do

This was how it is supposed to be.

With the following commit 84466aa (i.e. version 0.6.0.RC1) this was changed to

  • Browser does not anymore remember the Driver but always just forwards the call to the Configuration
  • Configuration now does remember the Driver once it was created, totally independent from cacheDriver setting
  • So effectively as long as the same Configuration instance was used, always the same Driver instance was used across all threads
  • When Browser.drive was used without an explicit configuration each time a new Configuration was created so this was not affected.
  • The built-in test tool integrations each time when creating a new Browser also gave it a fresh Configuration instance so they were not affected
  • At this point it was already broken, but with the built-in integrations you still were not affected as always new configurations were created. Only if you had some manual code that created multiple Browser instances with the same Configuration instance of the built-in `Configuration class

With commit 08d1a0d (i.e. version 6.0) to fix geb/issues#660 this finally backfired for the built-in integrations, because now

  • The GebTestManager which is used by all the built-in integrations before this commit had a browserCreator that like in 0.6.0.RC1 as described above each time a new Browser was created a new Configuration object was created, so still the issue did not happen.
  • After this commit, to not execute the configuration script over and over again for each test case, the GebTestManager was changed to have one Configuration instance per Thread and new Browser instances are created using this same Configuration instance, which badly remembers the Driver instance although it should not do.

So currently with the built-in Configuration class we have three cases:

  1. cacheDriver && !cacheDriverPerThread which is the default
  2. cacheDriver && cacheDriverPerThread
  3. !cacheDriver

For case 1 there is no difference with this PR merged, because whether you create new configuration objects or not is irrelevant.
Even if you create new configuration objects, the global cache will always give you the same Driver instance across all threads, so all Configuration instances currently remember the same Driver instance anyway.

For case 2 there is no difference with this PR merged, because the Configuration instances in the GebTestManager are thread-local, so the Driver that is wrongly reused is reused thread-local and even if you create new Configuration instances every time, the cache will give you the same thread-local instance.

Case 3 is where the difference is, because you configured that you do not want the driver to be cached / reused, but the Configuration instances in the GebTestManager since 6.0 badly remember it.

Especially if you combine case 3 with quitDriverOnBrowserReset or otherwise quite the driver, this bites you immediately, because due to the remembered Driver in the Configuration even with cacheDriver disabled the next Browser gets the same Driver instance that is quit already and cannot be used anymore.

My currently work-around for this is to effectively revert the geb/issues#660 performance boost by using my own base spec that uses .withBrowserCreator { new Browser() } so that each time a new configuration instance is created and so the caching mechanism can behave as intended (i.e. not cache in my case), even though hat means that the configuration script again is executed for each test.

Another case where behaviour changes is, if you try to follow the Book of Geb paragraph

A new driver can be forced at anytime by calling either of CachingDriverFactory.clearCache() or CachingDriverFactory.clearCacheAndQuitDriver() both of which are static. After calling any of these methods, the next request for a default driver will result in a new driver instance being created.

While technically correct, the methods are currently effectively useless if you use the built-in integrations, because there is never a new "default driver" requested, as the Configuration instances are reused and they remember the Driver instance.


I have no idea why the CI has test failures.
But I really think this problem should be fixed in this way, even if someone relies on this faulty behavior.
You could list it as potentially breaking change, but I really think that 98.7% of the Geb usages out there should not see a change in behavior.
And if so, it should be relatively easy to mitigate by enabling driver caching, or using a custom Configuration implementation, or a custom browserCreator, whatever is more appropriate for the respective case.

Looking at ParallelExecutionWithReportingTest2 it should not be affected by this change actually, because it already uses what I use as workaround, doing .withBrowserCreator { new Browser() }.
Also, if I try to execute it, it just hangs forever and does not execute anything.
And that the test was successful on CI using Java 21 but fails with Java 17 also suggests to me that the problem is not my change but some other problem, causing flaky behavior. :-/

@Vampire
Copy link
Contributor Author

Vampire commented Apr 23, 2025

Ah, ok, it did not do anything because you cannot run one of these tests alone, it forcibly waits until both are started to ensure they run in parallel.
It should have failed after 30 seconds actually and does not which is something else to look at.
But if I run both of these tests as they required (ParallelExecutionWithReportingTest1 and ParallelExecutionWithReportingTest2) and additionally ParallelExecutionTest for which they wait, or all JUnit5 tests, then they run successfully here without any problem.

@Vampire
Copy link
Contributor Author

Vampire commented Apr 23, 2025

By running those tests in a loop I was able to make them fail, but not for the reason they fail on CI, but due to this wrong report path:
image

@Vampire
Copy link
Contributor Author

Vampire commented Apr 24, 2025

That problem is now fixed in #267.
With that PR integrated I could not make the parallel tests fail anymore locally.
Let's see what happens here after that PR is merged and this rebased.

@Vampire Vampire force-pushed the non-cached-driver-fix branch from 32e84f4 to e455a1a Compare April 26, 2025 00:38
@Vampire
Copy link
Contributor Author

Vampire commented Apr 26, 2025

The local chrome tests should also work fine now.
Those were also not "indicate we'd actually be breaking a part of the API contract", but just stupid me.
Before the test set the config.driver to the mocked one for each test.
As I removed the config.driver field I set config.driverConf instead to a closure returning that mocked driver.
But as caching is globally enabled by default and I did not disable it, the second test got the driver of the first test and thus it did not work properly.
I changed it now to set browser.driver instead, now the test again does what it is supposed to do.

@jonnybot0
Copy link
Contributor

If I rebase this branch from master after the last merge, the test originally mentioned works. Happy to merge this now, though I'd like it to pass a CI check, just to make sure nothing new has cropped up.

Trouble is, I think something might be wrong with the Build, test, and quality check workflow. The error reports from this run, which is ostensibly running Java 21, reports Java11 (see line 221). I can see a Java 17 build failing with the same error at https://github.com/apache/groovy-geb/actions/runs/14801195371/job/41560129100. Makes me think our actions/setup-java is misconfigured, but it looks correct at first blush.

If you can rebase and re-push, I'll rerun... but I think you're right that something else is causing this flakiness.

@Vampire Vampire force-pushed the non-cached-driver-fix branch from e455a1a to 98ad355 Compare May 4, 2025 01:46
@Vampire
Copy link
Contributor Author

Vampire commented May 4, 2025

Nah, setup-java is fine.
Gradle is running Java 21.
But the build uses a Java 11 toolchain here since 7 months, so the test matrix is probably a bit useless since then:

languageVersion = JavaLanguageVersion.of(11)

@jonnybot0 jonnybot0 merged commit 42d5ad9 into apache:master May 5, 2025
9 checks passed
@jonnybot0
Copy link
Contributor

Ah, yeah, I see that.

I believe the purpose of those multi-JVM builds is actually to ensure Geb's compatibility after the build when running on those Java versions, not necessarily to ensure you can compile with them.

I expect the best thing to do would be to use the release flag to set cross-compile compatibility to Java 11, as in the guide on combining toolchains. Then we could either let the environment decide the Java version or continue using the toolchain to specify a version.

Then we could override the toolchain at the task level to test different versions, as Gradle has documented.

But this is really another issue altogether. I'll see if I can get a PR up to that effect.

@Vampire Vampire deleted the non-cached-driver-fix branch May 5, 2025 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants