Skip to content

Convert Particle_tracking_with_Parcels notebook to Intake#534

Merged
marc-white merged 15 commits into
mainfrom
INTAKE-ParticleTrackingParcels
Jul 17, 2025
Merged

Convert Particle_tracking_with_Parcels notebook to Intake#534
marc-white merged 15 commits into
mainfrom
INTAKE-ParticleTrackingParcels

Conversation

@marc-white

Copy link
Copy Markdown
Contributor

Progresses #313 by converting the notebook Recipes/Mains-Advanced/Particle_tracking_with_Parcels.ipynb to use Intake rather than COSIMA Cookbook.

@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@marc-white

Copy link
Copy Markdown
Contributor Author

I'm not sure what's up with this notebook, but every time I try to start the Dask cluster, it gives me a cluster with only one worker, but the combined memory of all the workers:

Screenshot 2025-06-27 at 2 38 24 pm

I've even tried blowing away my entire cosima-recipes copy on Gadi, re-checking it out, and then opening up the existing main version of the notebook - same problem.

@rbeucher do you get this issue if you try this notebook?

@rbeucher

Copy link
Copy Markdown
Contributor

No, it works for me
What queue are you using?
Have you tried to rename/delete your ondemand folder in your home directory. Maybe some caching issues...
Do you have dask configurations somewhere?

image

@rbeucher

rbeucher commented Jun 27, 2025

Copy link
Copy Markdown
Contributor

That also works

image

@marc-white

Copy link
Copy Markdown
Contributor Author

@rbeucher Killing the ondemand folder in my home area fixed the problem.

@marc-white

marc-white commented Jun 30, 2025

Copy link
Copy Markdown
Contributor Author

@rbeucher or @charles-turner-1 would you be able to take a look at where this branch is up to? I've managed to get the 2D example running, but getting the 3D data in is giving me issues:

  • I'm struggling to get the data in in an efficient way, particularly because under the current-style file_id system, it's coming out as multiple datasets that I need to merge;
  • As far as I can tell, it seems like there's a gap in the time series of the data around the 1975-2000 dates that I used in the 2D advection sample.

@charles-turner-1

Copy link
Copy Markdown
Collaborator

Sorry @marc-white, I've only just been able to get round to being able to take a look at this now - would it still be helpful for you if I take a look through, or have things moved on?

@marc-white

Copy link
Copy Markdown
Contributor Author

Sorry @marc-white, I've only just been able to get round to being able to take a look at this now - would it still be helpful for you if I take a look through, or have things moved on?

@charles-turner-1 No please do! I've been banging my head against it for a while.

TLDR: Getting the 3D data in is slow, painful, and occasionally fails for no better reason than the kernel seems to get bored of the job. Once it does load, I can't for the life of me figure out how to get the graph size down for the particle advection step (the last calculation before plotting).

@charles-turner-1

Copy link
Copy Markdown
Collaborator

Cool - will dig into things ASAP

@charles-turner-1

Copy link
Copy Markdown
Collaborator

I'm getting lots of similar and weird issue with Dask in this notebook to the ones you were getting @marc-white - and interestingly, ones I've never seen before elsewhere.

I've optimised a couple of the dataset openings with regexes & it seems to have made no difference. I'm wondering if some of the kernel functions in Parcels have some sort of hardware acceleration features that don't play nicely with Dask or similar.

I'm gonna dig through the documentation & see if I can work out whether there's anything that might be causing these issues.

@charles-turner-1

Copy link
Copy Markdown
Collaborator

Okay, so I tried my "lets just copy paste the code into a new notebook" approach, and I very quickly discovered that just importing parcels is what causes dask to default to one very large worker. This is making me very suspicious about how nicely dask & parcels play together....

@marc-white

marc-white commented Jul 16, 2025

Copy link
Copy Markdown
Contributor Author

https://docs.oceanparcels.org/en/latest/examples/documentation_MPI.html

Does the above mean you need to do something manual to get Parcels to parallelize correctly?

@charles-turner-1

charles-turner-1 commented Jul 16, 2025

Copy link
Copy Markdown
Collaborator

I'm not quite sure, I've been reading through that page you linked as well as any relevant git issues I can find & I still can't make heads or tails of what it's saying. Perhaps a skill issue on my part but I basically haven't got a clue right now.

@marc-white

Copy link
Copy Markdown
Contributor Author

Perhaps Google AI search to the rescue (or a calamitous explosion, one of the two):

Running Simulations: When creating the ParticleSet, you can pass the Dask client to the pset_ėti method, indicating that calculations should be distributed across the cluster.

from parcels import ParticleSet, JITParticle

# Assuming you have a ParticleSet 'pset' already defined
pset = ParticleSet(fieldset, pclass=JITParticle, lon=..., lat=...)

# Run the simulation with the Dask client
kernels = pset.Kernel(AdvectionRK4)
pset.execute(kernels,
                       runtime=time_课,
                       dt=dt,
                       recovery_manager=recovery_manager,
                       output_file=pfile,
                       verbose_progress=True,
                       client=client) # Pass the client here

@charles-turner-1

charles-turner-1 commented Jul 17, 2025

Copy link
Copy Markdown
Collaborator

Unfortunately I couldn't find that argument in the source code 🫤 .

Also a word of caution with Google's AI search - not quite up to scratch compared to other LLM's unfortunately.

Screenshot 2025-07-17 at 10 01 26 am

@charles-turner-1 charles-turner-1 marked this pull request as ready for review July 17, 2025 06:39
@charles-turner-1

Copy link
Copy Markdown
Collaborator

Again can't request review from you @marc-white but this should be ready to go if you can confirm you're happy with my updates!

@review-notebook-app

review-notebook-app Bot commented Jul 17, 2025

Copy link
Copy Markdown

View / edit / reply to this conversation on ReviewNB

marc-white commented on 2025-07-17T06:45:37Z
----------------------------------------------------------------

Typo in last reference to xx-large


@review-notebook-app

review-notebook-app Bot commented Jul 17, 2025

Copy link
Copy Markdown

View / edit / reply to this conversation on ReviewNB

marc-white commented on 2025-07-17T06:45:38Z
----------------------------------------------------------------

Has this markdown been executed to make it a proper header?


charles-turner-1 commented on 2025-07-17T06:56:11Z
----------------------------------------------------------------

Weird, looks fine when I open the notebook

marc-white commented on 2025-07-17T07:00:15Z
----------------------------------------------------------------

Must just be a ReviewNB quirk then...

@review-notebook-app

review-notebook-app Bot commented Jul 17, 2025

Copy link
Copy Markdown

View / edit / reply to this conversation on ReviewNB

marc-white commented on 2025-07-17T06:45:39Z
----------------------------------------------------------------

That is mindbogglingly faster than my load - win for disposing of the bulk parcels import!


@review-notebook-app

review-notebook-app Bot commented Jul 17, 2025

Copy link
Copy Markdown

View / edit / reply to this conversation on ReviewNB

marc-white commented on 2025-07-17T06:45:40Z
----------------------------------------------------------------

We can probably remove the comments.

We should also add somewhere that the simplified data set is needed to push everything onto the same coordinate grid, as per the comments in this cell.


@review-notebook-app

review-notebook-app Bot commented Jul 17, 2025

Copy link
Copy Markdown

View / edit / reply to this conversation on ReviewNB

marc-white commented on 2025-07-17T06:45:40Z
----------------------------------------------------------------

Get rid of the commented print statements


Copy link
Copy Markdown
Collaborator

Weird, looks fine when I open the notebook


View entire conversation on ReviewNB

Copy link
Copy Markdown
Contributor Author

Must just be a ReviewNB quirk then...


View entire conversation on ReviewNB

@charles-turner-1

Copy link
Copy Markdown
Collaborator

@marc-white another quick scan please and if you're happy lets merge!

@charles-turner-1

Copy link
Copy Markdown
Collaborator

Closes #313

@charles-turner-1 charles-turner-1 linked an issue Jul 17, 2025 that may be closed by this pull request
44 tasks
@marc-white marc-white merged commit 3ef13c1 into main Jul 17, 2025
3 checks passed
@marc-white marc-white deleted the INTAKE-ParticleTrackingParcels branch July 17, 2025 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Converting notebooks from COSIMA Cookbook to ACCESS-NRI intake catalog

3 participants