Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LeastCost and RandomWalk MovementModes #34

Open
wants to merge 25 commits into
base: alg_efficiency
Choose a base branch
from

Conversation

rafaqz
Copy link
Collaborator

@rafaqz rafaqz commented Mar 11, 2025

This PR adds new algorithms for least cost and random walk movements - which should be equal to RSP for ~0 and ~Inf values of theta (if those could actually run numerically).

I'm also using the opportunity (new algs without tests) to rethink from scratch the type structure of all conscape algs, and how we organise single-target compute for fast/low-memory vector solves.

First it adds a MovementMode abstract type that has LeastCost, RandomShortestPath and RandomWalk as concrete implementations.

It then splits betweenness measures into two type heirarchies - one for edge/node betweenness as types of GraphMeasure, and a BetweennessMeasures heirarchy for Q/K/QK/Unweighted.

These objects control dispatch in compute and compute_target methods.

KullbackLeiblerDivergence is moved to connectivity_meaures as its more like them... but we add a lower abstract type SourceTargetMeasures to capture them all under the same umbrella.

@vboussange would be great to hear what you think of the structure (a lot of the alg code may be a bit broken in practice, I havent run any of this)

Edit: also replaced GridRSP with the more generic GridPrecalculations abstract type and LeastCostPrecalculations, RandomWalkPrecalculations and RandomShortestPathPrecalculations, to generalize accross all of the movement types.

@rafaqz
Copy link
Collaborator Author

rafaqz commented Mar 12, 2025

To see whats possible I've just move all of the algorithms to compute methods with single-target execution.

Probably none of it runs but it all seems to be able to fit in this new context.

Probably will lead to just deleting grid.jl, gridrsp.jl and rsp.jl files and saving 1000k lines of code.

Copy link
Collaborator

@vboussange vboussange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Raf, great work here. Overall, I really like the idea of defining a MovementMode and distances (or more generally measures) based on each mode. I also like your approach with the precalculations. Finally, it seems that you tried hard getting a clean integration of a flexible linear solve - great!

My major comment concerns the GraphMeasure and ConnectivityMeasure hierarchies. They seem a bit cumbersome, and hard to relate to (from a mathematical perspective). A suggestion would be to define the hierarchy based on the output of these algorithms. I see 3 different types of outputs: a scalar (e.g., equivalent connected habitat), which is a global measure of the graph, a vector (e.g., centrality of the node), and a matrix (e.g., distance from nodes to nodes). Maybe that these strict distinction would facilitate the construction of a robust type system?

# TODO: the loop here doesn't decompose to single targets, so the full Z matrix seems to be needed.
# this is a problem for memory use in e.g. BatchProblem, D may be large fraction of
# the available memory per node on the cluster (~3gb per core)
function compute(::RandomWalk, g::Grid)
Copy link
Collaborator

@vboussange vboussange Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I believe that you calculate the CommuteCost based on a RandomWalk type of movement (which should subtypeMovementMode).

k = ones(size(g.costmatrix, 1)) # TODO use a workspace
# And maybe transform them
# TODO untangle this logic so apply_weight handles everythign
if !isnothing(distance_transformation(cm))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is cm?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connectivity_measure, but these names are kind of in flux

# ConnectedHabitat
compute(::ConnectedHabitat, tp::TargetPrecalculations) = tp.landscape_matrix

function compute(gm::Sensitivity, tp::RandomisedShortestPathTargetPrecalculations)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sensitivity is w.r.t quality or permeability, so this should appear somewhere.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already wrt cost and/or affinity or quality

_scale!(A, ::Union{Affinity,AffinityAndCost}, g) = A .*= g.affinities
_scale!(A, ::Union{Cost,CostAndAffinity}, g) = A .*= g.costmatrix

function sensitivity(::ExpectedCost, tp::RandomisedShortestPathTargetPrecalculations)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, sensitivity w.r.t what?

Copy link
Collaborator Author

@rafaqz rafaqz Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are subsections, wrt is handled in the main compute that calls these. They could I guess be renamed to make that clear

Copy link
Collaborator

@vboussange vboussange Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the general idea of the type hierarchy, in particular the definition of a MovementMode, and defining graph measures based on the different modes through traits. However, I do not get the ArrivingMovement terminology and usage. I also think that RandomisedShortestPath reads more like a DistanceMeasure than a MovementMode. Maybe RandomisedMovement? Finally, I feel like the definition of FundamentalMeasure, DistanceMeasure and the likes are slightly confusing and cumbersome.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSP is pretty embedded now... There are a lot of compromises between cleanest language and common useage

Copy link
Collaborator Author

@rafaqz rafaqz Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arriving is for movements that are assumed to arrive without mortality, but there may be a better word.

It's opposed to absorbing. Yeah, sounds pretty vague

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also, I feel like we could have a simplified type hierarchy, or have some clear definitions of these different measures. The difference between TopologicalMeasure , SpatialMeasure and LandscapeMeasure are a bit obscure to me.

Copy link
Collaborator Author

@rafaqz rafaqz Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they're kind of placeholders, the structure isn't clear to me yet either

(If you have ideas...)

@rafaqz
Copy link
Collaborator Author

rafaqz commented Mar 14, 2025

Thanks for the review! Yes the hierarchy could be better.

But I'm afraid it needs multiple inheritance as eg. Betweenesses are mostly similar but have different outputs.

Currently multiple inheritance is implemented with traits like returntrait. But these can be better organised and renamed.

@rafaqz
Copy link
Collaborator Author

rafaqz commented Mar 14, 2025

And yes it's all per-target now so LinearSolvers will just work. Another upside is the memory footprint is much smaller with the fundamental matrix being a single column. So running windowed we can thread at the per window level, staying well inside the ~3gb/ core ratio of our computer clusters.

One downside is that EigMax apparently needs the whole graph, and has to run at the outer level rather than in the per-target loop. So it still may not run at scale due to memory limitations.

@vboussange
Copy link
Collaborator

That's awesome. Do you plan to implement GPU support?

@rafaqz
Copy link
Collaborator Author

rafaqz commented Mar 15, 2025

I think we can get cuda sparse solves via LinearSolve.jl? But probably lots of other things will break.

After seeing the high ratio of CPUs to GPUs in large clusters it hasn't been something I've focussed on.

Also with the per-target approach our arrays are not that big, so possibly it won't be worth the effort of launching GPU kernels or copying data back and forth. But I would like to try at some stage!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants