Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lazy compute operations #25

Open
wants to merge 21 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ DelimitedFiles = "8bb1440f-4735-579b-a4ab-409b98df4dab"
Graphs = "86223c79-3864-5bf0-83f7-82e725a168b6"
LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
LinearSolve = "7ed4a6bd-45f5-4d41-b270-4a48e9bafcae"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
ProgressLogging = "33c8b6b6-d38a-422a-b730-caa89a2f386c"
Rasters = "a3a2b9e3-a471-40c9-b274-f788e487c689"
Expand All @@ -20,6 +21,7 @@ ArnoldiMethod = "0.0.4, 0.4"
DelimitedFiles = "1"
Graphs = "1"
LaTeXStrings = "1.1"
LinearSolve = "2.38.0"
Plots = "1.4"
ProgressLogging = "0.1"
Rasters = "0.12"
Expand Down
49 changes: 48 additions & 1 deletion examples/2_landmarks.jmd
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Landmark function for testing purposes

```julia
nothing
using ConScape, Rasters, ArchGDAL, SparseArrays, LinearAlgebra, Plots
```

Expand All @@ -16,6 +17,7 @@ grid = 10

```julia
datadir = joinpath(dirname(pathof(ConScape)), "..", "data")

```

```julia
Expand Down Expand Up @@ -83,7 +85,7 @@ heatmap(qbetw)

```julia
qbetw_coarse = ConScape.betweenness_qweighted(h_coarse);
ConScape.heatmap(qbetw_coarse)
heatmap(qbetw_coarse)
```

```julia
Expand All @@ -99,6 +101,51 @@ kbetw_coarse = @time ConScape.betweenness_kweighted(h_coarse, distance_transform
cor(filter(!isnan, kbetw), filter(!isnan, kbetw_coarse))
```

We can write this using a lazy problem definition:

```julia
ops = (;
func=ConScape.ConnectedHabitat(),
qbetw=ConScape.BetweennessQweighted(),
kbetw=ConScape.BetweennessKweighted(),
)
# Put least cost, random walk, and rsp
problem = ConScape.Problem(ops;
connectivity_measure=ConScape.ExpectedCost(θ=1.0),
distance_transformation=(exp=x -> exp(-x/75), oddsfor=ConScape.OddsFor()),
solver=ConScape.MatrixSolve(),
)
problem = ConScape.Problem(ops;
distance_transformation=(exp=x -> exp(-x/75), oddsfor=ConScape.OddsFor()),
connectivity_measure=ConScape.ExpectedCost(θ=1.0),
solver=ConScape.VectorSolve(nothing),
)
```

Then run it for all operations on both normal and coarse grids

```julia
res = ConScape.solve(problem, g)
res_coarse = ConScape.solve(problem, g_coarse)
```

We can plot these outputs:
```julia
heatmap(res.qbetw.oddsfor)
heatmap(res.qbetw.exp)
heatmap(res_coarse.qbetw.exp)
heatmap(res.kbetw.exp)
heatmap(res_coarse.kbetw.exp)
heatmap(res.func.exp)
heatmap(res_coarse.func.exp)
```

And we can check the corelation similarly to above, by getting
layers from `res` and `res_coarse`
```julia
cor(filter(!isnan, res.kbetw), filter(!isnan, res_coarse.kbetw))
```

# Test landmark performance for amount of connected habitat

```julia
Expand Down
10 changes: 9 additions & 1 deletion src/ConScape.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ module ConScape
using SparseArrays, LinearAlgebra
using Graphs, Plots, SimpleWeightedGraphs, ProgressLogging, ArnoldiMethod
using Rasters
using LinearSolve
using Rasters.DimensionalData

# Old funcion-based interface
abstract type ConnectivityFunction <: Function end
abstract type DistanceFunction <: ConnectivityFunction end
abstract type ProximityFunction <: ConnectivityFunction end
Expand All @@ -16,6 +18,9 @@ module ConScape
struct survival_probability <: ProximityFunction end
struct power_mean_proximity <: ProximityFunction end

# Need to define before loading files
abstract type AbstractProblem end

# Randomized shortest path algorithms
include("randomizedshortestpath.jl")
# Grid struct and methods
Expand All @@ -26,5 +31,8 @@ module ConScape
include("io.jl")
# Utilities
include("utils.jl")

include("graph_measure.jl")
include("connectivity_measure.jl")
include("problem.jl")
include("tiles.jl")
end
38 changes: 38 additions & 0 deletions src/connectivity_measure.jl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not having something like

(d::LeastCostDistance)(args...; kwargs...) = least_cost_distance(args...; kwargs...) 

Copy link
Collaborator Author

@rafaqz rafaqz Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've allowed passing solver modes to \ in rsp, but playing around I found some situations where LinearSolve solvers were very fast on Z but stalled in RSP, so maybe we want different solvers per problem?

We also want to factor out some of the solves as the same thing is being done in multiple places if you want multiple kinds of betweenness metrics in the same run.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about having a folder connectivity_measures and then files that contain specialised methods belonging to a certain framework, such as e.g. the RSP framework but also possibly to other frameworks?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should reorganise all the code like that, but in this PR im trying to add new things without completely reorganising the old ones so that changes to the old code diff properly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for now its just a file with struct definitions and the methods are left in their current place.

Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# New type-based interface
# Easier to add parameters to these
abstract type ConnectivityMeasure end
Copy link
Collaborator

@vboussange vboussange Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe what is the difference between those? I feel like the type system should be as close as possible to the mathematical classification of these measures.

I suggest that we could follow the classification of e.g. Graphs.jl or that of networkx: https://networkx.org/documentation/stable/reference/algorithms/index.html

i.e., DistanceMeasure, ConnectivityMeasure, CentralityMeasure, etc...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like a good idea, the GraphMeasures that we discussed would then be CentralityMeasures (connected_habitat is a weighted closeness centrality and the betweennesses are betweenness centralities)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets do a round or two of documenting and reorganising/renaming the type structure when differences get clearer, the objects here can be seen as place holders for better names.


abstract type FundamentalMeasure <: ConnectivityMeasure end
abstract type RSPDistanceMeasure <: FundamentalMeasure end

struct LeastCostDistance <: ConnectivityMeasure end
@kwdef struct ExpectedCost{T<:Union{Real,Nothing}} <: RSPDistanceMeasure
θ::T=nothing
approx::Bool=false
end
@kwdef struct FreeEnergyDistance{T<:Union{Real,Nothing}} <: RSPDistanceMeasure
θ::T=nothing
approx::Bool=false
end
@kwdef struct SurvivalProbability{T<:Union{Real,Nothing}} <: FundamentalMeasure
θ::T=nothing
approx::Bool=false
end
@kwdef struct PowerMeanProximity{T<:Union{Real,Nothing}} <: FundamentalMeasure
θ::T=nothing
approx::Bool=false
end

keywords(cm::ConnectivityMeasure) = _keywords(cm)

# TODO remove the complexity of the connectivity_function
# These methods are mostly to avoid changing the original interface for now
connectivity_function(::LeastCostDistance) = least_cost_distance
connectivity_function(::ExpectedCost) = expected_cost
connectivity_function(::FreeEnergyDistance) = free_energy_distance
connectivity_function(::SurvivalProbability) = survival_probability
connectivity_function(::PowerMeanProximity) = power_mean_proximity

# This is not used yet but could be
compute(cm::ConnectivityMeasure, g) =
connectivity_function(m)(g; keywords(cm)...)
70 changes: 70 additions & 0 deletions src/graph_measure.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
abstract type GraphMeasure end

keywords(o::GraphMeasure) = _keywords(o)

abstract type TopologicalMeasure <: GraphMeasure end
abstract type BetweennessMeasure <: GraphMeasure end
abstract type PerturbationMeasure <: GraphMeasure end
abstract type PathDistributionMeasure <: GraphMeasure end

struct BetweennessQweighted <: BetweennessMeasure end
@kwdef struct BetweennessKweighted{DV} <: BetweennessMeasure
diagvalue::DV=nothing
end
struct EdgeBetweennessQweighted <: BetweennessMeasure end
@kwdef struct EdgeBetweennessKweighted{DV} <: BetweennessMeasure
diagvalue::DV=nothing
end

@kwdef struct ConnectedHabitat{DV} <: GraphMeasure
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this should be a topological measure

diagvalue::DV=nothing
end

@kwdef struct Criticality{DV,AV,QT,QS} <: PerturbationMeasure
diagvalue::DV=nothing
avalue::AV=floatmin()
qˢvalue::QS=0.0
qᵗvalue::QT=0.0
end

# These maybe don't quite belong here
@kwdef struct EigMax{F,DV,T} <: TopologicalMeasure
diagvalue::DV=nothing
tol::T=1e-14
end
struct MeanLeastCostKullbackLeiblerDivergence <: PathDistributionMeasure end
struct MeanKullbackLeiblerDivergence <: PathDistributionMeasure end

# Kind of a hack for now but makes the input requirements clear
keywords(o::GraphMeasure, dt, p::AbstractProblem) =
(; _keywords(o)..., distance_transformation=dt)
keywords(o::Union{BetweennessQweighted,EdgeBetweennessQweighted,PathDistributionMeasure}, dt, p::AbstractProblem) =
(; _keywords(o)...)
keywords(o::Union{BetweennessKweighted,EigMax}, dt, p::AbstractProblem) =
(; _keywords(o)...,
connectivity_function=connectivity_function(p),
distance_transformation=dt)
keywords(o::ConnectedHabitat, dt, p::AbstractProblem) =
(; _keywords(o)...,
distance_transformation=dt,
connectivity_function=connectivity_function(p),
approx=connectivity_measure(p).approx)

graph_function(m::BetweennessKweighted) = betweenness_kweighted
graph_function(m::EdgeBetweennessKweighted) = edge_betweenness_kweighted
graph_function(m::BetweennessQweighted) = betweenness_qweighted
graph_function(m::EdgeBetweennessQweighted) = edge_betweenness_qweighted
graph_function(m::ConnectedHabitat) = connected_habitat
graph_function(m::Criticality) = criticality
graph_function(m::EigMax) = eigmax
graph_function(m::MeanLeastCostKullbackLeiblerDivergence) = mean_lc_kl_divergence
graph_function(m::MeanKullbackLeiblerDivergence) = mean_kl_divergence

compute(m::GraphMeasure, p::AbstractProblem, g::Union{Grid,GridRSP}) =
compute(m, p.distance_transformation, p, g)
compute(m::GraphMeasure, dts::Union{Tuple,NamedTuple}, p::AbstractProblem, g::Union{Grid,GridRSP}) =
map(dts) do dt
compute(m, dt, p, g)
end
compute(m::GraphMeasure, distance_transformation, p::AbstractProblem, g::Union{Grid,GridRSP}) =
graph_function(m)(g; keywords(m, distance_transformation, p)...)
44 changes: 33 additions & 11 deletions src/grid.jl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that memory use and flops is also related to the prob::Problem. It could be easy to assess the expected number of flops in VectorSolve() where we can do the calculation for one node and multiply by the number of targets, but it may be more difficult for a MatrixSolve() without performing the operation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes makes sense, memory can just be checked by running one column.

Flops may need us to get some values I'm not sure we have access to (but e.g. UMFPACK dose track flops internally).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are refactoring this, it would make sense to already think of #29. Maybe add a field connected_components that is a vector of vectors, where each vector corresponds to a list of node of the associated connected component.

Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ struct Grid
id_to_grid_coordinate_list::Vector{CartesianIndex{2}}
source_qualities::AbstractMatrix{Float64}
target_qualities::AbstractMatrix{Float64}
targetidx::Vector{CartesianIndex{2}}
targetnodes::Vector{Int}
qs::Vector{Float64}
qt::Vector{Float64}
dims::Union{Tuple,Nothing}
end

Expand Down Expand Up @@ -75,7 +79,7 @@ function Grid(nrows::Integer,
# _affinities = affinities
# vec(CartesianIndices((nrows, ncols)))
# end
id_to_grid_coordinate_list = vec(CartesianIndices((nrows, ncols)))
id_to_grid_coordinate_list = vec(collect(CartesianIndices((nrows, ncols))))

costfunction, costmatrix = if costs isa Transformation
costs, mapnz(costs, affinities)
Expand All @@ -95,6 +99,11 @@ function Grid(nrows::Integer,
throw(ArgumentError("cost graph contains edges not present in the affinity graph"))
end

targetidx, targetnodes = _targetidx_and_nodes(target_qualities, id_to_grid_coordinate_list)
qs = [_source_qualities[i] for i in id_to_grid_coordinate_list]
qt = [_target_qualities[i] for i in id_to_grid_coordinate_list ∩ targetidx]

@show typeof(qs) typeof(qt) typeof(targetidx) typeof(targetnodes)
g = Grid(
nrows,
ncols,
Expand All @@ -104,6 +113,10 @@ function Grid(nrows::Integer,
id_to_grid_coordinate_list,
_source_qualities,
_target_qualities,
targetidx,
targetnodes,
qs,
qt,
dims(source_qualities),
)

Expand Down Expand Up @@ -142,15 +155,17 @@ _unwrap(R::Raster) = parent(R)
_unwrap(R::AbstractMatrix) = R
# Compute a vector of the cartesian indices of nonzero target qualities and
# the corresponding node id corresponding to the indices
_targetidx(q::Matrix, grididxs::Vector) = grididxs
_targetidx(q::AbstractMatrix, grididxs::Vector) = grididxs
_targetidx(q::SparseMatrixCSC, grididxs::Vector) =
CartesianIndex.(findnz(q)[1:2]...) ∩ grididxs

function _targetidx_and_nodes(g::Grid)
targetidx = _targetidx(g.target_qualities, g.id_to_grid_coordinate_list)
_targetidx_and_nodes(g::Grid) =
_targetidx_and_nodes(g.target_qualities, g.id_to_grid_coordinate_list)
function _targetidx_and_nodes(target_qualities, id_to_grid_coordinate_list)
targetidx = _targetidx(target_qualities, id_to_grid_coordinate_list)
targetnodes = findall(
t -> t ∈ targetidx,
g.id_to_grid_coordinate_list)
id_to_grid_coordinate_list)
return targetidx, targetnodes
end

Expand Down Expand Up @@ -246,15 +261,23 @@ function largest_subgraph(g::Grid)
affinities = g.affinities[scci, scci]
# affinities = convert(SparseMatrixCSC{Float64,Int}, graph[scci])

id_to_grid_coordinate_list = g.id_to_grid_coordinate_list[scci]
targetidx, targetnodes = _targetidx_and_nodes(g.target_qualities, id_to_grid_coordinate_list)
qs = [g.source_qualities[i] for i in id_to_grid_coordinate_list]
qt = [g.target_qualities[i] for i in id_to_grid_coordinate_list ∩ targetidx]
return Grid(
g.nrows,
g.ncols,
affinities,
g.costfunction,
g.costfunction === nothing ? g.costmatrix[scci, scci] : mapnz(g.costfunction, affinities),
g.id_to_grid_coordinate_list[scci],
id_to_grid_coordinate_list,
g.source_qualities,
g.target_qualities,
targetidx,
targetnodes,
qs,
qt,
g.dims,
)
end
Expand Down Expand Up @@ -294,13 +317,12 @@ function least_cost_distance(g::Grid; θ::Nothing=nothing, approx::Bool=false)
if approx
throw(ArgumentError("no approximate algorithm is available for this distance function"))
end
targets = ConScape._targetidx_and_nodes(g)[1]
@progress vec_of_vecs = [_least_cost_distance(g, target) for target in targets]
targets = g.targetidx
@progress vec_of_vecs = [least_cost_distance(g, target) for target in targets]

return reduce(hcat, vec_of_vecs)
end

function _least_cost_distance(g::Grid, target::CartesianIndex{2})
function least_cost_distance(g::Grid, target::CartesianIndex{2})
graph = SimpleWeightedDiGraph(g.costmatrix)
targetnode = findfirst(isequal(target), g.id_to_grid_coordinate_list)
distvec = dijkstra_shortest_paths(graph, targetnode).dists
Expand Down Expand Up @@ -457,4 +479,4 @@ power_mean_proximity(
g::Grid;
θ::Union{Real,Nothing}=nothing,
approx::Bool=false
) = survival_probability(g; θ=θ, approx=approx) .^ (1/θ)
) = survival_probability(g; θ=θ, approx=approx) .^ (1/θ)
Loading
Loading