Skip to content

Latest commit

 

History

History
452 lines (377 loc) · 22.4 KB

File metadata and controls

452 lines (377 loc) · 22.4 KB

Algorithms in nx-neptune

nx-neptune defined algorithms map NetworkX calls to Amazon Neptune Analytics functions. For a complete listing of functions provided by Amazon Neptune Analytics, please visit here.

Traversal Algorithms

bfs_edges

Iterate over edges in a breadth-first-search starting at source. Runs the breadth-first search (BFS) algorithm for finding nodes.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/bfs-standard.html

Source: nx_neptune/algorithms/traversal/bfs.py

@configure_if_nx_active()
def bfs_edges(
    neptune_graph: NeptuneGraph,
    source,
    reverse=False,
    depth_limit=None,
    sort_neighbors=None,
    vertex_label: Optional[str] = None,
    edge_labels: Optional[List] = None,
    concurrency: Optional[int] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph
  • source : node - Specify starting node for breadth-first search; this function iterates over only those edges in the component reachable from this node.
  • reverse : bool, optional - If True traverse a directed graph in the reverse direction
  • depth_limit : int, optional(default=len(G)) - Specify the maximum search depth
  • sort_neighbors : function (default=None) - A function that takes an iterator over nodes as the input, and returns an iterable of the same nodes with a custom ordering. For example, sorted will sort the nodes in increasing order. (not supported in Neptune Analytics)
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • edge_labels : list, optional - To filter on one more edge label, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm. If set to 0, uses all available threads to complete execution of the individual algorithm invocation. If set to 1, uses a single thread.

Yields:

  • edge - Edges in the breadth-first search starting from source.

descendants_at_distance

The BFS-Levels algorithm is executed on the Neptune Analytics graph, to achieve feature parity with the results generated by NetworkX.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/bfs-levels.html

Source: nx_neptune/algorithms/traversal/bfs.py

@configure_if_nx_active()
def descendants_at_distance(
    neptune_graph: NeptuneGraph,
    source,
    distance: int,
    edge_labels: Optional[List] = None,
    vertex_label: Optional[str] = None,
    traversal_direction: Optional[str] = None,
    concurrency: Optional[int] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • source - Specify starting node for breadth-first levels search
  • distance : int - The distance of the wanted nodes from source
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • traversal_direction : str, optional - The direction of edge to follow. Must be one of: "outbound" or "inbound".
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.

Returns:

  • set() - The descendants of source in G at the given distance from source

bfs_layers

The BFS-Levels algorithm is executed on the Neptune Analytics graph, to achieve feature parity with the results generated by NetworkX.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/bfs-levels.html

Source: nx_neptune/algorithms/traversal/bfs.py

@configure_if_nx_active()
def bfs_layers(
    neptune_graph: NeptuneGraph,
    sources,
    edge_labels: Optional[List] = None,
    vertex_label: Optional[str] = None,
    traversal_direction: Optional[str] = None,
    concurrency: Optional[int] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • sources - Specify starting node for breadth-first levels search
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • traversal_direction : str, optional - The direction of edge to follow. Must be one of: "outbound" or "inbound".
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.

Yields:

  • List of nodes at the same distance from sources.

Link Analysis Algorithms

pagerank

Executes PageRank algorithm on the graph.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/page-rank.html

Source: nx_neptune/algorithms/link_analysis/pagerank.py

@configure_if_nx_active()
def pagerank(
    neptune_graph: NeptuneGraph,
    alpha: float,
    personalization: Optional[Dict],
    max_iter: int,
    tol: float,
    nstart: Optional[Dict],
    weight: Optional[str] = None,
    dangling: Optional[Dict] = None,
    vertex_label: Optional[str] = None,
    edge_labels: Optional[List] = None,
    concurrency: Optional[int] = None,
    traversal_direction: Optional[str] = None,
    edge_weight_property: Optional[str] = None,
    edge_weight_type: Optional[str] = None,
    source_nodes: Optional[List] = None,
    source_weights: Optional[List] = None,
    write_property: Optional[str] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • alpha : float - Damping parameter for PageRank
  • personalization : dict, optional - Dict with nodes as keys and personalization values (not supported in Neptune Analytics)
  • max_iter : int - Maximum number of iterations
  • tol : float - Error tolerance to check convergence
  • nstart : dict, optional - Dict with nodes as keys and initial PageRank values (not supported in Neptune Analytics)
  • weight : str, optional - Edge attribute to use as weight (not supported in Neptune Analytics)
  • dangling : dict, optional - Dict with nodes as keys and dangling values (not supported in Neptune Analytics)
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.
  • traversal_direction : str, optional - The direction of edge to follow. Must be one of: "outbound" or "inbound".
  • edge_weight_property : str, optional - The weight property to consider for weighted pageRank computation.
  • edge_weight_type : str, optional - required if edgeWeightProperty is present, valid values: "int", "long", "float", "double".
  • source_nodes : list, optional - If a vertexLabel is provided, nodes that do not have the given vertexLabel are ignored.
  • source_weights : list, optional - A personalization weight list. The weight distribution among the personalized vertices.
  • write_property : str, optional - Specifies the name of the node property that will store the computed pageRank values.

Returns:

  • Computation result of pagerank algorithm, or an empty dictionary when write_property is specified.

Centrality Algorithms

degree_centrality

Compute the degree centrality for nodes.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/degree.html

Source: nx_neptune/algorithms/centrality/degree_centrality.py

@configure_if_nx_active()
def degree_centrality(
    neptune_graph: NeptuneGraph,
    vertex_label: Optional[str] = None,
    edge_labels: Optional[List] = None,
    concurrency: Optional[int] = None,
    write_property: Optional[str] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.
  • write_property : str, optional - Specifies the name of the node property that will store the computed degree values.

Returns:

  • Dictionary of node.id:degree pairs

in_degree_centrality

Executes Degree algorithm on the graph with inbound edges.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/degree.html

Source: nx_neptune/algorithms/centrality/degree_centrality.py

@configure_if_nx_active()
def in_degree_centrality(
    neptune_graph: NeptuneGraph,
    vertex_label: Optional[str] = None,
    edge_labels: Optional[List] = None,
    concurrency: Optional[int] = None,
    write_property: Optional[str] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.
  • write_property : str, optional - Specifies the name of the node property that will store the computed degree values.

Returns:

  • Dictionary of node.id:degree pairs

out_degree_centrality

Executes Degree algorithm on the graph with outbound edges.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/degree.html

Source: nx_neptune/algorithms/centrality/degree_centrality.py

@configure_if_nx_active()
def out_degree_centrality(
    neptune_graph: NeptuneGraph,
    vertex_label: Optional[str] = None,
    edge_labels: Optional[List] = None,
    concurrency: Optional[int] = None,
    write_property: Optional[str] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.
  • write_property : str, optional - Specifies the name of the node property that will store the computed degree values.

Returns:

  • Dictionary of node.id:degree pairs

closeness_centrality

Compute the closeness centrality for nodes.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/closeness-centrality.html

Source: nx_neptune/algorithms/centrality/closeness.py

@configure_if_nx_active()
def closeness_centrality(
        neptune_graph: NeptuneGraph,
        u=None,
        distance=None,
        wf_improved=True,
        num_sources: Optional[int] = None,
        edge_labels: Optional[List] = None,
        vertex_label: Optional[str] = None,
        traversal_direction: Optional[str] = None,
        concurrency: Optional[int] = None,
        write_property: Optional[str] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • u, optional - Optional; limits the computation scope to the specified node. If omitted, the computation is performed over the entire graph.
  • distance, optional - (Unsupported) Specifies the edge attribute to use as the distance in the shortest path calculations.
  • wf_improved : bool - If True, scale by the fraction of nodes reachable. This gives the Wasserman and Faust improved formula. For single component graphs it is the same as the original formula.
  • num_sources : int, optional - The number of source nodes to use for approximating closeness centrality. If omitted, defaults to maxInt, resulting in exact closeness centrality computation.
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • traversal_direction : str, optional - The direction of edge to follow. Must be one of: "outbound" or "inbound".
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.
  • write_property : str, optional - Specifies the name of the node property that will store the computed group id values.

Returns:

  • Dictionary with the pair of node ID as key and closeness centrality score as value.

Community Algorithms

label_propagation_communities

Executes labelPropagation algorithm on the graph.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/label-propagation.html

Source: nx_neptune/algorithms/communities/label_propagation.py

@configure_if_nx_active()
def label_propagation_communities(
    neptune_graph: NeptuneGraph,
    edge_labels: Optional[List] = None,
    vertex_label: Optional[str] = None,
    vertex_weight_property: Optional[str] = None,
    vertex_weight_type: Optional[str] = None,
    edge_weight_property: Optional[str] = None,
    edge_weight_type: Optional[str] = None,
    max_iterations: Optional[int] = None,
    traversal_direction: Optional[str] = None,
    concurrency: Optional[int] = None,
    write_property: Optional[str] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • vertex_weight_property : str, optional - The node weight used in Label Propagation. When vertexWeightProperty is not specified, each node's communityId is treated equally.
  • vertex_weight_type : str, optional - The type of the numeric values in the node property specified by vertexWeightProperty.
  • edge_weight_property : str, optional - The weight property to consider for weighted computation.
  • edge_weight_type : str, optional - required if edgeWeightProperty is present, valid values: "int", "long", "float", "double".
  • max_iterations : int, optional - The maximum number of iterations to run.
  • traversal_direction : str, optional - The direction of edge to follow. Must be one of: "outbound" or "inbound".
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.
  • write_property : str, optional - Specifies the name of the node property that will store the computed degree values.

Returns:

  • Dictionary of community items to members, in the form of dict_value.

fast_label_propagation_communities

Executes labelPropagation algorithm on the graph.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/label-propagation.html

Source: nx_neptune/algorithms/communities/label_propagation.py

@configure_if_nx_active()
def fast_label_propagation_communities(
    neptune_graph: NeptuneGraph,
    *,
    weight=None,
    seed=None,
    edge_labels: Optional[List] = None,
    vertex_label: Optional[str] = None,
    vertex_weight_property: Optional[str] = None,
    vertex_weight_type: Optional[str] = None,
    edge_weight_property: Optional[str] = None,
    edge_weight_type: Optional[str] = None,
    max_iterations: Optional[int] = None,
    traversal_direction: Optional[str] = None,
    concurrency: Optional[int] = None,
    write_property: Optional[str] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • weight - The edge attribute representing a non-negative weight of an edge. If None, each edge is assumed to have weight one.
  • seed - Indicator of random number generation state.
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • vertex_weight_property : str, optional - The node weight used in Label Propagation.
  • vertex_weight_type : str, optional - The type of the numeric values in the node property specified by vertexWeightProperty.
  • edge_weight_property : str, optional - The weight property to consider for weighted pageRank computation.
  • edge_weight_type : str, optional - required if edgeWeightProperty is present, valid values: "int", "long", "float", "double".
  • max_iterations : int, optional - The maximum number of iterations to run.
  • traversal_direction : str, optional - The direction of edge to follow. Must be one of: "outbound" or "inbound".
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.
  • write_property : str, optional - Specifies the name of the node property that will store the computed degree values.

Returns:

  • Dictionary of community items to members, in the form of dict_value.

asyn_lpa_communities

Executes labelPropagation algorithm on the graph.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/label-propagation.html

Source: nx_neptune/algorithms/communities/label_propagation.py

@configure_if_nx_active()
def asyn_lpa_communities(
    neptune_graph: NeptuneGraph,
    weight=None,
    seed=None,
    edge_labels: Optional[List] = None,
    vertex_label: Optional[str] = None,
    vertex_weight_property: Optional[str] = None,
    vertex_weight_type: Optional[str] = None,
    edge_weight_property: Optional[str] = None,
    edge_weight_type: Optional[str] = None,
    max_iterations: Optional[int] = None,
    traversal_direction: Optional[str] = None,
    concurrency: Optional[int] = None,
    write_property: Optional[str] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • weight - The edge attribute representing a non-negative weight of an edge. If None, each edge is assumed to have weight one.
  • seed - Indicator of random number generation state.
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • vertex_label : str, optional - A vertex label for vertex filtering.
  • vertex_weight_property : str, optional - The node weight used in Label Propagation.
  • vertex_weight_type : str, optional - The type of the numeric values in the node property specified by vertexWeightProperty.
  • edge_weight_property : str, optional - The weight property to consider for weighted pageRank computation.
  • edge_weight_type : str, optional - required if edgeWeightProperty is present, valid values: "int", "long", "float", "double".
  • max_iterations : int, optional - The maximum number of iterations to run.
  • traversal_direction : str, optional - The direction of edge to follow. Must be one of: "outbound" or "inbound".
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.
  • write_property : str, optional - Specifies the name of the node property that will store the computed degree values.

Returns:

  • Dictionary of community items to members, in the form of dict_value.

louvain_communities

Executes Louvain algorithm on the graph.
Link: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/louvain.html

Source: nx_neptune/algorithms/communities/louvain.py

@configure_if_nx_active()
def louvain_communities(
    neptune_graph: NeptuneGraph,
    weight: str,
    resolution: float,
    threshold: float,
    max_level: Optional[int],
    seed: Optional[int],
    edge_weight_property: Optional[str] = None,
    edge_weight_type: Optional[str] = None,
    edge_labels: Optional[List] = None,
    max_iterations: Optional[int] = None,
    concurrency: Optional[int] = None,
    level_tolerance: Optional[float] = None,
    write_property: Optional[str] = None,
):

Parameters:

  • neptune_graph : NeptuneGraph - A NeptuneGraph instance
  • weight : str - The edge attribute representing a non-negative weight of an edge.
  • resolution : float - (not supported in Neptune Analytics) If resolution is less than 1, the algorithm favors larger communities. Greater than 1 favors smaller communities
  • threshold : float - Modularity gain threshold for each level. If the gain of modularity between 2 levels of the algorithm is less than the given threshold, then the algorithm stops and returns the resulting communities.
  • max_level : int, optional - The minimum change in modularity required to continue to the next level.
  • seed : int, optional - (not supported in Neptune Analytics) Indicator of random number generation state.
  • edge_weight_property : str, optional - The weight property to consider for weighted pageRank computation.
  • edge_weight_type : str, optional - required if edgeWeightProperty is present, valid values: "int", "long", "float", "double".
  • edge_labels : list, optional - To filter on one more edge labels, provide a list of the ones to filter on. If no edgeLabels field is provided then all edge labels are processed during traversal.
  • max_iterations : int, optional - The maximum number of iterations to run.
  • concurrency : int, optional - Controls the number of concurrent threads used to run the algorithm.
  • level_tolerance : float, optional - The minimum change in modularity required to continue to the next level.
  • write_property : str, optional - Specifies the name of the node property that will store the computed group id values.

Returns:

  • Dictionary of community items to members, in the form of list of set.