-
Notifications
You must be signed in to change notification settings - Fork 8
Expand file tree
/
Copy path005-inspecting-nodes-and-edges.Rmd
More file actions
149 lines (108 loc) · 5.76 KB
/
Copy path005-inspecting-nodes-and-edges.Rmd
File metadata and controls
149 lines (108 loc) · 5.76 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
---
title: "005 - Inspecting Nodes and Edges"
output: html_document
---
## Setup
Ensure that the development version of **DiagrammeR** is installed. Load in the package with `library()`. Additionally, load in the **tidyverse** packages.
```{r load_packages, message=FALSE, warning=FALSE, include=FALSE, results=FALSE}
#devtools::install_github("rich-iannone/DiagrammeR")
library(DiagrammeR)
library(tidyverse)
```
## Part 1. Information on All Nodes and Edges
When you have a graph object, sometimes you'll want to poke around and inspect some of the nodes, and some of the edges. There are very good reasons for doing so. There can be valuable information within the nodes and edges. Further graph construction may hinge on what's extant in the graph. Inspection is a good way to verify that graph modification has indeed taken place in the correct manner.
First, let's build a graph to use for the examples. For the node data frame we will include values for the `type`, `label`, and `data` attributes. The edge data frame will contain the `rel`, `color`, and `weight` edge attributes.
```{r create_initial_graph}
# Create a node data frame (ndf) with
# 4 nodes
ndf <-
create_node_df(
n = 4,
type = "number",
label = c(
"one", "two",
"three", "four"),
data = c(
3.5, 2.6, 9.4, 2.7))
# Create an edge data frame (ndf) with
# 4 edges
edf <-
create_edge_df(
from = c(1, 2, 3, 4),
to = c(4, 3, 1, 1),
rel = c("P", "B", "L", "L"),
color = c("green", "blue", "red", "red"),
weight = c(2.1, 5.7, 10.1, 3.9))
graph <-
create_graph(
nodes_df = ndf,
edges_df = edf)
# Render the graph to see it in the RStudio Viewer
graph %>% render_graph()
```
The `get_node_ids()` function simply returns a vector of node ID values. This is useful in many cases and is great when used as a sanity check.
```{r use_get_node_ids}
graph %>% get_node_ids()
```
Using the `get_node_info()` function provides a data frame with detailed information on nodes and their interrelationships within a graph. It always returns the same columns, in the same order. It returns as many rows as there are nodes in the graph. It's useful when you want a quick summary of the node ID values, their labels and `type` values, and their degrees of connectness with other nodes.
```{r use_get_node_info}
graph %>% get_node_info()
```
In the above table the base attributes of the nodes are provided first (`id`, `type`, and `label`) and then follow the columns with degree information (`deg`, `indeg`, and `outdeg`). The node degree (`deg`) describes the number of edges to or from the node. The indegree and outdegree are the number of edges coming in to the node and out from the node, respectively. Finally, the `loops` column provides the number of self edges for the node (this is an edge that starts and terminates at the same node, so the degree for that is 2).
The `get_edges()` function returns all of the node ID values related to each edge in the graph:
```{r use_get_edges}
graph %>% get_edges()
```
Like nodes, edges also have ID values. This is important for distinguishing between edges when a pair of nodes has multiple edges between them (and especially if they are in the same direction in a directed graph). To get all edge ID values in the graph, use the `get_edge_ids()` function.
```{r use_get_edge_ids}
graph %>% get_edge_ids()
```
The `get_edge_info()`, like the `get_node_info()` function, always returns a data frame with a set number of columns. In this case, it is the edge ID value `id`, the node ID values `from` and `to` that define the links, and, the relationship (`rel`) labels for the edges.
```{r use_get_edge_info}
graph %>% get_edge_info()
```
## Part 2. Inspecting Nodes, Edges, and their Attributes
Two of a graph object's main components are its node data frame (ndf) and its edge data frame (edf). These can be obtained as individual data frames using the `get_node_df()` and `get_edge_df()` functions:
```{r use_get_node_df}
# Get the graph's ndf with the `get_node_df()` function
graph %>% get_node_df()
```
```{r use_get_edge_df}
# Get the graph's edf with the `get_edge_df()` function
graph %>% get_edge_df()
```
For the ndf, the `id`, `type`, and `label` columns will always be present and in that prescribed order. For the edf, it is the `id`, `from`, `to`, and `rel` columns will always be present. Any additional columns can be either parameters recognized by the graph rendering engine (e.g., `color`, `fontname`, etc.) or non-aesthetic properties of the nodes or edges (e.g., a node `data` value or an edge `weight`).
## Part 3. Determining Existence of Nodes or Edges
There may be cases where you need to verify that a certain node ID exists in the graph or that an edge definition is present. The `is_node_present()` and `is_edge_present()` functions will provide a `TRUE` or `FALSE` value as confirmation.
Get the node ID values present in the graph with the `get_node_ids()` function.
```{r use_get_node_id_2}
graph %>% get_node_ids()
```
Is node with ID `1` in the graph? Use `node_present()` to find out.
```{r is_node_1_present}
graph %>%
is_node_present(node = 1)
```
Is node with ID `5` in the graph?
```{r is_node_5_present}
graph %>%
is_node_present(node = 5)
```
Get the node ID values associated with the edges present in the graph (with the `get_edges()` function).
```{r use_get_edges_2}
graph %>% get_edges()
```
To determine whether an edge is present, the `edge_present()` function takes 2 arguments after `graph`: `from` and `to`. So, to find out whether the edge `1->4` is present, the following can be used:
```{r is_edge_1_4_present}
graph %>%
is_edge_present(
from = 1,
to = 4)
```
Since the the edge `2->4` does not exist, the following will return FALSE:
```{r is_edge_2_4_present}
graph %>%
is_edge_present(
from = 2,
to = 4)
```