Skip to content

Commit c23e90c

Browse files
Merge pull request #92 from inzva/muratbiberoglu-patch
Migration of bundles continue
2 parents ac2bbcc + f275e29 commit c23e90c

27 files changed

+624
-1
lines changed

docs/data-structures/img/mo.png

146 KB
Loading

docs/data-structures/img/trie.png

67.1 KB
Loading

docs/data-structures/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,12 +22,14 @@ Bilgisayar biliminde veri yapıları, belirli bir eleman kümesi üzerinde verim
2222
### [Deque](deque.md)
2323
### [Fenwick Tree](fenwick-tree.md)
2424
### [Segment Tree](segment-tree.md)
25+
### [Trie](trie.md)
2526

2627
## Statik Veri Yapıları
2728

2829
### [Prefix Sum](prefix-sum.md)
2930
### [Sparse Table](sparse-table.md)
3031
### [SQRT Decomposition](sqrt-decomposition.md)
32+
### [Mo's Algorithm](mo-algorithm.md)
3133

3234
## Common Problems
3335

docs/data-structures/mo-algorithm.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
title: Mo's Algorithm
3+
tags:
4+
- Data Structures
5+
- Mo's Algorithm
6+
---
7+
8+
This method will be a key for solving offline range queries on an array. By offline, we mean we can find the answers of these queries in any order we want and there are no updates. Let’s introduce a problem and construct an efficient solution for it.
9+
10+
You have an array a with $N$ elements such that it’s elements ranges from $1$ to $M$. You have to answer $Q$ queries. Each is in the same type. You will be given a range $[l, r]$ for each query, you have to print how many different values are there in the subarray $[a_l , a_{l+1}..a_{r−1}, a_r]$.
11+
12+
First let’s find a naive solution and improve it. Remember the frequency array we mentioned before. We will keep a frequency array that contains only given subarray’s values. Number of values in this frequency array bigger than 0 will be our answer for given query. Then we have to update frequency array for next query. We will use $\mathcal{O}(N)$ time for each query, so total complexity will be $\mathcal{O}(Q \times N)$. Look at the code below for implementation.
13+
14+
```cpp
15+
class Query {
16+
public:
17+
int l, r, ind;
18+
Query(int l, int r, int ind) {
19+
this->l = l, this->r = r, this->ind = ind;
20+
}
21+
};
22+
23+
void del(int ind, vector<int> &a, vector<int> &F, int &num) {
24+
if (F[a[ind]] == 1) num--;
25+
F[a[ind]]--;
26+
}
27+
28+
void add(int ind, vector<int> &a, vector<int> &F, int &num) {
29+
if (F[a[ind]] == 0) num++;
30+
F[a[ind]]++;
31+
}
32+
33+
vector<int> solve(vector<int> &a, vector<Query> &q) {
34+
int Q = q.size(), N = a.size();
35+
int M = *max_element(a.begin(), a.end());
36+
vector<int> F(M + 1, 0); // This is frequency array we mentioned before
37+
vector<int> ans(Q, 0);
38+
int l = 0, r = -1, num = 0;
39+
for (int i = 0; i < Q; i++) {
40+
int nl = q[i].l, nr = q[i].r;
41+
while (l < nl) del(l++, a, F, num);
42+
while (l > nl) add(--l, a, F, num);
43+
while (r > nr) del(r--, a, F, num);
44+
while (r < nr) add(++r, a, F, num);
45+
ans[q[i].ind] = num;
46+
}
47+
return ans;
48+
}
49+
```
50+
51+
Time complexity for each query here is $\mathcal{O}(N)$. So total complexity is $\mathcal{O}(Q \times N)$. Just by changing the order of queries we will reduce this complexity to $\mathcal{O}((Q + N) \times \sqrt N)$.
52+
53+
## Mo's Algorithm
54+
55+
We will change the order of answering the queries such that overall complexity will be reduced drastically. We will use following cmp function to sort our queries and will answer them in this sorted order. Block size here is $\mathcal{O}(\sqrt N)$.
56+
57+
```cpp
58+
bool operator<(Query other) const {
59+
return make_pair(l / block_size, r) <
60+
make_pair(other.l / block_size, other.r);
61+
}
62+
```
63+
64+
Why does that work? Let’s examine what we do here first then find the complexity. We divide $l$'s of queries into blocks. Block number of a given $l$ is $l$ blocksize (integer division). We sort the queries first by their block numbers then for same block numbers, we sort them by their $r$'s. Sorting all queries will take $\mathcal{O}(Q \times log{Q})$ time. Let’s look at how many times we will call add and del operations to change current $r$. For the same block $r$'s always increases. So for same block it is $\mathcal{O}(N)$ since it can only increase. Since there are $N$ blocksize blocks in total, it will be $\mathcal{O}(N \times N / \text{block\_size})$ operations in total. For same block, add and del operations that changes $l$ will be called at most $\mathcal{O}(\text{block\_size})$ times for each query, since if block number is same then their $l$'s must differ at most by $\mathcal{O}(\text{block\_size})$. So overall it is $\mathcal{O}(Q \times \text{block\_size})$. Also when consecutive queries has different block numbers we will perform at most $\mathcal{O}(N)$ operations, but notice that there are at most $\mathcal{O}(N \div \text{block\_size})$ such consecutive queries, so it doesn't change the overall time complexity. If we pick $block\_size = \sqrt N$ overall complexity will be $\mathcal{O}((Q + N) \times \sqrt N)$. Full code is given below.
65+
66+
<figure markdown="span" style="width: 64%">
67+
![Example for the Algorithm](img/mo.png)
68+
<figcaption>Example for the Algorithm</figcaption>
69+
</figure>
70+
71+
```cpp
72+
int block_size;
73+
74+
class Query {
75+
public:
76+
int l, r, ind;
77+
Query(int l, int r, int ind) {
78+
this->l = l, this->r = r, this->ind = ind;
79+
}
80+
bool operator<(Query other) const {
81+
return make_pair(l / block_size, r) <
82+
make_pair(other.l / block_size, other.r);
83+
}
84+
};
85+
86+
void del(int ind, vector<int> &a, vector<int> &F, int &num) {
87+
if (F[a[ind]] == 1) num--;
88+
F[a[ind]]--;
89+
}
90+
91+
void add(int ind, vector<int> &a, vector<int> &F, int &num) {
92+
if (F[a[ind]] == 0) num++;
93+
F[a[ind]]++;
94+
}
95+
96+
vector<int> solve(vector<int> &a, vector<Query> &q) {
97+
int Q = q.size(), N = a.size();
98+
int M = *max_element(a.begin(), a.end());
99+
block_size = sqrt(N);
100+
sort(q.begin(), q.end());
101+
vector<int> F(M + 1, 0); // This is frequency array we mentioned before
102+
vector<int> ans(Q, 0);
103+
int l = 0, r = -1, num = 0;
104+
for (int i = 0; i < Q; i++) {
105+
int nl = q[i].l, nr = q[i].r;
106+
while (l < nl) del(l++, a, F, num);
107+
while (l > nl) add(--l, a, F, num);
108+
while (r > nr) del(r--, a, F, num);
109+
while (r < nr) add(++r, a, F, num);
110+
ans[q[i].ind] = num;
111+
}
112+
return ans;
113+
}
114+
```

docs/data-structures/trie.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
title: Trie
3+
tags:
4+
- Data Structures
5+
- Trie
6+
---
7+
8+
Trie is an efficient information reTrieval data structure. Using Trie, search complexities can be brought to optimal limit (key length). If we store keys in binary search tree, a well balanced BST will need time proportional to $M \times log N$, where $M$ is maximum string length and $N$ is number of keys in tree. Using Trie, we can search the key in $\mathcal{O}(M)$ time. However the penalty is on Trie storage requirements (Please refer [Applications of Trie](https://www.geeksforgeeks.org/advantages-trie-data-structure/) for more details)
9+
10+
11+
<figure markdown="span" style="width: 36%">
12+
![Trie Structure https://www.geeksforgeeks.org/wp-content/uploads/Trie.png](img/trie.png)
13+
<figcaption>Trie Structure. https://www.geeksforgeeks.org/wp-content/uploads/Trie.png</figcaption>
14+
</figure>
15+
16+
Every node of Trie consists of multiple branches. Each branch represents a possible character of keys. We need to mark the last node of every key as end of word node. A Trie node field isEndOfWord is used to distinguish the node as end of word node. A simple structure to represent nodes of English alphabet can be as following,
17+
18+
```cpp
19+
// Trie node
20+
class TrieNode {
21+
public:
22+
TrieNode *children[ALPHABET_SIZE];
23+
bool isEndOfWord;
24+
TrieNode() {
25+
isEndOfWord = false;
26+
for (int i = 0; i < ALPHABET SIZE; i++)
27+
children[i] = NULL;
28+
}
29+
};
30+
```
31+
32+
## Insertion
33+
34+
Inserting a key into Trie is simple approach. Every character of input key is inserted as an individual Trie node. Note that the children is an array of pointers (or references) to next level Trie nodes. The key character acts as an index into the array children. If the input key is new or an extension of existing key, we need to construct non-existing nodes of the key, and mark end of word for last node. If the input key is prefix of existing key in Trie, we simply mark the last node of key as end of word. The key length determines Trie depth.
35+
36+
```cpp
37+
void insert(struct TrieNode *root, string key) {
38+
struct TrieNode *pCrawl = root;
39+
for (int i = 0; i < key.length(); i++) {
40+
int index = key[i] - 'a';
41+
if (!pCrawl->children[index])
42+
pCrawl->children[index] = new TrieNode;
43+
pCrawl = pCrawl->children[index];
44+
}
45+
pCrawl->isEndOfWord = true;
46+
}
47+
```
48+
49+
## Search
50+
51+
Searching for a key is similar to insert operation, however we only compare the characters and move down. The search can terminate due to end of string or lack of key in Trie. In the former case, if the isEndofWord field of last node is true, then the key exists in Trie. In the second case, the search terminates without examining all the characters of key, since the key is not present in Trie.
52+
53+
```cpp
54+
bool search(struct TrieNode *root, string key) {
55+
TrieNode *pCrawl = root;
56+
for (int i = 0; i < key.length(); i++) {
57+
int index = key[i] - 'a';
58+
if (!pCrawl->children[index])
59+
return false;
60+
pCrawl = pCrawl->children[index];
61+
}
62+
return (pCrawl != NULL && pCrawl->isEndOfWord);
63+
}
64+
```
65+
66+
Insert and search costs $\mathcal{O}(\text{key\_length})$. However the memory requirements of Trie high. It is $\mathcal{O}(\text{ALPHABET SIZE} \times \text{key\_length} \times N)$ where $N$ is number of keys in Trie. There are efficient representation of trie nodes (e.g. compressed trie, ternary search tree, etc.) to minimize memory requirements of trie.

docs/graph/bipartite-checking.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
title: Bipartite Checking
3+
tags:
4+
- Bipartite Checking
5+
- Graph
6+
---
7+
8+
The question is in the title. Is the given graph bipartite? We can use BFS or DFS on graph. Lets first focus on BFS related algorithm. This procedure is very similar to BFS, we have an extra color array and we assign a color to each vertex when we are traversing the graph. Algorithm proof depends on fact that BFS explores the graph level by level. If the graph contains an odd cycle it means that there must be a edge between two vertices that are in same depth (layer, proof can be found on [[1 - Algorithm Design, Kleinberg, Tardos]]()). Let's say the colors are red and black and we traverse the graph with BFS and assign red to odd layers and black to even layers. Then we check the edges to see if there exists an edge that its vertices are same color. If there is a such edge, the graph is not bipartite, else the graph is bipartite.
9+
10+
<figure markdown="span">
11+
![If two nodes x and y in the same layer are joined by an edge, then the cycle through x, y, and their lowest common ancestor z has odd length, demonstrating that the graph cannot be bipartite.](img/bipartite_check.png)
12+
<figcaption>If two nodes x and y in the same layer are joined by an edge, then the cycle through x, y, and their lowest common ancestor z has odd length, demonstrating that the graph cannot be bipartite.</figcaption>
13+
</figure>
14+
15+
```cpp
16+
typedef vector<int> adjList;
17+
typedef vector<adjList> graph;
18+
typedef pair<int,int> ii;
19+
enum COLOR {RED, GREEN};
20+
bool bipartite_check(graph &g){
21+
int root = 0; // Pick 0 indexed node as root.
22+
vector<bool> visited(g.size(),false);
23+
vector<int> Color(g.size(),0);
24+
queue<ii> Q( { {root,0}} ); // insert root to queue, it is first layer_0
25+
visited[root] = true;
26+
Color[root] = RED;
27+
while ( !Q.empty() )
28+
{
29+
/*top.first is node, top.second its depth i.e layer */
30+
auto top = Q.front();
31+
Q.pop();
32+
for (int u : g[top.first]){
33+
if ( !visited[u] ){
34+
visited[u] = true;
35+
//Mark even layers to red, odd layers to green
36+
Color[u] = (top.second+1) % 2 == 0 ? RED : GREEN;
37+
Q.push({u, top.second+1 });
38+
}
39+
}
40+
}
41+
for(int i=0; i < g.size(); ++i){
42+
for( auto v: g[i]){
43+
if ( Color[i] == Color[v] ) return false;
44+
}
45+
}
46+
return true;
47+
}
48+
int main(){
49+
graph g(3);
50+
g[0].push_back(1);
51+
g[1].push_back(2);
52+
g[2].push_back(3);
53+
cout << (bipartite_check(g) == true ? "YES" : "NO") << endl;
54+
return 0;
55+
}
56+
```
57+
58+
The complexity of algorithm is is $O(V + E) + O(E) $, BFS and loop over edges. But we can say it $O(V+E)$ since it is Big-O notation.
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
---
2+
title: Bridges and Articulation Points
3+
tags:
4+
- Bridge
5+
- Articulation Point
6+
- Cut Vertex
7+
- Cut Edge
8+
- Graph
9+
---
10+
11+
## DFS Order
12+
13+
**DFS order** is traversing all the nodes of a given graph by fixing the root node in the same way as in the DFS algorithm, but without revisiting a discovered node. An important observation here is that the edges and nodes we use will form a **tree** structure. This is because, for every node (**except the root**), we only arrive from another node, and for the **root** node, we do not arrive from any other node, thus forming a **tree** structure.
14+
15+
```cpp
16+
void dfs(int node){
17+
used[node] = true;
18+
for(auto it : g[node])
19+
if(!used[it])
20+
dfs(it);
21+
}
22+
```
23+
24+
### Types of Edges
25+
26+
When traversing a graph using DFS order, several types of edges can be encountered. These edges will be very helpful in understanding some graph algorithms.
27+
28+
**Types of Edges:**
29+
- **Tree edge:** These are the main edges used while traversing the graph.
30+
- **Forward edge:** These edges lead to a node that has been visited before and is located in our own subtree.
31+
- **Back edge:** These edges lead to nodes that have been visited before but where the DFS process is not yet complete.
32+
- **Cross edge:** These edges lead to nodes that have been visited before and where the DFS process is already complete.
33+
34+
An important observation about these edges is that in an undirected graph, it is impossible to have a cross edge. This is because it is not possible for an edge emerging from a node where the DFS process is complete to remain unvisited.
35+
36+
<figure markdown="span" style="width: 36%">
37+
![Green-colored edges are tree edges. Edge (1,8) is a forward edge. Edge (6,4) is a back edge. Edge (5,4) is a cross edge.](img/types-of-edges.png)
38+
<figcaption>Green-colored edges are tree edges. Edge (1,8) is a forward edge. Edge (6,4) is a back edge. Edge (5,4) is a cross edge.</figcaption>
39+
</figure>
40+
41+
## Bridge
42+
43+
In an **undirected** and **connected** graph, if removing an edge causes the graph to become disconnected, this edge is called a **bridge**.
44+
45+
### Finding Bridges
46+
47+
Although there are several algorithms to find bridges (such as **Chain Decomposition**), we will focus on **Tarjan's Algorithm**, which is among the easiest to implement and the fastest.
48+
49+
When traversing a graph using DFS, if there is a **back edge** coming out of the subtree of the lower endpoint of an edge, then that edge is **not** a bridge. This is because the **back edge** prevents the separation of the subtree and its ancestors when the edge is removed.
50+
51+
This algorithm is based exactly on this principle, keeping track of the minimum depth reached by the **back edge**s within the subtree of each node.
52+
53+
If the minimum depth reached by the **back edge**s in the subtree of the lower endpoint of an edge is greater than or equal to the depth of the upper endpoint, then this edge is a **bridge**. This is because no **back edge** in the subtree of the edge's lower endpoint reaches a node above the current edge. Therefore, if we remove this edge, the subtree and its ancestors become disconnected.
54+
55+
Using Tarjan's Algorithm, we can find all bridges in a graph with a time complexity of $\mathcal{O}(V + E)$, where $V$ represents the number of vertices and $E$ represents the number of edges in the graph.
56+
57+
```cpp
58+
int dfs(int node, int parent, int depth) {
59+
int minDepth = depth;
60+
dep[node] = depth; // dep dizisi her dugumun derinligini tutmaktadir.
61+
used[node] = true;
62+
for (auto it : g[node]) {
63+
if (it == parent)
64+
continue;
65+
if (used[it]) {
66+
minDepth = min(minDepth, dep[it]);
67+
// Eger komsu dugum daha once kullanilmis ise
68+
// Bu edge back edge veya forward edgedir.
69+
continue;
70+
}
71+
int val = dfs(it, node, depth + 1);
72+
// val degeri alt agacindan yukari cikan minimum derinliktir.
73+
if (val >= depth + 1)
74+
bridges.push_back({node, it});
75+
minDepth = min(minDepth, val);
76+
}
77+
return minDepth;
78+
}
79+
```
80+
81+
## Articulation Point
82+
83+
In an undirected graph, if removing a node increases the number of connected components, that node is called an **articulation point** or **cut point**.
84+
85+
<figure markdown="span" style="width: 36%">
86+
![For example, if we remove node 0, the remaining nodes are split into two groups: 5 and 1, 2, 3, 4. Similarly, if we remove node 1, the nodes are split into 5, 0 and 2, 3, 4. Therefore, nodes 0 and 1 are **articulation points**.](img/cut-point.png)
87+
<figcaption>For example, if we remove node 0, the remaining nodes are split into two groups: 5 and 1, 2, 3, 4. Similarly, if we remove node 1, the nodes are split into 5, 0 and 2, 3, 4. Therefore, nodes 0 and 1 are **articulation points**.</figcaption>
88+
</figure>
89+
90+
### Finding Articulation Points
91+
92+
Tarjan's Algorithm for finding articulation points in an undirected graph:
93+
94+
- Traverse the graph using DFS order.
95+
96+
- For each node, calculate the depth of the minimum depth node that can be reached from the current node and its subtree through back edges. This value is called the **low** value of the node.
97+
98+
- If the **low** value of any child of a non-root node is greater than or equal to the depth of the current node, then the current node is an **articulation point**. This is because no **back edge** in the subtree of this node can reach a node above the current node. Therefore, if this node is removed, its subtree will become disconnected from its ancestors.
99+
100+
- If the current node is the root (the starting node of the DFS order) and there are multiple branches during the DFS traversal, then the root itself is an **articulation point**. This is because the root has multiple connected subgraphs.
101+
102+
Using Tarjan's Algorithm, we can find all articulation points in a graph with a time complexity of $\mathcal{O}(V + E)$, where $V$ is the number of vertices and $E$ is the number of edges in the graph.
103+
104+
```cpp
105+
int dfs(int node, int parent, int depth) {
106+
int minDepth = depth, children = 0;
107+
dep[node] = depth; // dep array holds depth of each node.
108+
used[node] = true;
109+
for (auto it : g[node]) {
110+
if (it == parent)
111+
continue;
112+
if (used[it]) {
113+
minDepth = min(minDepth, dep[it]);
114+
continue;
115+
}
116+
int val = dfs(it, node, depth + 1);
117+
if (val >= depth and parent != -1)
118+
isCutPoint[node] = true;
119+
minDepth = min(minDepth, val);
120+
children++;
121+
}
122+
// This if represents the root condition that we mentioned above.
123+
if (parent == -1 and children >= 2)
124+
isCutPoint[node] = true;
125+
return minDepth;
126+
}
127+
```

0 commit comments

Comments
 (0)