Skip to content

Commit ff2b597

Browse files
Merge pull request #87 from beyzanurdeniz/master
segment tree translated, lazy and binary search added
2 parents 10134d8 + 1806d37 commit ff2b597

File tree

4 files changed

+228
-37
lines changed

4 files changed

+228
-37
lines changed
45.9 KB
Loading
139 KB
Loading
139 KB
Loading

docs/data-structures/segment-tree.md

Lines changed: 228 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -5,97 +5,288 @@ tags:
55
- Segment Tree
66
---
77

8-
Segment Tree bir dizide $\mathcal{O}(\log N)$ zaman karmaşıklığında herhangi bir $[l,r]$ aralığı icin minimum, maksimum, toplam gibi sorgulara cevap verebilmemize ve bu aralıklar üzerinde güncelleme yapabilmemize olanak sağlayan bir veri yapısıdır.
8+
Segment Tree is a data structure that enables us to answer queries like minimum, maximum, sum etc. for any $[l,r]$ interval in $\mathcal{O}(\log N)$ time complexity and update these intervals.
99

10-
Segment Tree, [Fenwick Tree](fenwick-tree.md) ve [Sparse Table](sparse-table.md) yapılarından farklı olarak elemanlar üzerinde güncelleme yapılabilmesi ve minimum, maksimum gibi sorgulara da olanak sağlaması yönünden daha kullanışlıdır. Ayrıca Segment Tree $\mathcal{O}(N)$ hafıza karmaşıklığına sahipken Sparse Table yapısında gereken hafıza karmaşıklığı $\mathcal{O}(N \log N)$'dir.
10+
Segment Tree is more useful than [Fenwick Tree](fenwick-tree.md) and [Sparse Table](sparse-table.md) structures because it allows updates on elements and provides the possibility to answer queries like minimum, maximum etc. for any $[l,r]$ interval. Also, the memory complexity of Segment Tree is $\mathcal{O}(N)$ while the memory complexity of the Sparse Table structure is $\mathcal{O}(N \log N)$.
1111

12-
## Yapısı ve Kuruluşu
13-
Segment Tree, "Complete Binary Tree" yapısına sahiptir. Segment Tree'nin yaprak düğümlerinde dizinin elemanları saklıdır ve bu düğümlerin atası olan her düğüm kendi çocuğu olan düğümlerinin cevaplarının birleşmesiyle oluşur. Bu sayede her düğümde belirli aralıkların cevapları ve root düğümünde tüm dizinin cevabı saklanır. Örneğin toplam sorgusu için kurulmuş bir Segment Tree yapısı için her düğümün değeri çocuklarının değerleri toplamına eşittir.
12+
## Structure and Construction
13+
Segment Tree has a "Complete Binary Tree" structure. The leaf nodes of the Segment Tree store the elements of the array, and each internal node's value is calculated with a function that takes its children's values as inputs. Thus, the answers of certain intervals are stored in each node, and the answer of the whole array is stored in the root node. For example, for a Segment Tree structure built for the sum query, the value of each node is equal to the sum of its children's values.
1414

1515
<figure markdown="span">
16-
![a = [41,67,6,30,85,43,39] dizisinde toplam sorgusu icin oluşturulmuş Segment Tree yapısı](img/segtree.png){ width="100%" }
17-
<figcaption>$a = [41,67,6,30,85,43,39]$ dizisinde toplam sorgusu icin oluşturulmuş Segment Tree yapısı</figcaption>
16+
![segment tree structure to query sum on array a = [41,67,6,30,85,43,39]](img/segtree.png){ width="100%" }
17+
<figcaption>segment tree structure to query sum on array $a = [41,67,6,30,85,43,39]$</figcaption>
1818
</figure>
1919

2020
```c++
2121
void build(int ind, int l, int r) {
22-
// tree[ind] dizinin [l,r] araliginin cevabini saklar.
23-
if (l == r) { // yaprak dugum'e ulasti
24-
tree[ind] = a[l]; // bu dugum dizinin l. elamaninin cevabini saklar
22+
// tree[ind] stores the answer of the interval [l,r]
23+
if (l == r) { // leaf node reached
24+
tree[ind] = a[l]; // store the value of the leaf node
2525
} else {
2626
int mid = (l + r) / 2;
2727
build(ind * 2, l, mid);
2828
build(ind * 2 + 1, mid + 1, r);
29-
// [l,r] araliginin cevabini
30-
// [l,mid] ve [mid + 1,r] araliklarinin cevaplarinin birlesmesiyle olusur.
29+
// the answer of the interval [l,mid] and [mid + 1,r] is the sum of their answers
3130
tree[ind] = tree[ind * 2] + tree[ind * 2 + 1];
3231
}
3332
}
3433
```
3534
36-
## Aralık Sorgu ve Eleman Güncelleme
35+
## Query and Update Algorithms
3736
38-
### Sorgu Algoritması
37+
### Query Algorithm
3938
40-
Herhangi bir $[l,r]$ aralığı için sorgu algoritması sırası ile şu şekilde çalışır:
41-
- $[l,r]$ aralığını ağacımızda cevapları saklı olan en geniş aralıklara parçala.
42-
- Bu aralıkların cevaplarını birleştirerek istenilen cevabı hesapla.
39+
For any $[l,r]$ interval, the query algorithm works as follows:
40+
- Divide the $[l,r]$ interval into the widest intervals that are stored in the tree.
41+
- Merge the answers of these intervals to calculate the desired answer.
4342
44-
Ağacın her derinliğinde cevabımız için gerekli aralıklardan maksimum $2$ adet bulunabilir. Bu yüzden sorgu algoritması $\mathcal{O}(\log N)$ zaman karmaşıklığında çalışır.
43+
There are at most $2$ intervals that are needed to calculate the answer at each depth of the tree. Therefore, the query algorithm works in $\mathcal{O}(\log N)$ time complexity.
4544
4645
<figure markdown="span">
47-
![a = [41,67,6,30,85,43,39] dizisinde $[2,6]$ aralığında sorgu işlemi](img/segtreequery.png){ width="100%" }
48-
<figcaption>$a = [41,67,6,30,85,43,39]$ dizisinde $[2,6]$ aralığında sorgu işlemi</figcaption>
46+
![on array a = [41,67,6,30,85,43,39] query at $[2,6]$ interval](img/segtreequery.png){ width="100%" }
47+
<figcaption>on array $a = [41,67,6,30,85,43,39]$ query at $[2,6]$ interval</figcaption>
4948
</figure>
5049
51-
$a = [41,67,6,30,85,43,39]$ dizisinde $[2,6]$ aralığının cevabı $[2,3]$ ile $[4,6]$ aralıklarının cevaplarının birleşmesiyle elde edilir. Toplam sorgusu için cevap $36+167=203$ şeklinde hesaplanır.
50+
On array $a = [41,67,6,30,85,43,39]$, the answer of the $[2,6]$ interval is obtained by merging the answers of the $[2,3]$ and $[4,6]$ intervals. The answer for the sum query is calculated as $36+167=203$.
5251
5352
```c++
54-
// [lw,rw] sorguda cevabini aradigimiz aralik.
55-
// [l,r] ise agactaki ind nolu node'da cevabini sakladigimiz aralik.
53+
// [lw,rw] is the interval we are looking for the answer
54+
// [l,r] is the interval that the current node stores the answer
5655
int query(int ind, int l, int r, int lw, int rw) {
57-
if (l > rw or r < lw) // bulundugumuz aralik cevabini aradigimiz araligin disinda.
56+
if (l > rw or r < lw) //current interval does not contain the interval we are looking for
5857
return 0;
59-
if (l >= lw and r <= rw) // cevabini aradigimiz aralik bu araligi tamamen kapsiyor.
58+
if (l >= lw and r <= rw) //current interval is completely inside the interval we are looking for
6059
return tree[ind];
6160
6261
int mid = (l + r) / 2;
6362
64-
// Agacta recursive bir sekilde araligimizi araliklara bolup gelen cevaplari birlestiyoruz.
63+
// recursively calculate the answers of all intervals containing the x index
6564
return query(ind * 2, l, mid, lw, rw) + query(ind * 2 + 1, mid + 1, r, lw, rw);
6665
}
6766
```
6867

69-
### Eleman Güncelleme Algoritması
68+
### Update Algorithm
7069

71-
Dizideki $x$ indeksli elemanının değerini güncellemek için kullanılan algoritma şu şeklide çalışır:
70+
Update the value of every node that contains $x$ indexed element.
7271

73-
- Ağaçta $x$ indeksli elemanı içeren tüm düğümlerin değerlerini güncelle.
74-
75-
Ağaçta $x$ indeksli elemanın cevabını tutan yaprak düğümden root düğüme kadar toplamda $\log(N)$ düğümün değerini güncellememiz yeterlidir. Dolayısıyla herhangi bir elemanın değerini güncellemenin zaman karmaşıklığı $\mathcal{O}(\log N)$'dir.
72+
It is sufficient to update the values of at most $\log(N)$ nodes from the leaf node containing the $x$ indexed element to the root node. Therefore, the time complexity of updating the value of any element is $\mathcal{O}(\log N)$.
7673

7774
<figure markdown="span">
78-
![a = [41,67,6,30,85,43,39] dizisinde 5 indeksli elemanın cevabını güncellerken güncellememiz gereken düğümler şekildeki gibidir.](img/segtreeupdate.png){ width="100%" }
79-
<figcaption>$a = [41,67,6,30,85,43,39]$ dizisinde 5 indeksli elemanın cevabını güncellerken güncellememiz gereken düğümler sekildeki gibidir.</figcaption>
75+
![the nodes that should be updated when updating the $5^{th}$ index of the array a = [41,67,6,30,85,43,39] are as follows:](img/segtreeupdate.png){ width="100%" }
76+
<figcaption>the nodes that should be updated when updating the $5^{th}$ index of the array $a = [41,67,6,30,85,43,39]$ are as follows:</figcaption>
8077
</figure>
8178

79+
80+
8281
```c++
8382
void update(int ind, int l, int r, int x, int val) {
84-
if (l > x || r < x) // bulundugumuz aralik x indeksli elemani icermiyor.
83+
if (l > x || r < x) // x index is not in the current interval
8584
return;
8685
if (l == x and r == x) {
87-
tree[ind] = val; // x indeksli elemani iceren yaprak dugumunun cevabini guncelliyoruz.
86+
tree[ind] = val; // update the value of the leaf node
8887
return;
8988
}
9089

9190
int mid = (l + r) / 2;
9291

93-
// recursive bir sekilde x indeksli elemani iceren
94-
// tum araliklarin cevaplarini guncelliyoruz.
92+
// recursively update the values of all nodes containing the x index
9593
update(ind * 2, l, mid, x, val);
9694
update(ind * 2 + 1, mid + 1, r, x, val);
9795
tree[ind] = tree[ind * 2] + tree[ind * 2 + 1];
9896
}
9997
```
10098
101-
Segment Tree veri yapısı ile ilgili örnek bir probleme [buradan](https://codeforces.com/gym/100739/problem/A){target="_blank"} ulaşabilirsiniz.
99+
A sample problem related to the Segment Tree data structure can be found [here](https://codeforces.com/gym/100739/problem/A){target="_blank"}.
100+
101+
## Segment Tree with Lazy Propagation
102+
Previously, update function was called to update only a single value in array. Please note that a single value update in array may cause changes in multiple nodes in Segment Tree as there may be many segment tree nodes that have this changed single element in it’s range.
103+
104+
### Lazy Propogation Algorithm
105+
We need a structure that can perform following operations on an array $[1,N]$.
106+
- Add inc to all elements in the given range $[l, r]$.
107+
- Return the sum of all elements in the given range $[l, r]$.
108+
109+
Notice that if update was for single element, we could use the segment tree we have learned before. Trivial structure comes to mind is to use an array and do the operations by traversing and increasing the elements one by one. Both operations would take $\mathcal{O}(L)$ time complexity in this structure where $L$ is the number of elements in the given range.
110+
111+
Let’s use segment tree’s we have learned. Second operation is easy, We can do it in $\mathcal{O}(\log N)$. What about the first operation. Since we can do only single element update in the regular segment tree, we have to update all elements in the given range one by one. Thus we have to perform update operation $L$ times. This works in $\mathcal{O}(L \times \log N)$ for each range update. This looks bad, even worse than just using an array in a lot of cases.
112+
113+
So we need a better structure. People developed a trick called lazy propagation to perform range updates on a structure that can perform single update (This trick can be used in segment trees, treaps, k-d trees ...).
114+
115+
Trick is to be lazy i.e, do work only when needed. Do the updates only when you have to. Using Lazy Propagation we can do range updates in $\mathcal{O}(\log N)$ on standart segment tree. This is definitely fast enough.
116+
117+
### Updates Using Lazy Propogation
118+
Let’s be <i>lazy</i> as told, when we need to update an interval, we will update a node and mark its children that it needs to be updated and update them when needed. For this we need an array $lazy[]$ of the same size as that of segment tree. Initially all the elements of the $lazy[]$ array will be $0$ representing that there is no pending update. If there is non-zero element $lazy[k]$ then this element needs to update node k in the segment tree before making any query operation, then $lazy[2\cdot k]$ and $lazy[2 \cdot k + 1]$ must be also updated correspondingly.
119+
120+
To update an interval we will keep 3 things in mind.
121+
- If current segment tree node has any pending update, then first add that pending update to current node and push the update to it’s children.
122+
- If the interval represented by current node lies completely in the interval to update, then update the current node and update the $lazy[]$ array for children nodes.
123+
- If the interval represented by current node overlaps with the interval to update, then update the nodes as the earlier update function.
124+
125+
```c++
126+
void update(int node, int start, int end, int l, int r, int val) {
127+
// If there's a pending update on the current node, apply it
128+
if (lazy[node] != 0) {
129+
tree[node] += (end - start + 1) * lazy[node]; // Apply the pending update
130+
// If not a leaf node, propagate the lazy update to the children
131+
if (start != end) {
132+
lazy[2 * node] += lazy[node];
133+
lazy[2 * node + 1] += lazy[node];
134+
}
135+
lazy[node] = 0; // Clear the pending update
136+
}
137+
138+
// If the current interval [start, end] does not intersect with [l, r], return
139+
if (start > r || end < l) {
140+
return;
141+
}
142+
143+
// If the current interval [start, end] is completely within [l, r], apply the update
144+
if (l <= start && end <= r) {
145+
tree[node] += (end - start + 1) * val; // Update the segment
146+
// If not a leaf node, propagate the update to the children
147+
if (start != end) {
148+
lazy[2 * node] += val;
149+
lazy[2 * node + 1] += val;
150+
}
151+
return;
152+
}
153+
154+
// Otherwise, split the interval and update both halves
155+
int mid = (start + end) / 2;
156+
update(2 * node, start, mid, l, r, val);
157+
update(2 * node + 1, mid + 1, end, l, r, val);
158+
159+
// After updating the children, recalculate the current node's value
160+
tree[node] = tree[2 * node] + tree[2 * node + 1];
161+
}
162+
```
163+
164+
This is the update function for given problem. Notice that when we arrive a node, all the updates that we postponed that would effect this node will be performed since we are pushing them downwards as we go to this node. Thus this node will keep the exact values when the range updates are done without lazy. So it’s seems like it is working. How about queries?
165+
166+
### Queries Using Lazy Propogation
167+
Since we have changed the update function to postpone the update operation, we will have to change the query function as well. The only change we need to make is to check if there is any pending update operation on that node. If there is a pending update, first update the node and then proceed the same way as the earlier query function. As mentioned in the previous subsection, all the postponed updates that would affect this node will be performed before we reach it. Therefore, the sum value we look for will be correct.
168+
169+
```c++
170+
int query(int node, int start, int end, int l, int r) {
171+
// If the current interval [start, end] does not intersect with [l, r], return 0
172+
if (start > r || end < l) {
173+
return 0;
174+
}
175+
176+
// If there's a pending update on the current node, apply it
177+
if (lazy[node] != 0) {
178+
tree[node] += (end - start + 1) * lazy[node]; // Apply the pending update
179+
// If not a leaf node, propagate the lazy update to the children
180+
if (start != end) {
181+
lazy[2 * node] += lazy[node];
182+
lazy[2 * node + 1] += lazy[node];
183+
}
184+
lazy[node] = 0; // Clear the pending update
185+
}
186+
187+
// If the current interval [start, end] is completely within [l, r], return the value
188+
if (l <= start && end <= r) {
189+
return tree[node];
190+
}
191+
192+
// Otherwise, split the interval and query both halves
193+
int mid = (start + end) / 2;
194+
int p1 = query(2 * node, start, mid, l, r); // Query the left child
195+
int p2 = query(2 * node + 1, mid + 1, end, l, r); // Query the right child
196+
197+
// Combine the results from the left and right child nodes
198+
return (p1 + p2);
199+
}
200+
```
201+
Notice that the only difference with the regular query function is pushing the lazy values downwards as we traverse. This is a widely used trick applicable to various problems, though not all range problems. You may notice that we leveraged properties of addition here. The associative property of addition allows merging multiple updates in the lazy array without considering their order. This assumption is crucial for lazy propagation. Other necessary properties are left as an exercise to the reader.
202+
203+
## Binary Search on Segment Tree
204+
Assume we have an array A that contains elements between 1 and $M$. We have to perform 2 kinds of operations.
205+
- Change the value of the element in given index i by x.
206+
- Return the value of the kth element on the array when sorted.
207+
208+
### How to Solve It Naively
209+
Let’s construct a frequency array, $F[i]$ will keep how many times number i occurs in our original array. So we want to find smallest $i$ such that $\sum_{j=1}^{i} F[i] \geq k$. Then the number $i$ will be our answer for the query. And for updates we just have to change $F$ array accordingly.
210+
211+
<figure markdown="span">
212+
![naive updates](img/naive_update.png){ width="100%" }
213+
<figcaption>A naive update example</figcaption>
214+
</figure>
215+
216+
This is the naive algorithm. Update is O(1) and query is O(M).
217+
218+
```c++
219+
void update(int i, int x) {
220+
F[A[i]]--;
221+
F[A[i] = x]++;
222+
}
223+
224+
int query(int k) {
225+
int sum = 0, ans = 0;
226+
// Iterate through the frequency array F to find the smallest value
227+
// for which the cumulative frequency is at least k
228+
for (int i = 1; i <= M; i++) {
229+
sum += F[i]; // Add the frequency of F[i] to the cumulative sum
230+
if (sum >= k) {
231+
return i;
232+
}
233+
}
234+
}
235+
```
236+
237+
### How to Solve It With Segment Tree
238+
This is of course, slow. Let’s use segment tree’s to improve it. First we will construct a segment tree on $F$ array. Segment tree will perform single element updates and range sum queries. We will use binary search to find corresponding $i$ for $k^{th}$ element queries.
239+
240+
<figure markdown = "span">
241+
![segment tree updates](img/updated_segtree.png){ width="100%" }
242+
<figcaption>Segment Tree After First Update</figcaption>
243+
244+
```c++
245+
void update(int i, int x) {
246+
update(1, 1, M, A[i], --F[A[i]]); // Decrement frequency of old value
247+
A[i] = x; // Update A[i] to new value
248+
update(1, 1, M, A[i], ++F[A[i]]); // Increment frequency of new value
249+
}
250+
251+
int query(int k) {
252+
int l = 1, r = m; // Initialize binary search range
253+
while (l < r) {
254+
int mid = (l + r) / 2;
255+
if (query(1, 1, M, 1, mid) < k)
256+
l = mid + 1; // Move lower bound up
257+
else
258+
r = mid; // Move upper bound down
259+
}
260+
return l; // Return index where cumulative frequency is at least k
261+
}
262+
```
263+
264+
If you look at the code above you can notice that each update takes $\mathcal{O}(\log M)$ time and each query takes $\mathcal{O}(\log^{2} M)$ time, but we can do better.
265+
266+
### How To Speed Up?
267+
If you look at the segment tree solution on preceding subsection you can see that queries are performed in $\mathcal{O}(\log^{2} M)$ time. We can make is faster, actually we can reduce the time complexity to $\mathcal{O}(\log M)$ which is same with the time complexity for updates. We will do the binary search when we are traversing the segment tree. We first will start from the root and look at its left child’s sum value, if this value is greater than k, this means our answer is somewhere in the left child’s subtree. Otherwise it is somewhere in the right child’s subtree. We will follow a path using this rule until we reach a leaf, then this will be our answer. Since we just traversed $\mathcal{O}(\log M)$ nodes (one node at each level), time complexity will be $\mathcal{O}(\log M)$. Look at the code below for better understanding.
268+
269+
<figure markdown = "span">
270+
![solution of first query](img/query_soln.png){ width="100%" }
271+
<figcaption>Solution of First Query</figcaption>
272+
</figure>
273+
274+
```c++
275+
void update(int i, int x) {
276+
update(1, 1, M, A[i], --F[A[i]]); // Decrement frequency of old value
277+
A[i] = x; // Update A[i] to new value
278+
update(1, 1, M, A[i], ++F[A[i]]); // Increment frequency of new value
279+
}
280+
281+
int query(int node, int start, int end, int k) {
282+
if (start == end) return start; // Leaf node, return the index
283+
int mid = (start + end) / 2;
284+
if (tree[2 * node] >= k)
285+
return query(2 * node, start, mid, k); // Search in left child
286+
return query(2 * node + 1, mid + 1, end, k - tree[2 * node]); // Search in right child
287+
}
288+
289+
int query(int k) {
290+
return query(1, 1, M, k); // Public interface for querying
291+
}
292+
```

0 commit comments

Comments
 (0)