RAKTA_ROPAN/Note.txt at master · ayushgithub/RAKTA_ROPAN · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
The Nearest Neighbour of a query point in R-tree is described below -
The metrics for finding a NN are -
1 - Minimum Distance(MINDIST)
It is basically the minimum distance from a query point to a MBR. The property of this distance is it is the lower bound to the distance of any object from the query point in the MBR
2 - Minimax Distance(MINMAXDIST)
In order to avoid visiting unnecessary MBRs, we should have an upper bound of the NN distance to any object inside an MBR. This will allow us to prune MBRs that have MINDIST higher than this upper bound. MINMAXDIST exploits the fact that every face of a MBR will have a object. It is quite intutive as if a MBR doesnot have a object on its surface we can compress that surface to a extent so as a point comes on that surface without losing any another point. MINMAXDIST computes the minimum value of all the maximum distances between the query point and points on the each of the n axes respectively. It guarantees there is an object within the MBR at a distance less than or equal to MINMAXDIST.

So we now have two distances between which a object point is guaranteed.
Now there are three pruning heuristics which will help us in pruning unnecessary nodes
1 - an MBR M with MINDIST(P,M) greater than the MINMAXDIST(P,M') of another MBR M' is discarded because it cannot contain the NN. We use this in downward pruning.
2 - an actual distance from P to a given object O which is greater than the MINMAXDIST(P,M) for an MBR M can be discarded because M contains a object O' which is nearer to P. This is also used in upward pruning.
3 - every MBR M with MINDIST(P,M) greater than the actual distance from P to a given object O is discarded because it cannot enclose an object nearer than O. We use this in upward pruning.

The algorithm presented here implements an ordered
depth First traversal. It begins with the R-tree root node and proceeds down the tree. Originally, our guess for the nearest neighbor distance (call it Nearest) is infinity. During the descending phase, at each newly visited nonleaf node, the algorithm computes the ordering metric bounds (e.g. MINDIST, Definition 2) for all its MBRs and sorts them (associated with their corresponding node) into an Active Branch List (ABL). We then apply pruning strategies 1 and 2 to the ABL to remove unnecessary branches. The algorithm iterates on this ABL until the ABL is empty: For each iteration, the algorithm selects the next branch in the list and applies itself recursively to the node corresponding to the MBR of this branch. At a leaf node (DB objects level), the algorithm calls a type specifyc distance function for each object and selects the smaller distance between current value of N earest and each computed value and updates N earest appropriately. At the return from the recursion, we take this new estimate of the NN and apply pruning strategy 3 to remove all branches with M INDIST (P;M) > N earest for all MBRs M in the ABL.

But the above algorithm will not work in our case. As every possible object can be a skyline with some probability, So pruning nodes from R-tree is not desired. We modified the algorithm so as to remove pruning in any case possible. This gave us a list which is ordered by the distance from the query point. Now after applying the PRNN2D algorithm we get our desired result. But the problem is we have computed the distance of every object from the query point. Suppose our database is very huge and the number of desired nearest neighbour are very less compared to the database size this computation is useless. So this can be improved further. So we used the 1999 approach which gives me the nearest neighbour incrementally which will drastically improve the performance of our system. The algorithm's psuedo code is given below