Skip to content

SangzorDeGeit/MPQP

Repository files navigation

MPQP

Name

MPQP (Multi Party Query Planner) is a program that makes a query plan for queries that involve multiple parties. It will utilize rewrite rules to restructure the relational algebra tree to optimize the query for multiple parties.

Requirements

To run the program you need (at least) the following:

  • rust (cargo to build and run)
  • sqlite3
  • graphviz (for .dot files)

Usage

You can run the executable with the -h (--help) flag to get input arguments. You need to apply database (.db) files with the -d (--database) flag. The easiest way to interact with the program is to use the -f (--file) option and provide a file with a complete SQL query. The query supports the following keywords:

  • SELECT, FROM, JOIN ON, WHERE, GROUP BY, AS
  • MIN, MAX, COUNT, AVG, SUM
  • AND, OR, >, <, =, !=, >= and <=
    In the query each table should be defined as <database>.<table> where database should be equal to some given database file name (with -d) without extension.

Example: when you input -d sample_databases/chinook.db you can use the table chinook.<table in chinook>
You also need to give the program an output file using the -o flag, MPQP will create this file and a <filename>_original_tree.dot file in the same directory.

You can inspect the outputted dotfiles using the command:

dot -Tsvg <output_file> > <graphname>.svg  

Rewrite join implementation

To illustrate the implementation of a join rewrite an example is given.
Say we have the following tree with a complex predicate over all diamond tables
The initial tree, all diamonds have the same owner and should be joined first
For each diamond we list the parent joins from low to high:
diamond1 = 2 1
diamond2 = 4 2 1
diamond3 = 5 3 1
diamond4 = 6 3 1
dimaond5 = 6 3 1

First, if we find two or more arrays that identical we create a new array that is this same array but with the first index removed (call it diamond6)
Then we remove the duplicates that this array is based on.
We loop this process until no two arrays are the same.
diamond1 = 2 1
diamond2 = 4 2 1
diamond3 = 5 3 1
diamond6 = 3 1

Now all diamonds are equal to a pointer to a certain relation, we create a new (cluster of) join from these diamonds, we call this 'J':
The newly created join tree (cluster)
Now we need to update the pointers to all diamonds from the original tree. From the following list we pick the pointer to the diamond with the longest array (if there are more then two with the same length we pick the first of the two):
diamond1 = 2 1
diamond2 = 4 2 1
diamond3 = 5 3 1
diamond6 = 3 1
In this case diamond2, we update this pointer to point to the newly created join (cluster) and the tree looks like this:
Diamond two has become the newly created join relation, all diamonds in this tree could be seen as 'floating pointers'
We compare the first array that is not diamond2 with the array of diamond2
(diamond1: 2 1)
Take its first index and check if it is in the diamond2 array:
yes: push all predicates contained in this join (join2) to join that comes before two in the diamond2 array (so join4), then remove this join from the tree and its array from the list. Update the tree and all arrays accordingly (so remove element "2" from all arrays) remove 2 from the tree.
diamond2: 4 1
diamond3: 5 3 1
diamond6: 3 1
The tree now looks like this:
Tree after first restructure operation
again remove the duplicate arrays and create a new one with the first element removed
We compare the first array that is not diamond2 with the array of diamond2
(diamond3: 5 3 1)
Take its first index and check if it is in the diamond2 array:
no: find the first element in each array that is the same take the index before that. Set the pointer that pointed to diamond3 to 4 (element from diamond2 array).
Push all predicates from '1' to 3 (element from diamond3 array). and remove 1 from all arrays. Push the leftover elements from diamond3 array after 4, remove 1 from the tree.
diamond2: 4 5 3
diamond6: 3
The tree now looks like this:
Tree after second restructure operation
again remove the duplicate arrays and create a new one with the first element removed
We compare the first array that is not diamond2 with the array of diamond2
(diamond6: 3)
Take its first index and check if it is in the diamond2 array:
yes: push all predicates contained in this join (join3) to join that comes before two in the diamond2 array (so join5), then remove this join from the tree and its array from the list. Update the tree and all arrays accordingly (so remove element "3" from all arrays) remove 3 from the tree.
The final tree looks like this:
Final tree after complete rewrite of the predicate
Even though the new tree looks like it has less join operations, there are more predicates per join.

About

Multi Party Query Planner. A query planner for SQL for collaborative analytics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors