-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Cristian raised the following difference attack:
I was under the impression this attack was already described in another discussion. But to be more explicit:
We have 2 tables: purchases, with columns product_id and client_id, and clients, with columns id and ssn.
Column ssn is labeled as the AID. Assume the following data is present:
Clients:
| id | ssn |
|---|---|
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
| 5 | e |
| 6 | f |
Purchases:
| product_id | client_id |
|---|---|
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 5 | 5 |
| 1 | 6 |
| 2 | 6 |
| 3 | 6 |
| 4 | 6 |
| 5 | 6 |
If we issue a query like:
select count(*) from purchases join clients on client_id = clients.idwe get the following input to the count aggregator:
| count | ssn |
|---|---|
| 1 | a |
| 1 | b |
| 1 | c |
| 1 | d |
| 1 | e |
| 5 | f |
If Ne=1 and Nt = 2, this results in a flattening of 4. The output will be 10 - 4 + noise(1) = 6 + noise(1).
For the query:
select count(*) from purchases left join clients on client_id = clients.id and ssn <> 'f'we get the following input to the count aggregator:
| count | ssn |
|---|---|
| 1 | a |
| 1 | b |
| 1 | c |
| 1 | d |
| 1 | e |
| 5 | NULL |
The last row is put aside as unaccounted.
If Ne=1 and Nt = 2, this results in a flattening of 0 for accounted rows. The output will be 5 - 0 + 5 - 0 + noise(1) = 10 + noise(1).
The difference in the results is large enough to dominate the noise, allowing us to know the entity with ssn = 'f' is an outlier and to detect other attributes of it by conditional inclusion/exclusion in queries.