Memory footprint of sorting medium-size list with M>= 1 #54
Replies: 6 comments
-
|
What you are observing is indeed expected behavior. You can also see this for the demo mpyc/demos/sort.py if you run it like this, using If With For The built-in routine To lower the memory footprint reducing the size of the secure integers should help a bit (default is L=32). There will also be a vectorized version using secure NumPy arrays for sorting, which will reduce the memory footprint considerably. Achieving the same effect as already done for instance for |
Beta Was this translation helpful? Give feedback.
-
|
Thank you, I better understand the phenomenon now. I would have a follow-up question: is there a way to control the expansion of the circuit? For example, I envision two possibilities:
These solutions create a memory-footprint/round-complexity trade-off which is necessary for large lists. |
Beta Was this translation helpful? Give feedback.
-
|
The methods I've done a quick experiment already with secure NumPy arrays, and then sorting 1000 numbers with 3 parties takes like 10 seconds. The circuit size also remains much smaller, linear in the depth of the sorting network, so much less of an issue. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks, these functions should do the job. Wow, this is a nice speed up. Did you just parallelize the batcher sort or did you also switch to a quicksort or a radix sort? |
Beta Was this translation helpful? Give feedback.
-
|
It's a vectorized version of Batcher's merge-exchange sort algorithm as used by def np_sort(self, a, axis=-1):
if axis is None:
a = self.np_flatten(a)
axis = 0
n = a.shape[axis]
if a.size == 0 or n <= 1:
return a
a = self.np_swapaxes(a, axis, -1) # switch to last axis
t = (n-1).bit_length() # n >= 2, so t >= 1
p = 1 << t-1
while p:
d, q, r = p, 1 << t-1, 0
while d:
I = np.fromiter((i for i in range(n - d) if i & p == r), dtype=int)
b0 = a[..., I]
b1 = a[..., I + d]
h = (b0 >= b1) * (b0 - b1)
b0, b1 = b0 - h, b1 + h
a = self.np_update(a, (..., I), b0)
a = self.np_update(a, (..., I + d), b1)
d, q, r = q - p, q >> 1, p
p >>= 1
a = self.np_swapaxes(a, axis, -1) # restore original axis
return a |
Beta Was this translation helpful? Give feedback.
-
|
Looks nice! Looking forward to the next release! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I've run some experiments where I sort "medium-size" secure lists (i.e., around 1K elements). For example, see the following snippet:
If I execute my script with no argument or with
-M0, the script finishes without any issue. If I use-M3or even-M1, the Python process takes more and more memory. Sometimes, it even fills my memory entirely for larger lists or larger M. On the other hand, with-M0, the memory footprint is constant (and much smaller).I suppose this behaviour comes from the secret sharing costs but I wonder whether I can optimize it to reduce the memory footprint. For example, I see several nested loops in the
_sort()function and wonder whether the coroutine stores the "comparison shares" from all comparison rounds until the end instead of releasing some of them round by round.To sum up, is there a simple trick I missed in the documentation or the code to optimize the memory footprint in these case?
Beta Was this translation helpful? Give feedback.
All reactions