Private model training (DP-SGD) with sparse features

Hello,

Private model training has been recently mentioned [here](https://github.com/WICG/turtledove/blob/main/PA_private_model_training.md). One of the privacy considerations is to include DP in the training loop through DP-SGD.

There are cases when DP-SGD would make the training process considerably slower as it destroys the sparsity of the gradients calculated during backprop, rendering impossible to use optimization techniques that rely on such sparsity. This is usually the case when some features are categorical features or working with embedding tables in the case of LLMs. I am aware there is research around this topic to remedy it, although it is not clear from the explainer linked above if this is something that has been considered in the context of Protected Audience API.

Are there any techniques that are being considered to face this or thoughts about this topic?

Thanks




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Private model training (DP-SGD) with sparse features #1370

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development