If the metadata for Croissant is pulled via URL (done [here](https://github.com/mlcommons/croissant/blob/main/python/mlcroissant/mlcroissant/_src/structure_graph/nodes/metadata.py#L427-L430)), we should set a user-agent that allows the package to be identified. For reference, `kagglehub` does something similar [here](https://github.com/Kaggle/kagglehub/blob/main/src/kagglehub/clients.py#L61-L83)