Skip to content

SSL certificate verification failure when loading HFDataset locally #1294

@ElenaKhaustova

Description

@ElenaKhaustova

Description

When loading HFDataset using datasets.load_dataset() (via HFDataset.load()), SSL verification fails in some local environments:

httpx.ConnectError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain

This issue does not affect CI after doctest skipping (see #1293), but remains relevant for local developer experience and documentation.

Context

This occurs on macOS with Python 3.11, 3.12, and 3.13. The issue is independent of Kedro and reproduces when calling datasets.load_dataset() directly.

Steps to Reproduce

  1. pip install "kedro-datasets[huggingface-hfdataset]"
  2. Run
from kedro_datasets.huggingface import HFDataset

dataset_name = "openai_humaneval"
dataset = HFDataset(dataset_name=dataset_name)
ds = dataset.load()
  1. Or run without Kedro dataset
from datasets import load_dataset

ds = load_dataset(
    "openai_humaneval",
)

print("Keys in dataset:", ds.keys())
print("Train examples:", len(ds["train"]))

Notes

  • Not caused by Kedro implementation
  • Not fixed by downgrading datasets or huggingface_hub
  • It worked a few times when turning off VPN
  • Likely requires explicit CA trust configuration or documentation

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions