Open
Description
Filling public issue because this died in private channels and threads many times
Part of work towards ipshipyard/roadmaps#9 and ipshipyard/roadmaps#15
This is also helping with webseeds from ipshipyard/roadmaps#19
Need
We need to agree and write down a specification how HTTP-only Trustless Gateway providers can be announced on existing Amino DHT.
Such provider can have a synthetic PeerID for interop with routing systems and software, but in reality it won't have any libp2p networking stack, and only expose HTTP endpoint that follows the Trustless Gateway spec.
Wider context
- We already have big providers that have HTTP-only "PeerID" provider (e.g. Storacha announces special peer with
/tls/http
multiaddr – right now it is only announced to IPNI) - We already have experimental HTTP-only retrieval in
boxo
/rainbow
that- Detects
/tls/http
multiaddrs - Performs content-type negotiation / probe to confirm the HTTP endpoint supports trustless gateway protocol
- Performs HTTP-only retrieval from such provider
- The PeerID is not used for anything.
- There is no auth when client learns about this provider from delegated routing system that proxies to IPNI and DHT, nor IPNI runs any validation checks when it accepts such announcement.
- Detects
- We want to allow people to self-host over HTTP and turn off Bitswap
- Stability and cost reduction thanks to HTTP Caching and free HTTP CDNs makes self-hosting possible with cheap hardware and with narrow bandwidth
- We can't deploy protocol changes to DHT without waiting 6-24+ months for significant % of DHT server nodes to update
- Annual reminder, that no spec for IPFS behaviors on Amino DHT exists, and if we want to make changes, need to fill this gap first: IPFS DHT Specification is missing #345
- Most of people who self-host run Kubo or IPFS Cluster backed by a fleet of Kubo nodes that announce to Amino DHT (and some run sidecar that also announces to IPNI)
North star
- Use this as opportunity to fill gap described in IPFS DHT Specification is missing #345
- We want to leverage existing DHT clients, servers, including third-party software.
- Leverage PoC that works with Storacha, and generalize it for Amino DHT and private swarms
- Make it possible for existing users to run Kubo-based node/cluster that only exposes non-recursive Trustless Gateway over HTTPS and has Bitswap Server shut down
Proposed spec direction
- use
/tls/http
in announced Multiaddr as signaling method- document that libp2p peers should follow libp2p+http spec for coexistence of multiple services on the same HTTP endpoint
- for libp2p specific ones, follow https://specs.ipfs.tech/http-gateways/libp2p-gateway/#well-known-libp2p-protocols to discover libp2p protocols supported by the HTTP endpoint → libp2p+http: clarify peerid auth optionality and gw type detection #495
- for http-only retrieval, follow plain HTTP content type negotiation (Accept header) and probing convention documented in trustless-gateway spec
- this keeps system open-ended – HTTP-based transfer protocols can be added the same way, all they need to do is to use new content-type in
Accept
/Content-Type
header - document that HTTP-only Trustless Gateway clients can ignore PeerID → libp2p+http: clarify peerid auth optionality and gw type detection #495
- this keeps system open-ended – HTTP-based transfer protocols can be added the same way, all they need to do is to use new content-type in
- document that libp2p peers should follow libp2p+http spec for coexistence of multiple services on the same HTTP endpoint
Open questions
- Any concerns with using
/tls/http
in announced Multiaddrs for this?- Example: is it ok for Kubo user to put their
Gateway.NoFetch=true
&Gateway.DeserializedResponses
gateway behind Cloudflare and URL inAddresses.AppendAnnounce
as/dns4/gw.example.com/tcp/443/tls
as extra hint for clients that prefer HTTP-only retrieval? - afaik we have all specs necessary for reliable interop – see spec direction above, but comment below if any spec gaps exist
- Example: is it ok for Kubo user to put their
- If content providers start announcing their trustless, non-recursive gateways as
/dnsX/example.com/../tls/http
, do we need any auth of DNS names?- Should DHT nodes accept and gossip
/dnsX/example.com/../tls/http
addrs blindly or should extra validation be added at routing system level (DHT, IPNI)? - Following prior art of ACME challenges, we need to think about HTTPS on raw IPs, or other setups without access to DNS. We could require a signed peerid published on DNS TXT or as HTTP GET file on
.well-known/libp2p/signed-peerid
path, but this complicates deployments, requires people to deal with PeerIDs, and at the end of the day, how is this auth request any better from HTTP client blindly sending a trustless probeGET /ipfs/cid
and getting 404 indicating endpoint is not a valid gateway?
- Should DHT nodes accept and gossip
As usual, ideas, feedback welcome.