|
| 1 | +# IPIP-337: Delegated Content Routing HTTP API |
| 2 | + |
| 3 | +- Start Date: 2022-10-18 |
| 4 | +- Related Issues: |
| 5 | + - https://github.com/ipfs/specs/pull/337 |
| 6 | + |
| 7 | +## Summary |
| 8 | + |
| 9 | +This IPIP specifies an HTTP API for delegated content routing. |
| 10 | + |
| 11 | +## Motivation |
| 12 | + |
| 13 | +Idiomatic and first-class HTTP support for delegated routing is an important requirement for large content routing providers, |
| 14 | +and supporting large content providers is a key strategy for driving down IPFS content routing latency. |
| 15 | +These providers must handle high volumes of traffic and support many users, so leveraging industry-standard tools and services |
| 16 | +such as HTTP load balancers, CDNs, reverse proxies, etc. is a requirement. |
| 17 | +To maximize compatibility with standard tools, IPFS needs an HTTP API specification that uses standard HTTP idioms and payload encoding. |
| 18 | +The [Reframe spec](https://github.com/ipfs/specs/blob/main/reframe/REFRAME_PROTOCOL.md) for delegated content routing is an experimental attempt at this, |
| 19 | +but it has resulted in a very unidiomatic HTTP API which is difficult to implement and is incompatible with many existing tools. |
| 20 | +The cost of a proper redesign, implementation, and maintenance of Reframe and its implementation is too high relative to the urgency of having a delegated content routing HTTP API. |
| 21 | + |
| 22 | +Note that this does not supplant nor deprecate Reframe. Ideally in the future, Reframe and its implementation would receive the resources needed to map the IDL to idiomatic HTTP, |
| 23 | +and implementations of this spec could then be rewritten in the IDL, maintaining backwards compatibility. |
| 24 | + |
| 25 | +We expect this API to be extended beyond "content routing" in the future, so additional IPIPs may rename this to something more general such as "Delegated Routing HTTP API". |
| 26 | + |
| 27 | +## Detailed design |
| 28 | + |
| 29 | +See the [Delegated Content Routing HTTP API spec](../routing/DELEGATED_CONTENT_ROUTING_HTTP.md) included with this IPIP. |
| 30 | + |
| 31 | +## Design rationale |
| 32 | + |
| 33 | +To understand the design rationale, it is important to consider the concrete Reframe limitations that we know about: |
| 34 | + |
| 35 | +- Reframe [method types](../reframe/REFRAME_KNOWN_METHODS.md) using the HTTP transport are encoded inside IPLD-encoded messages |
| 36 | + - This prevents URL-based pattern matching on methods, which makes it hard and expensive to do basic HTTP scaling and optimizations: |
| 37 | + - Configuring different caching strategies for different methods |
| 38 | + - Configuring reverse proxies on a per-method basis |
| 39 | + - Routing methods to specific backends |
| 40 | + - Method-specific reverse proxy config such as timeouts |
| 41 | + - Developer UX is poor as a result, e.g. for CDN caching you must encode the entire request message and pass it as a query parameter |
| 42 | + - This was initially done by URL-escaping the raw bytes |
| 43 | + - Not possible to consume correctly using standard JavaScript (see [edelweiss#61](https://github.com/ipld/edelweiss/issues/61)) |
| 44 | + - Shipped in Kubo 0.16 |
| 45 | + - Packing a CID into a struct, encoding it with DAG-CBOR, multibase-encoding that, percent-encoding that, and then passing it in a URL, rather than merely passing the CID in the URL, is needlessly complex from a user's perspective, and has already made it difficult to manually construct requests or interpret logs |
| 46 | + - Added complexity of "Cacheable" methods supporting both POSTs and GETs |
| 47 | +- The required streaming support and message groups add a lot of implementation complexity, but streaming does not currently work for cachable methods sent over HTTP |
| 48 | + - Ex for FindProviders, the response is buffered anyway for ETag calculation |
| 49 | + - There are no limits on response sizes nor ways to impose limits and paginate |
| 50 | + - This is useful for routers that have highly variable resolution time, to send results as soon as possible, but this is not a use case we are focusing on right now and we can add it later |
| 51 | +- The Identify method is not implemented because it is not currently useful |
| 52 | + - This is because Reframe's ambition is to be a generic catch-all bag of methods across protocols, while delegated routing use case only requires a subset of its methods. |
| 53 | +- Client and server implementations are difficult to write correctly, because of the non-standard wire formats and conventions |
| 54 | + - Example: [bug reported by implementer](https://github.com/ipld/edelweiss/issues/62), and [another one](https://github.com/ipld/edelweiss/issues/61) |
| 55 | +- The Go implementation is [complex](https://github.com/ipfs/go-delegated-routing/blob/main/gen/proto/proto_edelweiss.go) and [brittle](https://github.com/ipfs/go-delegated-routing/blame/main/client/provide.go#L51-L100), and is currently maintained by IPFS Stewards who are already over-committed with other priorities |
| 56 | +- Only the HTTP transport has been designed and implemented, so it's unclear if the existing design will work for other transports, and what their use cases and requirements are |
| 57 | + - This means Reframe can't be trusted to be transport-agnostic until there is at least a second transport implemented (e.g. as a reframe-over-libp2p protocol) |
| 58 | +- There's naming confusion around "Reframe, the protocol" and "Reframe, the set of methods" |
| 59 | + |
| 60 | +So this API proposal makes the following changes: |
| 61 | + |
| 62 | +- The Delegated Content Routing API is defined using HTTP semantics, and can be implemented without introducing Reframe concepts nor IPLD |
| 63 | +- There is a clear distinction between the RPC protocol (HTTP) and the API (Deleged Content Routing) |
| 64 | +- "Method names" and cache-relevant parameters are pushed into the URL path |
| 65 | +- Streaming support is removed, and default response size limits are added. |
| 66 | + - We will add streaming support in a subsequent IPIP, but we are trying to minimize the scope of this IPIP to what is immediately useful |
| 67 | +- Bodies are encoded using idiomatic JSON, instead of using IPLD codecs, and are compatible with OpenAPI specifications |
| 68 | +- The JSON uses human-readable string encodings of common data types |
| 69 | + - CIDs are encoded as CIDv1 strings with a multibase prefix (e.g. base32), for consistency with CLIs, browsers, and [gateway URLs](https://docs.ipfs.io/how-to/address-ipfs-on-web/) |
| 70 | + - Multiaddrs use the [human-readable format](https://github.com/multiformats/multiaddr#specification) that is used in existing tools and Kubo CLI commands such as `ipfs id` or `ipfs swarm peers` |
| 71 | + - Byte array values, such as signatures, are multibase-encoded strings (with an `m` prefix indicating Base64) |
| 72 | +- The "Identify" method and "message groups" are not included |
| 73 | +- The "GetIPNS" and "PutIPNS" methods are not included |
| 74 | + |
| 75 | +### User benefit |
| 76 | + |
| 77 | +The cost of building and operating content routing services will be much lower, as developers will be able to maximally reuse existing industry-standard tooling. |
| 78 | +Users will not need to learn a new RPC protocol and tooling to consume or expose the API. |
| 79 | +This will result in more content routing providers, each providing a better experience for users, driving down content routing latency across the IPFS network |
| 80 | +and increasing data availability. |
| 81 | + |
| 82 | +### Compatibility |
| 83 | + |
| 84 | +#### Backwards Compatibility |
| 85 | + |
| 86 | +IPFS Stewards will implement this API in [go-delegated-routing](https://github.com/ipfs/go-delegated-routing), using breaking changes in a new minor version. |
| 87 | +Because the existing Reframe spec can't be safely used in JavaScript and we won't be investing time and resources into changing the wire format implemented in edelweiss to fix it, |
| 88 | +the experimental support for Reframe in Kubo will be deprecated in the next release and delegated content routing will subsequently use this HTTP API. |
| 89 | +We may decide to re-add Reframe support in the future once these issues have been resolved.- |
| 90 | + |
| 91 | +#### Forwards Compatibility |
| 92 | + |
| 93 | +Standard HTTP mechanisms for forward compatibility are used: |
| 94 | + |
| 95 | +- The API is versioned using a version number prefix in the path |
| 96 | +- The `Accept` and `Content-Type` headers are used for content type negotiation, allowing for backwards-compatible additions of new MIME types, hypothetically such as: |
| 97 | + - `application/cbor` for binary-encoded responses |
| 98 | + - `application/x-ndjson` for streamed responses |
| 99 | + - `application/octet-stream` if the content router can provide the content/block directly |
| 100 | +- New paths+methods can be introduced in a backwards-compatible way |
| 101 | +- Parameters can be added using either new query parameters or new fields in the request/response body. |
| 102 | +- Provider records are both opaque and versioned to allow evolution of schemas and semantics for the same transfer protocol |
| 103 | + |
| 104 | +As a proof-of-concept, the tests for the initial implementation of this HTTP API were successfully tested with a libp2p transport using [libp2p/go-libp2p-http](https://github.com/libp2p/go-libp2p-http), demonstrating viability for also using this API over libp2p. |
| 105 | + |
| 106 | +### Security |
| 107 | + |
| 108 | +- All CID requests are sent to a central HTTPS endpoint as plain text, with TLS being the only protection against third-party observation. |
| 109 | +- While privacy is not a concern in the current version, plans are underway to add a separate endpoint that prioritizes lookup privacy. Follow the progress in related pre-work in [IPIP-272 (double hashed DHT)](https://github.com/ipfs/specs/pull/373/) and [ipni#5 (reader privacy in indexers)](https://github.com/ipni/specs/pull/5). |
| 110 | +- The usual JSON parsing rules apply. To prevent potential Denial of Service (DoS) attack, clients should ignore responses larger than 100 providers and introduce a byte size limit that is applicable to their use case. |
| 111 | + |
| 112 | +### Alternatives |
| 113 | + |
| 114 | +- Reframe (general-purpose RPC) was evaluated, see "Design rationale" section for rationale why it was not selected. |
| 115 | + |
| 116 | +### Copyright |
| 117 | + |
| 118 | +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). |
0 commit comments