Minimum graphs needed for top-level health reporting on the ipfs.io gateway

# Background

IP Shipyard has been entrusted to steward the ipfs.io gateway.  Other leaders in the ecosystem should have the ability to see the health and usage of the ipfs.io gateway.  This issue is about defining the minimum graphs needed to give others confidence in the maintained health of the service.

# Graphs
Some general requests includes:
* Provide time ranges 1+ year.  Why?  The longer context helps highlight small changes over time that can get missed in too short of a time period.
* Weekly rather than daily.  Why?  It facilitates looking at longer time horizons.  There has been discussion on how to accomplish this in [slack thread](https://filecoinproject.slack.com/archives/C03FFEVK30F/p1696968735725959).

## Unique Clients accessing ipfs.io / dweb.link
Current source: https://probelab.io/ipfsgateways/#daily-unique-clients-accessing-ipfsio--dweblink
Snapshot: 
![gateway-clients-overall](https://github.com/ipshipyard/waterworks-community/assets/85411/1a8a149d-2840-42b1-948c-7bc80d3b4d7a)
Improvements needed:
- [ ] Weekly numbers (rather than daily).  
- [ ] Combine with weekly data further back in time from https://docs.google.com/spreadsheets/d/1qnrAhqt_i5l9m48jge6617XD0hRK4qbTebTxWKhJdV0/edit#gid=1875197224 .  
<img width="971" alt="image" src="https://github.com/ipshipyard/waterworks-community/assets/85411/7a763fbd-9e2e-4273-a720-cbbb810f61d8">

## HTTP Requests to ipfs.io / dweb.link, by region
Current source: https://probelab.io/ipfsgateways/#daily-http-requests-to-ipfsio--dweblink-by-region
Snapshot: 
![gateway-requests-region](https://github.com/ipshipyard/waterworks-community/assets/85411/d965e4e4-aec2-4ca1-82c2-2c2673db8ab1)
Improvements needed: 
- [ ] Generate weekly (rather than daily) aggregates 
- [ ] Make clear what request/responses are included here.  Is it all or only “successful” (HTTP 200) requests?
- [ ] Combine with weekly data further back in time from https://docs.google.com/spreadsheets/d/1qnrAhqt_i5l9m48jge6617XD0hRK4qbTebTxWKhJdV0/edit#gid=1875197224 .  
<img width="928" alt="image" src="https://github.com/ipshipyard/waterworks-community/assets/85411/6d943cb2-332e-4863-980d-63a1f1f17411">

## p95 of TTFB for “200” responses
Current source: none currently other that a weekly snapshot value in https://protocollabs.grafana.net/d/J2_IHYTVz/gateway-report?orgId=1 .  I'm also not sure if that value is including "200" responses or all responses.
Existing data: in https://docs.google.com/spreadsheets/d/1qnrAhqt_i5l9m48jge6617XD0hRK4qbTebTxWKhJdV0/edit#gid=1875197224  there is 
<img width="594" alt="image" src="https://github.com/ipshipyard/waterworks-community/assets/85411/9661e1fc-b025-4908-adbd-d8e316e601f0">.  That said, I don't know if that is for "200" responses or all responses.

What's needed:
- [ ] Create new weekly plot for p95 of TTFB for “200” responses.  We don't need to combine with existing data.

## Response code distribution
For the requests in a given week, we should be able to show how the gateway is responding.  

Why:
1. Catch if there is a deployment issue that is affecting traffic.
2. Prove the value of certain functionality.

Example looking at the last 7 days:
<img width="389" alt="image" src="https://github.com/ipshipyard/waterworks-community/assets/85411/c8c2a257-73e6-4d59-89a8-d1c0ca0aede9">

The high 410’s emphasizes the importance of “Badbits”.  If we didn’t have it, the majority of requests would be served offering content we don’t want to serve.  
If this distribution were ever to change (e.g., “badbits” was disabled) that would be bad and we’d want to see it.

Current source: none currently other that a weekly snapshot value in https://protocollabs.grafana.net/d/J2_IHYTVz/gateway-report?orgId=1 

What's needed:
- [ ] Create new weekly plot that shows status code distribution.  Maybe use the top 5-10 status codes and bucket the rest as other.

## Unique CIDs requested per week
Why: Gives a sense of how much of the content addressable space is being requested through the ipfs.io gateway.

What's needed:
- [ ] weekly plot for the number of top-level / root CIDs requested by clients
- [ ] weekly plot for total number of CIDs fetched by ipfs.io gateway 
   - If clients only fetched non-badbit content, then this number would always be larger than the number above.  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimum graphs needed for top-level health reporting on the ipfs.io gateway #5

Background

Graphs

Unique Clients accessing ipfs.io / dweb.link

HTTP Requests to ipfs.io / dweb.link, by region

p95 of TTFB for “200” responses

Response code distribution

Unique CIDs requested per week

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Minimum graphs needed for top-level health reporting on the ipfs.io gateway #5

Description

Background

Graphs

Unique Clients accessing ipfs.io / dweb.link

HTTP Requests to ipfs.io / dweb.link, by region

p95 of TTFB for “200” responses

Response code distribution

Unique CIDs requested per week

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions