Scraper: increase HTTP timeout and set a User-Agent header

## Summary

The planet scraper (via the `river` library) has two issues causing avoidable feed fetch failures:

### 1. HTTP timeout is too short (3 seconds)

The timeout in `river`'s `lib/http.ml` is hardcoded to 3 seconds. Several feeds that are perfectly reachable consistently time out in CI:

- `https://ocamlpro.com/blog/feed` — works (200) but slow
- `https://mirage.io/feed.xml` — works (200) but slow
- `https://hannes.robur.coop/atom` — intermittent
- `https://blog.robur.coop/feed.xml` — intermittent
- `https://jon.recoil.org/atom.xml` — intermittent

**Proposed fix:** increase the timeout to 10 seconds.

### 2. No User-Agent header

The scraper sends HTTP requests without a `User-Agent` header (cohttp default). Some sites behind Cloudflare or similar CDNs reject requests without a recognized user agent:

- `https://priver.dev/tags/ocaml/index.xml` — returns **403 Forbidden** from CI, but **200** with a browser user agent

**Proposed fix:** set a common browser `User-Agent` header on each request. `Cohttp_lwt_unix.Client.get` accepts an optional `~headers` parameter.

### How `river` is managed

`river` is pinned to a specific commit via `ocamlorg.opam.template`:

```
pin-depends: [
  ["river.dev" "git+https://github.com/aantron/river#476dc945a908a69548bddd267f143a3e5d9c8a1a"]
]
```

This is a fork of `kayceesrk/river`. To apply the fixes:

1. Submit a PR to `aantron/river` (or `kayceesrk/river`) with the timeout and User-Agent changes
2. Update the pin hash in `ocamlorg.opam.template` to the new commit

### Context

- #3512 tracks the YouTube RSS feed platform-wide outage
- #3514 disables dead feeds and adds the Rocq Prover source

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scraper: increase HTTP timeout and set a User-Agent header #3513

Summary

1. HTTP timeout is too short (3 seconds)

2. No User-Agent header

How `river` is managed

Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Scraper: increase HTTP timeout and set a User-Agent header #3513

Description

Summary

1. HTTP timeout is too short (3 seconds)

2. No User-Agent header

How river is managed

Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

How `river` is managed