Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
793 changes: 401 additions & 392 deletions .github/actions/spelling/expect.txt

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions data/clients/telegram-preview.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
- name: telegrambot
action: ALLOW
expression:
all:
- userAgent.matches("TelegramBot")
- verifyFCrDNS(remoteAddress, "ptr\\.telegram\\.org$")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this. So much.

6 changes: 6 additions & 0 deletions data/clients/vk-preview.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
- name: vkbot
action: ALLOW
expression:
all:
- userAgent.matches("vkShare[^+]+\\+http\\://vk\\.com/dev/Share")
- verifyFCrDNS(remoteAddress, "^snipster\\d+\\.go\\.mail\\.ru$")
1 change: 1 addition & 0 deletions data/crawlers/_allow-good.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@
- import: (data)/crawlers/marginalia.yaml
- import: (data)/crawlers/mojeekbot.yaml
- import: (data)/crawlers/commoncrawl.yaml
- import: (data)/crawlers/yandexbot.yaml
6 changes: 6 additions & 0 deletions data/crawlers/yandexbot.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
- name: yandexbot
action: ALLOW
expression:
all:
- userAgent.matches("\\+http\\://yandex\\.com/bots")
- verifyFCrDNS(remoteAddress, "^.*\\.yandex\\.(ru|com|net)$")
2 changes: 2 additions & 0 deletions data/meta/messengers-preview.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
- import: (data)/clients/telegram-preview.yaml
- import: (data)/clients/vk-preview.yaml
25 changes: 24 additions & 1 deletion docs/docs/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Add support to simple Valkey/Redis cluster mode
- Open Graph passthrough now reuses the configured target Host/SNI/TLS settings, so metadata fetches succeed when the upstream certificate differs from the public domain. ([1283](https://github.com/TecharoHQ/anubis/pull/1283))
- Stabilize the CVE-2025-24369 regression test by always submitting an invalid proof instead of relying on random POW failures.
- Add Polish locale ([#1292](https://github.com/TecharoHQ/anubis/pull/1309))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Add Polish locale ([#1292](https://github.com/TecharoHQ/anubis/pull/1309))


### Logging customization

Expand All @@ -44,6 +43,30 @@ logging:
```

Additionally, information about [how Anubis uses each logging level](./admin/policies.mdx#log-levels) has been added to the documentation.
Comment thread
Xe marked this conversation as resolved.
### DNS Features

- CEL expressions for:
- FCrDNS checks
- Forward DNS queries
- Reverse DNS queries
- `arpaReverseIP` to transform IPv4/6 addresses into ARPA reverse IP notation.
- `regexSafe` to escape regex special characters (useful for including `remoteAddress` or headers in regular expressions).
- DNS cache and other optimizations to minimize unnecessary DNS queries.

The DNS cache TTL can be changed in the bots config like this:
```yaml
dns_ttl:
forward: 600
reverse: 600
```
The default value for both forward and reverse queries is 300 seconds.

The `verifyFCrDNS` CEL function has two overloads:
- `(addr)`
Simply verifies that the remote side has PTR records pointing to the target address.
- `(addr, ptrPattern)`
Verifies that the remote side refers to a specific domain and that this domain points to the target IP.


## v1.23.1: Lyse Hext - Echo 1

Expand Down
114 changes: 114 additions & 0 deletions docs/docs/admin/configuration/expressions.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,27 @@ This is best applied when doing explicit block rules, eg:

It seems counter-intuitive to allow known bad clients through sometimes, but this allows you to confuse attackers by making Anubis' behavior random. Adjust the thresholds and numbers as facts and circumstances demand.

### `regexSafe`

Available in `bot` expressions.

```ts
function regexSafe(input: string): string;
```

`regexSafe` takes a string and escapes it for safe use inside of a regular expression. This is useful when you are creating regular expressions from headers or variables such as `remoteAddress`.

| Input | Output |
| :------------------------ | :------------------------------ |
| `regexSafe("1.2.3.4")` | `1\\.2\\.3\\.4` |
| `regexSafe("techaro.lol")` | `techaro\\.lol` |
| `regexSafe("star*")` | `star\\*` |
| `regexSafe("plus+")` | `plus\\+` |
| `regexSafe("{braces}")` | `\\{braces\\}` |
| `regexSafe("start^")` | `start\\^` |
| `regexSafe("back\\slash")` | `back\\\\slash` |
| `regexSafe("dash-dash")` | `dash\\-dash` |

### `segments`

Available in `bot` expressions.
Expand Down Expand Up @@ -266,6 +287,99 @@ This is useful if you want to write rules that allow requests that have no query
- size(segments(path)) < 2
```

### DNS Functions

Anubis can also perform DNS lookups as a part of its expression evaluation. This can be useful for doing things like checking for a valid [Forward-confirmed reverse DNS (FCrDNS)](https://en.wikipedia.org/wiki/Forward-confirmed_reverse_DNS) record.

#### `arpaReverseIP`

Available in `bot` expressions.

```ts
function arpaReverseIP(ip: string): string;
```

`arpaReverseIP` takes an IP address and returns its value in [ARPA notation](https://www.ietf.org/rfc/rfc2317.html). This can be useful when matching PTR record patterns.

| Input | Output |
| :----------------------------- | :------------------------------------------------------------------- |
| `arpaReverseIP("1.2.3.4")` | `4.3.2.1` |
| `arpaReverseIP("2001:db8::1")` | `1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.b.d.0.1.0.0.2` |

#### `lookupHost`

Available in `bot` expressions.

```ts
function lookupHost(host: string): string[];
```

`lookupHost` performs a DNS lookup for the given hostname and returns a list of IP addresses.

```yaml
- name: cloudflare-ip-in-host-header
action: DENY
expression: '"104.16.0.0" in lookupHost(headers["Host"])'
```

#### `reverseDNS`

Available in `bot` expressions.

```ts
function reverseDNS(ip: string): string[];
```

`reverseDNS` takes an IP address and returns the DNS names associated with it. This is useful when you want to check PTR records of an IP address.

```yaml
- name: allow-googlebot
action: ALLOW
expression: 'reverseDNS(remoteAddress).endsWith(".googlebot.com")'
```

::: warning

Do not use this for validating the legitimacy of an IP address. It is possible for DNS records to be out of date or otherwise manipulated. Use [`verifyFCrDNS`](#verifyfcrdns) instead for a more reliable result.

:::

#### `verifyFCrDNS`

Available in `bot` expressions.

```ts
function verifyFCrDNS(ip: string): bool;
function verifyFCrDNS(ip: string, pattern: string): bool;
```

`verifyFCrDNS` checks if the reverse DNS of an IP address matches its forward DNS. This is a common technique to filter out spam and bot traffic. `verifyFCrDNS` comes in two forms:

- `verifyFCrDNS(remoteAddress)` will check that the reverse DNS of the remote address resolves back to the remote address.
- `verifyFCrDNS(remoteAddress, pattren)` will check that the reverse DNS of the remote address is matching with pattern and that name resolves back to the remote address.

This is best used in rules like this:

```yaml
- name: require-fcrdns-for-post
action: DENY
expression:
all:
- method == "POST"
- "!verifyFCrDNS(remoteAddress)"
```

Here is an another example that allows requests from telegram:

```yaml
- name: telegrambot
action: ALLOW
expression:
all:
- userAgent.matches("TelegramBot")
- verifyFCrDNS(remoteAddress, "ptr\\.telegram\\.org$")
```

## Life advice

Expressions are very powerful. This is a benefit and a burden. If you are not careful with your expression targeting, you will be liable to get yourself into trouble. If you are at all in doubt, throw a `CHALLENGE` over a `DENY`. Legitimate users can easily work around a `CHALLENGE` result with a [proof of work challenge](../../design/why-proof-of-work.mdx). Bots are less likely to be able to do this.
70 changes: 70 additions & 0 deletions internal/dns/cache.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
package dns

import (
"log/slog"
"time"

"github.com/TecharoHQ/anubis/lib/store"

_ "github.com/TecharoHQ/anubis/lib/store/all"
)

type DnsCache struct {
forward store.JSON[[]string]
reverse store.JSON[[]string]
forwardTTL time.Duration
reverseTTL time.Duration
}

func NewDNSCache(forwardTTL int, reverseTTL int, backend store.Interface) *DnsCache {
return &DnsCache{
forward: store.JSON[[]string]{
Underlying: backend,
Prefix: "forwardDNS",
},
reverse: store.JSON[[]string]{
Underlying: backend,
Prefix: "reverseDNS",
},
forwardTTL: time.Duration(forwardTTL) * time.Second,
reverseTTL: time.Duration(reverseTTL) * time.Second,
}
}

func (d *Dns) getCachedForward(host string) ([]string, bool) {
if d.cache == nil {
return nil, false
}
if cached, err := d.cache.forward.Get(d.ctx, host); err == nil {
slog.Debug("DNS: forward cache hit", "name", host, "ips", cached)
return cached, true
}
slog.Debug("DNS: forward cache miss", "name", host)
return nil, false
}

func (d *Dns) getCachedReverse(addr string) ([]string, bool) {
if d.cache == nil {
return nil, false
}
if cached, err := d.cache.reverse.Get(d.ctx, addr); err == nil {
slog.Debug("DNS: reverse cache hit", "addr", addr, "names", cached)
return cached, true
}
slog.Debug("DNS: reverse cache miss", "addr", addr)
return nil, false
}

func (d *Dns) forwardCachePut(host string, entries []string) {
if d.cache == nil {
return
}
d.cache.forward.Set(d.ctx, host, entries, d.cache.forwardTTL)
}

func (d *Dns) reverseCachePut(addr string, entries []string) {
if d.cache == nil {
return
}
d.cache.reverse.Set(d.ctx, addr, entries, d.cache.reverseTTL)
}
Loading