Skip to content

Commit 8699599

Browse files
Merge pull request #590 from SierraSoftworks/copilot/migrate-src-filters-module
Migrate filter DSL to the filt-rs crate
2 parents f385e2a + 3e76e6f commit 8699599

18 files changed

Lines changed: 129 additions & 1783 deletions

File tree

Cargo.lock

Lines changed: 14 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ chrono = { version = "0.4.45", features = ["serde"] }
1010
clap = { version = "4.6.1", features = ["derive", "string"] }
1111
croner = "3.0.1"
1212
ctrlc = "3.5.2"
13+
filt-rs = { version = "1.1.0", features = ["chrono", "regex", "serde"] }
1314
gix = { version = "0.84.0", features = [
1415
"blocking-http-transport-reqwest-rust-tls",
1516
] }

docs/advanced/filters.md

Lines changed: 53 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ Here are a few common filter examples which you might use in your configuration.
1919
- `!release.prerelease && !asset.source-code` - Only include release artifacts which are not marked as pre-releases and are not source code archives.
2020
- `repo.name in ["git-tool", "grey"]` - Only include repositories with the names "git-tool" or "grey".
2121
- `repo.stargazers >= 5` - Only include repositories with at least 5 stars.
22+
- `repo.name like "*-backup"` - Only include repositories whose name ends with "-backup" using glob pattern matching.
23+
- `repo.name matches r"^awesome-\d+$"` - Only include repositories whose name matches the given regular expression.
24+
- `repo.pushed_at > now() - 30d` - Only include repositories which have been pushed to within the last 30 days.
2225

2326
## Language Features
2427
### Properties - `repo.<field>`
@@ -46,6 +49,12 @@ If you wish to treat an empty string as a valid value, you can use `repo.<field>
4649
evaluation of an empty string.
4750
:::
4851

52+
::: tip
53+
You can also write *raw strings* using an `r` prefix (for example `r"^v\d+$"`), within which backslashes are treated literally
54+
rather than as escape sequences. This is particularly convenient when writing [regular expression](#pattern-matching-like-matches)
55+
patterns. Use the hashed form `r#"..."#` if your pattern needs to contain a double quote.
56+
:::
57+
4958
#### Numbers
5059
Numbers are represented internally as a 64-bit floating-point value, which means that they can represent most reasonably sized
5160
integers as well as most reasonably precise decimal numbers. For example, `5` and `5.0` are equivalent in the filter language.
@@ -61,6 +70,18 @@ example, `repo.fork` will evaluate to `true` if the repository is a fork, and `f
6170
The `null` value is used to represent the absence of a value, and is considered falsey when evaluated. Accessing a property which
6271
does not exist will return `null`.
6372

73+
#### Datetimes and Durations
74+
Some fields, such as `repo.pushed_at` or `release.published_at`, expose native timestamps rather than strings. These can be compared
75+
against one another, and against the current time using the [`now()`](#functions) function, allowing you to backup only those entities
76+
which have changed recently.
77+
78+
Durations are written as a number immediately followed by a unit (`ms`, `s`, `m` for minutes, `h`, `d`, or `w`), and several segments
79+
can be chained together to form a more precise duration, for example `1h30m`. Datetimes and durations support `+` and `-` arithmetic,
80+
so `now() - 7d` evaluates to the point in time seven days ago.
81+
82+
- `repo.pushed_at > now() - 30d` - Only include repositories which have been pushed to within the last 30 days.
83+
- `release.published_at < now() - 1w` - Only include releases which were published more than a week ago.
84+
6485
## Operators
6586
### Unary Negation - `!`
6687
The unary negation operator converts the following expression into the boolean opposite of its value.
@@ -102,8 +123,9 @@ comparison. These operators **DO NOT** perform type coercion, which means that y
102123
type - for example, comparing `5 <= "5" || 5 >= "5"` will always return `false`.
103124

104125
::: warning
105-
String comparisons are performed using a case-insensitive comparison of ASCII characters, which means that `"Hello" == "hello"` will return `true`,
106-
as will `"hello👋" == "hello"`.
126+
String comparisons are performed case-insensitively using the filter language's Unicode case-folding rules, which means that
127+
`"Hello" == "hello"` will return `true`, as will `"STRASSE" == "straße"`. If you need an exact, case-sensitive comparison, use the
128+
[`_cs` variants](#case-sensitivity-cs) of the string operators.
107129
:::
108130

109131
- `==` - Returns `true` if the left and right hand expressions are equal.
@@ -136,6 +158,35 @@ The prefix and suffix matching operators are used to determine whether a string
136158
- `"hello" startswith "he"` - Determines whether the string `hello` starts with the sequence `he`, returning `true` in this case.
137159
- `"goodbye" endswith "bye"` - Determines whether the string `goodbye` ends with the sequence `bye`, returning `true` in this case.
138160

161+
### Pattern Matching - `like`, `matches`
162+
The pattern matching operators allow you to match a string against a pattern, which can be useful when you want to match
163+
repositories whose names follow a particular convention without listing each of them explicitly.
164+
165+
- `like` performs a case-insensitive [glob](https://en.wikipedia.org/wiki/Glob_(programming)) match, where `*` matches any
166+
sequence of characters (including none), `?` matches exactly one character, and a backslash makes the following character
167+
literal (`\*`, `\?`, `\\`). For example, `repo.name like "*-rs"` matches any repository whose name ends with `-rs`.
168+
- `matches` performs a [regular expression](https://docs.rs/regex/latest/regex/#syntax) match. Regular expressions are
169+
case-sensitive (use `(?i)` to ignore case) and unanchored (use `^` and `$` to anchor the match). For example,
170+
`release.tag matches r"^v\d+(\.\d+){2}$"` matches tags like `v1.2.3`.
171+
172+
::: tip
173+
Regular expression patterns are easiest to write using [raw strings](#strings) (`r"..."`), which do not process backslash
174+
escape sequences and so avoid the need to double-escape characters like `\d`.
175+
:::
176+
177+
### Case Sensitivity - `_cs`
178+
The string operators (`contains`, `in`, `startswith`, `endswith`, and `like`) compare values case-insensitively by default. Each of
179+
them has a case-sensitive variant with a `_cs` suffix (`contains_cs`, `in_cs`, `startswith_cs`, `endswith_cs`, and `like_cs`) which
180+
compares strings exactly as written. The `matches` operator is always case-sensitive unless you opt in with the `(?i)` flag.
181+
182+
## Functions
183+
Filters may call built-in functions using the familiar `name(args...)` syntax. Unknown function names and incorrect argument counts
184+
are rejected when the filter is parsed.
185+
186+
- `now()` - Returns the current UTC time, evaluated afresh on every evaluation. This is most useful in combination with
187+
[durations](#datetimes-and-durations), for example `repo.pushed_at > now() - 30d`.
188+
- `trim(string)` - Returns the string argument with leading and trailing whitespace removed (`null` for non-string values).
189+
139190
## Nerdy Details
140191
The filtering language itself is implemented as a simple recursive descent parser which compiles an expression
141192
tree from the input string. This expression tree is then evaluated using an interpreter to determine whether

docs/reference/gist.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,3 +65,5 @@ These fields are also available when using [`github/repo`](./repo.md) or [`githu
6565
| `gist.file_names` | `array` | List of file names in the gist |
6666
| `gist.languages` | `array` | List of programming languages used in the gist |
6767
| `gist.type` | `string` | MIME-Type of content in the gist |
68+
| `gist.created_at` | `datetime`| When the gist was created |
69+
| `gist.updated_at` | `datetime`| When the gist was last updated |

docs/reference/release.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,9 +70,13 @@ For `kind: github/release`
7070
| `release.draft` | `boolean` | Whether the release is a draft (unpublished) release |
7171
| `release.prerelease` | `boolean` | Whether to identify the release as a prerelease or a full release |
7272
| `release.published` | `boolean` | Whether the release is a published (not a draft) release |
73+
| `release.created_at` | `datetime` | When the release was created (_2013-02-27T19:35:32Z_) |
74+
| `release.published_at`| `datetime` | When the release was published, or `null` for drafts (_2013-02-27T19:35:32Z_) |
7375
| `asset.name` | `string` | The file name of the asset (_github-backup-darwin-arm64_) |
7476
| `asset.size` | `integer` | The size of the asset, in kilobytes. (_1024_) |
7577
| `asset.downloaded` | `boolean` | If the asset was downloaded at least once from the GitHub Release |
78+
| `asset.created_at` | `datetime` | When the asset was created (_2013-02-27T19:35:32Z_) |
79+
| `asset.updated_at` | `datetime` | When the asset was last updated (_2013-02-27T19:35:32Z_) |
7680

7781
```json
7882
{
@@ -117,7 +121,11 @@ For `kind: github/release`
117121
// Whether the release is a draft (inverse of published)
118122
"draft": false,
119123
/// Whether the release has been published yet or not (inverse of draft)
120-
"published": true
124+
"published": true,
125+
// When the release was created
126+
"created_at": "2013-02-27T19:35:32Z",
127+
// When the release was published (null for draft releases)
128+
"published_at": "2013-02-27T19:35:32Z"
121129
},
122130

123131
// Describes a specific artifact which is part of a release
@@ -127,7 +135,11 @@ For `kind: github/release`
127135
// The size of the release asset in kilobytes
128136
"size": 1024,
129137
// Whether the asset has been downloaded at least once
130-
"downloaded": true
138+
"downloaded": true,
139+
// When the asset was created
140+
"created_at": "2013-02-27T19:35:32Z",
141+
// When the asset was last updated
142+
"updated_at": "2013-02-27T19:35:32Z"
131143
}
132144
}
133145
```

docs/reference/repo.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,9 @@ These fields are also available when using [`github/release`](./release.md) or [
7373
| `repo.template` | `boolean` | Whether this repository acts as a template that can be used to generate new repositories |
7474
| `repo.forks` | `integer` | The number of times this repository is forked |
7575
| `repo.stargazers` | `integer` | The number of people starred this repository |
76+
| `repo.pushed_at` | `datetime` | When the repository was last pushed to (_2011-01-26T19:06:43Z_) |
77+
| `repo.created_at` | `datetime` | When the repository was created (_2011-01-26T19:01:12Z_) |
78+
| `repo.updated_at` | `datetime` | When the repository was last updated (_2011-01-26T19:14:43Z_) |
7679

7780
```json
7881
{
@@ -102,7 +105,13 @@ These fields are also available when using [`github/release`](./release.md) or [
102105
// The number of times this repository has been forked.
103106
"forks": 0,
104107
// The number of people who have starred this repository.
105-
"stargazers": 501
108+
"stargazers": 501,
109+
// When the repository was last pushed to.
110+
"pushed_at": "2011-01-26T19:06:43Z",
111+
// When the repository was created.
112+
"created_at": "2011-01-26T19:01:12Z",
113+
// When the repository was last updated.
114+
"updated_at": "2011-01-26T19:14:43Z"
106115
}
107116
}
108117
```

src/entities/macros.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ macro_rules! entity {
2929
}
3030
)*
3131

32-
pub fn with_metadata<V: Into<FilterValue>>(mut self, key: &'static str, value: V) -> Self {
32+
pub fn with_metadata<'a, V: Into<FilterValue<'a>>>(mut self, key: &'static str, value: V) -> Self {
3333
self.metadata.insert(key, value.into());
3434
self
3535
}
@@ -47,7 +47,7 @@ macro_rules! entity {
4747
}
4848

4949
impl crate::Filterable for $name {
50-
fn get(&self, key: &str) -> crate::FilterValue {
50+
fn get(&self, key: &str) -> crate::FilterValue<'_> {
5151
self.metadata.get(key)
5252
}
5353
}
@@ -79,7 +79,7 @@ mod tests {
7979
assert_eq!(entity.url, "http://example.com");
8080
assert_eq!(entity.credentials, Credentials::Token("test".to_string()));
8181

82-
assert_eq!(entity.get("test"), FilterValue::String("test".to_string()));
82+
assert_eq!(entity.get("test"), FilterValue::String("test".into()));
8383
assert_eq!(entity.get("test2"), FilterValue::Number(1_f64));
8484
}
8585
}

src/entities/mod.rs

Lines changed: 20 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ use crate::{FilterValue, Filterable};
77

88
pub use credentials::Credentials;
99
pub use release::Release;
10+
use std::borrow::Cow;
1011
use std::collections::HashMap;
1112
use unicase::UniCase;
1213

@@ -18,21 +19,36 @@ pub trait BackupEntity: std::fmt::Display + Filterable {
1819
}
1920

2021
#[derive(Default, Clone, Debug)]
21-
pub struct Metadata(HashMap<UniCase<&'static str>, FilterValue>);
22+
pub struct Metadata(HashMap<UniCase<&'static str>, FilterValue<'static>>);
2223

2324
impl Metadata {
24-
pub fn insert<V: Into<FilterValue>>(&mut self, key: &'static str, value: V) {
25-
self.0.insert(UniCase::new(key), value.into());
25+
pub fn insert<'a, V: Into<FilterValue<'a>>>(&mut self, key: &'static str, value: V) {
26+
self.0.insert(UniCase::new(key), into_owned(value.into()));
2627
}
2728

28-
pub fn get(&self, key: &str) -> FilterValue {
29+
pub fn get(&self, key: &str) -> FilterValue<'_> {
2930
self.0
3031
.get(&UniCase::new(key))
3132
.cloned()
3233
.unwrap_or(FilterValue::Null)
3334
}
3435
}
3536

37+
/// Converts a [`FilterValue`] into one which owns all of its data so that it
38+
/// can be cached within a [`Metadata`] collection (whose entries must outlive
39+
/// the entity they were derived from).
40+
fn into_owned(value: FilterValue<'_>) -> FilterValue<'static> {
41+
match value {
42+
FilterValue::Null => FilterValue::Null,
43+
FilterValue::Bool(b) => FilterValue::Bool(b),
44+
FilterValue::Number(n) => FilterValue::Number(n),
45+
FilterValue::String(s) => FilterValue::String(Cow::Owned(s.into_owned())),
46+
FilterValue::Tuple(v) => FilterValue::Tuple(v.into_iter().map(into_owned).collect()),
47+
FilterValue::DateTime(dt) => FilterValue::DateTime(dt),
48+
FilterValue::Duration(d) => FilterValue::Duration(d),
49+
}
50+
}
51+
3652
pub trait MetadataSource {
3753
fn inject_metadata(&self, metadata: &mut Metadata);
3854
}

src/filter/expr.rs

Lines changed: 0 additions & 113 deletions
This file was deleted.

0 commit comments

Comments
 (0)