@@ -44,18 +44,26 @@ catp [OPTIONS] PATH ...
4444 use 0 for multi-threaded zst decoder (slightly faster at cost of more CPU) (default 1)
4545 -pass value
4646 filter matching, may contain multiple AND patterns separated by ^,
47- if filter matches, line is passed to the output (unless filtered out by -skip)
48- each -pass value is added with OR logic,
49- for example, you can use "-pass bar^baz -pass foo" to only keep lines that have (bar AND baz) OR foo
47+ if filter matches, line is passed to the output (may be filtered out by preceding -skip)
48+ other -pass values are evaluated if preceding pass/skip did not match,
49+ for example, you can use "-pass bar^baz -pass foo -skip fo" to only keep lines that have (bar AND baz) OR foo, but not fox
50+ -pass-any
51+ finishes matching and gets the value even if previous -pass did not match,
52+ if previous -skip matched, the line would be skipped any way.
53+ -pass-csv value
54+ filter matching, loads pass params from CSV file,
55+ each line is treated as -pass, each column value is AND condition.
5056 -progress-json string
5157 write current progress to a file
5258 -rate-limit float
5359 output rate limit lines per second
5460 -skip value
5561 filter matching, may contain multiple AND patterns separated by ^,
56- if filter matches, line is removed from the output (even if it passed -pass)
57- each -skip value is added with OR logic,
62+ if filter matches, line is removed from the output (may be kept if it passed preceding -pass)
5863 for example, you can use "-skip quux^baz -skip fooO" to skip lines that have (quux AND baz) OR fooO
64+ -skip-csv value
65+ filter matching, loads skip params from CSV file,
66+ each line is treated as -skip, each column value is AND condition.
5967 -version
6068 print version and exit
6169```
@@ -77,10 +85,10 @@ get-key.log: 100.0% bytes read, 1000000 lines processed, 8065.7 l/s, 41.8 MB/s,
7785```
7886
7987Run log filtering (lines containing ` foo bar ` or ` baz ` ) on multiple files in background (with ` screen ` ) and output to a
80- new file.
88+ new compressed file.
8189
8290```
83- screen -dmS foo12 ./catp -output ~/foo-2023-07-12.log -pass "foo bar" -pass "baz" /home/logs/server-2023-07-12*
91+ screen -dmS foo12 ./catp -output ~/foo-2023-07-12.log.zst -pass "foo bar" -pass "baz" /home/logs/server-2023-07-12*
8492```
8593
8694```
@@ -100,3 +108,14 @@ all: 32.3% bytes read, /home/logs/server-2023-07-12-09-00.log_6.zst: 5.1% bytes
100108# detaching from screen with ctrl+a+d
101109```
102110
111+ Filter based on large list of needles. Values from allow and block lists are loaded into high-performance
112+ [ Aho Corasick] ( https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm ) indexes.
113+
114+ ```
115+ catp -pass-csv allowlist.csv -skip-csv blocklist.csv -pass-any -output filtered.log.zst source.log.zst
116+ ```
117+
118+ Each source line would follow the filtering pipeline:
119+ * if ` allowlist.csv ` has at least one row, all cells of which are present in the source line, source line gets into output
120+ * if not, but if ` blocklist.csv ` has at least one row, all cells of which are present in the source line, source line is skipped
121+ * if not, source line gets into output because of ` -pass-any `
0 commit comments