-
Notifications
You must be signed in to change notification settings - Fork 2
Stream processing tool
License
zherczeg/pcresp
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Pcresp is a stream processing tool which can be used to update
stream data using PCRE2 regular expressions and custom scripts.
Some examples:
echo ABCD | pcresp '[BC]+'
By default pcresp prints the matched strings.
The unmatched characters are discarded by default.
Result: BC
echo ABCD | pcresp -s '' -p '[BC]+'
An empty program prints nothing.
The unmatched characters are printed because of -p.
Result: AD
echo ABCD | pcresp -p '[BC]+' -s '/bin/echo -n {#0}'
The echo program prints its arguments.
There is no newline because of the -n parameter.
Result: A{BC}D
echo A7 BB29 | pcresp '([A-Z]+)(\d+)' -s "*print -#1- :#2:"
The script arguments can be printed by the *print flag.
This is considerably faster than using the echo program.
Result:
-A- :7:
-BB- :29:
echo 29 30 31 32 33 | pcresp '(\d+)(*SKIP)(?C^ *null /usr/bin/expr #1 % 3 = 0 ^)'
This command prints those decimal numbers whose are divisible by 3.
This condition is checked by 'expr number % 3 = 0'.
The return value of expr influence the matching: zero - continue, non-zero fail.
Result: 30 33
SRC='
INC=`expr $2 + 1`
echo $1 : $INC
'
echo A7 BB29 | pcresp -d prog "$SRC" -s "/bin/bash -c #[prog] arg0 #1 #2" '([A-Z]+)(\d+)'
This example shows an easy way to embed strings (usually source codes).
The source referenced #[0] is not processed by pcresp.
Result:
A : 8
BB : 30
SRC='
INC=`expr $2 + 1`
echo $1 : $INC
'
echo A7 BB29 | pcresp --def-string prog "$SRC" --shell "/bin/bash -c ? arg0" \
-s "#[prog] #1 #2" '([A-Z]+)(\d+)'
Same as above
export PCRESP_SHELL="/bin/bash -c ? arg0"
SRC='
INC=`expr $2 + 1`
echo $1 : $INC
'
echo A7 BB29 | pcresp -d prog "$SRC" -s "#[prog] #1 #2" '([A-Z]+)(\d+)'
Same as above
Usage pcresp [options] pcre_pattern [file list]
-h, --help
Prints help and exit
-s, --script script
Executing this script after each successul match
-p, --print
Prints characters between matched strings
-d, --def-string name string
Define a constant string (see #[name])
--shell default-shell
Specify the default shell for each script
[--pattern] pcre2_pattern
Specify the pattern. The --pattern can be omitted
if the pattern does not start with dash
--limit n
Stop after n successful match (0 - unlimited)
-i
Enable caseless matching
-m
Both ^ and $ match newlines
-x
Ignore white space and # comments (extended regex)
-u, --utf
Enable UTF-8
--dot-all
Dot matches to any character
--newline-[type]
[type] can be: lf (linefeed), cr (carriage return)
crlf (CR followed by LF), anycrlf (any combination of
CR and LF), unicode (any Unicode newline sequence)
--bsr-[type]
[type] can be: anycrlf (any combination of CR and LF),
unicode (any Unicode newline sequence)
--verbose
Display executed commands (useful for debugging)
--end
End of parameter list. If the pattern has not
specified yet it must be the next argument
Reads from stdin if file list is empty
Return value:
0 - pattern matched at least once
1 - pattern does not match
2 - error occured
Pattern format:
Patterns must follow PCRE2 regular expression syntax
Scripts can be executed during matching using (?C) callouts
Script format:
A list of arguments separated by white space(s)
The first argument must be an executable file
Recognized # (hash mark) sequences:
Sequences marked by [*] are not accepted by --shell
#idx - string value of capture block idx (0-65535) [*]
#{idx} - string value of capture block idx (0-65535) [*]
#[name] - insert constant string by name [*]
#{idx,name} - same as #{idx} if capture block is not empty
same as #[name] otherwise [*]
#M - current MARK value [*]
## - # (hash mark)
#< - less-than sign character
#> - greater-than sign character
#n - newline (\n) character
Script arguments can be preceeded by control flags:
*print - print arguments
*!nl - no newline after arguments are printed
*null - discard output
*!sh - default-shell (--shell) is not used
Arguments enclosed in <> brackets:
Arguments can be enclosed in <> brackets. These enclosed
arguments are never recognised as special arguments such
as control flags but # sequences are still recognized.
Examples: <argument with spaces> <!null> <?>
a '/bin/echo <##>' script prints a # sign
Setting the default shell:
The default shell can be set by the --shell option or by the
PCRESP_SHELL environment variable.
Example: --shell '/bin/bash -c ? my_script'
The arguments defined by the default shell are added before
the arguments provided by the currently executed script.
If a question mark argument is present in the default script
it will be replaced by the first argument of the script and
this first argument is not appended after the default shell.
About
Stream processing tool
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published