Skip to content

Commit f1c24b5

Browse files
DeltaMichaelDilyan Marinovpre-commit-ci[bot]
authored
vdk-structlog: update README.md (#3047)
Add info about bound loggers and syslog support --------- Signed-off-by: Dilyan Marinov <[email protected]> Co-authored-by: Dilyan Marinov <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 64864b1 commit f1c24b5

File tree

1 file changed

+141
-25
lines changed

1 file changed

+141
-25
lines changed

Diff for: projects/vdk-plugins/vdk-structlog/README.md

+141-25
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,9 @@ This plugin allows users to:
77
- select the log output format
88
- configure the logging metadata
99
- display metadata added by bound loggers
10+
- emit logs to syslog
11+
12+
For a more in-depth technical analysis, check out [VEP-2448](https://github.com/vmware/versatile-data-kit/blob/main/specs/vep-2448-vdk-run-logs-simplified-and-readable/README.md)
1013

1114
## Usage
1215

@@ -16,28 +19,36 @@ pip install vdk-structlog
1619

1720
### Configuration
1821

19-
(`vdk config-help` is useful command to browse all config options of your installation of vdk)
22+
(`vdk config-help` is a useful command to browse all config options of your installation of vdk)
2023

21-
| Name | Description | Example Value | Possible Values |
22-
|----------------------------|---------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------|
23-
| logging_metadata | Configure the metadata that will be output along with the log message | "timestamp, level, logger_name, file_name, vdk_job_name | "timestamp, level, logger_name, file_name, line_number, function_name, vdk_job_name, vdk_step_name, vdk_step_type" |
24-
| logging_format | Configure the logging output format. Available formats: json, console | "console" | "console", "json" |
25-
| custom_console_log_pattern | Custom format string for console logging, applied only when`logging_format` is 'console' and overrides `logging_metadata` | "%(asctime)s %(name)-12s %(levelname)-8s %(message)s" | Any valid Python logging format string |
26-
| log_level_module | Configure the log level of different Python modules separately | "a.b.c=INFO;foo.bar=ERROR" | Semicolon-separated list of pairs of Python module paths and log level labels |
24+
| Name | Description | Example Value | Possible Values |
25+
|----------------------------|-------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
26+
| logging_metadata | Configure the metadata that will be output along with the log message | "timestamp, level, logger_name, file_name, vdk_job_name | Any combination of the following: "timestamp, level, logger_name, file_name, line_number, function_name, vdk_job_name, vdk_step_name, vdk_step_type". Can be expanded by extra params and bound key-value pairs. See the bound logger examples for more information |
27+
| logging_format | Configure the logging output format. Available formats: json, console | "console" | "console", "json", "ltsv" |
28+
| custom_console_log_pattern | Custom format string for console logging, applied only when`logging_format` is 'console'. Overrides `logging_metadata`. | "%(asctime)s %(name)-12s %(levelname)-8s %(message)s" | Any valid Python logging format string |
29+
| log_level_module | Configure the log level of different Python modules separately | "a.b.c=INFO;foo.bar=ERROR" | Semicolon-separated list of pairs of Python module paths and log level labels |
30+
| syslog_host | Syslog host to which logs are emitted | "syslog.vmware.com" | Any valid host name |
31+
| syslog_port | Syslog port used to emit logs | 514 | Any valid port number |
32+
| syslog_protocol | Protocol used to emit logs | "UDP" | "TCP", "UDP" |
33+
| syslog_enabled | Enable/disable syslog | "True" | "True", "False" |
2734

2835
### Example: Configure Custom Console Format
2936

30-
If you wish to apply a specific format to your console logs, you can define a custom format using the `custom_console_log_pattern` configuration. This custom format string will be used only when the `logging_format` is set to 'console'.
37+
If you wish to apply a specific format to your console logs, you can define a
38+
custom format using the `custom_console_log_pattern` configuration. This custom
39+
format string will be used only when the `logging_format` is set to 'console'.
3140

3241
For example, add the following to your data job configuration:
3342

34-
```
43+
```ini
3544
[vdk]
3645
custom_console_log_pattern=%(asctime)s %(name)-12s %(levelname)-8s %(message)s
3746
```
38-
When you run your data job, regardless of other logging settings, your logs will strictly follow this custom format, displaying the timestamp, logger's name, log level, and the log message as per the format string specified.
47+
When you run your data job, regardless of other logging settings, your logs will
48+
strictly follow this custom format, displaying the timestamp, logger's name, log
49+
level, and the log message as per the format string specified.
3950

40-
```
51+
```log
4152
2023-10-17 11:20:59,202 managed_cursor INFO ingest-from-db-example-job - Executing query SUCCEEDED. Query duration 00h:00m:00s
4253
2023-10-17 11:20:59,202 managed_connectio INFO ingest-from-db-example-job - Fetching query result...
4354
2023-10-17 11:20:59,202 managed_cursor INFO ingest-from-db-example-job - Fetching all results from query ...
@@ -54,7 +65,7 @@ When you run your data job, regardless of other logging settings, your logs will
5465

5566
Create a data job and add the following config options
5667

57-
```
68+
```ini
5869
[vdk]
5970
logging_metadata=timestamp,level,file_name,vdk_job_name
6071
logging_format=console
@@ -64,7 +75,7 @@ Then run the data job. You should see just the configured tags where relevant.
6475
For example, you won't see the vdk_job_name outside of log statements directly
6576
related to job execution.
6677

67-
```
78+
```log
6879
2023-10-17 11:20:59,202 [VDK] [INFO ] managed_cursor.py ingest-from-db-example-job - Executing query SUCCEEDED. Query duration 00h:00m:00s
6980
2023-10-17 11:20:59,202 [VDK] [INFO ] managed_connection_b ingest-from-db-example-job - Fetching query result...
7081
2023-10-17 11:20:59,202 [VDK] [INFO ] managed_cursor.py ingest-from-db-example-job - Fetching all results from query ...
@@ -78,14 +89,15 @@ related to job execution.
7889

7990
Now, let's remove the timestamp from the configuration and add the line number
8091

81-
```
92+
```ini
93+
[vdk]
8294
logging_metadata=level,file_name,line_number,vdk_job_name
8395
logging_format=console
8496
```
8597

8698
And run the job again
8799

88-
```
100+
```log
89101
[INFO ] managed_cursor.py :97 ingest-from-db-example-job - Executing query SUCCEEDED. Query duration 00h:00m:00s
90102
[INFO ] managed_connection_b :133 ingest-from-db-example-job - Fetching query result...
91103
[INFO ] managed_cursor.py :193 ingest-from-db-example-job - Fetching all results from query ...
@@ -101,7 +113,7 @@ And run the job again
101113

102114
Create a data job and add the following config options
103115

104-
```
116+
```ini
105117
[vdk]
106118
logging_metadata=timestamp,level,file_name,vdk_job_name
107119
logging_format=json
@@ -123,11 +135,112 @@ And you should see json-formatted logs.
123135

124136
### Example: Bound loggers
125137

126-
TODO: Add an example once bound loggers are part of vdk-core
138+
Python's logging module allows you to pass extra params as a dict. vdk-structlog
139+
takes this into account and displays the extra params, as long as they're added
140+
to `structlog_metadata` in config.ini
141+
142+
```python
143+
import logging
144+
import uuid
145+
146+
log = logging.getLogger(__name__)
147+
148+
myid = uuid.uuid4()
149+
log.info(f"Starting job step {__name__}")
150+
log.info("This is an info log statement with extra stuff.", extra={"uuid": myid})
151+
```
152+
153+
```ini
154+
[vdk]
155+
structlog_metadata=timestamp,level,logger_name,file_name,line_number,function_name,uuid
156+
structlog_format=console
157+
```
158+
159+
```log
160+
2024-01-16 10:16:53,237 [VDK] [INFO ] step_20_python_ste 20_python_step.py :21 run - Starting job step step_20_python_ste
161+
2024-01-16 10:16:53,237 [VDK] [INFO ] step_20_python_ste 20_python_step.py :22 run 78118954-fe0c-451d-80cf-9b9bc80b8140 - This is an info log statement with extra stuff
162+
```
163+
164+
This is fine, however, if we want to have the same uuid with multiple log
165+
statements, it's not very convenient to pass it every time. `vdk` offers a
166+
function to solve this problem.
167+
168+
```python
169+
import logging
170+
import uuid
171+
from vdk.internal.core.logging import bind_logger
172+
173+
log = logging.getLogger(__name__)
174+
175+
log.info(f"Starting job step {__name__}")
176+
177+
myid = uuid.uuid4()
178+
bound_log = bind_logger(log, {"uuid": myid})
179+
bound_log.info("This is an info log statement with extra stuff")
180+
bound_log.info("This is another log statement with uuid")
181+
```
182+
183+
```ini
184+
[vdk]
185+
structlog_metadata=timestamp,level,logger_name,file_name,line_number,function_name,uuid
186+
structlog_format=console
187+
```
188+
189+
Now every time you log using the bound logger, you have the uuid attached.
190+
191+
```log
192+
2024-01-16 10:31:04,761 [VDK] [INFO ] step_20_python_ste 20_python_step.py :22 run - Starting job step step_20_python_ste
193+
2024-01-16 10:31:04,761 [VDK] [INFO ] step_20_python_ste 20_python_step.py :25 run ed1e447d-3746-4068-8a83-353176327985 - This is an info log statement with extra stuff
194+
2024-01-16 10:31:04,761 [VDK] [INFO ] step_20_python_ste 20_python_step.py :26 run ed1e447d-3746-4068-8a83-353176327985 - This is another log statement with uuid
195+
```
196+
197+
Bound loggers act the same as regular loggers. You can even pass extra params to
198+
them, or bind additional context.
199+
200+
```python
201+
import logging
202+
import uuid
203+
from vdk.internal.core.logging import bind_logger
204+
205+
log = logging.getLogger(__name__)
206+
207+
log.info(f"Starting job step {__name__}")
127208

128-
### Example: Passing custom metadata fields with extra_params
209+
myid = uuid.uuid4()
210+
bound_log = bind_logger(log, {"uuid": myid})
211+
bound_log.info("This is an info log statement with extra stuff")
212+
bound_log.info("This is another log statement with uuid")
213+
bound_log.info("More stuff", extra={"new_key": "new_value", "extra_key":"more_value"})
214+
rebound_log = bind_logger(bound_log, {"another_key": "another_value"})
215+
```
216+
217+
```ini
218+
[vdk]
219+
structlog_metadata=timestamp,level,logger_name,file_name,line_number,function_name,uuid,new_key,another_key,extra_key
220+
structlog_format=console
221+
```
222+
223+
### Syslog support
224+
225+
Outputting logs to a syslog server is supported with the following config
226+
options
129227

130-
TODO: Add an example
228+
```ini
229+
[vdk]
230+
syslog_host=localhost
231+
syslog_port=514
232+
syslog_protocol=UDP
233+
syslog_enabled=True
234+
```
235+
236+
The syslog output format is set to `console` and the following formatter string
237+
is used. These are currently not configurable.
238+
239+
```python
240+
DETAILED_FORMAT =
241+
"%(asctime)s [VDK] %(job_name)s [%(levelname)-5.5s] %(name)-30.30s %(filename)20.20s:%("
242+
"lineno)-4.4s %(funcName)-16.16s[id:%(attempt_id)s]- %(message)s"
243+
```
131244

132245
### Build and test
133246

@@ -137,15 +250,18 @@ pip install -e .
137250
pytest
138251
```
139252

140-
In VDK repo [../build-plugin.sh](https://github.com/vmware/versatile-data-kit/tree/main/projects/vdk-plugins/build-plugin.sh) script can be used also.
253+
The
254+
[../build-plugin.sh](https://github.com/vmware/versatile-data-kit/tree/main/projects/vdk-plugins/build-plugin.sh)
255+
script can be used also.
141256

142257

143258
#### Note about the CICD:
144259

145260
.plugin-ci.yaml is needed only for plugins part of [Versatile Data Kit Plugin repo](https://github.com/vmware/versatile-data-kit/tree/main/projects/vdk-plugins).
146261

147-
The CI/CD is separated in two stages, a build stage and a release stage.
148-
The build stage is made up of a few jobs, all which inherit from the same
149-
job configuration and only differ in the Python version they use (3.7, 3.8, 3.9 and 3.10).
150-
They run according to rules, which are ordered in a way such that changes to a
151-
plugin's directory trigger the plugin CI, but changes to a different plugin does not.
262+
The CI/CD is separated in two stages, a build stage and a release stage. The
263+
build stage is made up of a few jobs, all which inherit from the same job
264+
configuration and only differ in the Python version they use (3.7, 3.8, 3.9 and
265+
3.10). They run according to rules, which are ordered in a way such that changes
266+
to a plugin's directory trigger the plugin CI, but changes to a different plugin
267+
does not.

0 commit comments

Comments
 (0)