Skip to content

Commit 6b42ab8

Browse files
tchapahg-g
andauthored
Use contextual logging (#337)
* Use contextual logging All possible direct klog calls are removed. Instead logr.Logger is loaded from the context or passed around as an argument. * Fix log levels * server: Handle context canceled * pod_reconciler: Use TRACE log level Co-authored-by: Abdullah Gharaibeh <[email protected]> * pod_reconciler: Don't log pod not found err --------- Co-authored-by: Abdullah Gharaibeh <[email protected]>
1 parent 5114a55 commit 6b42ab8

32 files changed

+436
-301
lines changed

docs/dev.md

+49-31
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,33 @@
11
<!-- Dev guide -->
22

3-
43
## Logging
54

5+
We use `logr.Logger` interface for logging everywhere.
6+
The logger instance is loaded from `context.Context` or passed around as an argument directly.
7+
This is aligned with contextual logging as explained in [k8s instrumentation logging guidelines](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/logging.md).
8+
9+
In other words, we explicitly don't use `klog` global logging calls.
10+
Using `klog` log value helpers like `klog.KObj` is just fine.
11+
612
### Change log verbosity
7-
We use the `k8s.io/klog/v2` package to manage logging.
813

914
We generally follow the [k8s instrumentation logging guidelines](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/logging.md), which states "the practical default level is V(2). Developers and QE environments may wish to run at V(3) or V(4)".
1015

11-
To configure logging verbosity, specify the `v` flag such as `--v=2`.
16+
To configure logging verbosity, specify the `v` flag such as `--v=2`.
1217

1318
### Add logs
1419

1520
The [k8s instrumentation logging guidelines](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/logging.md) has the following definitions:
1621

17-
* `klog.V(0).InfoS` = `klog.InfoS` - Generally useful for this to **always** be visible to a cluster operator
18-
* `klog.V(1).InfoS` - A reasonable default log level if you don't want verbosity.
19-
* `klog.V(2).InfoS` - Useful steady state information about the service and important log messages that may correlate to significant changes in the system. This is the recommended default log level for most systems.
20-
* `klog.V(3).InfoS` - Extended information about changes
21-
* `klog.V(4).InfoS` - Debug level verbosity
22-
* `klog.V(5).InfoS` - Trace level verbosity
22+
- `logger.V(0).Info` = `logger.Info` - Generally useful for this to **always** be visible to a cluster operator
23+
- `logger.V(1).Info` - A reasonable default log level if you don't want verbosity.
24+
- `logger.V(2).Info` - Useful steady state information about the service and important log messages that may correlate to significant changes in the system. This is the recommended default log level for most systems.
25+
- `logger.V(3).Info` - Extended information about changes
26+
- `logger.V(4).Info` - Debug level verbosity
27+
- `logger.V(5).Info` - Trace level verbosity
2328

2429
We choose to simplify to the following 3 common levels.
30+
2531
```
2632
const(
2733
DEFAULT=2
@@ -33,34 +39,46 @@ const(
3339

3440
The guidelines are written in the context of a k8s controller. Our [ext-proc](../pkg/ext-proc/) does more things such as handling requests and scraping metrics, therefore we adapt the guidelines as follows:
3541

36-
1. The server startup process and configuration.
37-
* `klog.InfoS` Logging at the `V(0)` verbosity level is generally welcome here as this is only logged once at startup, and provides useful info for debugging.
42+
1. The server startup process and configuration.
43+
44+
- `logger.Info` Logging at the `V(0)` verbosity level is generally welcome here as this is only logged once at startup, and provides useful info for debugging.
3845

3946
2. Reconciler loops. The reconciler loops watch for CR changes such as the `InferenceModel` CR. And given changes in these CRs significantly affect the behavior of the extension, we recommend using v=1 verbosity level as default, and sparsely use higher verbosity levels.
40-
41-
* `klog.V(DEFAULT).InfoS`
42-
* Default log level in the reconcilers.
43-
* Information about config (listening on X, watching Y)
44-
* Errors that repeat frequently that relate to conditions that can be corrected (e.g., inference model not initialized yet)
45-
* System state changing (adding/removing objects in the data store)
46-
* `V(VERBOSE)` and above: Use your best judgement.
47+
48+
- `logger.V(DEFAULT)`
49+
- Default log level in the reconcilers.
50+
- Information about config (listening on X, watching Y)
51+
- Errors that repeat frequently that relate to conditions that can be corrected (e.g., inference model not initialized yet)
52+
- System state changing (adding/removing objects in the data store)
53+
- `logger.V(VERBOSE)` and above: Use your best judgement.
4754

4855
3. Inference request handling. These requests are expected to be much higher volume than the control flow in the reconcilers and therefore we should be mindful of log spamming. We recommend using v=2 to log important info about a request, such as the HTTP response code, and higher verbosity levels for less important info.
4956

50-
* `klog.V(DEFAULT).InfoS`
51-
* Logging the status code of an HTTP request
52-
* Important decision making such as picking the target model, target pod
53-
* `klog.V(VERBOSE).InfoS`
54-
* Detailed request scheduling algorithm operations, such as running the filtering logic
55-
* `V(DEBUG)` and above: Use your best judgement.
57+
- `logger.V(DEFAULT)`
58+
- Logging the status code of an HTTP request
59+
- Important decision making such as picking the target model, target pod
60+
- `logger.V(VERBOSE)`
61+
- Detailed request scheduling algorithm operations, such as running the filtering logic
62+
- `logger.V(DEBUG)` and above: Use your best judgement.
5663

5764
4. Metric scraping loops. These loops run at a very high frequency, and logs can be very spammy if not handled properly.
58-
* `klog.V(TRACE).InfoS`
59-
* Transient errors/warnings, such as failure to get response from a pod.
60-
* Important state changes, such as updating a metric.
6165

62-
5. Misc
66+
- `logger.V(TRACE)`
67+
- Transient errors/warnings, such as failure to get response from a pod.
68+
- Important state changes, such as updating a metric.
69+
70+
5. Misc
6371
1. Periodic (every 5s) debug loop which prints the current pods and metrics.
64-
* `klog.WarningS` If the metrics are not fresh enough, which indicates an error occurred during the metric scraping loop.
65-
* `klog.V(DEBUG).InfoS`
66-
* This is very important to debug the request scheduling algorithm, and yet not spammy compared to the metric scraping loop logs.
72+
- `logger.V(DEFAULT).Error` If the metrics are not fresh enough, which indicates an error occurred during the metric scraping loop.
73+
- `logger.V(DEBUG)`
74+
- This is very important to debug the request scheduling algorithm, and yet not spammy compared to the metric scraping loop logs.
75+
76+
### Passing Logger Around
77+
78+
You can pass around a `context.Context` that contains a logger or a `logr.Logger` instance directly.
79+
You need to make the call which one to use. Passing a `context.Context` is more standard,
80+
on the other hand you then need to call `log.FromContext` everywhere.
81+
82+
As `logger.V` calls are cummulative, i.e. `logger.V(2).V(3)` results in `logger.V(5)`,
83+
a logger should be passed around with no verbosity level set so that `logger.V(DEFAULT)`
84+
actually uses `DEFAULT` verbosity level.

go.mod

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ require (
88
github.com/bojand/ghz v0.120.0
99
github.com/elastic/crd-ref-docs v0.1.0
1010
github.com/envoyproxy/go-control-plane/envoy v1.32.4
11+
github.com/go-logr/logr v1.4.2
1112
github.com/google/go-cmp v0.6.0
1213
github.com/jhump/protoreflect v1.17.0
1314
github.com/onsi/ginkgo/v2 v2.22.2
@@ -61,7 +62,6 @@ require (
6162
github.com/felixge/httpsnoop v1.0.4 // indirect
6263
github.com/fsnotify/fsnotify v1.7.0 // indirect
6364
github.com/fxamacker/cbor/v2 v2.7.0 // indirect
64-
github.com/go-logr/logr v1.4.2 // indirect
6565
github.com/go-logr/stdr v1.2.2 // indirect
6666
github.com/go-logr/zapr v1.3.0 // indirect
6767
github.com/go-openapi/jsonpointer v0.21.0 // indirect

pkg/ext-proc/backend/datastore.go

+5-5
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,11 @@ import (
77
"strconv"
88
"sync"
99

10+
"github.com/go-logr/logr"
1011
corev1 "k8s.io/api/core/v1"
1112
"k8s.io/apimachinery/pkg/labels"
12-
"k8s.io/klog/v2"
1313
"sigs.k8s.io/controller-runtime/pkg/client"
14+
"sigs.k8s.io/controller-runtime/pkg/log"
1415
"sigs.k8s.io/gateway-api-inference-extension/api/v1alpha1"
1516
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/ext-proc/util/logging"
1617
)
@@ -87,7 +88,7 @@ func (ds *K8sDatastore) HasSynced() bool {
8788
return ds.inferencePool != nil
8889
}
8990

90-
func RandomWeightedDraw(model *v1alpha1.InferenceModel, seed int64) string {
91+
func RandomWeightedDraw(logger logr.Logger, model *v1alpha1.InferenceModel, seed int64) string {
9192
var weights int32
9293

9394
source := rand.NewSource(rand.Int63())
@@ -98,7 +99,7 @@ func RandomWeightedDraw(model *v1alpha1.InferenceModel, seed int64) string {
9899
for _, model := range model.Spec.TargetModels {
99100
weights += *model.Weight
100101
}
101-
klog.V(logutil.VERBOSE).InfoS("Weights for model computed", "model", model.Name, "weights", weights)
102+
logger.V(logutil.TRACE).Info("Weights for model computed", "model", model.Name, "weights", weights)
102103
randomVal := r.Int31n(weights)
103104
for _, model := range model.Spec.TargetModels {
104105
if randomVal < *model.Weight {
@@ -128,7 +129,7 @@ func (ds *K8sDatastore) flushPodsAndRefetch(ctx context.Context, ctrlClient clie
128129
LabelSelector: selectorFromInferencePoolSelector(newServerPool.Spec.Selector),
129130
Namespace: newServerPool.Namespace,
130131
}); err != nil {
131-
klog.Error(err, "error listing clients")
132+
log.FromContext(ctx).V(logutil.DEFAULT).Error(err, "Failed to list clients")
132133
}
133134
ds.pods.Clear()
134135

@@ -139,7 +140,6 @@ func (ds *K8sDatastore) flushPodsAndRefetch(ctx context.Context, ctrlClient clie
139140
}
140141
ds.pods.Store(pod, true)
141142
}
142-
143143
}
144144

145145
func selectorFromInferencePoolSelector(selector map[v1alpha1.LabelKey]v1alpha1.LabelValue) labels.Selector {

pkg/ext-proc/backend/datastore_test.go

+3-1
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ import (
55

66
v1 "k8s.io/apimachinery/pkg/apis/meta/v1"
77
"sigs.k8s.io/gateway-api-inference-extension/api/v1alpha1"
8+
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/ext-proc/util/logging"
89
)
910

1011
func TestHasSynced(t *testing.T) {
@@ -46,6 +47,7 @@ func TestHasSynced(t *testing.T) {
4647
}
4748

4849
func TestRandomWeightedDraw(t *testing.T) {
50+
logger := logutil.NewTestLogger()
4951
tests := []struct {
5052
name string
5153
model *v1alpha1.InferenceModel
@@ -118,7 +120,7 @@ func TestRandomWeightedDraw(t *testing.T) {
118120
for _, test := range tests {
119121
t.Run(test.name, func(t *testing.T) {
120122
for range 10000 {
121-
model := RandomWeightedDraw(test.model, seedVal)
123+
model := RandomWeightedDraw(logger, test.model, seedVal)
122124
if model != test.want {
123125
t.Errorf("Model returned!: %v", model)
124126
break

pkg/ext-proc/backend/fake.go

+2-2
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ package backend
33
import (
44
"context"
55

6-
klog "k8s.io/klog/v2"
6+
"sigs.k8s.io/controller-runtime/pkg/log"
77
"sigs.k8s.io/gateway-api-inference-extension/api/v1alpha1"
88
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/ext-proc/util/logging"
99
)
@@ -17,7 +17,7 @@ func (f *FakePodMetricsClient) FetchMetrics(ctx context.Context, pod Pod, existi
1717
if err, ok := f.Err[pod]; ok {
1818
return nil, err
1919
}
20-
klog.V(logutil.VERBOSE).InfoS("Fetching metrics for pod", "pod", pod, "existing", existing, "new", f.Res[pod])
20+
log.FromContext(ctx).V(logutil.VERBOSE).Info("Fetching metrics for pod", "pod", pod, "existing", existing, "new", f.Res[pod])
2121
return f.Res[pod], nil
2222
}
2323

pkg/ext-proc/backend/inferencemodel_reconciler.go

+14-12
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,14 @@ package backend
33
import (
44
"context"
55

6+
"github.com/go-logr/logr"
67
"k8s.io/apimachinery/pkg/api/errors"
78
"k8s.io/apimachinery/pkg/runtime"
89
"k8s.io/apimachinery/pkg/types"
910
"k8s.io/client-go/tools/record"
10-
"k8s.io/klog/v2"
1111
ctrl "sigs.k8s.io/controller-runtime"
1212
"sigs.k8s.io/controller-runtime/pkg/client"
13+
"sigs.k8s.io/controller-runtime/pkg/log"
1314
"sigs.k8s.io/gateway-api-inference-extension/api/v1alpha1"
1415
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/ext-proc/util/logging"
1516
)
@@ -27,38 +28,39 @@ func (c *InferenceModelReconciler) Reconcile(ctx context.Context, req ctrl.Reque
2728
return ctrl.Result{}, nil
2829
}
2930

30-
klogV := klog.V(logutil.DEFAULT)
31-
klogV.InfoS("Reconciling InferenceModel", "name", req.NamespacedName)
31+
logger := log.FromContext(ctx)
32+
loggerDefault := logger.V(logutil.DEFAULT)
33+
loggerDefault.Info("Reconciling InferenceModel", "name", req.NamespacedName)
3234

3335
infModel := &v1alpha1.InferenceModel{}
3436
if err := c.Get(ctx, req.NamespacedName, infModel); err != nil {
3537
if errors.IsNotFound(err) {
36-
klogV.InfoS("InferenceModel not found. Removing from datastore since object must be deleted", "name", req.NamespacedName)
38+
loggerDefault.Info("InferenceModel not found. Removing from datastore since object must be deleted", "name", req.NamespacedName)
3739
c.Datastore.InferenceModels.Delete(infModel.Spec.ModelName)
3840
return ctrl.Result{}, nil
3941
}
40-
klogV.ErrorS(err, "Unable to get InferenceModel", "name", req.NamespacedName)
42+
loggerDefault.Error(err, "Unable to get InferenceModel", "name", req.NamespacedName)
4143
return ctrl.Result{}, err
4244
} else if !infModel.DeletionTimestamp.IsZero() {
43-
klogV.InfoS("InferenceModel is marked for deletion. Removing from datastore", "name", req.NamespacedName)
45+
loggerDefault.Info("InferenceModel is marked for deletion. Removing from datastore", "name", req.NamespacedName)
4446
c.Datastore.InferenceModels.Delete(infModel.Spec.ModelName)
4547
return ctrl.Result{}, nil
4648
}
4749

48-
c.updateDatastore(infModel)
50+
c.updateDatastore(logger, infModel)
4951
return ctrl.Result{}, nil
5052
}
5153

52-
func (c *InferenceModelReconciler) updateDatastore(infModel *v1alpha1.InferenceModel) {
53-
klogV := klog.V(logutil.DEFAULT)
54+
func (c *InferenceModelReconciler) updateDatastore(logger logr.Logger, infModel *v1alpha1.InferenceModel) {
55+
loggerDefault := logger.V(logutil.DEFAULT)
5456

5557
if infModel.Spec.PoolRef.Name == c.PoolNamespacedName.Name {
56-
klogV.InfoS("Updating datastore", "poolRef", infModel.Spec.PoolRef, "serverPoolName", c.PoolNamespacedName)
57-
klogV.InfoS("Adding/Updating InferenceModel", "modelName", infModel.Spec.ModelName)
58+
loggerDefault.Info("Updating datastore", "poolRef", infModel.Spec.PoolRef, "serverPoolName", c.PoolNamespacedName)
59+
loggerDefault.Info("Adding/Updating InferenceModel", "modelName", infModel.Spec.ModelName)
5860
c.Datastore.InferenceModels.Store(infModel.Spec.ModelName, infModel)
5961
return
6062
}
61-
klogV.InfoS("Removing/Not adding InferenceModel", "modelName", infModel.Spec.ModelName)
63+
loggerDefault.Info("Removing/Not adding InferenceModel", "modelName", infModel.Spec.ModelName)
6264
// If we get here. The model is not relevant to this pool, remove.
6365
c.Datastore.InferenceModels.Delete(infModel.Spec.ModelName)
6466
}

pkg/ext-proc/backend/inferencemodel_reconciler_test.go

+4-1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ import (
1313
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
1414
"k8s.io/apimachinery/pkg/types"
1515
"sigs.k8s.io/gateway-api-inference-extension/api/v1alpha1"
16+
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/ext-proc/util/logging"
1617
)
1718

1819
var (
@@ -46,6 +47,8 @@ var (
4647
)
4748

4849
func TestUpdateDatastore_InferenceModelReconciler(t *testing.T) {
50+
logger := logutil.NewTestLogger()
51+
4952
tests := []struct {
5053
name string
5154
datastore *K8sDatastore
@@ -135,7 +138,7 @@ func TestUpdateDatastore_InferenceModelReconciler(t *testing.T) {
135138
Datastore: test.datastore,
136139
PoolNamespacedName: types.NamespacedName{Name: test.datastore.inferencePool.Name},
137140
}
138-
reconciler.updateDatastore(test.incomingService)
141+
reconciler.updateDatastore(logger, test.incomingService)
139142

140143
if ok := mapsEqual(reconciler.Datastore.InferenceModels, test.wantInferenceModels); !ok {
141144
t.Error("Maps are not equal")

pkg/ext-proc/backend/inferencepool_reconciler.go

+11-7
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,14 @@ import (
44
"context"
55
"reflect"
66

7+
"github.com/go-logr/logr"
78
"k8s.io/apimachinery/pkg/runtime"
89
"k8s.io/apimachinery/pkg/types"
910
"k8s.io/client-go/tools/record"
1011
klog "k8s.io/klog/v2"
1112
ctrl "sigs.k8s.io/controller-runtime"
1213
"sigs.k8s.io/controller-runtime/pkg/client"
14+
"sigs.k8s.io/controller-runtime/pkg/log"
1315
"sigs.k8s.io/gateway-api-inference-extension/api/v1alpha1"
1416
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/ext-proc/util/logging"
1517
)
@@ -29,29 +31,31 @@ func (c *InferencePoolReconciler) Reconcile(ctx context.Context, req ctrl.Reques
2931
if req.NamespacedName.Name != c.PoolNamespacedName.Name || req.NamespacedName.Namespace != c.PoolNamespacedName.Namespace {
3032
return ctrl.Result{}, nil
3133
}
32-
klogV := klog.V(logutil.DEFAULT)
33-
klogV.InfoS("Reconciling InferencePool", "name", req.NamespacedName)
34+
35+
logger := log.FromContext(ctx)
36+
loggerDefault := logger.V(logutil.DEFAULT)
37+
loggerDefault.Info("Reconciling InferencePool", "name", req.NamespacedName)
3438

3539
serverPool := &v1alpha1.InferencePool{}
3640
if err := c.Get(ctx, req.NamespacedName, serverPool); err != nil {
37-
klogV.ErrorS(err, "Unable to get InferencePool", "name", req.NamespacedName)
41+
loggerDefault.Error(err, "Unable to get InferencePool", "name", req.NamespacedName)
3842
return ctrl.Result{}, err
3943
}
4044
if c.Datastore.inferencePool == nil || !reflect.DeepEqual(serverPool.Spec.Selector, c.Datastore.inferencePool.Spec.Selector) {
41-
c.updateDatastore(serverPool)
45+
c.updateDatastore(logger, serverPool)
4246
c.Datastore.flushPodsAndRefetch(ctx, c.Client, serverPool)
4347
} else {
44-
c.updateDatastore(serverPool)
48+
c.updateDatastore(logger, serverPool)
4549
}
4650

4751
return ctrl.Result{}, nil
4852
}
4953

50-
func (c *InferencePoolReconciler) updateDatastore(serverPool *v1alpha1.InferencePool) {
54+
func (c *InferencePoolReconciler) updateDatastore(logger logr.Logger, serverPool *v1alpha1.InferencePool) {
5155
pool, _ := c.Datastore.getInferencePool()
5256
if pool == nil ||
5357
serverPool.ObjectMeta.ResourceVersion != pool.ObjectMeta.ResourceVersion {
54-
klog.V(logutil.DEFAULT).InfoS("Updating inference pool", "target", klog.KMetadata(&serverPool.ObjectMeta))
58+
logger.V(logutil.DEFAULT).Info("Updating inference pool", "target", klog.KMetadata(&serverPool.ObjectMeta))
5559
c.Datastore.setInferencePool(serverPool)
5660
}
5761
}

pkg/ext-proc/backend/inferencepool_reconciler_test.go

+4-1
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import (
66

77
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
88
"sigs.k8s.io/gateway-api-inference-extension/api/v1alpha1"
9+
logutil "sigs.k8s.io/gateway-api-inference-extension/pkg/ext-proc/util/logging"
910
)
1011

1112
var (
@@ -41,6 +42,8 @@ var (
4142
)
4243

4344
func TestUpdateDatastore_InferencePoolReconciler(t *testing.T) {
45+
logger := logutil.NewTestLogger()
46+
4447
tests := []struct {
4548
name string
4649
datastore *K8sDatastore
@@ -74,7 +77,7 @@ func TestUpdateDatastore_InferencePoolReconciler(t *testing.T) {
7477
for _, test := range tests {
7578
t.Run(test.name, func(t *testing.T) {
7679
inferencePoolReconciler := &InferencePoolReconciler{Datastore: test.datastore}
77-
inferencePoolReconciler.updateDatastore(test.incomingPool)
80+
inferencePoolReconciler.updateDatastore(logger, test.incomingPool)
7881

7982
gotPool := inferencePoolReconciler.Datastore.inferencePool
8083
if !reflect.DeepEqual(gotPool, test.wantPool) {

0 commit comments

Comments
 (0)