Skip to content
Open
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
d34edd4
initial
ansoniS1 Jan 4, 2024
15c0403
remove Oauth2 settings from print_useful_settings
ansoniS1 Jan 10, 2024
57d50cc
bump otel libs to latest
ansoniS1 Jan 10, 2024
e3aca57
get rid of opentelemetry library
ansoniS1 Jan 24, 2024
bc0b572
add requirement
ansoniS1 Jan 24, 2024
22f3bb9
update requirements
ansoniS1 Jan 24, 2024
5c3f188
change import
ansoniS1 Jan 24, 2024
00116f6
update generated protobuf OTEL files
ansoniS1 Jan 24, 2024
069ef52
update import
ansoniS1 Jan 24, 2024
19eca12
try alternative import style
ansoniS1 Jan 24, 2024
ae430bb
move import
ansoniS1 Jan 25, 2024
d9c9779
Integration tests and kuberentes_monitor / k8s implementation
alesnovak-s1 Jan 26, 2024
9462017
Prefix oauth configurations
ansoniS1 Jan 29, 2024
1ffd6a2
rename methods to use oauth_ prefix
ansoniS1 Jan 29, 2024
1884372
Re-factor client methods to util
ansoniS1 Jan 29, 2024
0217af5
first test
ansoniS1 Jan 29, 2024
8b163f0
update reference to ssl
ansoniS1 Jan 29, 2024
a4bf7e1
remove `__` from __receive_response_status
ansoniS1 Jan 29, 2024
8431a88
DTIN-3315 (#1223)
alesnovak-s1 Jan 8, 2024
5fe0df4
Removing unused queue import, which is not supported in python2. conc…
alesnovak-s1 Jan 10, 2024
1e2c4ce
Project import generated by Copybara. (#1232)
alesnovak-s1 Jan 11, 2024
e17c449
Fixing windows unit tests (#1234)
alesnovak-s1 Jan 17, 2024
447682d
EventID Parsing (#1235)
alesnovak-s1 Jan 18, 2024
c5246b5
DTIN-2346: Set monitor attribute when message_log_template is used (#…
jmakar-s1 Jan 22, 2024
b366b54
Integration tests and kuberentes_monitor / k8s implementation
alesnovak-s1 Jan 26, 2024
238c883
Update new LogWatcher `add_log_config` signature into unit-tests
ansoniS1 Jan 29, 2024
7c56dde
Merge branch 'master' into otlp_2
ansoniS1 Jan 29, 2024
393f6e2
Revert "EventID Parsing (#1235)"
ansoniS1 Jan 30, 2024
5e228ea
Revert "Fixing windows unit tests (#1234)"
ansoniS1 Jan 30, 2024
ef62d8f
Revert "Project import generated by Copybara. (#1232)"
ansoniS1 Jan 30, 2024
e586a03
Revert "Removing unused queue import, which is not supported in pytho…
ansoniS1 Jan 30, 2024
c39a1b4
revert all the changes (very messy)
ansoniS1 Jan 30, 2024
b709d12
revert all the changes (very messy)
ansoniS1 Jan 30, 2024
1036729
Enable OAuth/OTLP configs to be EnvAware
ansoniS1 Jan 30, 2024
adea362
add opentelemetry thirdparty to our docker container
ansoniS1 Jan 30, 2024
116086b
add reference worker_config server_url
ansoniS1 Jan 30, 2024
4c26218
fix worker config reference
ansoniS1 Jan 30, 2024
a9042ab
fix flow with selecting server_url
ansoniS1 Jan 31, 2024
42e35ce
remove requirement for scalyr_server to be set (server_url is adequate)
ansoniS1 Feb 13, 2024
08c913a
Handle implicit ports
ansoniS1 Feb 13, 2024
a4e986e
allow server_url or scalyr_server for setting otlp endpoint
ansoniS1 Feb 15, 2024
e20f0dc
add debug
ansoniS1 Feb 15, 2024
a615853
implement some statistics
ansoniS1 Feb 16, 2024
05dbfd0
add basic auth support
ansoniS1 Feb 27, 2024
6362d2b
fix header typo
ansoniS1 Feb 27, 2024
81278d9
add rename_logfile support
ansoniS1 Mar 11, 2024
b851c34
likely fix the rename_logfile support
ansoniS1 Mar 18, 2024
b7d7a30
Merge remote-tracking branch 'origin/master' into otlp_2
ansoniS1 Mar 18, 2024
e4a076e
Merge remote-tracking branch 'origin/master' into otlp_2
ansoniS1 Mar 21, 2024
af71d6c
add opentelemetry as thirdparty in aio
ansoniS1 Mar 26, 2024
af50ba2
fix handling of session_info and log/thread
ansoniS1 Mar 26, 2024
75334ef
Fix order of attribute encoding
ansoniS1 Mar 27, 2024
300584f
quick test
ansoniS1 Apr 2, 2024
62dc253
add log and thread last
ansoniS1 Apr 2, 2024
d627f14
avoid ambiguous resolution between event and session variables
ansoniS1 Apr 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions dev-requirements-new.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ python-dateutil==2.8.2
repoze.lru==0.7
six==1.14.0

# Required for opentelemetry support.
protobuf==3.17.3

# Required for redis monitor.
redis==2.10.5

Expand Down
78 changes: 78 additions & 0 deletions scalyr_agent/client_auth.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
import datetime
import json
import urllib
from base64 import b64encode

from urllib.parse import urlparse

import requests


class ClientAuth(object):
def __init__(self, configuration, headers):
self.configuration = configuration
self.headers = headers
if self.configuration.auth == "oauth2":
self.auth = OAuth2(self.configuration, self.headers)
elif self.configuration.auth == "bearer":
self.auth = BearerToken(self.configuration, self.headers)
else:
self.auth = NoAuth(self.configuration, self.headers)

def authenticate(self):
return self.auth.authenticate()

class NoAuth(object):
def __init__(self, configuration, headers):
self.headers = headers

def authenticate(self):
# Add a header just so for traceability
self.headers["x-no-auth"]="true"
return True

# Simple Authorization Header as Bearer Token using `api_key` configuration
class BearerToken(object):
def __init__(self, configuration, headers):
headers.set("Authorization", "Bearer " + configuration.api_key)

def authenticate(self):
return True

# Implement https://datatracker.ietf.org/doc/html/rfc6749#section-4.4 flow
class OAuth2(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a python library for OAuth2 to avoid having an in-house implementation?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python2 requirement paints us into some really weird corners like having to be on older versions of code that opens up security risks. This seems like the path of least resistance. I originally used the OpenTelemetry Python SDK, but had to back out of that because of no Python2 support.

def __init__(self, configuration, headers):
self.headers = headers # Headers modified for external requests
self.client_id = configuration.client_id
self.client_secret = configuration.client_secret
self.token_url = configuration.token_url
scopes = " ".join(configuration.scopes)
# Payload Body for the token exchange
self.auth_request = "grant_type=client_credentials&scope=" + urllib.parse.quote_plus(scopes)
# Our token to use in requests
self.token = None
# When the token expires
self.expiry_time = datetime.datetime.now()
self.verify_ssl = configuration.verify_server_certificate
# Authentication Headers for the token exchange
authorization_header_value = b64encode( bytes(('' + self.client_id + ':' + self.client_secret).encode("utf-8"))).decode('utf-8')
self.auth_headers = { "Content-Type": "application/x-www-form-urlencoded", "Authorization" : "Basic " + authorization_header_value}

def authenticate(self):
if self.token == None or self.expiry_time < datetime.datetime.now():
print("Request/Refresh OAuth2 Token")
if not self.refresh_token():
raise Exception("OAuth2: Unable to refresh token")
self.headers["Authorization"]="Bearer " + self.token
return True

def refresh_token(self):
resp = requests.post(self.token_url, data=self.auth_request, headers=self.auth_headers, verify=self.verify_ssl)
if resp.status_code == 200:
auth_response = json.loads(resp.content)
self.token = auth_response["access_token"]
self.expiry_time = datetime.datetime.now() + datetime.timedelta(seconds=auth_response["expires_in"])
return True
else:
raise Exception("Unable to obtain OAuth2 Token: " + str(resp.status_code) + "(" + str(resp.content) + ")")
return False
58 changes: 58 additions & 0 deletions scalyr_agent/configuration.py
Original file line number Diff line number Diff line change
Expand Up @@ -765,6 +765,8 @@ def print_useful_settings(self, other_config=None):
"default_sessions_per_worker",
"default_worker_session_status_message_interval",
"enable_worker_session_process_metrics_gather",
"server_url"
"transport",
# NOTE: It's important we use sanitzed_ version of this method which masks the API key
"sanitized_worker_configs",
]
Expand Down Expand Up @@ -1497,6 +1499,41 @@ def api_key(self):
"""Returns the configuration value for 'api_key'."""
return self.__get_config().get_string("api_key")

@property
def server_url(self):
"""Returns the configuration value for 'server_url'."""
return self.__get_config().get_string("server_url")

@property
def auth(self):
"""Returns the configuration value for 'auth'."""
return self.__get_config().get_string("auth")

@property
def client_id(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move all oauth2 configuration keys under a new oauth2 key:
oauth2:
->> client_id:
->> client_secret:
....

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with oauth_ prefix instead which goes with the current pattern used for kubernetes keys (k8s_)

"""Returns the configuration value for 'client_id'."""
return self.__get_config().get_string("client_id")

@property
def client_secret(self):
"""Returns the configuration value for 'client_secret'."""
return self.__get_config().get_string("client_secret")

@property
def token_url(self):
"""Returns the configuration value for 'token_url'."""
return self.__get_config().get_string("token_url")

@property
def scopes(self):
"""Returns the configuration value for 'scopes'."""
return self.__get_config().get_json_array("scopes")

@property
def transport(self):
"""Returns the configuration value for 'transport'."""
return self.__get_config().get_string("transport")

@property
def scalyr_server(self):
"""Returns the configuration value for 'scalyr_server'."""
Expand Down Expand Up @@ -2826,6 +2863,27 @@ def __verify_main_config_and_apply_defaults(
apply_defaults,
env_aware=True,
)
self.__verify_or_set_optional_string(
config,
"transport",
"",
description,
apply_defaults
)
self.__verify_or_set_optional_string(
config,
"server_url",
"",
description,
apply_defaults
)
self.__verify_or_set_optional_string(
config,
"auth",
"",
description,
apply_defaults
)
self.__verify_or_set_optional_bool(
config,
"use_new_ingestion",
Expand Down
11 changes: 9 additions & 2 deletions scalyr_agent/copying_manager/worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,8 @@ def __init__(self, add_events_request, completion_callback):
# If there is a AddEventsTask object already created for the next request due to pipelining, this is set to it.
# This must be the next request if this request is successful, otherwise, we will lose bytes.
self.next_pipelined_task = None

# last status
self.__receive_response_status = ()

class CopyingManagerWorkerSessionInterface(six.with_metaclass(ABCMeta)):
"""
Expand Down Expand Up @@ -354,6 +355,7 @@ def __init__(self, configuration, worker_config_entry, session_id, is_daemon=Fal

# The current pending AddEventsTask. We will retry the contained AddEventsRequest several times.
self.__pending_add_events_task = None
self.__receive_response_status = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this private field used?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in self.__pending_add_events_task.__receive_response_status

Trying to resolve this error (it doesn't):

2024-01-26 17:17:51.891Z ERROR [core] [scalyr_agent.scalyr_logging:707] Failed while attempting to scan and transmit logs :stack_trace:
  Traceback (most recent call last):
    File "/Users/anthonyj/_Scalyr/scalyr-agent-2/scalyr_agent/copying_manager/worker.py", line 500, in run
      "Repeatedly failed to parse response due to exception.  Dropping events",
  AttributeError: 'AddEventsTask' object has no attribute '_CopyingManagerWorkerSession__receive_response_status'

I defined the attribute in hopes that this was just a sequencing issue in the way the scalyr_client works, but this didn't fix it. Happens when we start to hit publishing delays. Easiest way for me to hit it is to Debug Agent;Pause Execution for X minutes;Resume Agent

Since this is experimental, I'm comfortable shipping this bug to the customer so they can provide feedback on the feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the second one on the line 358. I believe that's a mistake.


# The next LogFileProcessor that should have log lines read from it for transmission.
self.__current_processor = 0
Expand Down Expand Up @@ -489,6 +491,7 @@ def run(self):
# on the ground and advance.
if current_time - last_success > self.__config.max_retry_time:
if self.__pending_add_events_task is not None:

if (
"parseResponseFailed"
in self.__pending_add_events_task.__receive_response_status
Expand Down Expand Up @@ -1226,8 +1229,12 @@ def _init_scalyr_client(self, quiet=False):
"""
api_key = self.__worker_config_entry["api_key"]
if self.__config.use_new_ingestion:

self.__new_scalyr_client = create_new_client(self.__config, api_key=api_key)
elif self.__config.transport == "otlp":
from scalyr_agent.otlp_client import (
create_otlp_client,
)
self.__scalyr_client = create_otlp_client(self.__config, self.__worker_config_entry)
else:
self.__scalyr_client = create_client(
self.__config,
Expand Down
Loading