Skip to content

Conversation

AndreKurait
Copy link
Member

@AndreKurait AndreKurait commented Oct 2, 2024

Description

This change includes several bug fixes and enhancements for request and tuple transformations.

  1. Fixed potential memory leaks during transformations when Binary ByteBuf was modified or replaced.
    • previously, transformations were responsible for releasing the Binary ByteBuf if replaced, now it's wrapped in a Unpooled.unreleaseable buffer beforehand and unwrapped afterwards
  2. Added support during transformations for replacing Binary ByteBuf with a string making Jolt transforms possible to modify this field.
  3. Added support during transformations parsing/serializing for inlinedTextBody, that is, if the content type is text/* and the payload is successfully encoded in utf-8. inlinedTextBody key will be populated with the payload value as a string for transformations accessing the body (or tuples as described below)
  4. Added Transformation support for tuples in a similar paradigm for request transformations aligning transformation input (or output when no transformations are provided) with the expected in input for request transformations.
    • This means JSON payloads and perhaps plain text in the near future will not be Base64 encoded in the tuples and "human readable" scripts aren't needed.

See Example Tuples:

See example tuples:
[
  {
    "sourceRequest": {
      "Host": [
        "localhost:9200"
      ],
      "Authorization": [
        "Basic YWRtaW46YWRtaW4="
      ],
      "User-Agent": [
        "curl/8.7.1"
      ],
      "Accept": [
        "*/*"
      ],
      "Request-URI": "/",
      "Method": "GET",
      "HTTP-Version": "HTTP/1.1",
      "payload": {
        "inlinedBase64Body": ""
      }
    },
    "sourceResponse": {
      "content-type": [
        "application/json; charset=UTF-8"
      ],
      "content-length": [
        "538"
      ],
      "HTTP-Version": "HTTP/1.1",
      "Status-Code": 200,
      "Reason-Phrase": "OK",
      "response_time_ms": 10,
      "payload": {
        "inlinedJsonBody": {
          "name": "2383a194365a",
          "cluster_name": "docker-cluster",
          "cluster_uuid": "fhZZvFEiS92srLRLvGXKrA",
          "version": {
            "number": "7.10.2",
            "build_flavor": "oss",
            "build_type": "docker",
            "build_hash": "747e1cc71def077253878a59143c1f785afa92b9",
            "build_date": "2021-01-13T00:42:12.435326Z",
            "build_snapshot": false,
            "lucene_version": "8.7.0",
            "minimum_wire_compatibility_version": "6.8.0",
            "minimum_index_compatibility_version": "6.0.0-beta1"
          },
          "tagline": "You Know, for Search"
        }
      }
    },
    "targetRequest": {
      "Host": [
        "opensearchtarget"
      ],
      "Authorization": [
        "Basic YWRtaW46bXlTdHJvbmdQYXNzd29yZDEyMyE="
      ],
      "User-Agent": [
        "curl/8.7.1"
      ],
      "Accept": [
        "*/*"
      ],
      "Request-URI": "/",
      "Method": "GET",
      "HTTP-Version": "HTTP/1.1",
      "payload": {
        "inlinedBase64Body": ""
      }
    },
    "targetResponses": [
      {
        "content-type": [
          "application/json; charset=UTF-8"
        ],
        "content-length": [
          "568"
        ],
        "HTTP-Version": "HTTP/1.1",
        "Status-Code": 200,
        "Reason-Phrase": "OK",
        "response_time_ms": 112,
        "payload": {
          "inlinedJsonBody": {
            "name": "758b4454da60",
            "cluster_name": "docker-cluster",
            "cluster_uuid": "Uu3orSZ-Tie1Jnq7p-GrVw",
            "version": {
              "distribution": "opensearch",
              "number": "2.15.0",
              "build_type": "tar",
              "build_hash": "61dbcd0795c9bfe9b81e5762175414bc38bbcadf",
              "build_date": "2024-06-20T03:27:32.562036890Z",
              "build_snapshot": false,
              "lucene_version": "9.10.0",
              "minimum_wire_compatibility_version": "7.10.0",
              "minimum_index_compatibility_version": "7.0.0"
            },
            "tagline": "The OpenSearch Project: https://opensearch.org/"
          }
        }
      }
    ],
    "connectionId": "0242acfffe12000b-0000000a-0000000f-d1aa22e30e1211a4-eba39e55.0",
    "numRequests": 1,
    "numErrors": 0
  },
  {
    "sourceRequest": {
      "Host": [
        "localhost:9200"
      ],
      "Authorization": [
        "Basic YWRtaW46YWRtaW4="
      ],
      "User-Agent": [
        "curl/8.7.1"
      ],
      "Accept": [
        "*/*"
      ],
      "Request-URI": "/_cat/indices",
      "Method": "GET",
      "HTTP-Version": "HTTP/1.1",
      "payload": {
        "inlinedBase64Body": ""
      }
    },
    "sourceResponse": {
      "content-type": [
        "text/plain; charset=UTF-8"
      ],
      "content-length": [
        "162"
      ],
      "HTTP-Version": "HTTP/1.1",
      "Status-Code": 200,
      "Reason-Phrase": "OK",
      "response_time_ms": 12,
      "payload": {
        "inlinedTextBody": "green  open searchguard             KRWtFn0nQwi6BdObOAsCYQ 1 0 8 0 45.4kb 45.4kb\nyellow open sg7-auditlog-2024.10.04 F2PV5IeTT0aVxP_BmuJSaQ 1 1 4 0 57.8kb 57.8kb\n"
      }
    },
    "targetRequest": {
      "Host": [
        "opensearchtarget"
      ],
      "Authorization": [
        "Basic YWRtaW46bXlTdHJvbmdQYXNzd29yZDEyMyE="
      ],
      "User-Agent": [
        "curl/8.7.1"
      ],
      "Accept": [
        "*/*"
      ],
      "Request-URI": "/_cat/indices",
      "Method": "GET",
      "HTTP-Version": "HTTP/1.1",
      "payload": {
        "inlinedBase64Body": ""
      }
    },
    "targetResponses": [
      {
        "content-type": [
          "text/plain; charset=UTF-8"
        ],
        "content-length": [
          "249"
        ],
        "HTTP-Version": "HTTP/1.1",
        "Status-Code": 200,
        "Reason-Phrase": "OK",
        "response_time_ms": 39,
        "payload": {
          "inlinedTextBody": "green open .plugins-ml-config        hfn4ZxQCQvOe0BN5TejuJA 1 0  1 0  3.9kb  3.9kb\ngreen open .opensearch-observability 7LKST3UWQDiNZyG3rnSqxg 1 0  0 0   208b   208b\ngreen open .opendistro_security      AvLAB1yDR4uk8PEQ212Yvg 1 0 10 0 78.3kb 78.3kb\n"
        }
      }
    ],
    "connectionId": "0242acfffe12000b-0000000a-00000011-657b97d8be126192-72df00be.0",
    "numRequests": 1,
    "numErrors": 0
  },
  • Category: New Feature, bug fix
  • Why these changes are required? Transformation support for tuples, easier transformations with plain text requests
  • What is the old behavior before changes and new behavior after changes? see above

Issues Resolved

MIGRATIONS-1593

Is this a backport? If so, please add backport PR # and/or commits #

Testing

Unit tests added for cases, some tuple transformations passed in docker startup manually testing e2e workflow.

Check List

  • New functionality includes testing
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Oct 2, 2024

Codecov Report

Attention: Patch coverage is 69.29348% with 113 lines in your changes missing coverage. Please review.

Project coverage is 80.28%. Comparing base (376180d) to head (1d2fd5c).
Report is 21 commits behind head on main.

Files with missing lines Patch % Lines
.../opensearch/migrations/replay/TrafficReplayer.java 0.00% 36 Missing ⚠️
...datahandlers/http/NettyJsonBodyConvertHandler.java 36.73% 25 Missing and 6 partials ⚠️
...ahandlers/http/NettyJsonBodyAccumulateHandler.java 58.33% 13 Missing and 2 partials ⚠️
...h/migrations/replay/ParsedHttpMessagesAsDicts.java 73.80% 7 Missing and 4 partials ⚠️
...DecodedHttpRequestPreliminaryTransformHandler.java 88.52% 1 Missing and 6 partials ⚠️
...tahandlers/http/NettyJsonBodySerializeHandler.java 63.63% 1 Missing and 3 partials ⚠️
...arch/migrations/replay/tracing/ReplayContexts.java 73.33% 4 Missing ⚠️
...lers/http/StrictCaseInsensitiveHttpHeadersMap.java 0.00% 3 Missing ⚠️
...rs/http/NettyDecodedHttpRequestConvertHandler.java 97.05% 0 Missing and 1 partial ⚠️
...s/http/NettyDecodedHttpResponseConvertHandler.java 97.05% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1044      +/-   ##
============================================
- Coverage     80.59%   80.28%   -0.32%     
- Complexity     2736     2873     +137     
============================================
  Files           365      383      +18     
  Lines         13653    14333     +680     
  Branches        941      988      +47     
============================================
+ Hits          11004    11507     +503     
- Misses         2069     2234     +165     
- Partials        580      592      +12     
Flag Coverage Δ
gradle-test 78.32% <69.29%> (-0.27%) ⬇️
python-test 90.24% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@AndreKurait AndreKurait force-pushed the TupleTransforms branch 10 times, most recently from e0b1a7b to e7f6db0 Compare October 5, 2024 19:54
@AndreKurait AndreKurait marked this pull request as ready for review October 5, 2024 20:34
var expectedBytes = (hostTransformation)
? new String(testBytes, StandardCharsets.UTF_8).replace("foo.example", "bar.example")
.getBytes(StandardCharsets.UTF_8)
? replaceBytes(testBytes,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enables us to test with gzipped requests without messing up the body bytes by encoding in utf8

this.httpTransactionContext = httpTransactionContext;
}

public ListKeyAdaptingCaseInsensitiveHeadersMap clone(ListKeyAdaptingCaseInsensitiveHeadersMap original) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that the fix for the gzip decompression bug should require us to pass the netty HTTP Request object through this handler. Since this handler is only used in one spot, I'd prefer to see it just use an original copy of the headers rather than go through any translation. Header translations are extremely sensitive. By writing a clone, we risk doing a normalization that makes differences dissolve away.

Anyway, there are existing bugs in the request transformation pipeline. There's a slight bit more risk here, but I'm fine reevaluating the risk when we tackle the other pipeline bugs. No change is required here to move this tuple transformation change closer to the approval.

"HTTP-Version": {
"keepAliveDefault": true
$ cat /shared-logs-output/traffic-replayer-default/*/tuples/tuples.log | jq
[
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? I thought that it was ndjson in and jq would just keep printing objects at the top-level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, i'll update

map.put(STATUS_CODE_KEY, Integer.parseInt(message.code()));
map.put("Reason-Phrase", message.reason());
map.put(RESPONSE_TIME_MS_KEY, latency.toMillis());
context.setHttpVersion(message.protocol());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why did you move this? There are potentially 2 reasons to have this earlier. 1) if there's an exception within this code, context's side effects will 'escape the stack' and map won't. 2) tightening the calls to protocol() has an ever so-negligible improvement on cache-hit likelihood.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you left an earlier comment to move this to the bottom #1044 (comment)

}

/**
* Writes a tuple object to an output stream as a JSON object.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what jackson does? If not, this would be a good comment above too.

Comment on lines +345 to +347
"--" + params.getTransformerConfigParameterArgPrefix() + "transformer-config-base64" + ", " +
"--" + params.getTransformerConfigParameterArgPrefix() + "transformer-config" + ", or " +
"--" + params.getTransformerConfigParameterArgPrefix() + "transformer-config-file" + ".");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a helper function would make this more readable

public class HttpJsonMessageWithFaultingPayload extends LinkedHashMap<String, Object> {

public static final String MESSAGE_SCHEMA_VERSION = "1.0";
public static final String MESSAGE_SCHEMA_VERSION = "2.0";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want the .0 at all. If this were a flat integer, it probably makes a lot of things simpler (if > 14, etc). Is there any value in bifurcating this schema?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to an int

Comment on lines +120 to +121
// Touch for easier debugging if later leak detected
// e.g. if received unreleasable ByteBuf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Playing w/ netty & even looking at touch backtraces for years & this comment makes me understand when I would want to use touch(). Thanks!

@AndreKurait AndreKurait merged commit 823fe07 into opensearch-project:main Oct 15, 2024
13 of 14 checks passed
@AndreKurait AndreKurait changed the title Add Transformation support for tuples Replayer - Add Transformation support for Tuples Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants