Skip to content

Simplify XMPP element metrics #4520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from
Draft

Conversation

chrzaszcz
Copy link
Member

@chrzaszcz chrzaszcz commented Apr 11, 2025

The main goal is to simplify and unify events for XMPP (XML) elements. The main benefit is that unified metrics become easier to understand, maintain, test and document.

Before, there would be pairs of events like c2s_element_in and xmpp_element_size_in. After, there is only one event: xmpp_element_in. The complete list is shown below:

  1. c2s_element_in (labels: host_type) and xmpp_element_size_in (labels: connection_type=c2s) are merged to xmpp_element_in (labels: host_type, connection_type=c2s).
  2. c2s_element_out (labels: host_type) and xmpp_element_size_out (labels: connection_type=c2s) are merged to xmpp_element_out (labels: host_type, connection_type=c2s).
  3. s2s_element_in and xmpp_element_size_in (labels: connection_type=s2s) are merged to xmpp_element_in (labels: host_type, connection_type=s2s).
  4. s2s_element_out and xmpp_element_size_out (labels: connection_type=s2s) are merged to xmpp_element_out (labels: host_type, connection_type=s2s).
  5. component_element_in and xmpp_element_size_in (labels: connection_type=component) are merged to xmpp_element_in (labels: host_type="", connection_type=component).
  6. component_element_out and xmpp_element_size_out (labels: connection_type=component) are merged to xmpp_element_out (labels: host_type="", connection_type=component).

Notes:

  • Ad 1, 2: c2s_element_* were missing if host type was unknown (e.g. in case of early stream errors). Now it is no longer the case.
  • Size-related events and all s2s/component events didn't include host types. Now they do if host type is known.
  • If host type is unknown, it is set to an empty binary in the labels. Prometheus sees this as an empty label value, and allows to e.g. display graphs of all events without host types (useful for debugging; I tested this manually). For exometer, we just put global as the prefix in such cases.

Left for the subsequent PR's:

  • Document all the XMPP metrics, updating previous docs, and filling the gaps (only c2s events were documented).

@mongoose-im

This comment was marked as outdated.

Copy link

codecov bot commented Apr 11, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.88%. Comparing base (e89f360) to head (a631cef).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4520      +/-   ##
==========================================
+ Coverage   85.86%   85.88%   +0.02%     
==========================================
  Files         565      565              
  Lines       33803    33795       -8     
==========================================
  Hits        29025    29025              
+ Misses       4778     4770       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@chrzaszcz chrzaszcz force-pushed the simplify-element-metrics branch from 9923835 to 84d7d06 Compare April 11, 2025 12:19
@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the simplify-element-metrics branch from 84d7d06 to 9d1a993 Compare April 11, 2025 16:02
@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the simplify-element-metrics branch from 9d1a993 to 4540c4f Compare April 11, 2025 16:19
@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the simplify-element-metrics branch from 4540c4f to f39f613 Compare April 14, 2025 14:19
@mongoose-im

This comment was marked as outdated.

@chrzaszcz chrzaszcz force-pushed the simplify-element-metrics branch from f39f613 to 262ddac Compare April 16, 2025 07:29
@mongoose-im

This comment was marked as outdated.

@mongoose-im
Copy link
Collaborator

mongoose-im commented Apr 16, 2025

elasticsearch_and_cassandra_27 / elasticsearch_and_cassandra_mnesia / 7da5aab
Reports root/ big
OK: 683 / Failed: 0 / User-skipped: 72 / Auto-skipped: 0


small_tests_27 / small_tests / 7da5aab
Reports root / small


small_tests_26 / small_tests / 7da5aab
Reports root / small


small_tests_27_arm64 / small_tests / 7da5aab
Reports root / small


ldap_mnesia_26 / ldap_mnesia / 7da5aab
Reports root/ big
OK: 2332 / Failed: 0 / User-skipped: 1187 / Auto-skipped: 0


ldap_mnesia_27 / ldap_mnesia / 7da5aab
Reports root/ big
OK: 2332 / Failed: 0 / User-skipped: 1352 / Auto-skipped: 0


dynamic_domains_mysql_redis_27 / mysql_redis / 7da5aab
Reports root/ big
OK: 5160 / Failed: 0 / User-skipped: 154 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_27 / pgsql_mnesia / 7da5aab
Reports root/ big
OK: 5195 / Failed: 0 / User-skipped: 119 / Auto-skipped: 0


internal_mnesia_27 / internal_mnesia / 7da5aab
Reports root/ big
OK: 2466 / Failed: 0 / User-skipped: 1218 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_26 / pgsql_mnesia / 7da5aab
Reports root/ big
OK: 5030 / Failed: 0 / User-skipped: 119 / Auto-skipped: 0


pgsql_cets_27 / pgsql_cets / 7da5aab
Reports root/ big
OK: 5281 / Failed: 0 / User-skipped: 188 / Auto-skipped: 0


dynamic_domains_mssql_mnesia_27 / odbc_mssql_mnesia / 7da5aab
Reports root/ big
OK: 5190 / Failed: 0 / User-skipped: 124 / Auto-skipped: 0


cockroachdb_cets_27 / cockroachdb_cets / 7da5aab
Reports root/ big
OK: 5291 / Failed: 5 / User-skipped: 188 / Auto-skipped: 0

pubsub_SUITE:dag+last_item_cache:send_last_published_item_no_items_test
{error,
  {timeout_when_waiting_for_stanza,
    [{escalus_client,wait_for_stanza,
       [{client,
          <<"alice_send_last_published_item_no_items_test_3693@localhost/res1">>,
          escalus_tcp,<0.117971.0>,
          [{event_manager,<0.117959.0>},
           {server,<<"localhost">>},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3693">>},
           {resource,<<"res1">>}],
          [{event_client,
             [{event_manager,<0.117959.0>},
            {server,<<"localhost">>},
            {username,
              <<"alicE_send_last_published_item_no_items_test_3693">>},
            {resource,<<"res1">>}]},
           {resource,<<"res1">>},
           {username,
             <<"alice_send_last_published_item_no_items_test_3693">>},
           {server,<<"localhost">>},
           {host,<<"localhost">>},
           {port,5222},
           {auth,fun escalus_auth:auth_plain/2},
           {wspath,undefined},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3693">>},
           {server,<<"localhost">>},
           {password,<<"matygrysa">>},
           {stream_id,<<"8f2d3c009a440dd0">>}]},
        5000],
       [{file,
          "/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
        {line,136}]},
     {pubsub_tools,receive_response,3,
       [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
        {line,434}]},
     {pubsub_tools,receive_and_c...

Report log

pubsub_SUITE:dag+last_item_cache:send_last_published_item_no_items_test
{error,
  {timeout_when_waiting_for_stanza,
    [{escalus_client,wait_for_stanza,
       [{client,
          <<"alice_send_last_published_item_no_items_test_3696@localhost/res1">>,
          escalus_tcp,<0.118048.0>,
          [{event_manager,<0.118036.0>},
           {server,<<"localhost">>},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3696">>},
           {resource,<<"res1">>}],
          [{event_client,
             [{event_manager,<0.118036.0>},
            {server,<<"localhost">>},
            {username,
              <<"alicE_send_last_published_item_no_items_test_3696">>},
            {resource,<<"res1">>}]},
           {resource,<<"res1">>},
           {username,
             <<"alice_send_last_published_item_no_items_test_3696">>},
           {server,<<"localhost">>},
           {host,<<"localhost">>},
           {port,5222},
           {auth,fun escalus_auth:auth_plain/2},
           {wspath,undefined},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3696">>},
           {server,<<"localhost">>},
           {password,<<"matygrysa">>},
           {stream_id,<<"f8e4777a9e2a2d15">>}]},
        5000],
       [{file,
          "/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
        {line,136}]},
     {pubsub_tools,receive_response,3,
       [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
        {line,434}]},
     {pubsub_tools,receive_and_c...

Report log

pubsub_SUITE:dag+last_item_cache:send_last_published_item_no_items_test
{error,
  {timeout_when_waiting_for_stanza,
    [{escalus_client,wait_for_stanza,
       [{client,
          <<"alice_send_last_published_item_no_items_test_3700@localhost/res1">>,
          escalus_tcp,<0.118126.0>,
          [{event_manager,<0.118122.0>},
           {server,<<"localhost">>},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3700">>},
           {resource,<<"res1">>}],
          [{event_client,
             [{event_manager,<0.118122.0>},
            {server,<<"localhost">>},
            {username,
              <<"alicE_send_last_published_item_no_items_test_3700">>},
            {resource,<<"res1">>}]},
           {resource,<<"res1">>},
           {username,
             <<"alice_send_last_published_item_no_items_test_3700">>},
           {server,<<"localhost">>},
           {host,<<"localhost">>},
           {port,5222},
           {auth,fun escalus_auth:auth_plain/2},
           {wspath,undefined},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3700">>},
           {server,<<"localhost">>},
           {password,<<"matygrysa">>},
           {stream_id,<<"2afee07a8376381d">>}]},
        5000],
       [{file,
          "/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
        {line,136}]},
     {pubsub_tools,receive_response,3,
       [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
        {line,434}]},
     {pubsub_tools,receive_and_c...

Report log

pubsub_SUITE:dag+last_item_cache:send_last_published_item_no_items_test
{error,
  {timeout_when_waiting_for_stanza,
    [{escalus_client,wait_for_stanza,
       [{client,
          <<"alice_send_last_published_item_no_items_test_3702@localhost/res1">>,
          escalus_tcp,<0.118201.0>,
          [{event_manager,<0.118190.0>},
           {server,<<"localhost">>},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3702">>},
           {resource,<<"res1">>}],
          [{event_client,
             [{event_manager,<0.118190.0>},
            {server,<<"localhost">>},
            {username,
              <<"alicE_send_last_published_item_no_items_test_3702">>},
            {resource,<<"res1">>}]},
           {resource,<<"res1">>},
           {username,
             <<"alice_send_last_published_item_no_items_test_3702">>},
           {server,<<"localhost">>},
           {host,<<"localhost">>},
           {port,5222},
           {auth,fun escalus_auth:auth_plain/2},
           {wspath,undefined},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3702">>},
           {server,<<"localhost">>},
           {password,<<"matygrysa">>},
           {stream_id,<<"a255f4cd8970e22f">>}]},
        5000],
       [{file,
          "/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
        {line,136}]},
     {pubsub_tools,receive_response,3,
       [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
        {line,434}]},
     {pubsub_tools,receive_and_c...

Report log

pubsub_SUITE:tree+last_item_cache:send_last_published_item_no_items_test
{error,
  {timeout_when_waiting_for_stanza,
    [{escalus_client,wait_for_stanza,
       [{client,
          <<"alice_send_last_published_item_no_items_test_3758@localhost/res1">>,
          escalus_tcp,<0.119488.0>,
          [{event_manager,<0.119483.0>},
           {server,<<"localhost">>},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3758">>},
           {resource,<<"res1">>}],
          [{event_client,
             [{event_manager,<0.119483.0>},
            {server,<<"localhost">>},
            {username,
              <<"alicE_send_last_published_item_no_items_test_3758">>},
            {resource,<<"res1">>}]},
           {resource,<<"res1">>},
           {username,
             <<"alice_send_last_published_item_no_items_test_3758">>},
           {server,<<"localhost">>},
           {host,<<"localhost">>},
           {port,5222},
           {auth,fun escalus_auth:auth_plain/2},
           {wspath,undefined},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3758">>},
           {server,<<"localhost">>},
           {password,<<"matygrysa">>},
           {stream_id,<<"3ac2dddbffbaf4a7">>}]},
        5000],
       [{file,
          "/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
        {line,136}]},
     {pubsub_tools,receive_response,3,
       [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
        {line,434}]},
     {pubsub_tools,receive_and_c...

Report log


mysql_redis_27 / mysql_redis / 7da5aab
Reports root/ big
OK: 5563 / Failed: 0 / User-skipped: 149 / Auto-skipped: 0


mssql_mnesia_27 / odbc_mssql_mnesia / 7da5aab
Reports root/ big
OK: 5579 / Failed: 0 / User-skipped: 133 / Auto-skipped: 0


pgsql_mnesia_26 / pgsql_mnesia / 7da5aab
Reports root/ big
OK: 5419 / Failed: 0 / User-skipped: 128 / Auto-skipped: 0

The new module has two functions:
  - execute_element_event/5 for incoming/outgoing XML elements
  - instrumentation/1 for declaration of events for elements and data

Element events are simplified to xmpp_element_in/out with the following labels:
  - connection_type: c2s, s2s or component
  - host_type: it can be an empty binary if no host type is known,
    e.g. for components, or for stream start errors.
    * for exometer, empty host type is translated to 'global';
    * prometheus treats empty string as no label, so no translation is needed.
Host type is always undefined for components.
It is now easier to collect all XMPP metrics (not only c2s),
and the total traffic is a better estimation of system load,
e.g. someone might only use components, and no c2s.
Also, refactor helpers to be more concise.
There is need to check for the empty host type in the labels,
because components don't have host types assigned to them.
Replace 'global' with host type when necessary.
This is needed, because all XML elements are reported per host type
if the host type is known.
@chrzaszcz chrzaszcz force-pushed the simplify-element-metrics branch from 7da5aab to a631cef Compare April 17, 2025 13:05
@mongoose-im
Copy link
Collaborator

mongoose-im commented Apr 17, 2025

elasticsearch_and_cassandra_27 / elasticsearch_and_cassandra_mnesia / a631cef
Reports root/ big
OK: 683 / Failed: 0 / User-skipped: 72 / Auto-skipped: 0


small_tests_26 / small_tests / a631cef
Reports root / small


small_tests_27 / small_tests / a631cef
Reports root / small


small_tests_27_arm64 / small_tests / a631cef
Reports root / small


ldap_mnesia_26 / ldap_mnesia / a631cef
Reports root/ big
OK: 2332 / Failed: 0 / User-skipped: 1187 / Auto-skipped: 0


ldap_mnesia_27 / ldap_mnesia / a631cef
Reports root/ big
OK: 2332 / Failed: 0 / User-skipped: 1352 / Auto-skipped: 0


dynamic_domains_mysql_redis_27 / mysql_redis / a631cef
Reports root/ big
OK: 5160 / Failed: 0 / User-skipped: 154 / Auto-skipped: 0


internal_mnesia_27 / internal_mnesia / a631cef
Reports root/ big
OK: 2466 / Failed: 0 / User-skipped: 1218 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_27 / pgsql_mnesia / a631cef
Reports root/ big
OK: 5195 / Failed: 0 / User-skipped: 119 / Auto-skipped: 0


dynamic_domains_pgsql_mnesia_26 / pgsql_mnesia / a631cef
Reports root/ big
OK: 5030 / Failed: 0 / User-skipped: 119 / Auto-skipped: 0


pgsql_cets_27 / pgsql_cets / a631cef
Reports root/ big
OK: 5281 / Failed: 0 / User-skipped: 188 / Auto-skipped: 0


dynamic_domains_mssql_mnesia_27 / odbc_mssql_mnesia / a631cef
Reports root/ big
OK: 5190 / Failed: 0 / User-skipped: 124 / Auto-skipped: 0


mysql_redis_27 / mysql_redis / a631cef
Reports root/ big
OK: 5563 / Failed: 0 / User-skipped: 149 / Auto-skipped: 0


cockroachdb_cets_27 / cockroachdb_cets / a631cef
Reports root/ big
OK: 5285 / Failed: 2 / User-skipped: 188 / Auto-skipped: 0

pubsub_SUITE:tree+last_item_cache:send_last_published_item_no_items_test
{error,
  {timeout_when_waiting_for_stanza,
    [{escalus_client,wait_for_stanza,
       [{client,
          <<"alice_send_last_published_item_no_items_test_3845@localhost/res1">>,
          escalus_tcp,<0.119354.0>,
          [{event_manager,<0.119352.0>},
           {server,<<"localhost">>},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3845">>},
           {resource,<<"res1">>}],
          [{event_client,
             [{event_manager,<0.119352.0>},
            {server,<<"localhost">>},
            {username,
              <<"alicE_send_last_published_item_no_items_test_3845">>},
            {resource,<<"res1">>}]},
           {resource,<<"res1">>},
           {username,
             <<"alice_send_last_published_item_no_items_test_3845">>},
           {server,<<"localhost">>},
           {host,<<"localhost">>},
           {port,5222},
           {auth,fun escalus_auth:auth_plain/2},
           {wspath,undefined},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3845">>},
           {server,<<"localhost">>},
           {password,<<"matygrysa">>},
           {stream_id,<<"47726d6b589f39f0">>}]},
        5000],
       [{file,
          "/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
        {line,136}]},
     {pubsub_tools,receive_response,3,
       [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
        {line,434}]},
     {pubsub_tools,receive_and_c...

Report log

pubsub_SUITE:tree+last_item_cache:send_last_published_item_no_items_test
{error,
  {timeout_when_waiting_for_stanza,
    [{escalus_client,wait_for_stanza,
       [{client,
          <<"alice_send_last_published_item_no_items_test_3848@localhost/res1">>,
          escalus_tcp,<0.119433.0>,
          [{event_manager,<0.119429.0>},
           {server,<<"localhost">>},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3848">>},
           {resource,<<"res1">>}],
          [{event_client,
             [{event_manager,<0.119429.0>},
            {server,<<"localhost">>},
            {username,
              <<"alicE_send_last_published_item_no_items_test_3848">>},
            {resource,<<"res1">>}]},
           {resource,<<"res1">>},
           {username,
             <<"alice_send_last_published_item_no_items_test_3848">>},
           {server,<<"localhost">>},
           {host,<<"localhost">>},
           {port,5222},
           {auth,fun escalus_auth:auth_plain/2},
           {wspath,undefined},
           {username,
             <<"alicE_send_last_published_item_no_items_test_3848">>},
           {server,<<"localhost">>},
           {password,<<"matygrysa">>},
           {stream_id,<<"275b9a097e49e92f">>}]},
        5000],
       [{file,
          "/home/circleci/project/big_tests/_build/default/lib/escalus/src/escalus_client.erl"},
        {line,136}]},
     {pubsub_tools,receive_response,3,
       [{file,"/home/circleci/project/big_tests/tests/pubsub_tools.erl"},
        {line,434}]},
     {pubsub_tools,receive_and_c...

Report log


pgsql_mnesia_26 / pgsql_mnesia / a631cef
Reports root/ big
OK: 5419 / Failed: 0 / User-skipped: 128 / Auto-skipped: 0


pgsql_mnesia_27 / pgsql_mnesia / a631cef
Reports root/ big
OK: 5584 / Failed: 0 / User-skipped: 128 / Auto-skipped: 0


mssql_mnesia_27 / odbc_mssql_mnesia / a631cef
Reports root/ big
OK: 5579 / Failed: 0 / User-skipped: 133 / Auto-skipped: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants