Skip to content

Nvidia/Mellanox expose ROCE ECN information on sysfs on the path #695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dasturiasArista
Copy link
Contributor

@dasturiasArista dasturiasArista commented Jan 28, 2025

Nvidia/Mellanox expose ROCE ECN information on sysfs on the path
/sys/class/net/<interface>/ecn/<protocol>/

There are 2 protocols Reaction Point (rp) and Notification point (np)

For each of the protocols they have a list of attributes:
/sys/class/net/<interface>/ecn/<protocol>/params/<requested attribute>

Each protocol will also if ECN is enabled per priority (where X is the
priority):
/sys/class/net/<interface>/ecn/<protocol>/enable/X

This is documented here
https://docs.nvidia.com/networking/display/mlnxofedv571020/explicit+congestion+notification+(ecn)

The attributes are documented here:
https://enterprise-support.nvidia.com/s/article/dcqcn-parameters

/sys/class/net/<interface>/ecn/<protocol>/

There are 2 protocols Reaction Point (rp) and Notification point (np)

For each of the protocols they have a list of attributes:
/sys/class/net/<interface>/ecn/<protocol>/params/<requested attribute>

Each protocol will also if ECN is enabled per priority (where X is the
priority):
/sys/class/net/<interface>/ecn/<protocol>/enable/X

This is documented here
https://docs.nvidia.com/networking/display/mlnxofedv571020/explicit+congestion+notification+(ecn)

The attributes are documented here:
https://enterprise-support.nvidia.com/s/article/dcqcn-parameters

Signed-off-by: Diego Asturias <[email protected]>
@discordianfish
Copy link
Member

LGTM in general but it makes me wonder where to draw the line what vendor specific stuff to include here and what not.. I feel this is probably relevant enough to be included but not sure. @SuperQ wdyt?

@dasturiasArista
Copy link
Contributor Author

Just to be clear this is just nvidia specific because other vendors aren't exposing these values through Sysfs. these are pretty generic values for ROCEv2. For example https://docs.broadcom.com/doc/NCC-WP1XX has information on broadcom's congestion control, which also uses ECN. Ideally other vendors would also expose these values in a sysfs path instead of relying on propriety utilities. I'm not aware if other vendors that implement rocev2 plan to in the future integrate with sysfs (if they already have), but this seems fairly generic in terms of ROCEv2

@discordianfish
Copy link
Member

@dasturiasArista Thanks for clarifying. I think this is a good argument for including it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants