-
Notifications
You must be signed in to change notification settings - Fork 7.4k
docs: Switch from 'breathe' to 'docleaf' for API documentation generation #59570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @michaeljones, and thank you very much for your first pull request to the Zephyr project!
A project maintainer just triggered our CI pipeline to run it against your PR and ensure it's compliant and doesn't cause any issues. You might want to take this opportunity to review the project's Contributor Expectations and make any updates to your pull request if necessary. 😊
This switches the Sphinx conf.py file over to using the 'docleaf.doxygen' module and the associated docleaf configuration entries, replacing the breathe module and config. Signed-off-by: Michael Jones <[email protected]>
b610d99
to
4bfd6ff
Compare
The "Documentation Build (HTML)" test is failing as there is a new warning that isn't covered by the current |
yes, if it can be fixed at the root, that is better than excluding via file, but to move forward you might want to use the exclusion path I guess. |
I will work through them and post progress here. The current error contains |
This is very cool, thanks for the PR!
FWIW it works like a charm on macOS (M2) :)
Ya not sure either but I've had a similar error recently when tinkering with the sphinx_immaterial theme. |
I've marked the PR as 'ready for review' as the tests that ran on the draft seemed to be passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: try to preserve bisectability with the changes
scripts/requirements-doc.txt
Outdated
docleaf~=0.5.0 | ||
docleaf==0.6.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you squash original commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Squashed.
doc/conf.py
Outdated
# Filters out any 'function' or 'variable' members that have 'all caps' names as | ||
# they are likely unprocessed macro calls | ||
docleaf_doxygen_skip = ["members:all_caps"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixup original commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
scripts/requirements-doc.txt
Outdated
@@ -1,6 +1,6 @@ | |||
# DOC: used to generate docs | |||
|
|||
docleaf==0.6.0 | |||
docleaf==0.7.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -85,7 +85,7 @@ jobs: | |||
DOC_TARGET="html" | |||
fi | |||
|
|||
DOC_TAG=${DOC_TAG} SPHINXOPTS="-q -W -t publish" make -C doc ${DOC_TARGET} | |||
DOC_TAG=${DOC_TAG} SPHINXOPTS="-q -W --keep-going -t publish" make -C doc ${DOC_TARGET} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop? I wonder if we should actually default to this, I was not aware of this option!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the commit that removed this and amended the commit message to indicate your approval.
.*Duplicate C declaration.*\n.*'\.\. c:member:: enum *bt_mesh_dfd_upload_phase bt_mesh_dfd_srv.phase'.* | ||
.*Duplicate C declaration.*\n.*'\.\. c:member:: struct *bt_mesh_blob_xfer bt_mesh_dfu_cli.blob'.* | ||
.*Duplicate C declaration.*\n.*'\.\. c:member:: struct *net_if *\* net_if_mcast_monitor.iface'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: why didn't these fail in breathe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. It would also be sensible to ask how many of the other patterns in this file we still need.
Docleaf isn't a line-for-line translation of Breathe into Rust. It is a new attempt to solve the same problem in a similar fashion. If it was easier to follow the logic the Breathe code base then a from-scratch re-write might not have been so appealing. All this is to say, I don't know why these didn't fail for Breathe and I'm not exactly sure why they are required for Docleaf.
Would you like clear answers to this before progressing? I can see why that would be appealing and feel like the correct approach. It'll take a bit of digging around though.
Edit: I'll dig into them. It'll be good to understand them more. Maybe a system we were can feed the names in as some config and docleaf can flag where it sees them in the rendering process or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial impression is that the root of the issue at times seems to be that the Sphinx C-domain does not have good handling for when structs, functions and other entities have the same name. This is allowed in C and present a few times in the Zephyr code base. This leads to duplicate targets being generated and triggering warnings from the Sphinx C Domain code.
It is mentioned here: sphinx-doc/sphinx#8241 (comment) in an issue which goes on to reference this pull request: sphinx-doc/sphinx#8313 as the solution but that seems to have been open for a long time.
As the references are parsed and stored by the C Domain which then generates the warnings for clashes, I don't see there being a clear way to work around them other than the current regexes. The question then becomes the difference between Breathe and Docleaf and that comes down to how much of the Doxygen XML the two systems attempt to render. In various edge cases, the XML becomes harder to make clear sense of and so the two systems potentially draw the lines at different points in different places. Neither are perfect.
I will work through the various cases (as highlighted by deleting all the lines in known-warnings.txt file) and document why each appears to be happening and attempt to improve Docleaf where possible. Any due to name clashes are potentially going to have to stay but I can see at least one case where field name is clashing with a field in a nested structure and that might be possible to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a new commit which adds categories to the known-warnings.txt file attempting to indicate the source of the various issues. Some are due to entities sharing the same name and the Sphinx issues indicated before. Some are due to fields with the same name appearing in multiple anonymous structs. Some are due to Doxygen not indicating extern
declarations in the XML output and so Docleaf and Breathe are unable to filter them out in any fashion. I have raised a ticket for this: doxygen/doxygen#10163 (Edit - and created a PR: doxygen/doxygen#10164)
It is possible to use the Doxygen @cond
syntax to have it ignore parts of the file and so we could use that to explicit stop it processing the extern
declarations but that would introduce more changes than the current known-warnings.txt patterns.
Finally the new lines are due to Docleaf attempting to render fields from anonymous structs which then clash with fields in the parent struct. I believe this didn't happen in Breathe as it doesn't attempt to render the members in some of these situations. For example, the breathe output for bt_mesh_dfd_srv
is:
Whilst the docleaf output is:
With the actual code being:
zephyr/include/zephyr/bluetooth/mesh/dfd_srv.h
Lines 117 to 136 in db1a718
struct bt_mesh_dfd_srv { | |
const struct bt_mesh_dfd_srv_cb *cb; | |
struct bt_mesh_model *mod; | |
struct bt_mesh_dfu_cli dfu; | |
struct bt_mesh_dfu_target targets[CONFIG_BT_MESH_DFD_SRV_TARGETS_MAX]; | |
struct bt_mesh_blob_target_pull pull_ctxs[CONFIG_BT_MESH_DFD_SRV_TARGETS_MAX]; | |
const struct bt_mesh_blob_io *io; | |
uint16_t target_cnt; | |
uint16_t slot_idx; | |
bool apply; | |
enum bt_mesh_dfd_phase phase; | |
struct bt_mesh_blob_cli_inputs inputs; | |
struct { | |
enum bt_mesh_dfd_upload_phase phase; | |
const struct bt_mesh_dfu_slot *slot; | |
const struct flash_area *area; | |
struct bt_mesh_blob_srv blob; | |
} upload; | |
}; |
Which shows that the anonymous struct members incorrectly appear in the body of the main struct. Unfortunately this is because the Doxygen xml output (structbt__mesh__dfd__srv.zip) doesn't distinguish between the two levels. It flattens everything. I'm not sure what is better or worse. Leaving out the details or having them poorly represented. I haven't yet looked to see if there is a doxygen issue for this but I fear it might involve changes to their XML structures to better support this and that seems like it is possibly unlikely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would welcome input as to what we might want to address or account for at this stage or if the current state is acceptable in some fashion for moving forwards.
Instead of breathe. I'm not sure of the best version comparison to use. There is no intention to break the API for docleaf but it is also still young in some ways and there might be adjustments to the configuration values that would need to be reflected in the version number as it is still pre-1.0 that would only be a minor version change. Signed-off-by: Michael Jones <[email protected]>
Where valid. Signed-off-by: Michael Jones <[email protected]>
We make sure that we have the 'root' and 'xml' defined for the project which is necessary at the moment for docleaf though the 'root' is only really used for the 'linkcode' integration. Signed-off-by: Michael Jones <[email protected]>
As that is not supported by docleaf. We get everything by default. Signed-off-by: Michael Jones <[email protected]>
There are times when the log message is an exception object and that causes the regex to fail with a TypeError. Here we work around that by converting the input to a string. Signed-off-by: Michael Jones <[email protected]>
So that we see all the potential errors in CI and not just the first and can assess how much work there is to do. This was introduced for debuggin but gmarull has suggested keeping it in. Signed-off-by: Michael Jones <[email protected]>
We should probably clear out the old warnings and see what still applies for docleaf now that we have that instead of breathe. However for this moment, these are new duplication declarations that we're seeing errors for and so we'd like to silence them. Signed-off-by: Michael Jones <[email protected]>
This allows us to identify any patterns that we can remove or that we didn't realise are no longer in use. This might happen if issues within doxygen or docleaf are resolved. This allows us to remove the pattern: .*Duplicate C declaration.*\n.*'\.\. c:.*:: uint16_t id'.* which does not match anything in the current set up. We also split the filter patterns in known-warnings out into different sections depending on their cause. Also extend the pattern parser to ignore empty lines so that we can have some formatting in the known-warnings file. Signed-off-by: Michael Jones <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @michaeljones!
Congratulations on getting your very first Zephyr pull request merged 🎉🥳. This is a fantastic achievement, and we're thrilled to have you as part of our community!
To celebrate this milestone and showcase your contribution, we'd love to award you the Zephyr Technical Contributor badge. If you're interested, please claim your badge by filling out this form: Claim Your Zephyr Badge.
Thank you for your valuable input, and we look forward to seeing more of your contributions in the future! 🪁
This PR switches the Zephyr documentation from using Breathe to using Docleaf for generating API documentation from Doxygen XML output.
Docleaf is a re-implementation of Breathe with the aim to improve on Breathe's performance and memory consumption issues. For reference, I am the original creator of Breathe. Docleaf is largely implemented in Rust with a Python layer to implement the Sphinx extension interface. Builds for Windows, MacOS and Linux and for Python 3.7 to 3.11 are available on PyPI. Docleaf is published under the Parity Public License.
I have been working on Docleaf using the Zephyr docs as a test bed and I think the output is at a sufficient standard to replace Breathe. A clean build of the docs on my Ubuntu 22.04 laptop is 2.1x faster than with Breathe and Docleaf performs content checks on the XML files to minimise incremental rebuild times which means that some of the XML copying logic in the doxyrunner code can be removed if you want. I confess I have not done detailed performance analysis of Docleaf to discover further opportunities for improvement. I have not done memory comparisons of the two solutions but assume that the Rust structures holding the XML information are more efficient than the problematic
minidom
module relied upon by Breathe.Further notes:
logger.warning
which results in thewarnings_filter.py
code attempting to process an exception object with a regex which then failed so I have adapted thewarnings_filter.py
code to work around that though the might be better approaches. I'm not sure why it wasn't triggered before.requirements-doc.txt
as the project is still young and updates should be taken with some care for the moment. Though no significant breaking changes are planned.:members:
from all the directive sites as that isn't currently respected by Docleaf which will output all content from the named group which I assume is the desired behaviour. If that is an issue, I can look into adjusting it.I hope this all seems reasonable.
Checklist