manifest: Use faster yaml CLoader if available#806
Conversation
Speed up the yaml parsing used by west by using LibYAML if available. Signed-off-by: Pieter De Gendt <pieter.degendt@basalte.be>
There was a problem hiding this comment.
This is an 8% improvement when doing practically nothing. On your test system, this saved 20ms. Doesn't 20ms become negligible when west actually performs something instead?
Also, this adds a dependency on additional C code, doesn't it? Not something desirable in 2024.
This does not add a dependency, if the C library isn't available, it falls back to the previous behaviour. It's a free optimisation, even documented by the authors, also used in Zephyr. It's a small gain, but if you start adding up all the small bits of daily runs in CI or locally, why wouldn't we? |
I understand but this does run more C code than before when the library is available, doesn't it? I bet this C library is valuable for programs that spend a lot of time parsing a lot of large YAML files but
If you add up 1% savings everywhere it's still 1%. |
As is running Python itself? This just takes a shortcut. |
I'm afraid you lost me. The current commit message says "Speed up the yaml parsing used by west by using LibYAML if available", but now you say Python itself leverages the C library if available even without this PR? How so? |
I mean Python itself is running some compiled C code, how is that so much different than calling into a library? |
Does Python run I don't see why the fact Python runs other C code matters. This is not about banning all C code overnight, this is about minimizing (and pinning) dependencies and especially new C dependencies (best avoided). EDIT: moreover this is a... parser = the poster child of where NOT to use C. I'm really confused by the current security "posture". On one hand we have PRs to pin every single, de-privileged, CI workflow (#802) that runs only inside an ephemeral github.com container. On the other hand, we have this PR which adds a brand new, run-time C dependency for every PS: again, I'm sure |
There was a problem hiding this comment.
Pull Request Overview
This PR speeds up YAML parsing in west by leveraging LibYAML’s CSafeLoader when available.
- Switches YAML loading in manifest.py from yaml.safe_load to yaml.load with Loader=SafeLoader.
- Applies the same change in commands.py for processing command specifications.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/west/manifest.py | Uses CSafeLoader (faster LibYAML loader) for manifest loading. |
| src/west/commands.py | Uses CSafeLoader for loading extension command specs. |
Sorry @marc-hb, I did not see you had an open comment here. But I have to disagree with the potential security concern, running precompiled C code is not inherently more dangerous as long as we pin the version carefully as it is done in #802? Feel free to revert this, but I think it's worthwhile. |
So, pinning is the latest "silver bullet"? For years we were told to "update, update and update" to get fixes early and stay secure. Which is it? Answer: neither is a silver bullet. It's complicated and it depends. Pinning this particular library is more secure if and only if the current version has very thoroughly reviewed, fuzzed and tested. Has it?
I really don't think increasing the attack surface with a parser in C is worth 20ms here. Other places with a bigger performance impact may want to spend time evaluating that dependency and risk but not here in west. |
I'm not sure why we turned this back into a pinning thing? It's unrelated? The reasons why I proposed this:
|
This was brought by @carlescufi for one reason: security.
None of these seems concerned about security. Which is what the only controversy has been about. |
Speed up the yaml parsing used by west by using LibYAML if available, as specified in the docs.
On Zephyr
mainwith optional modules enabled;Before
After
An ~8% speedup for a simple command that loads the manifest and loops the modules.