Skip to content

Change to address transitive dependency issue for vulcan.#6

Open
vinodshelke82 wants to merge 7 commits intoSpirentCom:vulcanfrom
vinodshelke82:vshelke/transitive_dep
Open

Change to address transitive dependency issue for vulcan.#6
vinodshelke82 wants to merge 7 commits intoSpirentCom:vulcanfrom
vinodshelke82:vshelke/transitive_dep

Conversation

@vinodshelke82
Copy link

@vinodshelke82 vinodshelke82 commented May 14, 2020

This change is Reviewable

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Copy link

@djoyner djoyner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 139 at r1 (raw file):

            f.write('  n%x -> n%x;\n' % (from_id, to_id))

    def _is_shared_lib(self, node):

Consistency: Since the logic in _walk_tree_write_nodes is all about testing whether transitive is true or false, please rename this method to _has_transitive_dependencies.


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

        filename = str(node.rfile())
        extension = os.path.splitext(str(node.rfile()))[1]
        return extension in sh_lib_extn or ".so." in filename

Maintainability: This still strikes me as a very brittle way to detect transitive linking scenarios. Have you considered whether there are other properties on node that will provide a better "signal" that this is a shared library?


src/engine/SCons/Spirent/init.py, line 145 at r1 (raw file):

        return extension in sh_lib_extn or ".so." in filename

    def _walk_tree_write_nodes(self, f, parent_id, node, visited, written, transitive=False):

Maintainability: Please don't provide a default value for transitive.


src/engine/SCons/Spirent/init.py, line 148 at r1 (raw file):

        node = node.disambiguate()
        node_id = id(node)
        path = str(node)

Minor: Just out of curiosity, why did path = ... move higher in the method?


src/engine/SCons/Spirent/init.py, line 153 at r1 (raw file):

                if self._is_shared_lib(child):
                    self._walk_tree_write_nodes(f, parent_id, child, visited, written, True)
            self._walk_tree_write_nodes(f, parent_id, node, visited, written)

Why is transitive false here?


src/engine/SCons/Spirent/init.py, line 186 at r1 (raw file):

                for child in node.all_children():
                    self._walk_tree_write_nodes(f, node_id, child, visited, written)
        elif isinstance(node, SCons.Node.FS.File) and node.rfile().exists() and not os.path.isabs(path):

What's the purpose of this change?

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @djoyner and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

Previously, djoyner (David Joyner) wrote…

Maintainability: This still strikes me as a very brittle way to detect transitive linking scenarios. Have you considered whether there are other properties on node that will provide a better "signal" that this is a shared library?

There is an attribute.shared (not always present so you'd need to use getattr(node.attribute, 'shared', 0)). This appears to be true if the builder for the object is the SharedLibrary builder. It won't pick up .so files that are not built, though. Or symlinks of .so files.

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @djoyner and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

Previously, cliffc-spirent wrote…

There is an attribute.shared (not always present so you'd need to use getattr(node.attribute, 'shared', 0)). This appears to be true if the builder for the object is the SharedLibrary builder. It won't pick up .so files that are not built, though. Or symlinks of .so files.

Oops make that node.attributes

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @djoyner and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

Previously, cliffc-spirent wrote…

Oops make that node.attributes

Yeah so that alone fails with liblog4cplus-2.0.so which is built by a Command (it's renamed) in framework/il/vendor/lib/SConscript. Argh.

Copy link
Author

@vinodshelke82 vinodshelke82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @djoyner)


src/engine/SCons/Spirent/init.py, line 139 at r1 (raw file):

_has_transitive_dependencies
Thanks. modified.


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

Previously, cliffc-spirent wrote…

Yeah so that alone fails with liblog4cplus-2.0.so which is built by a Command (it's renamed) in framework/il/vendor/lib/SConscript. Argh.

I tried node.attributes, but its returning shared=1 for shared object nodes too ( .os ).
I tried multiple things like if builder.src_builder is SharedObject or new adding env variable for the builder.
But it fails to differentiate the sharedlib node.


src/engine/SCons/Spirent/init.py, line 145 at r1 (raw file):

Previously, djoyner (David Joyner) wrote…

Maintainability: Please don't provide a default value for transitive.

modified.


src/engine/SCons/Spirent/init.py, line 148 at r1 (raw file):

Previously, djoyner (David Joyner) wrote…

Minor: Just out of curiosity, why did path = ... move higher in the method?

was using the path variable while making the change. not used now. will move back lower.


src/engine/SCons/Spirent/init.py, line 153 at r1 (raw file):

Previously, djoyner (David Joyner) wrote…

Why is transitive false here?

for the node, I just need to add an edge and recurse only for its children. If I put True for node, it goes into recursion loop and reaches max stack depth.
RuntimeError: maximum recursion depth exceeded:


src/engine/SCons/Spirent/init.py, line 186 at r1 (raw file):

Previously, djoyner (David Joyner) wrote…

What's the purpose of this change?

This change was done by pramod long back and checked in to p4, but is not part of git repo.
This has to do with the way we build the IL sources in build directory and not at the place where sources reside.

Copy link

@djoyner djoyner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @cliffc-spirent, @djoyner, and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

Previously, vinodshelke82 (Vinod Shelke) wrote…

I tried node.attributes, but its returning shared=1 for shared object nodes too ( .os ).
I tried multiple things like if builder.src_builder is SharedObject or new adding env variable for the builder.
But it fails to differentiate the sharedlib node.

Could we consider a different approach?

Given a node that is being linked, write edges to all of its dependencies. Recurse (e.g. if one of those dependencies was linked, then repeat).

This is subtly different. We'd need a way to detect that a node is being linked (vs compiled or something else). The recursion could stop at source nodes (i.e. nodes that don't have a builder).

Thinking this through, it seems like this could work. The only thing I'm unsure about is vendored libraries. These are dependencies but probably don't have builders.

If this doesn't work, then I'm out of ideas and we can stick with the current approach. It will still need a bit of work though. Needs to detect DLL's (we should undo explicit transitive dependencies that were added to make the BLL compile... terrible hack). Also I'd probably use a regex instead of the combination of splitext and the in check.


src/engine/SCons/Spirent/init.py, line 186 at r1 (raw file):

Previously, vinodshelke82 (Vinod Shelke) wrote…

This change was done by pramod long back and checked in to p4, but is not part of git repo.
This has to do with the way we build the IL sources in build directory and not at the place where sources reside.

Ah, ok. Thanks for including this.

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @cliffc-spirent, @djoyner, and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 153 at r1 (raw file):

                if self._is_shared_lib(child):
                    self._walk_tree_write_nodes(f, parent_id, child, visited, written, True)
            self._walk_tree_write_nodes(f, parent_id, node, visited, written)

I don't think that line is necessary. Basically it's calling itself with the same arguments except for the last one. If it doesn't return here it will end up running the rest of the function twice.

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @djoyner and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

Previously, djoyner (David Joyner) wrote…

Could we consider a different approach?

Given a node that is being linked, write edges to all of its dependencies. Recurse (e.g. if one of those dependencies was linked, then repeat).

This is subtly different. We'd need a way to detect that a node is being linked (vs compiled or something else). The recursion could stop at source nodes (i.e. nodes that don't have a builder).

Thinking this through, it seems like this could work. The only thing I'm unsure about is vendored libraries. These are dependencies but probably don't have builders.

If this doesn't work, then I'm out of ideas and we can stick with the current approach. It will still need a bit of work though. Needs to detect DLL's (we should undo explicit transitive dependencies that were added to make the BLL compile... terrible hack). Also I'd probably use a regex instead of the combination of splitext and the in check.

Here's an alternative version. It uses the fact that the shared object builder has ['SharedObject'] in its src_builder list. Not sure if that's more or less hacky than just looking at the filename, but it is cross-platform. Also won't detect vendored shared objects (but those don't list their dependencies anyway).

def _is_shared_lib(self, node):
    if isinstance(node, SCons.Node.FS.Dir):
        return False
    if not node.has_builder():
        return False
    return node.builder.src_builder == ['SharedObject']

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @djoyner and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

Previously, cliffc-spirent wrote…

Here's an alternative version. It uses the fact that the shared object builder has ['SharedObject'] in its src_builder list. Not sure if that's more or less hacky than just looking at the filename, but it is cross-platform. Also won't detect vendored shared objects (but those don't list their dependencies anyway).

def _is_shared_lib(self, node):
    if isinstance(node, SCons.Node.FS.Dir):
        return False
    if not node.has_builder():
        return False
    return node.builder.src_builder == ['SharedObject']

This is missing the magic to detect symlinks of shared libs too. sigh

Copy link

@djoyner djoyner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @cliffc-spirent, @djoyner, and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

Previously, cliffc-spirent wrote…

This is missing the magic to detect symlinks of shared libs too. sigh

I guess if there's a diversity of ways that transitive dependencies are built (as shared libraries, via Command, via copying) then we might just have to stick with the current approach. I'd still like a more straightforward and extensible mechanism of working with the filenames though.

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @cliffc-spirent, @djoyner, and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 153 at r1 (raw file):

Previously, cliffc-spirent wrote…

I don't think that line is necessary. Basically it's calling itself with the same arguments except for the last one. If it doesn't return here it will end up running the rest of the function twice.

I have a refactored version of this (putting the _has_transitive_dependencies part aside) that doesn't have the explosion seen in the bll builds. Instead of adding an argument I have a separate function for walking/writing the transitive dependencies. Then I can track which edges were already written. The version checked into perforce ends up re-writing the same edges thousands of times. (This can happen for example if libbgp includes libfoo which includes liblog4cplus and libbar which also includes liblog4cplus. There were about 16000 lines saying that libbgp depended on liblog4cplus.

@vinodshelke82
Copy link
Author


src/engine/SCons/Spirent/init.py, line 153 at r1 (raw file):

Previously, cliffc-spirent wrote…

I have a refactored version of this (putting the _has_transitive_dependencies part aside) that doesn't have the explosion seen in the bll builds. Instead of adding an argument I have a separate function for walking/writing the transitive dependencies. Then I can track which edges were already written. The version checked into perforce ends up re-writing the same edges thousands of times. (This can happen for example if libbgp includes libfoo which includes liblog4cplus and libbar which also includes liblog4cplus. There were about 16000 lines saying that libbgp depended on liblog4cplus.

yes in BLL the recursion is going through same dependency multiple times through different node context. Can you share your change. I was planning to work on it today :). but you seem to have a fix already. can you please share your change. will run all the yocto builds and BLL and check.

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @cliffc-spirent, @djoyner, and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 153 at r1 (raw file):

Previously, vinodshelke82 (Vinod Shelke) wrote…

yes in BLL the recursion is going through same dependency multiple times through different node context. Can you share your change. I was planning to work on it today :). but you seem to have a fix already. can you please share your change. will run all the yocto builds and BLL and check.

I've shelved it at 968986.

@vinodshelke82
Copy link
Author


src/engine/SCons/Spirent/init.py, line 143 at r1 (raw file):

Previously, djoyner (David Joyner) wrote…

I guess if there's a diversity of ways that transitive dependencies are built (as shared libraries, via Command, via copying) then we might just have to stick with the current approach. I'd still like a more straightforward and extensible mechanism of working with the filenames though.

updated code with regex.

@vinodshelke82
Copy link
Author


src/engine/SCons/Spirent/init.py, line 153 at r1 (raw file):

Previously, cliffc-spirent wrote…

I've shelved it at 968986.

updated PR with your change.

Copy link

@djoyner djoyner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r3.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @cliffc-spirent and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 153 at r1 (raw file):

Previously, vinodshelke82 (Vinod Shelke) wrote…

updated PR with your change.

Thanks Cliff!


src/engine/SCons/Spirent/init.py, line 14 at r3 (raw file):

import SCons.Subst
import re
import ntpath

Why ntpath instead of os.path? Won't this do the wrong thing (below, in basename(...)) for scons invocations on Linux machines?


src/engine/SCons/Spirent/init.py, line 143 at r3 (raw file):

    def _has_transitive_dependencies(self, node):
        filename = ntpath.basename(str(node.rfile())) 
        return re.search("([a-zA-Z0-9\s_\\.\-\+])+[\.](so|lib)",filename)

Several comments and questions here:

  • This regex is not a raw string (r"regex") and I think you intend for it to be. In any case, encoding regexes as a raw string is preferred because it cuts down on the backslashes and reduces the potential for error.

  • Do we need to match libraries that don't end in .so? For example libfoo.so.1? These files certainly exist on disk but may not be an issue when it comes to transitive dependencies.

  • Do we need to worry about .dll? This would only matter if the Microsoft toolchain needs to read these files during link time. I know it needs .lib but I cannot remember re: .dll.

  • This will compile the regex for every call to _has_transitive_dependencies. We can avoid this by creating a module-level global, e.g. RE_SHARED_LIBRARY = re.compile(r"..."), and using RE_SHARED_LIBRARY.search(filename) here.

Copy link
Author

@vinodshelke82 vinodshelke82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @cliffc-spirent and @djoyner)


src/engine/SCons/Spirent/init.py, line 14 at r3 (raw file):

Previously, djoyner (David Joyner) wrote…

Why ntpath instead of os.path? Won't this do the wrong thing (below, in basename(...)) for scons invocations on Linux machines?

My bad. I should have used os.path instead.


src/engine/SCons/Spirent/init.py, line 143 at r3 (raw file):

Previously, djoyner (David Joyner) wrote…

Several comments and questions here:

  • This regex is not a raw string (r"regex") and I think you intend for it to be. In any case, encoding regexes as a raw string is preferred because it cuts down on the backslashes and reduces the potential for error.

  • Do we need to match libraries that don't end in .so? For example libfoo.so.1? These files certainly exist on disk but may not be an issue when it comes to transitive dependencies.

  • Do we need to worry about .dll? This would only matter if the Microsoft toolchain needs to read these files during link time. I know it needs .lib but I cannot remember re: .dll.

  • This will compile the regex for every call to _has_transitive_dependencies. We can avoid this by creating a module-level global, e.g. RE_SHARED_LIBRARY = re.compile(r"..."), and using RE_SHARED_LIBRARY.search(filename) here.

  1. Changed to raw string. Thanks
  2. Many of the libs are built with versioned config. These libs are linked against the versioned extension.
    eg. boost has extension so.1.64.0 . we need to rebuild many of the libs for this. current downside is we have
    couple of duplicate dependency edges with .so and .so.x.y.z nodes.
  3. I will extend this for .dll in case of MS toolchain.
  4. updated code with use of rex compile obj.

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @djoyner and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 15 at r4 (raw file):

import re

REX_SH_LIBRARY_OBJ  = re.compile(r"([a-zA-Z0-9\s_\\.\-\+])+[\.](so|lib|dll)")

Is there really whitespace (\s) in some of our shared library names?

@vinodshelke82
Copy link
Author


src/engine/SCons/Spirent/init.py, line 15 at r4 (raw file):

Previously, cliffc-spirent wrote…

Is there really whitespace (\s) in some of our shared library names?

sorry. corrected.

Copy link

@djoyner djoyner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r5.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @cliffc-spirent and @vinodshelke82)


src/engine/SCons/Spirent/init.py, line 14 at r3 (raw file):

Previously, vinodshelke82 (Vinod Shelke) wrote…

My bad. I should have used os.path instead.

Ok, thanks. Also since re is a standard module, please move its import into the first import section above.


src/engine/SCons/Spirent/init.py, line 15 at r5 (raw file):

import re

REX_SH_LIBRARY_OBJ  = re.compile(r"([a-zA-Z0-9\_\\.\-\+])+[\.](so|lib|dll)")

Likewise, I don't think we have the \ literal character in any of our filenames. Referring to the double \\ before the period. If it helps you can paste this string directly into regex101.com and it'll give you an explanation. This is one of my favorite tools for understanding and debugging regexes.

@vinodshelke82
Copy link
Author


src/engine/SCons/Spirent/init.py, line 14 at r3 (raw file):

Previously, djoyner (David Joyner) wrote…

Ok, thanks. Also since re is a standard module, please move its import into the first import section above.

done. Thanks

@vinodshelke82
Copy link
Author


src/engine/SCons/Spirent/init.py, line 15 at r5 (raw file):

Previously, djoyner (David Joyner) wrote…

Likewise, I don't think we have the \ literal character in any of our filenames. Referring to the double \\ before the period. If it helps you can paste this string directly into regex101.com and it'll give you an explanation. This is one of my favorite tools for understanding and debugging regexes.

thanks for regex101. its really good site. modified the pattern.

Copy link
Author

@vinodshelke82 vinodshelke82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @cliffc-spirent and @djoyner)


src/engine/SCons/Spirent/init.py, line 15 at r5 (raw file):

[a-zA-Z0-9\_\\.\-\+])

src/engine/SCons/Spirent/init.py, line 15 at r5 (raw file):


REX_SH_LIBRARY_OBJ  = re.compile(r"([a-zA-Z0-9\_\\.\-\+])+[\.](so|lib|dll)")

Copy link

@djoyner djoyner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but playing around with the new regex in regex101.com I'm wondering if maybe you need to also use re.IGNORECASE? You explicitly handle a-zA-Z in the basename part of the filename but the extension, e.g. lib, is still case sensitive. Likely never a problem on Linux but could show up as an issue under Windows.

Reviewed 1 of 1 files at r6.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @cliffc-spirent)

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r6.
Reviewable status: all files reviewed, 2 unresolved discussions

@vinodshelke82
Copy link
Author

Thanks

Copy link
Author

@vinodshelke82 vinodshelke82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions


src/engine/SCons/Spirent/init.py, line 153 at r1 (raw file):

Previously, djoyner (David Joyner) wrote…

Thanks Cliff!

Thanks

Copy link

@cliffc-spirent cliffc-spirent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants