So, I'm hitting some edge cases in our core atpm algorithm and I think we have some design issues.
Notation
I use A->B to mean "A depends on B"
Greedy approach is insufficient
Currently our approach to dependency resolution is "greedy", that is we proceed vaguely as follows:
- Parse A's manifest, learn
A->B, A->C
- Choose a version of B
- Parse B's manifest, find no dependencies
- Parse C's manifest, learn
C->B, but with particular version constraints on B
- If the version constraints were not satisfied by the earlier choice in 2, fail.
- There may be a version of B that satisfies both requirement 2 and 4, but because 2 was processed first, we never "backtrack" to a version of B that also meets requirement 4.
It's not clear to me what the right solution here is. Fundamentally, we are not able to see the path A->C->B before we fetch C, so we cannot merely reorder them up front to resolve the issue. Nor can we know, when we hit condition 6, whether or not a satisfactory version of B exists, because the satisfactory version of B may introduce new dependencies, violating 3 and altering the system further.
Perhaps I need to research how other people solve this problem, but the obvious algorithms to me are to backtrack and try again, which has awful worst-case performance and may not even halt (!)
Atlock
The next problem is that we have an interaction with the above and the atlock.
I believe we currently ignore the atlock completely for packages that aren't the outermost, which may be the most sensible thing we can do. However I don't think we've thought through the details:
- Suppose we have an end-user application,
App
- The application depends on two libraries,
Logging and Network
Network also depends on Logging
Network developers are aware of some bugs in new Logging versions and so they pin to a particular commit in Network/build.atlock
App developers will see no effect of this, because external/Network/build.atlock is not consulted during atpm fetch of App.
- They are then running
Network in an unsupported configuration and who knows what happens now.
I see no obvious solution here either.
ping @owensd @dunkelstern
So, I'm hitting some edge cases in our core atpm algorithm and I think we have some design issues.
Notation
I use
A->Bto mean "A depends on B"Greedy approach is insufficient
Currently our approach to dependency resolution is "greedy", that is we proceed vaguely as follows:
A->B,A->CC->B, but with particular version constraints on BIt's not clear to me what the right solution here is. Fundamentally, we are not able to see the path
A->C->Bbefore we fetch C, so we cannot merely reorder them up front to resolve the issue. Nor can we know, when we hit condition 6, whether or not a satisfactory version of B exists, because the satisfactory version of B may introduce new dependencies, violating 3 and altering the system further.Perhaps I need to research how other people solve this problem, but the obvious algorithms to me are to backtrack and try again, which has awful worst-case performance and may not even halt (!)
Atlock
The next problem is that we have an interaction with the above and the
atlock.I believe we currently ignore the atlock completely for packages that aren't the outermost, which may be the most sensible thing we can do. However I don't think we've thought through the details:
AppLoggingandNetworkNetworkalso depends onLoggingNetworkdevelopers are aware of some bugs in newLoggingversions and so they pin to a particular commit inNetwork/build.atlockAppdevelopers will see no effect of this, becauseexternal/Network/build.atlockis not consulted duringatpm fetchofApp.Networkin an unsupported configuration and who knows what happens now.I see no obvious solution here either.
ping @owensd @dunkelstern