Overview
This feature introduces a new analysis capability to pyscn to evaluate architectural health at the module and package level. It goes beyond code-level metrics by implementing the "Abstractness vs. Instability Principle," as popularized by Robert C. Martin.
The goal is to provide developers with insights into design-level issues that are not apparent from analyzing code complexity or coupling alone.
Metrics to Implement
1. Instability (I)
- Purpose: Indicates the module's resilience to change.
- Formula:
I = Ce / (Ce + Ca)
Ce (Efferent Couplings): The number of external modules that this module depends on (outgoing dependencies).
Ca (Afferent Couplings): The number of external modules that depend on this module (incoming dependencies).
- Interpretation:
I = 0: Maximally stable. No outgoing dependencies.
I = 1: Maximally unstable. Only has outgoing dependencies and no incoming ones.
- Implementation in
pyscn:
This can be calculated by leveraging the dependency graph already generated by the existing deps analysis. It requires counting the incoming (Ca) and outgoing (Ce) edges for each module node.
2. Abstractness (A)
- Purpose: Measures the degree to which a module consists of abstract elements (i.e., how extensible it is).
- Formula:
A = (Number of abstract classes) / (Total number of classes)
- Implementation in
pyscn:
This will be calculated by traversing the AST for each module.
- Count all
class definition nodes to get the "Total number of classes."
- Identify "abstract classes" by checking if a class inherits from
abc.ABC.
- Compute the ratio
A based on the formula.
3. Distance from the Main Sequence (D)
- Purpose: A final score that indicates how well a module balances abstractness and instability.
- Formula:
D = |A + I - 1| (the absolute value of A + I - 1)
- Interpretation:
D = 0: The module is well-balanced and lies on the "Main Sequence."
D = 1: The module has a poor balance.
Identifying Problematic Design Patterns
Modules with a high D value fall into two problematic zones:
-
The Zone of Pain (D is high, I is close to 0):
- State: The module is highly stable (depended on by many) but very concrete (not abstract).
- Problem: It is difficult to change, and any modification will impact a large number of dependent modules, causing significant "pain."
-
The Zone of Uselessness (D is high, I is close to 1):
- State: The module is highly abstract but also very unstable (depends on many other modules, and few depend on it).
- Problem: It may be an over-engineered abstraction with no real purpose.
Expected Benefits
- Architectural Visualization: The HTML report will include a scatter plot with Instability on the x-axis and Abstractness on the y-axis. This will allow developers to instantly identify which modules are in the "Zone of Pain" or "Zone of Uselessness."
- Objective Refactoring Guidance: The
D metric provides an objective, data-driven basis for prioritizing architectural refactoring efforts.
Implementation Tasks
References
- Martin, Robert C. Agile Software Development, Principles, Patterns, and Practices. Prentice Hall, 2002.
- Martin, Robert C. Clean Architecture: A Craftsman's Guide to Software Structure and Design. Prentice Hall, 2017.
Overview
This feature introduces a new analysis capability to
pyscnto evaluate architectural health at the module and package level. It goes beyond code-level metrics by implementing the "Abstractness vs. Instability Principle," as popularized by Robert C. Martin.The goal is to provide developers with insights into design-level issues that are not apparent from analyzing code complexity or coupling alone.
Metrics to Implement
1. Instability (I)
I = Ce / (Ce + Ca)Ce(Efferent Couplings): The number of external modules that this module depends on (outgoing dependencies).Ca(Afferent Couplings): The number of external modules that depend on this module (incoming dependencies).I = 0: Maximally stable. No outgoing dependencies.I = 1: Maximally unstable. Only has outgoing dependencies and no incoming ones.pyscn:This can be calculated by leveraging the dependency graph already generated by the existing
depsanalysis. It requires counting the incoming (Ca) and outgoing (Ce) edges for each module node.2. Abstractness (A)
A = (Number of abstract classes) / (Total number of classes)pyscn:This will be calculated by traversing the AST for each module.
classdefinition nodes to get the "Total number of classes."abc.ABC.Abased on the formula.3. Distance from the Main Sequence (D)
D = |A + I - 1|(the absolute value ofA + I - 1)D = 0: The module is well-balanced and lies on the "Main Sequence."D = 1: The module has a poor balance.Identifying Problematic Design Patterns
Modules with a high
Dvalue fall into two problematic zones:The Zone of Pain (
Dis high,Iis close to 0):The Zone of Uselessness (
Dis high,Iis close to 1):Expected Benefits
Dmetric provides an objective, data-driven basis for prioritizing architectural refactoring efforts.Implementation Tasks
Ca(afferent) andCe(efferent) couplings for each module from the existing dependency graph.abc.ABCsubclasses) within each module.I,A, andDfor each module.depsanalysis.References