The project models the NPM ecosystem as a directed graph:
Instead of analyzing the entire NPM ecosystem (3M+ packages), a Combined Sampling Strategy was followed to capture both infrastructural backbone and popular end-user packages:
ecosyste.ms.
As a result, a directed graph of 1,844 nodes and 3,814 edges (after full traversal) was constructed.
Three fundamental metrics were used to determine the importance of packages within the network:
| Metric | Definition | Risk Context Meaning |
|---|---|---|
| In-Degree | The count of incoming dependencies. | Direct Impact Radius: Indicates popularity and the immediate number of projects affected by a vulnerability ($w=0.30$). |
| Out-Degree | The count of outgoing dependencies. | Attack Surface: A high out-degree suggests an expanded surface for upstream attacks ($w=0.10$). |
| Betweenness | Frequency of the node appearing on shortest paths between others. | Bridge Role: Quantifies the package's role as a network "bridge" controlling information and risk flow ($w=0.35$). |
| Inverted Clustering | $1 - \text{ClusteringCoefficient}$ | Local Fragility: Captures structural holes. Nodes bridging distinct clusters without redundant connections pose higher risk ($w=0.15$). |
A single metric is insufficient to express complex supply chain risks. Therefore, a Behavioral Risk Score (BRS) was developed through weighted combination of structural and popularity metrics.
Each metric is normalized to the [0,1] range using the Min-Max method. To focus on orders of magnitude rather than raw counts, skewed metrics (Dependents, Downloads, Staleness) are Log-Normalized before scaling:
The v3 model prioritizes strategic position (Betweenness) and impact radius (In-Degree), while accounting for local fragility (Inverted Clustering).
The model's validity was tested through targeted attack simulations: