Design Decisions
Zero npm Dependencies
npm-scanner has no npm dependencies. This is intentional.
The tool audits npm packages for supply chain attacks. If it depended on npm packages, it would be vulnerable to the same attacks it detects. An attacker could compromise a dependency of npm-scanner to disable or subvert its detection.
This constraint means:
- All code is bash scripts and standard Unix tools
- No package.json or node_modules
- CI fails if npm dependency files are detected
The tradeoff is slower development (no libraries) for stronger security guarantees.
Defense in Depth
npm-scanner uses multiple detection mechanisms because no single check catches all attacks:
- URL dependencies catch PhantomRaven-style attacks
- IOC matching catches known infrastructure
- Metadata analysis catches suspicious patterns
- Typosquatting detection catches name confusion
Each mechanism has blind spots. Together, they provide broader coverage.
Risk Scoring Over Binary Decisions
npm-scanner assigns risk scores rather than pass/fail verdicts because:
- Context matters - A lifecycle script in puppeteer is legitimate; in an unknown package it’s suspicious
- False positives are costly - Blocking legitimate packages frustrates users
- Humans make final decisions - Security tools inform; humans decide
The scoring weights were chosen based on:
- Severity of the attack vector
- Likelihood of false positives
- Real-world attack patterns
Caching Strategy
The cache exists because:
- npm registry API has rate limits
- Repeated scans of the same packages are common
- Offline operation is sometimes needed
The 3-day default expiration balances freshness with performance. Security-critical data (IOCs) is not cached—it’s bundled with the tool.
Shared Library Pattern
Common functionality lives in npm-audit-lib.sh because:
- IOC definitions need a single source of truth
- Cache functions should be consistent
- Code duplication leads to divergence
Scripts source the library rather than copying code.
Reports are Markdown because:
- Human-readable without special tools
- Renders well on GitHub
- Easy to diff between runs
- Can be converted to other formats
Structured data (TSV) is generated for programmatic analysis.