npm-scanner

Design Decisions

Zero npm Dependencies

npm-scanner has no npm dependencies. This is intentional.

The tool audits npm packages for supply chain attacks. If it depended on npm packages, it would be vulnerable to the same attacks it detects. An attacker could compromise a dependency of npm-scanner to disable or subvert its detection.

This constraint means:

All code is bash scripts and standard Unix tools
No package.json or node_modules
CI fails if npm dependency files are detected

The tradeoff is slower development (no libraries) for stronger security guarantees.

Defense in Depth

npm-scanner uses multiple detection mechanisms because no single check catches all attacks:

URL dependencies catch PhantomRaven-style attacks
IOC matching catches known infrastructure
Metadata analysis catches suspicious patterns
Typosquatting detection catches name confusion

Each mechanism has blind spots. Together, they provide broader coverage.

Risk Scoring Over Binary Decisions

npm-scanner assigns risk scores rather than pass/fail verdicts because:

Context matters - A lifecycle script in puppeteer is legitimate; in an unknown package it’s suspicious
False positives are costly - Blocking legitimate packages frustrates users
Humans make final decisions - Security tools inform; humans decide

The scoring weights were chosen based on:

Severity of the attack vector
Likelihood of false positives
Real-world attack patterns

Caching Strategy

The cache exists because:

npm registry API has rate limits
Repeated scans of the same packages are common
Offline operation is sometimes needed

The 3-day default expiration balances freshness with performance. Security-critical data (IOCs) is not cached—it’s bundled with the tool.

Shared Library Pattern

Common functionality lives in npm-audit-lib.sh because:

IOC definitions need a single source of truth
Cache functions should be consistent
Code duplication leads to divergence

Scripts source the library rather than copying code.

Report Output Format

Reports are Markdown because:

Human-readable without special tools
Renders well on GitHub
Easy to diff between runs
Can be converted to other formats

Structured data (TSV) is generated for programmatic analysis.

This site is open source. Improve this page.