chore(audit): add license_cve audit script + initial report
scripts/audit_license_cve.py: 4 internal checks (license + CVE + pin + source-header), policy tables (allowlist of permissive/weak-copyleft/public-domain, blocklist of non-OSI/restricted-source), and a main() that runs all 4 and emits line-per-violation to stdout + a markdown report. Tests (26 unit + integration) cover license classifier (16 variants across MIT, BSD, Apache, LGPL, MPL, CC0, WTFPL, GPL, AGPL, SSPL, BSL, Commons Clause, Elastic, Anti-996, Hippocratic, unknown), pin check (3), source-header check (3), license check via importlib.metadata (1), CVE check via subprocess pip-audit (2), and a smoke test of the main loop (1). No new pip deps in the project: pure stdlib (importlib.metadata, tomllib, pathlib, re) + subprocess to pip-audit (optional dev tool, installed via 'uv tool install pip-audit' if user wants CVE checks). Initial report at docs/reports/license_cve_audit/2026-06-07/ records the current state. The Phase 2 commit will apply the fixes (tilde-pin, delete requirements.txt); the Phase 3 commit will add --strict mode + baseline file for CI.
This commit is contained in:
@@ -0,0 +1,33 @@
|
||||
# Track state for license_cve_audit_20260607
|
||||
# Updated by Tier 2 Tech Lead as tasks complete
|
||||
|
||||
[meta]
|
||||
track_id = "license_cve_audit_20260607"
|
||||
name = "License & CVE Audit (Dependency Compliance)"
|
||||
status = "active"
|
||||
current_phase = 0
|
||||
last_updated = "2026-06-07"
|
||||
|
||||
[phases]
|
||||
phase_1 = { status = "pending", checkpointsha = "", name = "Audit script + initial report" }
|
||||
phase_2 = { status = "pending", checkpointsha = "", name = "Tilde-pin + lock regen + delete requirements.txt" }
|
||||
phase_3 = { status = "pending", checkpointsha = "", name = "CI gate (--strict + baseline)" }
|
||||
phase_4 = { status = "pending", checkpointsha = "", name = "tracks.md update" }
|
||||
|
||||
[verification]
|
||||
audit_script_exists = false
|
||||
license_check_passes = false
|
||||
cve_check_optional_passes = false
|
||||
pin_check_passes = false
|
||||
source_header_check_passes = false
|
||||
pyproject_tilde_pinned = false
|
||||
requirements_txt_deleted = false
|
||||
uv_lock_regenerated = false
|
||||
strict_mode_implemented = false
|
||||
baseline_file_committed = false
|
||||
unit_tests_passing = false
|
||||
|
||||
[tasks]
|
||||
t0_1 = { status = "completed", commit_sha = "", description = "Create state.toml" }
|
||||
t0_2 = { status = "in_progress", commit_sha = "", description = "Create empty scripts/audit_license_cve.py" }
|
||||
t0_3 = { status = "in_progress", commit_sha = "", description = "Create empty tests/test_audit_license_cve.py" }
|
||||
@@ -322,6 +322,54 @@ tests/
|
||||
|
||||
Each phase has its own checkpoint commit and git note.
|
||||
|
||||
## 5.5 Opencode-stable swap (non-destructive development + quality-gated rollout)
|
||||
|
||||
**Why this section exists.** The current `scripts/mcp_server.py` (and the `mcp_client.dispatch` it wraps) is consumed by **opencode clients** via the MCP protocol. opencode is the AI agent tool that uses Manual Slop's tool surface. The new sub-MCP architecture MUST be developed in a way that does not break opencode's existing usage during development, AND the actual swap (the new dispatch becoming the default in `sloppy.py`'s controller) MUST be gated on a stability verification.
|
||||
|
||||
**Non-destructive development principle.** Throughout Phases 1-6, the existing `mcp_client.py` continues to work exactly as it does today. The new sub-MCPs, the new controller, the new security module are all added AS NEW FILES (or alongside the existing code in `mcp_client.py`). The legacy code path remains the default. opencode clients see zero behavioral change during Phases 1-6.
|
||||
|
||||
**The swap mechanism.** `sloppy.py` (the entry point) and `app_controller.py` (the controller init) introduce a single configuration flag:
|
||||
|
||||
```python
|
||||
# In sloppy.py / app_controller.py
|
||||
MCP_USE_NEW_DISPATCH: bool = False # default during Phases 1-6; flipped to True after Phase 7 verification
|
||||
```
|
||||
|
||||
When `MCP_USE_NEW_DISPATCH=False` (default during development):
|
||||
- The legacy shim is the dispatch path (Phase 2's behavior; preserved as the safe default)
|
||||
- All existing opencode workflows work unchanged
|
||||
- The new sub-MCPs exist but are NOT in the dispatch path; they can be developed and unit-tested in isolation
|
||||
|
||||
When `MCP_USE_NEW_DISPATCH=True` (Phase 7's flip, gated on verification):
|
||||
- The new controller (`MCPController`) is the dispatch path
|
||||
- The legacy shim is still present (for any direct imports) but no longer called by the entry point
|
||||
- opencode clients connect via the MCP server, which now uses the new dispatch
|
||||
- All 45+ tools must work identically via the new path (verified by the opencode stability check)
|
||||
|
||||
**The verification (opencode stability check).** Before Phase 7 flips the default to `MCP_USE_NEW_DISPATCH=True`:
|
||||
|
||||
1. **Unit tests pass**: the per-sub-MCP unit tests + the controller tests + the legacy-shim regression tests all pass.
|
||||
2. **Existing test files pass unchanged**: `test_mcp_client_beads.py`, `test_mcp_config.py`, `test_mcp_perf_tool.py`, `test_mcp_ts_integration.py` pass without modification (they use the legacy shim, which delegates correctly).
|
||||
3. **Opencode integration test**: a manual or automated test where opencode connects to the MCP server (using `MCP_USE_NEW_DISPATCH=True`), lists the available tools, and invokes 5-10 representative tools (e.g., `read_file`, `list_directory`, `py_get_skeleton`, `py_find_usages`, `web_search`, `derive_code_path`). The results must match the expected outputs.
|
||||
4. **Soak test**: the opencode integration test runs cleanly for 5+ consecutive sessions over 1+ day without regressions, errors, or performance degradation.
|
||||
|
||||
**When the verification passes, the track ships with `MCP_USE_NEW_DISPATCH=True` as the default in `sloppy.py`.** When it doesn't (e.g., a sub-MCP has a regression, or a new sub-MCP's tool doesn't work via opencode), the default stays `False` until the issues are resolved.
|
||||
|
||||
**The flag is the boundary.** It is the single point where the new system becomes the default. During Phases 1-6, the flag is `False` and opencode sees no change. After Phase 7, the flag is `True` (gated on verification). Future tracks can extend either path without re-architecting.
|
||||
|
||||
## 5.6 Compatibility surface preserved during development
|
||||
|
||||
To make the non-destructive development principle concrete, here is the public surface that MUST keep working throughout the track (i.e., across all 7 phases):
|
||||
|
||||
| Consumer | What it uses | How it keeps working |
|
||||
|----------|--------------|----------------------|
|
||||
| `scripts/mcp_server.py` | `mcp_client.dispatch("tool_name", args)` and `mcp_client.async_dispatch(...)` | These functions exist in the legacy shim throughout Phases 1-6; in Phase 7 they delegate to the new controller (when the flag is True) or stay as-is (when the flag is False). |
|
||||
| `src/app_controller.py:61` | `mcp_client.py_get_symbol_info(...)` (a direct function call) | This function is in `mcp_client_legacy.py` and re-exported from `mcp_client.py` from Phase 2 onward. Unchanged for opencode. |
|
||||
| opencode (via MCP protocol) | The 45+ tool names; the JSON tool-call format; the response shape | The legacy shim preserves all 45+ tool names + signatures + return shapes (string). opencode sees no change until the flag is flipped in Phase 7. |
|
||||
| The 4 existing test files | `mcp_client.<func_name>(...)` and the dispatch result | Legacy shim re-exports; tests pass unchanged. |
|
||||
|
||||
Each phase has its own checkpoint commit and git note.
|
||||
|
||||
## 6. Configuration
|
||||
|
||||
No new dependencies. The existing stdlib `ast`, `pathlib`, `dataclasses`, etc. are used. The `result_types.py` and `type_aliases.py` modules are already in place from the previous tracks.
|
||||
@@ -344,6 +392,7 @@ No new dependencies. The existing stdlib `ast`, `pathlib`, `dataclasses`, etc. a
|
||||
| `tests/test_mcp_config.py` (existing) | Verify config-related MCP tools work. | 100% (regression) |
|
||||
| `tests/test_mcp_perf_tool.py` (existing) | Verify the perf tool works. | 100% (regression) |
|
||||
| `tests/test_mcp_ts_integration.py` (existing) | Verify the ts_c / ts_cpp integration tests work. | 100% (regression) |
|
||||
| `tests/test_mcp_client_opencode_integration.py` (NEW) | The opencode stability check (see section 5.5). Starts an MCP server with `MCP_USE_NEW_DISPATCH=True`, simulates opencode's tool-calling protocol, invokes 5-10 representative tools, and verifies the results. This is the quality gate that gates the Phase 7 default-flip. | 100% (quality gate) |
|
||||
|
||||
## 8. Risks & Mitigations
|
||||
|
||||
@@ -355,6 +404,8 @@ No new dependencies. The existing stdlib `ast`, `pathlib`, `dataclasses`, etc. a
|
||||
| The `Result[str, Any]` return type from sub-MCPs is incompatible with the existing tests' `assert dispatch(...) == "text"` pattern. | Low | Low | The legacy shim's `dispatch` unwraps `.data` so existing tests see the same string. New tests can check `.data` and `.errors` directly. |
|
||||
| The new sub-MCP architecture is "overkill" for the project's scale. | Low | Low (subjective) | The current 2,205-line file is the largest in the project; even if only 30% of the function count grew 2x in the next year, the file would be unmanageable. The investment now is bounded; the maintenance cost avoided is unbounded. |
|
||||
| The DSL future becomes "we have to do it now" before this track is done. | Low | Low | The DSL is explicitly out of scope. This track stays JSON-compatible. A future DSL track can layer on top without breaking the architecture. |
|
||||
| The new sub-MCP architecture is correct in isolation but breaks an opencode workflow that wasn't covered by the unit tests. | Medium | High (opencode is the primary external consumer) | The opencode stability check (section 5.5) is the explicit quality gate: opencode integration test + 5+ sessions soak test. The `MCP_USE_NEW_DISPATCH` flag stays `False` until the check passes. The legacy shim remains the dispatch path during Phases 1-6. |
|
||||
| The `MCP_USE_NEW_DISPATCH` flag is left `False` indefinitely because the opencode stability check is too strict or too flaky. | Low | Low | The flag is a single line in `sloppy.py`. The user can flip it manually when they judge the new system is ready for opencode, even if the automated check is too strict. The check is a quality gate, not a hard requirement. |
|
||||
|
||||
## 9. Out of Scope (Explicit)
|
||||
|
||||
@@ -373,7 +424,13 @@ No new dependencies. The existing stdlib `ast`, `pathlib`, `dataclasses`, etc. a
|
||||
|
||||
## 11. Configuration
|
||||
|
||||
No new environment variables. The existing `config.toml` is unchanged. The `extra_base_dirs` and `file_items` security configuration is set by `app_controller.py` at startup (unchanged).
|
||||
**One new environment variable** is introduced for the opencode-stable swap (see section 5.5):
|
||||
|
||||
- **`MCP_USE_NEW_DISPATCH: bool`** — default `False` during Phases 1-6 of this track. Flipped to `True` in Phase 7 after the opencode stability check passes (or stays `False` if the check fails). Read by `sloppy.py` (the entry point) and `app_controller.py` (the controller init).
|
||||
|
||||
**How it works.** `sloppy.py` and `app_controller.py` check the env var at startup. When `MCP_USE_NEW_DISPATCH=False` (the default during development), the legacy shim is the dispatch path. When `True`, the new `MCPController` is the dispatch path. The flag is the single point where the new system becomes the default; it can be toggled without code changes for testing.
|
||||
|
||||
No other new env vars. The existing `config.toml` is unchanged. The `extra_base_dirs` and `file_items` security configuration is set by `app_controller.py` at startup (unchanged).
|
||||
|
||||
## 12. See Also
|
||||
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -0,0 +1,259 @@
|
||||
"""Third-party license + CVE + version-pin audit tool.
|
||||
|
||||
Audits the project's dependencies (pyproject.toml + uv.lock transitive
|
||||
tree) for license compliance, known CVEs (via pip-audit), version
|
||||
pinning, and SPDX source-headers. See
|
||||
conductor/tracks/license_cve_audit_20260607/spec.md.
|
||||
|
||||
Output: line-per-violation to stdout (parseable) + a markdown report
|
||||
under docs/reports/license_cve_audit/<date>/. The --strict flag
|
||||
turns the script into a CI gate (exits non-zero on new violations
|
||||
versus the baseline).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
import json
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
import tomllib
|
||||
from dataclasses import dataclass, field
|
||||
from importlib import metadata
|
||||
from pathlib import Path
|
||||
from typing import Literal
|
||||
|
||||
ALLOW_LICENSES: frozenset[str] = frozenset({
|
||||
"MIT", "MIT-0",
|
||||
"BSD", "BSD-2-Clause", "BSD-3-Clause", "0BSD",
|
||||
"Apache", "Apache-2.0", "Apache 2.0", "Apache-2.0 WITH LLVM-exception",
|
||||
"ISC", "ISC-License",
|
||||
"Unlicense", "Unlicense-2.0",
|
||||
"Zlib", "zlib-acknowledgement",
|
||||
"Python-2.0", "PSF-2.0", "PSF", "CNRI-Python",
|
||||
"LGPL", "LGPL-2.0", "LGPL-2.1", "LGPL-3.0", "LGPL-2.0-or-later",
|
||||
"LGPL-2.1-or-later", "LGPL-3.0-or-later",
|
||||
"MPL", "MPL-1.1", "MPL-2.0",
|
||||
"CC0", "CC0-1.0", "WTFPL",
|
||||
"Anti-996", "Anti-996-License",
|
||||
"Hippocratic", "Hippocratic-2.1",
|
||||
})
|
||||
|
||||
BLOCK_LICENSES: frozenset[str] = frozenset({
|
||||
"GPL", "GPL-1.0", "GPL-2.0", "GPL-3.0",
|
||||
"GPL-2.0-or-later", "GPL-3.0-or-later",
|
||||
"AGPL", "AGPL-1.0", "AGPL-3.0",
|
||||
"AGPL-3.0-or-later",
|
||||
"SSPL", "SSPL-1.0", "Server Side Public License",
|
||||
"BUSL", "BUSL-1.1",
|
||||
"BSL", "BSL-1.1",
|
||||
"Commons-Clause",
|
||||
"Elastic", "Elastic-2.0",
|
||||
})
|
||||
|
||||
Result = Literal["allow", "block"]
|
||||
|
||||
def classify_license(license_str: str | None) -> Result:
|
||||
"""Classify a license string. Returns 'allow' or 'block'.
|
||||
|
||||
Decision rule:
|
||||
- None or empty string -> 'block' (no metadata = violation)
|
||||
- In BLOCK_LICENSES -> 'block'
|
||||
- In ALLOW_LICENSES -> 'allow'
|
||||
- Anything else (unknown / unparseable / unclassified) -> 'block'
|
||||
Never auto-passes; unknown licenses are flagged for manual review.
|
||||
"""
|
||||
if not license_str:
|
||||
return "block"
|
||||
normalized = license_str.strip()
|
||||
if normalized in BLOCK_LICENSES:
|
||||
return "block"
|
||||
if normalized in ALLOW_LICENSES:
|
||||
return "allow"
|
||||
return "block"
|
||||
|
||||
@dataclass
|
||||
class Violation:
|
||||
kind: Literal["license", "cve", "pin", "spdx"]
|
||||
target: str
|
||||
detail: str
|
||||
|
||||
def format_stdout(self) -> str:
|
||||
return f"{self.kind.upper()}_VIOLATION target={self.target} detail={self.detail!r}"
|
||||
|
||||
def check_pins(pyproject_path: Path) -> list[Violation]:
|
||||
"""Parse pyproject.toml and flag any dep without a version specifier."""
|
||||
with pyproject_path.open("rb") as f:
|
||||
data = tomllib.load(f)
|
||||
violations: list[Violation] = []
|
||||
for dep in data.get("project", {}).get("dependencies", []):
|
||||
name = re.split(r"[<>=!~;\[ ]", dep, maxsplit=1)[0].strip()
|
||||
has_specifier = any(op in dep for op in ("<", ">", "=", "~", "!"))
|
||||
if not has_specifier:
|
||||
violations.append(Violation(kind="pin", target=name, detail="no version specifier in pyproject.toml"))
|
||||
return violations
|
||||
|
||||
SPDX_PATTERN = re.compile(r"SPDX-License-Identifier:\s*(\S+)", re.IGNORECASE)
|
||||
|
||||
def check_source_headers(src_dir: Path) -> list[Violation]:
|
||||
"""Walk src_dir for .py files; flag any with a non-permissive SPDX."""
|
||||
violations: list[Violation] = []
|
||||
for py_file in src_dir.rglob("*.py"):
|
||||
try:
|
||||
text = py_file.read_text(encoding="utf-8", errors="replace")
|
||||
except OSError:
|
||||
continue
|
||||
head = "\n".join(text.splitlines()[:20])
|
||||
m = SPDX_PATTERN.search(head)
|
||||
if m and classify_license(m.group(1)) == "block":
|
||||
violations.append(Violation(
|
||||
kind="spdx",
|
||||
target=str(py_file),
|
||||
detail=f"license={m.group(1)!r}",
|
||||
))
|
||||
return violations
|
||||
|
||||
def check_licenses() -> list[Violation]:
|
||||
"""Check each installed distribution's license against the policy.
|
||||
|
||||
Iterates importlib.metadata.distributions(); for each, reads the
|
||||
License (or License-Expression) metadata and classifies it. If
|
||||
classify_license returns 'block', the dep is a violation.
|
||||
"""
|
||||
violations: list[Violation] = []
|
||||
for dist in metadata.distributions():
|
||||
name = dist.metadata["Name"]
|
||||
license_str = dist.metadata.get("License") or dist.metadata.get("License-Expression")
|
||||
if classify_license(license_str) == "block":
|
||||
if not license_str:
|
||||
detail = "no license metadata"
|
||||
else:
|
||||
detail = f"license={license_str!r}"
|
||||
violations.append(Violation(kind="license", target=name, detail=detail))
|
||||
return violations
|
||||
|
||||
import shutil
|
||||
|
||||
def check_cves() -> list[Violation]:
|
||||
"""Run pip-audit as a subprocess; parse JSON output for CVEs.
|
||||
|
||||
If pip-audit is not installed, this is a no-op (returns []). The script
|
||||
logs a warning so the user knows the CVE check was skipped.
|
||||
"""
|
||||
if shutil.which("pip-audit") is None:
|
||||
print("WARNING: pip-audit not installed; CVE check skipped. Install via 'uv tool install pip-audit'.", file=sys.stderr)
|
||||
return []
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["pip-audit", "--format=json", "--strict"],
|
||||
capture_output=True, text=True, timeout=120,
|
||||
)
|
||||
except (subprocess.TimeoutExpired, FileNotFoundError) as e:
|
||||
print(f"WARNING: pip-audit failed: {e}", file=sys.stderr)
|
||||
return []
|
||||
if result.returncode != 0 and not result.stdout.strip():
|
||||
print(f"WARNING: pip-audit returned non-zero with no output: {result.stderr}", file=sys.stderr)
|
||||
return []
|
||||
try:
|
||||
data = json.loads(result.stdout)
|
||||
except json.JSONDecodeError:
|
||||
return []
|
||||
violations: list[Violation] = []
|
||||
for dep in data.get("dependencies", []):
|
||||
name = dep.get("name", "<unknown>")
|
||||
for vuln in dep.get("vulns", []):
|
||||
cve_id = vuln.get("id", "<unknown>")
|
||||
fix = ", ".join(vuln.get("fix_versions", []) or ["<unknown>"])
|
||||
severity = vuln.get("severity", "unknown")
|
||||
violations.append(Violation(
|
||||
kind="cve", target=name,
|
||||
detail=f"cve_id={cve_id} severity={severity} fix_versions={fix!r}",
|
||||
))
|
||||
return violations
|
||||
|
||||
def main() -> int:
|
||||
import argparse
|
||||
parser = argparse.ArgumentParser(description="License + CVE + pin audit for third-party dependencies.")
|
||||
parser.add_argument("--src", default="src", help="Source dir to scan for SPDX headers")
|
||||
parser.add_argument("--scripts", default="scripts", help="Scripts dir to scan for SPDX headers")
|
||||
parser.add_argument("--pyproject", default="pyproject.toml", help="Path to pyproject.toml")
|
||||
parser.add_argument("--report-dir", default="docs/reports/license_cve_audit", help="Report output dir")
|
||||
parser.add_argument("--date", default=None, help="ISO date for the report (default: today)")
|
||||
parser.add_argument("--strict", action="store_true", help="Exit non-zero if violations > baseline")
|
||||
parser.add_argument("--dump-baseline", action="store_true", help="Write current violations as the new baseline")
|
||||
args = parser.parse_args()
|
||||
|
||||
violations: list[Violation] = []
|
||||
violations.extend(check_licenses())
|
||||
violations.extend(check_cves())
|
||||
violations.extend(check_pins(Path(args.pyproject)))
|
||||
src_dir = Path(args.src)
|
||||
if src_dir.exists():
|
||||
violations.extend(check_source_headers(src_dir))
|
||||
scripts_dir = Path(args.scripts)
|
||||
if scripts_dir.exists():
|
||||
violations.extend(check_source_headers(scripts_dir))
|
||||
|
||||
for v in violations:
|
||||
print(v.format_stdout())
|
||||
|
||||
from datetime import date
|
||||
date_str = args.date or date.today().isoformat()
|
||||
report_dir = Path(args.report_dir) / date_str
|
||||
report_dir.mkdir(parents=True, exist_ok=True)
|
||||
report_path = report_dir / "initial.md"
|
||||
_write_report(violations, report_path, args)
|
||||
|
||||
if args.strict:
|
||||
baseline_path = Path(args.report_dir).parent / "scripts" / "audit_license_cve.baseline.json"
|
||||
if baseline_path.exists():
|
||||
baseline = json.loads(baseline_path.read_text(encoding="utf-8"))
|
||||
baseline_n = len(baseline.get("baseline_violations", []))
|
||||
if len(violations) > baseline_n:
|
||||
print(f"STRICT FAIL: {len(violations)} violations > {baseline_n} baseline", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
if args.dump_baseline:
|
||||
baseline_path = Path(args.report_dir).parent / "scripts" / "audit_license_cve.baseline.json"
|
||||
baseline_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
baseline_path.write_text(json.dumps({
|
||||
"schema_version": 1,
|
||||
"baseline_violations": [v.format_stdout() for v in violations],
|
||||
"baseline_date": date_str,
|
||||
"notes": "Run scripts/audit_license_cve.py --dump-baseline to regenerate.",
|
||||
}, indent=2), encoding="utf-8")
|
||||
print(f"Wrote {baseline_path}")
|
||||
|
||||
return 0
|
||||
|
||||
def _write_report(violations: list[Violation], path: Path, args) -> None:
|
||||
by_kind: dict[str, list[Violation]] = {"license": [], "cve": [], "pin": [], "spdx": []}
|
||||
for v in violations:
|
||||
by_kind.setdefault(v.kind, []).append(v)
|
||||
lines: list[str] = [
|
||||
f"# License & CVE Audit - {args.date or 'today'}",
|
||||
"",
|
||||
"## Top-level summary",
|
||||
"",
|
||||
f"- License violations: {len(by_kind['license'])}",
|
||||
f"- CVEs found: {len(by_kind['cve'])}",
|
||||
f"- Pinning issues: {len(by_kind['pin'])}",
|
||||
f"- SPDX violations in src/ or scripts/: {len(by_kind['spdx'])}",
|
||||
"",
|
||||
"## Notes",
|
||||
"",
|
||||
"- No `LICENSE` file in repo root - informational, not a violation. The project's own license posture is the user's call (currently all rights reserved).",
|
||||
"- No source-file `SPDX-License-Identifier` headers - informational, not a violation. The project's own copyright headers are the user's call.",
|
||||
"- If pip-audit is not installed, the CVE check is skipped. Install via `uv tool install pip-audit` to enable.",
|
||||
"",
|
||||
"## Per-violation table",
|
||||
"",
|
||||
"| Type | Target | Detail |",
|
||||
"|------|--------|--------|",
|
||||
]
|
||||
for kind in ("license", "cve", "pin", "spdx"):
|
||||
for v in sorted(by_kind[kind], key=lambda x: x.target):
|
||||
lines.append(f"| {v.kind} | `{v.target}` | {v.detail} |")
|
||||
path.write_text("\n".join(lines) + "\n", encoding="utf-8")
|
||||
print(f"Wrote {path}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -0,0 +1,190 @@
|
||||
"""Tests for scripts/audit_license_cve."""
|
||||
from pathlib import Path
|
||||
import json
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
||||
|
||||
import pytest
|
||||
from scripts.audit_license_cve import classify_license, Violation
|
||||
|
||||
def test_classify_license_mit() -> None:
|
||||
assert classify_license("MIT") == "allow"
|
||||
|
||||
def test_classify_license_bsd_3_clause() -> None:
|
||||
assert classify_license("BSD-3-Clause") == "allow"
|
||||
assert classify_license("BSD") == "allow"
|
||||
|
||||
def test_classify_license_apache_2() -> None:
|
||||
assert classify_license("Apache-2.0") == "allow"
|
||||
assert classify_license("Apache 2.0") == "allow"
|
||||
|
||||
def test_classify_license_lgpl() -> None:
|
||||
assert classify_license("LGPL-2.1") == "allow"
|
||||
assert classify_license("LGPL-3.0") == "allow"
|
||||
|
||||
def test_classify_license_mpl_2() -> None:
|
||||
assert classify_license("MPL-2.0") == "allow"
|
||||
|
||||
def test_classify_license_cc0_wtfpl() -> None:
|
||||
assert classify_license("CC0-1.0") == "allow"
|
||||
assert classify_license("WTFPL") == "allow"
|
||||
|
||||
def test_classify_license_gpl_blocks() -> None:
|
||||
assert classify_license("GPL-2.0") == "block"
|
||||
assert classify_license("GPL-3.0") == "block"
|
||||
assert classify_license("GPL") == "block"
|
||||
|
||||
def test_classify_license_agpl_blocks() -> None:
|
||||
assert classify_license("AGPL-3.0") == "block"
|
||||
assert classify_license("AGPL") == "block"
|
||||
|
||||
def test_classify_license_sspl_blocks() -> None:
|
||||
assert classify_license("SSPL-1.0") == "block"
|
||||
assert classify_license("Server Side Public License") == "block"
|
||||
|
||||
def test_classify_license_bsl_blocks() -> None:
|
||||
assert classify_license("BUSL-1.1") == "block"
|
||||
assert classify_license("BSL-1.1") == "block"
|
||||
|
||||
def test_classify_license_commons_clause_blocks() -> None:
|
||||
assert classify_license("Apache-2.0 WITH Commons-Clause") == "block"
|
||||
assert classify_license("Commons-Clause") == "block"
|
||||
|
||||
def test_classify_license_elastic_blocks() -> None:
|
||||
assert classify_license("Elastic-2.0") == "block"
|
||||
|
||||
def test_classify_license_anti_996_allows() -> None:
|
||||
assert classify_license("Anti-996") == "allow"
|
||||
assert classify_license("Anti-996-License") == "allow"
|
||||
|
||||
def test_classify_license_hippocratic_allows() -> None:
|
||||
assert classify_license("Hippocratic-2.1") == "allow"
|
||||
|
||||
def test_classify_license_unknown_blocks() -> None:
|
||||
assert classify_license("UNKNOWN") == "block"
|
||||
assert classify_license("Custom") == "block"
|
||||
assert classify_license("see AUTHORS") == "block"
|
||||
assert classify_license("") == "block"
|
||||
assert classify_license(None) == "block"
|
||||
|
||||
def test_classify_license_random_string_blocks() -> None:
|
||||
"""Unknown / unclassified licenses are violations, never auto-passes."""
|
||||
assert classify_license("Made Up License v1.0") == "block"
|
||||
assert classify_license("Proprietary-EULA") == "block"
|
||||
|
||||
from scripts.audit_license_cve import check_pins
|
||||
|
||||
def test_check_pins_no_specifier(tmp_path: Path) -> None:
|
||||
pyproject = tmp_path / "pyproject.toml"
|
||||
pyproject.write_text(
|
||||
'[project]\nname = "x"\nversion = "0.1.0"\ndependencies = ["foo", "bar"]\n',
|
||||
encoding="utf-8",
|
||||
)
|
||||
violations = check_pins(pyproject)
|
||||
names = {v.target for v in violations}
|
||||
assert "foo" in names
|
||||
assert "bar" in names
|
||||
|
||||
def test_check_pins_with_specifier(tmp_path: Path) -> None:
|
||||
pyproject = tmp_path / "pyproject.toml"
|
||||
pyproject.write_text(
|
||||
'[project]\nname = "x"\nversion = "0.1.0"\ndependencies = ["foo>=1.0.0", "bar~2.0.0", "baz==3.0.0"]\n',
|
||||
encoding="utf-8",
|
||||
)
|
||||
violations = check_pins(pyproject)
|
||||
assert violations == []
|
||||
|
||||
def test_check_pins_exact_version_ok(tmp_path: Path) -> None:
|
||||
"""Exact pins are fine - they have a lower bound (==X)."""
|
||||
pyproject = tmp_path / "pyproject.toml"
|
||||
pyproject.write_text(
|
||||
'[project]\nname = "x"\nversion = "0.1.0"\ndependencies = ["foo==1.0.0"]\n',
|
||||
encoding="utf-8",
|
||||
)
|
||||
violations = check_pins(pyproject)
|
||||
assert violations == []
|
||||
|
||||
from scripts.audit_license_cve import check_source_headers
|
||||
|
||||
def test_check_source_headers_gpl_violation(tmp_path: Path) -> None:
|
||||
src = tmp_path / "src"
|
||||
src.mkdir()
|
||||
(src / "foo.py").write_text(
|
||||
"# SPDX-License-Identifier: GPL-3.0\n# A file.\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
violations = check_source_headers(src)
|
||||
assert any("foo.py" in v.target and "GPL" in v.detail for v in violations)
|
||||
|
||||
def test_check_source_headers_no_spdx_ok(tmp_path: Path) -> None:
|
||||
"""No SPDX line = no violation (informational note; project's own copyright is user's call)."""
|
||||
src = tmp_path / "src"
|
||||
src.mkdir()
|
||||
(src / "bar.py").write_text("# A file with no SPDX.\n", encoding="utf-8")
|
||||
violations = check_source_headers(src)
|
||||
assert violations == []
|
||||
|
||||
def test_check_source_headers_mit_ok(tmp_path: Path) -> None:
|
||||
src = tmp_path / "src"
|
||||
src.mkdir()
|
||||
(src / "baz.py").write_text("# SPDX-License-Identifier: MIT\n# A file.\n", encoding="utf-8")
|
||||
violations = check_source_headers(src)
|
||||
assert violations == []
|
||||
|
||||
from scripts.audit_license_cve import check_licenses
|
||||
|
||||
def test_check_licenses_via_metadata(monkeypatch) -> None:
|
||||
"""The license check iterates installed distributions and classifies each."""
|
||||
class FakeDist:
|
||||
def __init__(self, name: str, license_str: str | None) -> None:
|
||||
self.metadata = {"Name": name, "License": license_str, "Version": "1.0.0"}
|
||||
fake_dists = [
|
||||
FakeDist("good-pkg", "MIT"),
|
||||
FakeDist("bad-pkg", "GPL-3.0"),
|
||||
FakeDist("unknown-pkg", "UNKNOWN"),
|
||||
FakeDist("missing-pkg", None),
|
||||
]
|
||||
monkeypatch.setattr("importlib.metadata.distributions", lambda: fake_dists)
|
||||
violations = check_licenses()
|
||||
names = {v.target for v in violations}
|
||||
assert "bad-pkg" in names
|
||||
assert "unknown-pkg" in names
|
||||
assert "missing-pkg" in names
|
||||
assert "good-pkg" not in names
|
||||
|
||||
from scripts.audit_license_cve import check_cves
|
||||
|
||||
def test_check_cves_pip_audit_not_installed(monkeypatch) -> None:
|
||||
"""If pip-audit is not on PATH, the CVE check is a no-op (not a failure)."""
|
||||
monkeypatch.setattr("shutil.which", lambda cmd: None if cmd == "pip-audit" else "/usr/bin/" + cmd)
|
||||
violations = check_cves()
|
||||
assert violations == []
|
||||
|
||||
def test_check_cves_pip_audit_json(monkeypatch) -> None:
|
||||
"""If pip-audit is installed, parse its JSON output."""
|
||||
fake_json = json.dumps({
|
||||
"dependencies": [
|
||||
{"name": "vuln-pkg", "version": "1.0.0", "vulns": [
|
||||
{"id": "CVE-2024-12345", "fix_versions": [">=1.2.3"], "severity": "high"}
|
||||
]},
|
||||
],
|
||||
}).encode("utf-8")
|
||||
class FakeCompleted:
|
||||
stdout = fake_json
|
||||
returncode = 0
|
||||
stderr = b""
|
||||
monkeypatch.setattr("shutil.which", lambda cmd: "/usr/bin/pip-audit" if cmd == "pip-audit" else None)
|
||||
monkeypatch.setattr("subprocess.run", lambda *a, **kw: FakeCompleted())
|
||||
violations = check_cves()
|
||||
assert any("CVE-2024-12345" in v.detail and v.target == "vuln-pkg" for v in violations)
|
||||
|
||||
def test_main_smoke_runs(tmp_path: Path, capsys) -> None:
|
||||
"""The script runs end-to-end in informational mode; exit code 0."""
|
||||
import subprocess
|
||||
result = subprocess.run(
|
||||
["python", "-m", "scripts.audit_license_cve", "--report-dir", str(tmp_path / "reports"), "--date", "2026-06-07"],
|
||||
capture_output=True, text=True, timeout=30,
|
||||
)
|
||||
assert result.returncode == 0
|
||||
assert "Wrote" in result.stdout
|
||||
Reference in New Issue
Block a user