ed
8bf7cd175b
docs(tier2): add user guide for Tier 2 autonomous sandbox
2026-06-16 22:48:13 -04:00
ed
3e17aa6c8b
test(tier2): add smoke e2e test (opt-in, double-gate TIER2_SANDBOX_TESTS+TIER2_SMOKE)
2026-06-16 22:26:04 -04:00
ed
5b6e7db174
test(tier2): add sandbox enforcement test (pre-push hook refuses push)
2026-06-16 20:25:44 -04:00
ed
5d150dc6e0
test(tier2): add bootstrap -WhatIf test (opt-in via TIER2_SANDBOX_TESTS)
2026-06-16 20:01:32 -04:00
ed
37eafc008e
test(tier2): add trivial smoke track for e2e test (force-added, fixture)
2026-06-16 19:57:36 -04:00
ed
cb7c82008e
test(tier2): add tier2_sandbox and tier2_smoke pytest markers
2026-06-16 19:56:20 -04:00
ed
e487d34b40
feat(tier2): add post-checkout detection hook (logs to tier2_checkout_log.txt)
2026-06-16 19:51:16 -04:00
ed
01be39236b
feat(tier2): add pre-push hook that refuses all pushes
2026-06-16 19:50:58 -04:00
ed
cba5457b9d
feat(tier2): add run_tier2_sandboxed.ps1 launcher with restricted token (skeleton)
2026-06-16 19:49:47 -04:00
ed
a9be60ae50
feat(tier2): add setup_tier2_clone.ps1 bootstrap script with -WhatIf support
2026-06-16 19:47:06 -04:00
ed
796da0de60
feat(tier2): add run_track.py CLI with init/status/report modes + git fetch/switch
2026-06-16 19:27:08 -04:00
ed
9964ad3b3e
test(tier2): add 12 slash command + agent + config spec contract tests
2026-06-16 19:23:10 -04:00
ed
154a370728
feat(tier2): add opencode.json.fragment with deny rules + path allowlist
2026-06-16 19:19:37 -04:00
ed
016381c4ff
feat(tier2): create tier2-autonomous agent profile template
2026-06-16 19:18:36 -04:00
ed
7380e23bc0
feat(tier2): create tier-2-auto-execute slash command template
2026-06-16 19:17:41 -04:00
ed
73ab2778ca
feat(report): implement write_failure_report + 8 tests, 100% coverage
2026-06-16 19:13:30 -04:00
ed
5ca8444f35
test(report): add report writer tests (red, opt-in via TIER2_SANDBOX_TESTS=1)
2026-06-16 19:10:22 -04:00
ed
2dbfaeb60e
test(failcount): add 13 unit tests + 6 coverage tests; 100% coverage achieved
2026-06-16 19:06:09 -04:00
ed
190766fe03
feat(failcount): add default failcount.toml thresholds
2026-06-16 19:01:31 -04:00
ed
fc92e1aa74
feat(failcount): add FailcountState + FailcountConfig dataclasses + all stub functions
2026-06-16 18:59:38 -04:00
ed
e646067a8a
test(failcount): add test_initial_state_zero (red)
2026-06-16 18:58:00 -04:00
ed
9f2ff29c2e
feat(tier2): create scripts/tier2/ package
2026-06-16 18:57:09 -04:00
ed
e060399579
conductor(plan): add state.toml for tier2_autonomous_sandbox track
...
44 tasks across 9 phases, all pending. Tracks:
- failcount unit test progression (13 target)
- slash command spec tests (11 target)
- report writer tests (4 opt-in)
- bootstrap test (1 opt-in)
- sandbox enforcement test (1 opt-in)
- smoke e2e test (1 opt-in, double gate)
Enforcement stack contract: 9 flags tracking the 4 git bans + filesystem
boundary + 3 hook installs + OpenCode deny rules + Windows restricted token.
Final verification requires all 9 enforcement flags = true.
status: active, current_phase: 0, blocked_by: none, blocks: none
2026-06-16 18:51:42 -04:00
ed
2551ff18c7
no t-shirt nonsense (agents.md)
2026-06-16 18:47:50 -04:00
ed
6a26713d74
conductor(plan): Tier 2 autonomous sandbox - implementation plan + metadata
...
9 phases, 30+ tasks, scope-only (no T-shirt size per user feedback):
- Phase 1: failcount module (15 TDD tasks, 13 unit tests, 100% coverage target)
- Phase 2: failure report writer (4 sections, opt-in tests)
- Phase 3: slash command + agent + opencode.json.fragment templates (11 spec tests)
- Phase 4: run_track.py CLI entry point (duplicates slash command protocol)
- Phase 5: setup_tier2_clone.ps1 bootstrap (idempotent, -WhatIf support)
- Phase 6: run_tier2_sandboxed.ps1 launcher (restricted token skeleton v1)
- Phase 7: git hooks (pre-push refuses all pushes, post-checkout logs)
- Phase 8: opt-in tests (TIER2_SANDBOX_TESTS=1, TIER2_SMOKE=1)
- Phase 9: user guide + tracks.md registration + metadata
Key contracts:
- FailcountState dataclass with 3 signals (red/green/no_progress)
- Result-style with to_dict/from_dict for state persistence
- Atomic write via tmp + os.replace
- 3-layer enforcement: OpenCode permission system + Windows restricted token + git hooks
2026-06-16 18:46:36 -04:00
ed
568804c7d9
conductor(spec): drop T-shirt size per user feedback
2026-06-16 18:38:09 -04:00
ed
024938bd46
conductor(spec): Tier 2 autonomous sandbox track spec
2026-06-16 18:31:48 -04:00
ed
88e44d1c0e
docs(report): add session report (audit + migration plan + tech-rot prevention)
2026-06-16 10:48:15 -04:00
ed
b90d4bdd4e
feat(scripts): add --ci alias for --strict + CI-gate doc updates
2026-06-16 10:40:21 -04:00
ed
ce85c379ad
docs(agents): add Convention Enforcement section at the top (4 mechanisms)
2026-06-16 10:37:35 -04:00
ed
734840375f
docs(guidelines): add AI Agent Obligations section with 4 enforcement audit scripts
2026-06-16 10:35:55 -04:00
ed
ef1b0a1c6d
docs(styleguide): add AI Agent Checklist section against tech rot
2026-06-16 10:29:26 -04:00
ed
4a55a14fc0
conductor: register result_migration_20260616 in tracks.md (umbrella + 5 sub-tracks)
2026-06-16 10:26:10 -04:00
ed
4cf885da90
docs(workflow+agents): add HARD BAN on day estimates + Tier 1 Track Initialization Rules section
2026-06-16 10:16:49 -04:00
ed
ed6602274d
docs(tracks): strip day estimates from exception_handling_audit + rag_test_failures (Tier 1 rule)
2026-06-16 10:16:17 -04:00
ed
4c0b19b4db
conductor(track): spec/plan/metadata for result_migration_20260616 (5 sub-tracks, NO day estimates)
2026-06-16 10:15:46 -04:00
ed
4521a7df96
feat(scripts): add --summary and --by-size modes to exception_handling audit
2026-06-16 09:41:20 -04:00
ed
01fbd62a3f
conductor(track): mark exception_handling_audit_20260616 as completed
2026-06-16 09:10:14 -04:00
ed
4b8363bd71
conductor: register exception_handling_audit_20260616 in tracks.md
2026-06-16 09:09:34 -04:00
ed
3c59e24162
docs(report): add exception handling audit report (211 violations across 42 files)
2026-06-16 09:07:42 -04:00
ed
4209523228
docs(app_controller+guidelines): add Exception Handling section + audit script cross-reference
2026-06-16 09:07:24 -04:00
ed
b447f66818
docs(styleguide): add 5 sections clarifying the convention's boundaries
2026-06-16 09:06:54 -04:00
ed
9a04153abd
feat(scripts): add exception_handling audit script (10-category classification)
2026-06-16 09:06:25 -04:00
ed
3c267f6b9c
conductor(track): metadata.json for exception_handling_audit_20260616
2026-06-16 09:05:59 -04:00
ed
a33bfb0abd
conductor(track): plan for exception_handling_audit_20260616 (5 phases, ~12 tasks)
2026-06-16 09:05:40 -04:00
ed
e81413a2cd
conductor(track): spec for exception_handling_audit_20260616 (audit + doc clarification)
2026-06-16 09:05:19 -04:00
ed
3d35bb5b3f
todo
2026-06-16 01:03:59 -04:00
ed
ff91c4e8b0
docs(report): add completion report for rag_test_failures_20260615
...
Comprehensive 12-section completion report following the format of
TRACK_COMPLETION_ai_loop_regressions_20260615.md. Documents:
- 4 atomic commits, 1288+4+0 fully green baseline
- 2 defensive guards in src/rag_engine.py (lines 150 and 331)
- 3 new unit tests in tests/test_rag_sync_none_error.py
- 4 plan deviations (spec wrong about root cause, test_rag_visual_sim
was already passing, traceback diagnostic was a dead end, temp dir
cleanup retry loop for Windows)
- 5 followup recommendations for Tier 1 review
2026-06-16 00:36:24 -04:00
ed
ba04363003
conductor(track): mark rag_test_failures_20260615 as completed
...
Updated metadata.json: status=completed, completed_at=2026-06-15,
verification_criteria filled with actual results.
Updated tracks.md: status=shipped, 4-commit summary, test file added.
Final result: 1288 pass + 4 skip + 0 fail. All 11 batched test tiers pass
in 873.6s. First fully green baseline since 2026-06-12.
2026-06-16 00:31:26 -04:00
ed
d89c58103d
docs(rag): add troubleshooting section for NoneType.get error
...
Documents the two bugs fixed in the rag_test_failures_20260615 track:
1. get_all_indexed_paths: m.get('path') failing on None metadata
2. _validate_collection_dim_result: 'if not embeddings' raising
ValueError on non-empty numpy arrays
Also documents the 'no such table: tenants' chromadb corruption
symptom (wipe .slop_cache/chroma_* to recover).
Plus: 'rag_status' shows 'error: ' prefix is the failure indicator;
the actual error message is the part after the prefix.
2026-06-16 00:28:53 -04:00