manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	cfdf8988fb	feat(audit): add scripts/audit_dataclass_coverage.py + baseline (t0_2) GREEN phase for Phase 0. Mirrors scripts/audit_weak_types.py design with 3 additions specific to the any-type componentization track: 1. PROMOTED_SITE_MODULES allowlist: the 3 new src/ modules (mcp_tool_specs.py, openai_schemas.py, provider_state.py) are exempt from Any-counting (their new dataclasses intentionally have raw_response: Any and SDK holder fields that stay as Any per Pattern 3). 2. INLINE_PROMOTED_SITE_MODULES: log_registry.py + api_hooks.py get their dataclasses added inline in Phase 4 + 5 (not new modules); same exemption. 3. Combined counter: counts both Any AND weak-struct patterns (dict_str_any, list_of_dict, optional_dict, etc.). Modes: - default: informational (exits 0; prints human report) - --json: machine-readable with by_file, by_category, total_weak - --strict: CI gate (exits 1 when current > baseline) - --baseline: path to baseline file (default: scripts/audit_dataclass_coverage.baseline.json) Baseline: scripts/audit_dataclass_coverage.baseline.json = 207 weak sites (captured pre-Phase-1; expected to drop to ~118 after 89 sites promoted). Verification: uv run python scripts/audit_dataclass_coverage.py --strict STRICT OK: 207 weak sites <= baseline 207 uv run pytest tests/test_audit_dataclass_coverage.py --timeout=30 7 passed in 5.15s	2026-06-21 15:56:41 -04:00
ed	647ad3d49d	test(audit): add tests/test_audit_dataclass_coverage.py (t0_1) RED phase for Phase 0. Mirrors tests/test_audit_weak_types.py structure: - test_audit_script_exists: AUDIT_SCRIPT.is_file() sanity - test_audit_help_runs: --help exits 0 - test_audit_json_mode_emits_valid_json: --json emits valid JSON with expected fields - test_audit_default_mode_emits_human_report: default mode prints a report - test_audit_strict_mode_against_existing_baseline_passes: --strict exits 0 when current <= baseline - test_audit_strict_mode_fails_when_baseline_is_zero: --strict exits 1 when current > baseline=0 - test_audit_baseline_field_shape: --json output has expected baseline-shape fields 7 tests total. Run with: uv run pytest tests/test_audit_dataclass_coverage.py --timeout=30 NOTE: 6 of 7 tests fail at this commit (audit script not yet implemented). This is the RED phase; GREEN comes in the next commit.	2026-06-21 15:56:19 -04:00
ed	3669ce590c	conductor(plan): author plan.md for any_type_componentization_20260621 The spec.md was approved 2026-06-21 without a plan.md (the metadata.json noted 'plan.md (to be authored by writing-plans skill after spec approval)'). This plan mirrors the state.toml's per-task ledger and specifies the TDD protocol, tier-3 delegation conventions, hard bans, failcount contract, and per-phase verification commands. Plan structure: 7 phases, 61 tasks, ~50 atomic commits per the spec. Reads all 13 conductor/code_styleguides/*.md per the agent mandate.	2026-06-21 15:53:28 -04:00
ed	f1c23c7da5	conductor(plan): any_type_componentization_20260621 - 7 phases, 23 tasks, ~150 TDD steps Implements the 5 fat-struct candidates from docs/reports/ANY_TYPE_AUDIT_20260621.md: - Phase 0: JsonValue TypeAlias + audit_dataclass_coverage.py + styleguide section 12 - Phase 1: src/mcp_tool_specs.py (P1, 8 sites) - Phase 2: src/openai_schemas.py (P1, 17 sites) - Phase 3: src/provider_state.py (P2, 41 sites) - Phase 4: src/log_registry.py Session (P2, 7 sites) - Phase 5: src/api_hooks.py WebSocketMessage (P3, 16 sites) - Phase 6: verify + docs + archive Blocked by data_structure_strengthening_20260606 (pending merge). Sequencing: NOT blocked by code_path_audit_20260607 (orthogonal tracks). Tier 2 autonomous sandbox will execute via: /tier-2-auto-execute any_type_componentization_20260621 Spec: conductor/tracks/any_type_componentization_20260621/spec.md (approved 2026-06-21) Plan: this commit State: conductor/tracks/any_type_componentization_20260621/state.toml Metadata: conductor/tracks/any_type_componentization_20260621/metadata.json	2026-06-21 15:46:25 -04:00
ed	46a2245658	conductor(plan): mark Phase 0+1+2 init tasks complete in umbrella plan.md	2026-06-21 15:45:39 -04:00
ed	ebadfda9d6	docs(reports): TRACK_COMPLETION for video_analysis_campaign_20260621 (Phase 0+1+2 init only)	2026-06-21 15:44:06 -04:00
ed	365fa554d9	conductor(plan): mark Phase 0+1 complete + Phase 2 init complete in umbrella state.toml	2026-06-21 15:42:39 -04:00
ed	c1a15c45c5	conductor(tracks): scaffold plan.md + metadata.json + state.toml for 12 child + 1 synthesis tracks	2026-06-21 15:41:38 -04:00
ed	548c4fef63	feat(video_analysis): synthesize_report.py orchestrator with TDD (5 tests)	2026-06-21 15:39:22 -04:00
ed	ed0d198afe	feat(video_analysis): ocr_frames.py with TDD (4 tests, winsdk + tesseract backends)	2026-06-21 15:35:41 -04:00
ed	9ccdedeeb3	feat(video_analysis): extract_keyframes.py with TDD (4 tests)	2026-06-21 15:34:18 -04:00
ed	45a5e81406	feat(video_analysis): download_video.py with TDD (5 tests)	2026-06-21 15:32:46 -04:00
ed	94f4a4eee9	feat(video_analysis): extract_transcript.py with TDD (8 tests)	2026-06-21 15:31:42 -04:00
ed	12fcc55cfc	chore(scripts): scaffold scripts/video_analysis/ + placeholder test	2026-06-21 15:26:56 -04:00
ed	1c05305a98	chore(deps): add yt-dlp, cv2, imagehash, pillow, youtube-transcript-api, winsdk, pytesseract for video_analysis campaign	2026-06-21 15:26:02 -04:00
ed	a22e0f5473	Merge branch 'tier2/data_structure_strengthening_20260606'	2026-06-21 15:15:22 -04:00
ed	3529161b0f	conductor(track): add TIER2_STARTER.md for video_analysis_campaign dispatch 3 prompt templates for Tier 2 autonomous agents: 1. Umbrella Tier 2 (Phase 0+1+2 init): installs tooling, builds 5 scripts, scaffolds 12 children 2. Per-child Tier 2 (one child's 5-phase pipeline): Acquire, Keyframes, OCR, Synthesis, Verification 3. Synthesis Tier 2 (after all 12 children): cross-cutting per_video_summary.md + report.md Includes: file-read order, key risks, hard constraints, verification criteria, per-track Tier 2 dispatch commands, and a quick-reference table.	2026-06-21 15:13:24 -04:00
ed	6533b7120c	conductor(plan): enhance video_analysis_campaign plan with bite-sized Phase 0+1 Phase 0 (4 tasks): yt-dlp install, cv2/imagehash/PIL install, OCR backend decision, scripts/ namespace scaffold Phase 1 (5 tasks = 5 scripts): extract_transcript.py (8 tests), download_video.py (5 tests), extract_keyframes.py (4 tests), ocr_frames.py (4 tests), synthesize_report.py (5 tests) Phase 2-4: brief pointers (per-child plans deferred to Tier 2 during execution) Total: 26 unit tests across 5 test files. All scripts follow Result[T] convention + 1-space indent + type hints per project styleguides.	2026-06-21 15:08:20 -04:00
ed	de01131349	conductor(tracks): Register video_analysis_campaign_20260621 as active research track (row 26) - Added row 26 in Active Tracks table: priority A (research), independent, multi-pass handoff - Added detailed section under 'Active Research Tracks (2026-06+)' so the anchor link resolves - Documents: 12 videos in 5 clusters, per-child deliverables, reusable tooling, Phase 0 blockers, Pass 2/3 handoff contract	2026-06-21 15:05:58 -04:00
ed	1b40fa5345	conductor(video_analysis): Initialize 12 child + 1 synthesis spec scaffolds Each child spec is lightweight (~100 lines): references the umbrella, gives video details, specifies the 7 deliverables (transcript.json, frames/, ocr.md, report.md 1000-10000 LOC, summary.md), and the 5-phase pipeline. Children in execution order: 1. cs229_building_llms (Stanford CS229, Cluster E) 2. probability_logic (Cluster A) 3. entropy_epiplexity (Cluster A) 4. score_dynamics_giorgini (Cluster A) 5. platonic_intelligence_kumar (Cluster B) 6. free_lunches_levin (Cluster B) 7. generic_systems_fields (Cluster C) 8. brain_counterintuitive (Cluster C) 9. neural_dynamics_miller (Cluster C) 10. multiscale_hoffman (Cluster C) 11. cs336_architectures (Stanford CS336, Cluster E) 12. creikey_dl_cv (Cluster D) Plus 1 synthesis track (video_analysis_synthesis_20260621) blocked_by all 12 children.	2026-06-21 15:03:10 -04:00
ed	b184250b78	conductor(video_analysis_campaign): Initialize umbrella track + 12 child + 1 synthesis scaffold Pass 1 of 3 user research campaign (12 videos, 5 clusters). - Umbrella: spec.md (full design), plan.md, metadata.json, state.toml, README.md - Multi-pass framing (Pass 2 de-obfuscation, Pass 3 projection) - Lossless preservation directive (1000-10000 LOC per video report target) - Tooling prerequisites: yt-dlp, cv2, imagehash install in repo venv - 5 reusable scripts to live in scripts/video_analysis/ (TDD) - 12 children + 1 synthesis = 14 folders total	2026-06-21 15:02:44 -04:00
ed	aca84b881b	docs(reports): ANY_TYPE_AUDIT_20260621 - Any-type usage & componentization opportunities	2026-06-21 14:28:16 -04:00
ed	c4c45d4a54	conductor(plan): rewrite chronology_20260619 plan for v2 (11 phases, 4 pause points) Replaces the v1 plan (10 phases, single-stage cross-check) with an 11-phase plan that executes the v2 spec's git-history classifier + 3-stage cross-check + 30% quality gate. Plan Phase 2 = Spec Phase 2 part 1; renumbering shifts from Plan Phase 4 onwards (per the spec-vs-plan mapping in the summary table). 11 phases, 28 tasks, 4 hard pause points (Plan Phase 6 quality gate, Plan Phase 7 Tier 1 review, Plan Phase 10 user sign-off, plus the Plan Phase 6 ABORT fallback to manual review). TDD red+green cycles for Phases 2-4 (8 new tests for _classify_status + 4 for extract_summary + 3 for format_markdown + 5 for the quality gate). Test runner: scripts/run_tests_batched.py (per Tier 2 sandbox rule #1). Throw-away scripts: scripts/tier2/artifacts/chronology_20260619/ (rule #4). Default branch: master (rule #2). Line endings: preserve existing (rule #3).	2026-06-21 14:12:03 -04:00
ed	5c9249659f	conductor(spec): rewrite chronology_20260619 spec for v2 (git-history classifier + 30% quality gate) The first run shipped chronology.md with a status classifier that read stale metadata.json.status, marking 167/216 rows with wrong status. This v2 spec replaces FR1 (5-value status enum + per-row evidence + confidence), FR5 (git-history classifier with the 5-step algorithm from the handover), FR6 (3-stage cross-check), and adds FR7 (classifier quality gate at 30% low confidence threshold with abort-to-manual-review fallback). Substantive changes from v1: - 7 FRs (was 6); FR7 is new - 14 VCs (was 12); VC10-VC14 are new - 10 Risks (was 9) - 5-value status enum: Active / In Progress / Completed / Abandoned / Special (was 6-value: Shipped/Superseded/etc.) - Per-row evidence line format documented with worked example - 'Needs Review' section as a 5th section in chronology.md - Quality gate hard-codes the user's 'A only if classifier is good, else B' fallback design from chat 2026-06-21 Out of scope: 24 v1 commits + conductor/chronology.md.broken-v1 remain as the foundation; this is a continuation, not a re-do. state.toml still shows current_phase=10 from v1's false completion; the Tier 2 implementing agent will reset it in Phase 1.4 of the plan.	2026-06-21 14:08:40 -04:00
ed	6210410cda	conductor(plan): mark all phases/tasks complete in data_structure_strengthening_20260606	2026-06-21 13:07:58 -04:00
ed	bb4d85e4b4	conductor(tracks): mark data_structure_strengthening_20260606 as shipped	2026-06-21 13:05:52 -04:00
ed	d3205c7253	conductor(archive): ship data_structure_strengthening_20260606 to archive	2026-06-21 13:03:34 -04:00
ed	dff1dbb812	docs(reports): TRACK_COMPLETION_data_structure_strengthening_20260606	2026-06-21 13:03:07 -04:00
ed	60196a8723	docs(smoke): Phase 2 smoke test for data structure strengthening track	2026-06-21 13:02:00 -04:00
ed	c9c5abfbae	docs(product-guidelines): add Data Structure Conventions section	2026-06-21 13:01:19 -04:00
ed	7a52fca588	docs(styleguide): add canonical reference for type aliases convention	2026-06-21 12:59:41 -04:00
ed	f8990dae11	docs(type_registry): initial auto-generated registry (Phase 2)	2026-06-21 12:57:49 -04:00
ed	f7c16954d4	feat(generate_type_registry): AST-based registry generator with --check and --diff modes	2026-06-21 12:57:32 -04:00
ed	281cf0f01e	test(generate_type_registry): add red tests for the registry generator	2026-06-21 12:49:15 -04:00
ed	d81339ecb3	refactor(ai_client): _reread_file_items_result returns FileItemsDiff NamedTuple	2026-06-21 12:47:07 -04:00
ed	c147238970	conductor(plan): mark Phase 1 complete in data_structure_strengthening_20260606	2026-06-21 12:45:05 -04:00
ed	794ca91db0	conductor(plan): Phase 1 checkpoint - 8 commits; 528->112 weak sites (79% reduction)	2026-06-21 12:44:31 -04:00
ed	1985551f91	test(audit_weak_types): add tests for the audit script and --strict mode	2026-06-21 12:43:22 -04:00
ed	79c4b47b2b	chore(audit): generate baseline file (post-Phase-1: 112 weak sites, 79% reduction)	2026-06-21 12:41:34 -04:00
ed	dd26a79310	feat(audit_weak_types): add --strict mode for CI gate	2026-06-21 12:40:43 -04:00
ed	833e99f2ec	refactor(project_manager,aggregate,api_hook_client): replace weak type sites with aliases	2026-06-21 12:39:17 -04:00
ed	d0c0571bde	refactor(api_hook_client): replace weak type sites with aliases	2026-06-21 12:38:22 -04:00
ed	23b7b9357d	docs(reports): POST_CAMPAIGN_TEST_FIXES — closure for 3 failures 3 surgical test-side fixes shipped after the result-migration campaign was claimed '100% complete' (commit `0d11e917`). Each failure had a distinct root cause that bypassed the targeted track-level test sets: 1. test_phase_1_inventory_has_42_rows (tier-1-unit-gui): gitignored artifact deleted by cruft-removal at `b3508f0b` (commit `107d902d`) 2. test_live_warmup_canaries_endpoint (tier-3-live_gui): race with deferred warmup in live_gui subprocess (commit `69b7ab67`) 3. test_do_generate_uses_context_files (tier-1-unit-core): sandbox violation via paths.get_logs_dir default (commit `e2411e5c`) Full batched test suite: 11/11 tiers PASS. Campaign is now actually 100% complete. Report documents root causes, fixes, verification, and process learnings (rounds 6+7 of the false-completion pattern).	2026-06-21 12:36:41 -04:00
ed	57f0ddc815	refactor(app_controller): replace weak type sites with aliases	2026-06-21 12:33:51 -04:00
ed	852dea845f	refactor(ai_client): replace 192 weak type sites with aliases	2026-06-21 12:31:27 -04:00
ed	877bc0f06b	feat(type_aliases): add 10 TypeAliases + FileItemsDiff NamedTuple	2026-06-21 12:24:44 -04:00
ed	90d8c57a0f	test(type_aliases): add red tests for 10 TypeAliases + FileItemsDiff NamedTuple	2026-06-21 12:21:28 -04:00
ed	e2411e5c54	fix(test_sandbox): redirect session logs to tests/artifacts via autouse fixture Per FR1 of test_sandbox_hardening_20260619 spec, all writes must be under <project_root>/tests/. Tests that create an AppController + call init_state() trigger session_logger.open_session() at src/session_logger.py:85 which writes to paths.get_logs_dir() - by default logs/ at project root, outside tests/. This was triggered by tests/test_context_composition_decoupled.py and surfaced in the latest batched test run. Add a function-scoped autouse fixture in tests/conftest.py that monkeypatches src.paths.get_logs_dir to return a per-run tests/-allowed path. Per-run subdirectory prevents log_registry.toml collisions across test runs. Skips test_paths.py, test_test_sandbox.py, and test_app_controller_offloading.py which directly assert on paths.get_logs_dir() behavior or set up their own session via tmp_session_dir (overriding get_logs_dir at the module level breaks those tests' assertions). No production code is modified.	2026-06-21 11:59:51 -04:00
ed	69b7ab670d	fix(warmup_test): poll for canary records in live_gui test The live_gui subprocess spawns the desktop GUI, which creates AppController with defer_warmup=True (src/gui_2.py:318). Warmup is deferred until the first frame is painted (src/gui_2.py:1076). The previous test queried /api/warmup_canaries immediately after wait_for_server, racing against the first frame - canary list was empty until start_warmup() ran. Replace the immediate assert with a poll-with-retry loop (15s deadline, 0.5s interval) per workflow.md 'Async Setters Need Poll-For-State' rule.	2026-06-21 10:38:17 -04:00
ed	107d902d3c	fix(gui_2_result): regenerate PHASE1_SITE_INVENTORY.md via session fixture Tests/artifacts/PHASE1_SITE_INVENTORY.md was deleted by the cruft-removal track at commit `b3508f0b` (mistaken for sub-track 5's combined doc). The file is gitignored and cannot be restored from git history. This commit adds a session-scoped autouse fixture in tests/test_gui_2_result.py that regenerates the inventory markdown from scripts/audit_exception_handling.py --json output before the test runs. The 3 split files (PHASE1_INVENTORY_*.md, no 'SITE') are for sub-track 5 and cover mcp_client/ai_client/rag_engine (not gui_2). They coexist with this regenerated file per sub-track 4's convention.	2026-06-21 10:12:56 -04:00

1 2 3 4 5 ...

4047 Commits