manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	167eacc1de	Merge branch 'master' of C:\projects\manual_slop into tier2/send_result_to_send_20260616	2026-06-17 07:37:36 -04:00
ed	07a0e66a19	docs(tier2): apply user feedback - 6 workflow conventions User feedback from the first sandbox run (send_result_to_send_20260616, 2026-06-17) identified 6 conventions Tier 2 must follow. Update the agent prompt template, slash command template, user guide, and workflow doc: 1. Test runner: ALWAYS use 'uv run python scripts/run_tests_batched.py' (NOT 'uv run pytest'). The batched runner provides tier filtering, parallelization (xdist), and a summary table that direct pytest lacks. 2. Default branch: this repo uses 'master', not 'main'. The Tier 2 slash command now does 'git fetch origin master' (was 'origin main'). 3. Line endings: preserve existing. This repo has a mix of CRLF and LF; a repo-wide LF standardization is a future track. 4. Throw-away scripts: write to 'scripts/tier2/artifacts/<track>/', NOT the base 'scripts/tier2/' directory. The base is reserved for production code; throw-away scripts are kept for archival but isolated per-track. 5. End-of-track report: write 'docs/reports/TRACK_COMPLETION_<track>.md' and update 'state.toml' to 'status=completed'. The user reads this to decide merge. Previously this was implicit; now it's explicit. 6. Run-time expectation: tracks are 1-4 hours. If context runs out, Tier 2 notes progress to disk and continues. The --resume flag picks up from the last completed task. Also updated the user guide with a 'Conventions' section and a troubleshooting entry for the resume flow. The verify-the-sandbox checklist now uses 'origin master' instead of 'origin main'.	2026-06-17 02:13:29 -04:00
ed	86fc1c5477	Merge branch 'master' of C:\projects\manual_slop into tier2/send_result_to_send_20260616	2026-06-17 02:00:56 -04:00
ed	e2e570369e	wrong folder	2026-06-17 01:57:52 -04:00
ed	1fc4a6026b	plan update for (send_result-to_send)	2026-06-17 01:54:52 -04:00
ed	9899ad8a41	ignore coverage	2026-06-17 01:54:24 -04:00
ed	abf92a8b31	feat(tier2): add fetch_tier2_branch.ps1 - bridge from sandbox to main repo The Tier 2 sandbox blocks git push (and all other destructive git ops). After Tier 2 finishes a track, this script is the bridge: it fetches the tier2/<track> branch from the sandboxed clone (C:\projects\manual_slop_tier2) into the main repo (C:\projects\manual_slop), creating a local review/<track> branch so the working tree is untouched. Usage: pwsh -File scripts\\tier2\\fetch_tier2_branch.ps1 -TrackName send_result_to_send_20260616 Supports -WhatIf for dry-run. Does NOT push to origin (user's call).	2026-06-17 01:52:04 -04:00
ed	a91c1da33c	end of track: test suite log.	2026-06-17 01:43:50 -04:00
ed	959ea38b87	conductor(track): fable_review_20260617 metadata — point to plan.md Plan committed at `8ec6d8f4` (1010 lines, 7 phases, 50+ tasks).	2026-06-17 01:41:58 -04:00
ed	8ec6d8f4a6	conductor(plan): Add fable_review_20260617 plan 7 phases, 50+ bite-sized tasks. Phase 1: init + 4 skeleton files. Phase 2: 10 parallel Tier 3 cluster sub-agent dispatches. Phase 3: 17 synthesis sections (Tier 1 max-token-output strategy). Phase 4: 3 side artifacts. Phase 5: self-review. Phase 6: user review. Phase 7: final commit + register. Every task has a verification command. Fable artifact at docs/artifacts/Fable System Prompt.txt is NEVER staged (verified per-task). No day estimates (per conductor/workflow.md §Tier 1 Track Initialization Rules).	2026-06-17 01:41:42 -04:00
ed	511a19aab2	send_result_to_send_20260616 session transcript. This one was important to keep is it was the first attempt at an autonomous run. Essentially worked except for a turn exhaustion on ai side (need to tweak some config maybe).	2026-06-17 01:32:07 -04:00
ed	219b653a45	docs(tier2): add track completion report (final verification + handoff) End-of-track report following the same format as TRACK_COMPLETION_tier2_autonomous_sandbox_20260616.md. Documents: - 24-commit inventory (10 atomic renames + 14 plan/script commits) - All 6 phases completed, all 9 verification flags = true - Pre-existing failures (7 tests, all credentials.toml, confirmed against origin/master baseline where they also fail) - 2 surgical doc fixes in error_handling.md (deprecation section + line 204 contradiction) - Sandbox enforcement contracts held (4 of 4 hard bans + 4 of 4 secondary contracts) - User handoff instructions (fetch + diff + merge + per-commit review) The track is the first end-to-end test of the tier2_autonomous_sandbox; this report is the final deliverable for that test.	2026-06-17 01:22:57 -04:00
ed	8eaf694f4a	conductor(tracks): Register fable_review_20260617 in tracks.md New research track for critical analysis of Anthropic's Claude Fable 5 system prompt. Added as row 25 in the Active Tracks table (Priority B research) and as a section in the new 'Active Research Tracks (2026-06+)' grouping. The companion spec + metadata + state.toml are committed in `058e2c93` and `a6114ef9`.	2026-06-17 01:19:45 -04:00
ed	c0e2051ec9	conductor(plan): Mark Phase 6 complete - all track tasks done Phase 6 tasks (t6_1, t6_2, t6_3) and the phase itself marked completed. All 16 task entries now have status=completed. All 6 phase entries now have status=completed. This is the final state.toml commit for the track.	2026-06-17 01:18:40 -04:00
ed	9a5d3b9c8c	conductor(plan): Mark Task 6.3 complete - register in tracks.md Added entry after the Tier 2 Autonomous Sandbox track (its parent dependency). Status: shipped 2026-06-17. Notes: 6 phases, 10 atomic rename commits, 37 files modified, 0 new/deleted. Test inventory: 100/101 pass in renamed files; 7 broader pre-existing failures all due to missing credentials.toml (confirmed against origin/master).	2026-06-17 01:18:02 -04:00
ed	5a58e1ceaf	conductor(plan): Mark Task 6.2 complete - metadata.json to status=shipped Track marked shipped 2026-06-17. All 6 verification criteria evaluated with PASS/EXCEEDED/READY status and notes. 7 pre-existing test failures documented with root cause and pre_existing_failures_remaining flag. Risk register updated: scope_creep=none, behavior_change=none, doc_drift=medium (error_handling.md deprecation section required surgical rewrite to historical note). No deferred_to_followup_tracks (this track completed cleanly).	2026-06-17 01:16:43 -04:00
ed	a6114ef9ac	conductor(track): Add fable_review_20260617 state.toml 7 phases (init -> 10 parallel cluster dispatches -> 17 synthesis sections -> 3 side artifacts -> self-review -> user review -> register). Each phase has explicit task IDs (t1_1 .. t7_4) for Tier 2 to walk through. current_phase = 0 (spec approved, not started). Hard rule encoded in [meta]: docs/artifacts/Fable System Prompt.txt is NEVER committed.	2026-06-17 01:16:20 -04:00
ed	058e2c9385	conductor(track): Add fable_review_20260617 spec + metadata Critical-analysis track for Anthropic's Claude Fable 5 system prompt (1585 lines, the public 'Mythos' version). 10 cluster sub-reports written by Tier 3 workers in parallel, synthesized by Tier 1 into a 17-section report (>3500 LOC) with 3 side artifacts. T-shirt size: XL. Fable artifact at docs/artifacts/Fable System Prompt.txt is local-only and MUST NOT be committed (per user hard rule). No day estimates (per conductor/workflow.md §Tier 1 Track Initialization Rules).	2026-06-17 01:15:58 -04:00
ed	aad6deffcb	conductor(plan): Mark Task 6.1 complete - state.toml updated All 16 task entries now have status=completed and commit_sha. All 6 phases marked completed (phase_6 in_progress pending metadata+tracks.md). All 9 verification flags = true. All 6 enforcement_stack flags = true (sandbox contracts exercised). Added [notes] section documenting: - Phase 4 file count discrepancy (22 actual vs 24 spec) - error_handling.md deprecation section replacement - Pre-existing test failures (unrelated to track) - MCP edit_file unreliability + Python fallback	2026-06-17 01:15:33 -04:00
ed	d86131d951	conductor(plan): Mark Task 5.2 + 5.3 complete (Phase 5 verification) Final grep: 0 send_result in active code. 3 historical refs in error_handling.md (intentional, in the 'Historical deprecation' note). Test verification: 100/101 tests pass in the 26 files renamed by this track. 1 pre-existing failure in test_headless_service.py due to missing credentials.toml (verified against origin/master baseline where it also fails - unrelated to the rename).	2026-06-17 01:14:24 -04:00
ed	ea7d794a6b	conductor(plan): Mark Task 5.2 + 5.3 complete (Phase 5 verification done) Final grep: 0 send_result in active code. 3 historical refs in error_handling.md (intentional, in the 'Historical deprecation' note). Test verification: 100/101 tests pass in the 26 files renamed by this track. 1 pre-existing failure in test_headless_service.py due to missing credentials.toml (verified against origin/master baseline where it also fails - unrelated to the rename). 7 broader suite failures all pre-existing (all FileNotFoundError on credentials.toml, confirmed against origin/master baseline). Track verification: - git grep send_result: 0 in active code (3 historical intentional) - Full test suite: matches pre-rename baseline (7 pre-existing failures unrelated to the rename, 0 new regressions)	2026-06-17 01:13:25 -04:00
ed	5cc422b34b	conductor(plan): Mark Task 5.1 complete (Phase 5 docs done)	2026-06-17 00:51:07 -04:00
ed	9b5011231c	docs(ai_client): rename send_result to send in 3 current docs Doc consistency: guide_ai_client.md, guide_app_controller.md, and the error_handling styleguide now reference the new symbol name. Also fixes two consistency issues in error_handling.md introduced by the mechanical rename: 1. The 'Deprecation: send -> send_result' section (lines 623-642) was rewritten as a 'Historical deprecation (added 2026-06-15, reverted 2026-06-16)' note that points to the relevant track specs. 2. Line 204 (the 'Current State Audit' summary for src/ai_client.py) had a self-contradictory claim ('send() is the new public API; send() is @deprecated') after the rename. Updated to describe the canonical public API. Historical archives (conductor/tracks//spec.md, conductor/tracks//plan.md, docs/reports/*) are NOT modified - they document the 2026-06-15 public_api_migration decision and stay as historical record.	2026-06-17 00:50:36 -04:00
ed	d17d8743dd	conductor(plan): Mark Task 4.1 complete (Phase 4 done)	2026-06-17 00:45:44 -04:00
ed	ada9617308	test(ai_client): rename send_result to send in 22 remaining test files Batch rename of 22 test files. 62 references renamed total. The full test suite is now GREEN again, matching the pre-rename baseline from Task 1.1. Pure mechanical rename. No behavior change. Files affected: test_ai_cache_tracking, test_ai_client_cli, test_ai_client_result, test_api_events, test_context_pruner, test_deepseek_provider, test_gemini_cli_* (3 files), test_gui2_mcp, test_headless_* (2 files), test_live_gui_integration_v2, test_orchestration_logic, test_phase6_engine, test_rag_integration, test_run_worker_lifecycle_abort, test_spawn_interception_v2, test_symbol_parsing, test_tier4_interceptor, test_tiered_aggregation, test_token_usage. Note: spec estimated 24 files; actual is 22 (test_deprecation_warnings no longer exists, and 1 fewer file than spec's list). Refs: conductor/tracks/send_result_to_send_20260616/	2026-06-17 00:38:29 -04:00
ed	2f45bc4d68	conductor(plan): Mark Task 3.5 + 3.6 complete (Phase 3 done)	2026-06-17 00:35:32 -04:00
ed	e8a9102f19	test(ai_client): rename send_result to send in test_orchestrator_pm_history 4 references renamed. Test file state: GREEN. 3 tests pass. Phase 3 complete (all 5 high-impact test files green).	2026-06-17 00:34:37 -04:00
ed	53b35de5c6	conductor(plan): Mark Task 3.4 complete	2026-06-17 00:34:00 -04:00
ed	423f9a95b0	test(ai_client): rename send_result to send in test_conductor_tech_lead 11 references renamed (planned 8; the count grew with the @patch pattern + local var name). Test file state: GREEN. 9 tests pass.	2026-06-17 00:33:36 -04:00
ed	58fe3a9cb5	conductor(plan): Mark Task 3.3 complete	2026-06-17 00:33:00 -04:00
ed	4393e831b0	test(ai_client): rename send_result to send in test_ai_loop_regressions_20260614 13 references renamed (planned 12; one extra found in a comment). Test function test_fr2_send_result_callable_in_app_controller_namespace renamed to test_fr2_send_callable_in_app_controller_namespace. 7 tests pass.	2026-06-17 00:32:33 -04:00
ed	6dbba46a25	conductor(plan): Mark Task 3.2 complete	2026-06-17 00:31:33 -04:00
ed	5e99c204a3	test(ai_client): rename send_result to send in test_orchestrator_pm 14 references renamed (decorators + parameter names + assertions). Test file state: GREEN. 3 tests pass.	2026-06-17 00:30:48 -04:00
ed	f0663fda6a	conductor(plan): Mark Task 3.1 complete	2026-06-17 00:29:54 -04:00
ed	3e2b4f74ba	test(ai_client): rename send_result to send in test_conductor_engine_v2 22 references renamed (mostly monkeypatch.setattr calls + comments). Test file state: GREEN. All 10 tests in this file now pass.	2026-06-17 00:29:21 -04:00
ed	d714d10fd4	conductor(plan): Mark Task 2.1 complete	2026-06-17 00:28:17 -04:00
ed	d87d909f7b	refactor(ai_client): rename send_result to send in 5 src/ call sites Renames 10 references across app_controller, conductor_tech_lead, mcp_client (docstring example), multi_agent_conductor, orchestrator_pm. 5 call sites in ai_client.send_result(...) -> ai_client.send(...) 3 print strings mentioning send_result 1 docstring comment (conductor_tech_lead) 1 docstring example (mcp_client) 'src.ai_client.send_result' -> 'src.ai_client.send' Test suite state: still red, but all src/-level call sites are now renamed. Remaining failures are in test files (mocks and patches that still reference send_result). Refs: conductor/tracks/send_result_to_send_20260616/	2026-06-17 00:27:47 -04:00
ed	4a59567939	conductor(plan): Mark Task 1.1 complete	2026-06-17 00:26:05 -04:00
ed	5351389fc0	refactor(ai_client): rename send_result to send (the impl, TDD red moment) The TDD red moment. The implementation is renamed but the call sites in src/, tests/, and docs still use send_result. Subsequent commits rename the call sites and progressively move the test suite back to green. 10 references renamed in src/ai_client.py: - 4 'Called by: send_result' docstring tags in private provider helpers - 1 function definition (def send_result -> def send) - 1 [C: ...] SDM tag referencing test function names - 2 monitor component names (start_component / end_component) - 2 error source strings (CONFIG + INTERNAL) Also adds scripts/tier2/apply_t1_1_edits.py - the helper script that applied the 10 edits. Kept in scripts/tier2/ as a record of the mechanical change pattern. Refs: conductor/tracks/send_result_to_send_20260616/	2026-06-17 00:23:16 -04:00
ed	c1d9a966d7	conductor(plan): Rename send_result to send (sandbox test track) The first end-to-end test of the tier2_autonomous_sandbox_20260616 sandbox. Pure mechanical rename: ai_client.send_result to ai_client.send across 38 active files (6 src/, 29 tests/, 3 current docs). 10 atomic commits across 5 phases. No behavior change; no new tests; the existing test suite is the safety net. Phase structure: - Phase 1: rename src/ai_client.py (TDD red moment) - Phase 2: rename 5 other src/ files (batch) - Phase 3: rename top 5 test files (one commit per file) - Phase 4: rename 24 remaining test files (batch) - Phase 5: rename 3 current docs + final verification - Phase 6: update state + metadata + register in tracks.md Historical archives (conductor/tracks//spec.md, conductor/tracks//plan.md, docs/reports/*) are NOT modified per spec section 7.	2026-06-16 23:52:59 -04:00
ed	9ba61d43d3	docs(tier2): add track completion report (final verification + spec coverage matrix)	2026-06-16 23:29:00 -04:00
ed	00c6922c0b	conductor(plan): mark tier2_autonomous_sandbox_20260616 as complete (all 9 phases done)	2026-06-16 23:23:28 -04:00
ed	eedbfa1180	conductor(plan): update metadata.json to status=shipped + actual test counts	2026-06-16 23:22:24 -04:00
ed	2f79f19989	conductor(plan): register tier2_autonomous_sandbox_20260616 in tracks.md	2026-06-16 23:21:21 -04:00
ed	8bf7cd175b	docs(tier2): add user guide for Tier 2 autonomous sandbox	2026-06-16 22:48:13 -04:00
ed	3e17aa6c8b	test(tier2): add smoke e2e test (opt-in, double-gate TIER2_SANDBOX_TESTS+TIER2_SMOKE)	2026-06-16 22:26:04 -04:00
ed	5b6e7db174	test(tier2): add sandbox enforcement test (pre-push hook refuses push)	2026-06-16 20:25:44 -04:00
ed	5d150dc6e0	test(tier2): add bootstrap -WhatIf test (opt-in via TIER2_SANDBOX_TESTS)	2026-06-16 20:01:32 -04:00
ed	37eafc008e	test(tier2): add trivial smoke track for e2e test (force-added, fixture)	2026-06-16 19:57:36 -04:00
ed	cb7c82008e	test(tier2): add tier2_sandbox and tier2_smoke pytest markers	2026-06-16 19:56:20 -04:00

1 2 3 4 5 ...