manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	1b39aae7c4	fix(schemas): add legacy-kwarg backward compat to NormalizedResponse.__init__ 12 tests fail with: TypeError: NormalizedResponse.__init__() got an unexpected keyword argument 'usage_input_tokens' The @dataclass(frozen=True) auto-generated __init__ requires `usage: UsageStats`, but 12 tests + 1 production site (src/ai_client.py:908) call it with the OLD flat-kwarg API (usage_input_tokens=..., usage_output_tokens=..., etc.). Change @dataclass(frozen=True) -> @dataclass(frozen=True, init=False) and add a custom __init__ that accepts BOTH signatures: - New: usage: UsageStats (used by current production code) - Legacy: usage_input_tokens, usage_output_tokens, usage_cache_read_tokens, usage_cache_creation_tokens (used by tests + 1 ai_client site) If usage is None and any legacy flat kwarg is non-None, build a UsageStats from the legacy kwargs. Otherwise use the provided usage. All field assignments use object.__setattr__ because frozen=True locks __setattr__. Verification: - Legacy kwargs work: NormalizedResponse(text="hi", tool_calls=(), usage_input_tokens=10, usage_output_tokens=5, raw_response=None) sets usage.input_tokens=10 - New kwargs work: NormalizedResponse(text="hi", tool_calls=(), usage=UsageStats(1, 2)) sets usage directly - 12 affected tests now pass (was 12 failed, 3 passed; now 15 passed)	2026-06-24 11:01:11 -04:00
ed	2561e4ea9e	refactor(audit): remove dead compute_result_coverage compute_result_coverage() was implemented during the 14-phase plan but is never called: synthesize_aggregate_profile() (now at ~line 1075) inlines its own ResultCoverage construction via the actual AST analysis at ~line 1135-1145. The function has a latent bug at line 754 (was): result_producers = total_producers which hardcodes result_producers to 100% of total_producers regardless of input — making the function return meaningless numbers. Tests deleted in lockstep: - tests/test_code_path_audit_phase78.py: test_compute_result_coverage_no_producers - tests/test_code_path_audit_phase78.py: test_compute_result_coverage_full The 'compute_result_coverage' import was also removed from the test file's import block. Verification: - grep -c 'compute_result_coverage' src/code_path_audit.py = 0 - grep -c 'compute_result_coverage' tests/ = 0 - 125 of 125 remaining tests pass (was 127; -2 tests deleted)	2026-06-24 10:00:08 -04:00
ed	b385cd441b	refactor(audit): remove dead DSL parser (DSL files no longer produced) The v2 postfix DSL parser (DSL_WORD_ARITY_V2, _atom, to_dsl_v2, parse_dsl_v2) was implemented during the 14-phase DSL plan but never reached production: run_audit() (line ~1217 after this change) only writes .md files (AUDIT_REPORT.md plus per-aggregate markdowns via to_markdown/to_tree), never .dsl files. The DSL parser carried latent arity bugs (DSL_WORD_ARITY_V2 declared 5 for 'result-coverage' but writer emits 4; 4 for 'type-alias-coverage' but writer emits 3) which would have caused silent parse failures. Also removed the now-unused 'import re' statement (was only used by parse_dsl_v2). The 'from datetime import date as date_mod' is retained (still used at line ~1259, 1275, 1291 in the markdown renderer). Tests deleted in lockstep: - tests/test_code_path_audit_phase78.py: test_dsl_word_arity_v2_14_new_words - tests/test_code_path_audit_phase89.py: test_to_dsl_v2_includes_aggregate_kind_section, test_parse_dsl_v2_round_trip_aggregate_kind, test_parse_dsl_v2_malformed Verification: - grep -c 'to_dsl_v2\|parse_dsl_v2\|DSL_WORD_ARITY_V2' src/code_path_audit.py = 0 - 127 of 127 remaining tests pass (was 131; -4 tests deleted)	2026-06-24 09:57:17 -04:00
ed	02b1009874	chore(audit): remove duplicate import json in src/code_path_audit.py The import statement appeared twice in quick succession (lines 655 and 658). Both were identical and contributed nothing. Removed one. No functional change. Verification: - grep -c '^import json' src/code_path_audit.py = 1 - uv run python -c 'from src import code_path_audit' returns OK - 124 tests in tests/test_code_path_audit*.py pass	2026-06-24 09:45:28 -04:00
ed	9e143445e0	fix(audit): replace dict[str, Any] with JsonValue TypeAlias (5+ weak sites) Resolves audit_weak_types.py --strict regression (117 vs baseline 112 -> 104). The regression was in src/openai_schemas.py (10 sites) and src/mcp_tool_specs.py (4 sites), both files added after the 2026-06-21 baseline. JsonValue is the canonical JSON-serializable data TypeAlias from src/type_aliases.py:22 and is a structural superset of dict[str, Any], so consumers expecting the legacy shape are unaffected. All 30 existing tests in tests/test_openai_schemas.py and tests/test_mcp_tool_specs.py continue to pass. Spec WHERE for t1.1 referenced code_path_audit*.py files but those modules report 0 weak type findings per the audit (they use dict[str, int], dict[str, dict], etc., not dict[str, Any]); see plan.md investigation note.	2026-06-24 09:41:50 -04:00
ed	0b79798eaf	feat(audit): MVP output - AUDIT_REPORT.md only, move stale to _stale/ MVP pipeline simplification: - render_rollups() now produces ONLY summary.md + AUDIT_REPORT.md - run_audit() now produces only per-aggregate .md (no .dsl/.tree) - New src/code_path_audit_gen.py generates the single coherent report Stale artifacts moved to _stale/ subdirectory (preserved for history): - 13 per-aggregate .dsl files (redundant with .md) - 13 per-aggregate .tree files (redundant with .md) - 9 old top-level rollups (cross_audit_summary, decomposition_matrix, candidates, field_usage, call_graph, hot_paths, dead_fields, ssdl_analysis, organization_deductions - all superseded by sections inlined in AUDIT_REPORT.md) - _stale/README.md explains what happened Meta-audit updated to check .md files (14 required H2 sections per aggregate) instead of .dsl files. 0 violations on 10 real profiles. Tests: 131 passing. New MVP report: 5000+ lines.	2026-06-22 13:34:29 -04:00
ed	f7f616abb9	feat(audit): alias resolution - all real aggregates now have data	2026-06-22 12:52:22 -04:00
ed	077149011b	fix(audit): real line numbers + entry.get() field-access detection + Optional/dict/Union patterns Three real bugs fixed: 1. FunctionRef always used line=0. Now passes node.lineno from AST. 2. P3_pass results were discarded with bare pass. Now stored in ProducerConsumerGraph.field_accesses. 3. Field-access detector only saw entry['key']; missed entry.get('key') which is the dominant pattern in this codebase. Now handles both. Plus _extract_type_name() helper handles Optional[T], dict[str, T], list[T], Result[T], Union[T, ...], and T \| None (PEP 604) so P1/P2 catch more annotation patterns. Real numbers (Metadata aggregate): - producers: 77 -> 117 - consumers: 35 -> 66 - field-access sites: 130 -> 173 - line numbers: all real (line 1281, 1746, etc.) AUDIT_REPORT.md grew 2009 -> 3140 lines with real evidence. Total audit output: 5176 lines / 50 files (was 2415 / 49). All 131 tests still passing.	2026-06-22 12:20:32 -04:00
ed	783e5fd9fe	feat(audit): SSDL analysis - effective codepaths + nil-sentinel + organization verdict - src/code_path_audit_ssdl.py: 9 functions translating per-aggregate findings into SSDL primitives (compute_effective_codepaths, count_branches_in_function, detect_nil_check_pattern, compute_field_access_efficiency, suggest_defusing_technique, render_ssdl_sketch/rollup, render_organization_deductions). - src/code_path_audit.py:render_rollups() now emits ssdl_analysis.md + organization_deductions.md alongside the existing 8 rollups. - src/code_path_audit_render.py:render_full_markdown() adds SSDL sketch section per profile (effective codepaths + defusing recommendations). Real findings (Metadata aggregate): - 35 consumers, 251 total branches, 1.13e18 effective codepaths - 6 nil-check functions (candidates for [N] sentinel) - 130 field-access sites, 0% typed (candidates for immediate-mode cache) - Verdict: needs restructuring Audit output grew 2136 -> 2415 lines. All 131 tests pass. Meta-audit clean (0 violations).	2026-06-22 11:44:00 -04:00
ed	09167986d5	wip: SSDL analysis (has indentation bug, needs fix)	2026-06-22 10:46:34 -04:00
ed	558258cffd	feat(audit): rich rollups + per-line indentation fix - 2136 total lines Added 3 new top-level rollups (hot_paths.md, dead_fields.md, plus enriched summary.md, candidates.md, decomposition_matrix.md): - summary.md: per-aggregate memory_dim + access pattern tables, full cross-validation verdict per aggregate - decomposition_matrix.md: all 10 aggregates ranked by current cost, flagged-for-refactoring section, insufficient_data section - candidates.md: ranked optimization candidates with detail per step - hot_paths.md: top 5 hot consumers per aggregate (by field access count) - dead_fields.md: fields accessed (per-consumer breakdown) Total report: 2136 lines (was 1814).	2026-06-22 10:29:01 -04:00
ed	59eeee819e	feat(audit): enriched markdown renderer - 15 sections per profile + 2 new rollups render_full_markdown in src/code_path_audit_render.py produces detailed per-profile markdown: - Producers detail (grouped by file) - Consumers detail (grouped by file) - Field access matrix (every field x every consumer) - Access pattern (dominant + per-function distribution) - Frequency (aggregate + per-function) - Result coverage table - Type alias coverage table (typed vs untyped sites) - Cross-audit findings (per-bucket tables) - Decomposition cost (8 metrics) - Struct shape inference (inferred from producer returns) - Optimization candidates (concrete refactor steps + affected files) - Verdict - Evidence appendix (every per-function item) New rollups: - field_usage.md: cross-aggregate field access frequency - call_graph.md: producer/consumer tables grouped by aggregate Total report: 1814 lines (was 1204).	2026-06-22 10:12:48 -04:00
ed	5405345c5a	fix(audit): path resolution in analyze_consumer_fields + analyze_producer_size The previous code did Path(src_dir) / function_ref.file, which double-prefixed (e.g. src/src/project_manager.py) and silently returned empty. Fixed: if function_ref.file exists as CWD-relative, use it directly. Only join if it doesn't exist. Now 130 real field accesses detected across 35 Metadata consumers in the 2026-06-22 audit output (was 0 before).	2026-06-22 10:05:12 -04:00
ed	67ca680a05	feat(audit): per-aggregate cross_audit mapping via PCG file-index The aggregate_findings function now does 3-tier mapping: 1. Function lookup (find_enclosing_function) -> exact match 2. File-level fallback: if the finding's file has any producer/consumer of the aggregate, bucket it there 3. Unbucketed (the file has no aggregate refs) Handles both 'file' and 'filename' keys (v1 audit scripts use 'filename'; spec fixtures use 'file'). Path normalization for Windows paths. Generated the 6 real audit_inputs from scripts/audit_*.py against real src/. The Metadata aggregate now shows: - 1 unique weak_types finding (1 site, from ai_client.py:159) - 1 unique exception_handling finding (76 sites from PARAM_OPTIONAL) mcp_client.py shows 0 because no Metadata producer/consumer exists in the PCG for mcp_client (P1/P2 only detect typed parameter signatures, not internal field access). The next gap is expanding P3 to capture internal field use.	2026-06-22 09:48:56 -04:00
ed	8d2dffd7c5	feat(audit): wire cross_audit_findings aggregator into synthesize Loops over audit_weak_types + audit_exception_handling from the 6 audit_inputs, calls aggregate_cross_audit_findings per audit, sums the buckets per profile. Cross-audit aggregation is per-aggregate-flat (all findings go into 1 bucket per audit). The 3-tier finding-to-aggregate mapping (find_enclosing_function + type registry + file heuristic) is the next gap - requires per-finding site classification.	2026-06-22 09:14:40 -04:00
ed	85f5808ae3	feat(audit): real analysis - consumer fields, struct size, decomp	2026-06-22 09:08:41 -04:00
ed	c82538474f	feat(audit): implement Phase 8 v2 DSL + Phase 9 run_audit + CLI + MCP Phase 8: to_dsl_v2 (flat-section writer, 14 sections), to_markdown (10 sections), to_tree (box-drawing prefix tree), parse_dsl_v2 (round-trip parser). Phase 9: AGGREGATES_IN_SCOPE (10) + CANDIDATE_AGGREGATES (3), synthesize_aggregate_profile (per-aggregate builder, candidate placeholder path), AuditSummary dataclass, run_audit() main entry, render_rollups() (4 top-level files: summary, cross_audit_summary, decomposition_matrix, candidates), code_path_audit_v2() MCP tool wrapper. 13 new unit tests passing. 124 total tests passing. Phase 10 (integration tests with synthetic src/) next - may be deferred to next session if context runs low.	2026-06-22 01:59:07 -04:00
ed	e59334a303	feat(audit): implement Phase 7 cross-audit integration + Phase 8.1 DSL arity Phase 7: read_input_json (stdlib I/O boundary), INPUT_JSON_CONTRACTS (6 input sources), find_enclosing_function (3-tier mapping tier 1), compute_result_coverage (cross-check of doeh), compute_type_alias_coverage (cross-check of dss), aggregate_cross_audit_findings (per-aggregate bucketing), run_all_cross_audit_reads (convenience). Phase 8 Task 8.1: DSL_WORD_ARITY_V2 (14 new tagged words). 15 new unit tests passing. 111 total tests passing. Phase 8 Tasks 8.2-8.5 (4 renderers + parser) next.	2026-06-22 01:49:14 -04:00
ed	cca59668c8	feat(audit): implement Phase 5 CFE + Phase 6 Decomposition Cost (11 tasks) Phase 5 CFE: detect_frequency_from_entry_point + 6 caller sets (INIT/HOT/PER_TURN/COLD/PER_DISCUSSION/PER_REQUEST), load_frequency_overrides (tomllib), estimate_call_frequency with 3-tier precedence (override > entry-point > unknown). Phase 6 Decomposition Cost: 6 cost-model constants (per spec 7.5), per_call_cost_us formula, FREQUENCY_MULTIPLIER (7 frequencies), current_total_us, componentize_factor lookup, unify_factor lookup, recommended_direction (5-step precedence with frozen whole_struct -> hold override), generate_rationale auto-string, and compute_decomposition_cost main entry. 33 new unit tests passing (Phase 5: 11, Phase 6: 22). 96 total tests passing. Phase 7 (Cross-audit integration) next.	2026-06-22 01:40:32 -04:00
ed	c1d2f0e454	feat(audit): implement Phase 3 MemoryDim + Phase 4 APD (11 tasks) Phase 3: MemoryDim classifier with canonical mappings (23 entries, includes ToolSpec/ChatMessage/ProviderHistory now that they're real), file-of-origin heuristic (5 buckets), TOML override loader, classify_memory_dim() with 3-tier precedence. Phase 4: APD with 4 threshold constants, 5 pattern detectors (whole_struct, field_by_field, hot_cold_split, bulk_batched, dominant_pattern), detect_access_pattern() main entry. 30 new unit tests passing (Phase 3: 11, Phase 4: 19). 63 total tests passing. Phase 5 (CFE - Call Frequency Estimator) next.	2026-06-22 01:26:06 -04:00
ed	200396e4a5	feat(audit): implement Phase 2 PCG (5 tasks: skeleton + P1+P2+P3+build_pcg) Phase 2 PCG: ProducerConsumerGraph (bipartite aggregate<->function) + 3 AST passes (P1 return-type, P2 parameter-type, P3 field-access) + build_pcg() main entry returning Result[ProducerConsumerGraph]. 14 new unit tests passing (2 PCG + 3 P1 + 3 P2 + 3 P3 + 3 build_pcg). The build_pcg() function tolerates syntax errors per the stdlib I/O boundary pattern (records ErrorInfo, continues). Phase 2 complete: 33 unit tests passing. Phase 3 (MemoryDim classifier with canonical mappings) next.	2026-06-22 01:18:54 -04:00
ed	ef207cf684	feat(audit): complete Phase 1 data model (8 dataclasses, 12 new tests) Tasks 1.3-1.10: AccessPatternEvidence, FrequencyEvidence, ResultCoverage, TypeAliasCoverage, CrossAuditFinding, CrossAuditFindings, DecompositionCost, OptimizationCandidate, AggregateProfile. All frozen dataclasses per error_handling.md Pattern 1 (immutability for cross-thread safety). Phase 1 complete: 19 unit tests passing (5 enum tests + 14 dataclass tests). AggregateProfile is the central artifact with 14 required fields + 2 optional (mermaid, markdown). Phase 2 (PCG - 3 AST passes + build_pcg()) next.	2026-06-22 01:10:57 -04:00
ed	1680182953	feat(audit): add FunctionRef dataclass (frozen, 4 fields) fqname, file, line, role. Used in ProducerConsumerGraph edges and per-aggregate producer/consumer lists. Per error_handling.md Pattern 1 (immutability for cross-thread safety). 2 unit tests passing.	2026-06-22 01:05:17 -04:00
ed	be4ec0a459	feat(types): add JsonPrimitive + JsonValue TypeAliases (t0_3) Phase 0 of any_type_componentization_20260621. Extends src/type_aliases.py with two recursive-friendly TypeAliases for JSON wire format (used by Phase 5 api_hooks WebSocketMessage): - JsonPrimitive: str \| int \| float \| bool \| None - JsonValue: JsonPrimitive \| list['JsonValue'] \| dict[str, 'JsonValue'] The forward-ref 'JsonValue' strings work because from __future__ import annotations is at the top of the module (PEP 563 + PEP 613 TypeAlias). Tests added (4 new, 14 total): - test_json_primitive_alias_resolves_to_union: hints exposes JsonPrimitive - test_json_value_alias_resolves_to_recursive_union: hints exposes JsonValue - test_json_value_accepts_primitive_dict: dict[str, JsonValue] runtime use - test_json_value_accepts_nested_structures: nested dict+list round-trip Verification: uv run pytest tests/test_type_aliases.py --timeout=30 14 passed in 2.97s	2026-06-22 01:02:38 -04:00
ed	335f9080f5	feat(api_hooks): add WebSocketMessage + JsonValue type (t5_1-t5_8) Phase 5 of any_type_componentization_20260621. Promotes the WebSocket broadcast signature in src/api_hooks.py from (channel, payload: dict) to a typed WebSocketMessage dataclass (16 Any sites): NEW dataclass (inline in src/api_hooks.py): - WebSocketMessage (frozen=True): channel: str, payload: JsonValue MODIFIED: - _serialize_for_api(obj: Any) -> JsonValue (typed return) - broadcast(channel: str, payload: dict[str, Any]) -> broadcast(message: WebSocketMessage) - _get_app_attr / _set_app_attr signatures UNCHANGED (Pattern 4 preserved) NEW tests/test_api_hooks_dataclasses.py (12 tests, all pass): - test_websocket_message_construction - test_websocket_message_with_list_payload - test_websocket_message_with_nested_payload - test_websocket_message_is_frozen - test_websocket_message_to_json - test_serialize_for_api_returns_dict_for_to_dict_object - test_serialize_for_api_handles_nested_lists - test_serialize_for_api_handles_purepath - test_serialize_for_api_passthrough_for_primitives - test_serialize_for_api_handles_mixed_nesting - test_get_app_attr_signature_preserved (Pattern 4 invariant) - test_set_app_attr_signature_preserved (Pattern 4 invariant) MODIFIED tests/test_websocket_server.py: - Updated broadcast() call site to use WebSocketMessage(channel=..., payload=...) - Added WebSocketMessage import Verified: uv run pytest tests/test_api_hooks_dataclasses.py tests/test_api_hooks_warmup.py tests/test_websocket_server.py --timeout=30 23 passed in 5.03s (12 new + 10 existing + 1 websocket)	2026-06-22 01:00:06 -04:00
ed	3816a54d27	feat(log): add Session + SessionMetadata dataclasses (t4_1-t4_8) Phase 4 of any_type_componentization_20260621. Promotes the 2-level dict[str, dict[str, Any]] structure in src/log_registry.py to typed Session + SessionMetadata dataclasses (7 Any sites): NEW dataclasses (inline in src/log_registry.py): - SessionMetadata (frozen): message_count, errors, size_kb, whitelisted, reason, timestamp - Session (frozen): session_id, path, start_time, whitelisted, metadata - to_dict() / from_dict() classmethod for round-trip with TOML shape - Backward-compat __getitem__ / get() so existing test_log_registry.py tests that use session_data['path'] / session_data.get('metadata') continue to work REFACTOR LogRegistry: - self.data: dict[str, dict[str, Any]] -> dict[str, Session] - load_registry: populates with Session.from_dict(...) - save_registry: serializes via session.to_dict() - register_session: creates Session dataclass - update_session_metadata: creates new Session with updated SessionMetadata - is_session_whitelisted: reads session.whitelisted - update_auto_whitelist_status: reads session.path - get_old_non_whitelisted_sessions: reads session.start_time + metadata NEW tests/test_log_registry_dataclasses.py (13 tests, all pass): - test_session_dataclass_construction - test_session_metadata_dataclass_construction - test_session_from_dict_basic / with_metadata - test_session_to_dict_round_trip - test_session_metadata_to_dict - test_log_registry_data_is_typed - test_log_registry_register_session_returns_session - test_log_registry_update_session_metadata_sets_metadata - test_log_registry_is_session_whitelisted - test_log_registry_get_old_non_whitelisted_sessions - test_session_is_frozen - test_session_metadata_is_frozen Verified: uv run pytest tests/test_log_registry.py tests/test_log_registry_dataclasses.py --timeout=30 18 passed in 3.27s (5 existing + 13 new)	2026-06-22 01:00:00 -04:00
ed	5bd416c3ca	feat(provider): add src/provider_state.py + tests (t3_2, t3_3) Phase 3 of any_type_componentization_20260621 (PARTIAL). Adds the ProviderHistory abstraction and 6-provider registry. NEW src/provider_state.py (60 lines): - ProviderHistory dataclass (messages: list[HistoryMessage], lock: Lock, append / get_all / replace_all / clear methods) - _PROVIDER_HISTORIES: dict[str, ProviderHistory] for anthropic / deepseek / minimax / qwen / grok / llama - get_history(provider) factory + clear_all() + providers() - SDK client holders (_gemini_chat, _anthropic_client, etc.) NOT touched per Pattern 3 (heterogeneous SDK types) NEW tests/test_provider_state.py (12 tests, all pass): - test_six_providers_registered - test_get_history_returns_singleton_per_provider - test_get_history_raises_for_unknown - test_provider_history_starts_empty - test_provider_history_append / get_all_returns_copy / replace_all / replace_all_takes_copy / clear - test_clear_all_resets_every_provider - test_provider_history_thread_safety (10 threads x 100 messages) - test_independent_locks_per_provider (lock on one doesn't block another) DEFERRED: - t3_4 (Remove 14 globals from ai_client.py:111-133) - t3_5 through t3_13 (Update call sites in _send_<provider> functions) - t3_14 (Run full regression suite on test_ai_client*.py) These call-site updates require careful per-function refactoring of the ~27 sites in _send_anthropic, _send_deepseek, _send_minimax, _send_qwen, _send_grok, _send_llama. The ai_client.py file is 3432 lines; a single regex pass risks subtle indentation regressions in nested constructs (see the 7 ot : orphan lines from a previous attempt). The provider_state module is independently usable and tested. Future track: provider_state_migration_2026MMDD to wire up the call sites mechanically, OR integrate into a Phase 3 retry pass. Verified: uv run pytest tests/test_provider_state.py --timeout=30 12 passed in 2.99s	2026-06-22 00:59:50 -04:00
ed	04d723e420	feat(openai): add src/openai_schemas.py + refactor openai_compatible.py (t2_1-t2_7) Phase 2 of any_type_componentization_20260621. Promotes NormalizedResponse + OpenAICompatibleRequest from src/openai_compatible.py to typed dataclasses. The 17 Any sites become 5 dataclasses: NEW src/openai_schemas.py (138 lines): - ToolCallFunction dataclass (name, arguments) - ToolCall dataclass (id, function: ToolCallFunction, type='function') - ChatMessage dataclass (role, content, tool_calls, tool_call_id, name) - UsageStats dataclass (input_tokens, output_tokens, cache_read_, cache_creation_) - NormalizedResponse dataclass (text, tool_calls: tuple, usage, raw_response: Any) - OpenAICompatibleRequest dataclass (messages: list[ChatMessage], model, ...) NEW tests/test_openai_schemas.py (19 tests, all pass): - ToolCallFunction, ToolCall, ChatMessage round-trips - UsageStats field access + frozen=True semantics - NormalizedResponse.to_legacy_dict preserves shape - raw_response stays Any (Pattern 3 preserved) - tools field stays list[dict[str, Any]] for Phase 1 ToolSpec follow-up MODIFIED src/openai_compatible.py: - Removed inline NormalizedResponse + OpenAICompatibleRequest definitions - Re-imported from src.openai_schemas - _send_blocking: tool_calls -> tuple[ToolCall, ...]; usage_*_tokens -> UsageStats - _send_streaming: same migration - send_openai_compatible: messages_dicts = [m.to_dict() for m in request.messages] - Exception handler: empty NormalizedResponse uses UsageStats - All NormalizedResponse consumers still work (legacy dict shape preserved) Verified: uv run pytest tests/test_openai_schemas.py tests/test_mcp_tool_specs.py tests/test_audit_dataclass_coverage.py tests/test_type_aliases.py tests/test_mcp_client_beads.py tests/test_mcp_client_paths.py tests/test_arch_boundary_phase2.py --timeout=60 64 passed in 6.28s	2026-06-22 00:59:42 -04:00
ed	cd715670d7	feat(mcp): add src/mcp_tool_specs.py + tests (t1_1, t1_2, t1_3) Phase 1 of any_type_componentization_20260621. Promotes MCP_TOOL_SPECS (45 dict[str, Any] literals in src/mcp_client.py) to typed dataclasses: NEW src/mcp_tool_specs.py: - ToolParameter dataclass (name, type, description, required, enum) - ToolSpec dataclass (name, description, parameters: tuple) - _REGISTRY: dict[str, ToolSpec] - register() / get_tool_spec() / get_tool_schemas() / tool_names() - to_dict() preserves legacy JSON shape for downstream serialization - 45 register() calls (one per tool) at module level - Mirrors src/vendor_capabilities.py reference pattern NEW tests/test_mcp_tool_specs.py (11 tests, all pass): - test_module_loads_with_45_registrations - test_tool_names_set_matches_expected_45 - test_get_tool_spec_returns_correct_instance - test_get_tool_spec_raises_for_unknown_name - test_get_tool_schemas_returns_all_specs - test_tool_spec_is_frozen - test_tool_parameter_is_frozen - test_to_dict_round_trip_preserves_shape - test_tool_parameter_to_dict_includes_enum - test_tool_names_subset_of_models_agent_tool_names (cross-module invariant) - test_register_idempotent_replaces_existing (hot-reload support) NEW scripts/tier2/artifacts/any_type_componentization_20260621/: - generate_mcp_tool_specs.py: idempotent generator from MCP_TOOL_SPECS - generate_tool_specs.py: helper that emits registration lines - inspect_mcp_specs.py: shape inspection - _generated_registrations.txt: the 45 registration lines Verified: 11/11 tests pass. The legacy MCP_TOOL_SPECS dict in mcp_client.py still exists; this commit only ADDS the new module. Migration of call sites in mcp_client.py + ai_client.py follows in t1_4 + t1_5. Verified with: uv run pytest tests/test_mcp_tool_specs.py --timeout=30 11 passed in 3.01s	2026-06-22 00:59:35 -04:00
ed	21ba2ffb04	Merge branch 'tier2/phase2_4_5_call_site_completion_20260621' into tier2/code_path_audit_20260607	2026-06-22 00:47:33 -04:00
ed	5dca69f0d7	feat(audit): add 5 enums for the v2 data model AggregateKind (4 values), MemoryDim (7), AccessPattern (5), Frequency (7), RecommendedDirection (4). All Literal types for stable postfix DSL output (string-valued, no enum-name lookup table needed in the parser). 5 unit tests passing. The 9 supporting dataclasses + the AggregateProfile central artifact go in Tasks 1.2-1.10.	2026-06-22 00:46:00 -04:00
ed	e9d1867bbc	chore(audit): create empty src/code_path_audit.py v2 Module docstring + from __future__ import annotations. No code yet; the data model goes in next (Phase 1).	2026-06-22 00:41:33 -04:00
ed	09eaf69a83	fix(tests): resolve 3 pre-existing test failures surfaced by user's batched run The phase2_4_5_call_site_completion_20260621 track's end-of-track report documented 5 pre-existing tier-1-unit-core failures as 'not caused by this track' and deferred them to a future track. The user explicitly called this out as a process mistake - even pre-existing failures must be fixed for the track to be 'done'. Fixed 3 of 5 (the other 2 are sandbox-pollution audit_tier2_leaks tests that require infrastructure changes): 1. test_logging_e2e::test_logging_e2e ('Session' object does not support item assignment): Phase 4 of the parent track migrated LogRegistry data from dict to frozen Session dataclass; test_logging_e2e.py was missed in the migration. Fix: add LogRegistry.set_session_start_time() method (mirrors update_session_metadata's pattern of replacing the frozen Session with a new one); update test to use the new method. 2. test_no_temp_writes::test_no_script_emits_to_temp (scripts/generate_type_registry.py uses tempfile): The --check mode was using tempfile.TemporaryDirectory which the audit forbids. Fix: refactor --check mode to use a path under tests/artifacts/_type_registry_check/ instead (cleaned up in a finally block). 3. test_gui2_parity::test_gui2_custom_callback_hook_works (custom callback not executed within 1.5s): The test used time.sleep(1.5) + assert, the documented race condition anti-pattern. Fix: replace with a 10s poll loop that waits for the file to exist AND have the correct content (per workflow's polling pattern guidance). Verification: tier-1-unit-core now has only 3 remaining failures, all are pre-existing test_audit_tier2_leaks sandbox-pollution tests (deferred to infrastructure track per metadata.json).	2026-06-21 23:06:54 -04:00
ed	751b94d4e8	Revert "merge: tier2/phase2_4_5_call_site_completion_20260621 (parent + follow-up + Phase 6e analysis)" This reverts commit `f914b2bcd4`, reversing changes made to `7fef95cc87`.	2026-06-21 22:39:14 -04:00
ed	5834628111	refactor(ai_client): migrate _send_grok/_send_minimax/_send_llama to ChatMessage API Completes the deferred t2_6 task from any_type_componentization_20260621 Phase 2. The 3 OpenAI-compatible senders now construct OpenAICompatibleRequest with messages=[ChatMessage(role=, content=)] instead of list[dict] literals. The _<provider>_history global lists are still dicts (Phase 3 deferred to a separate track); the migration converts each dict to ChatMessage at the request-build boundary via list comprehension. The backward-compat shim in openai_compatible.py:86 (m.to_dict() if hasattr(m, 'to_dict') else m) handles both ChatMessage and dict transparently. Verified: 20/20 provider tests pass; tier-1-unit (5 pre-existing sandbox-pollution failures unchanged); no new regressions.	2026-06-21 19:47:40 -04:00
ed	06287dbb95	refactor(ai_client): migrate _send_grok/_send_minimax/_send_llama to ChatMessage API Completes the deferred t2_6 task from any_type_componentization_20260621 Phase 2. The 3 OpenAI-compatible senders now construct OpenAICompatibleRequest with messages=[ChatMessage(role=, content=)] instead of list[dict] literals. The _<provider>_history global lists are still dicts (Phase 3 deferred to a separate track); the migration converts each dict to ChatMessage at the request-build boundary via list comprehension. The backward-compat shim in openai_compatible.py:86 (m.to_dict() if hasattr(m, 'to_dict') else m) handles both ChatMessage and dict transparently. Verified: 20/20 provider tests pass; tier-1-unit (5 pre-existing sandbox-pollution failures unchanged); no new regressions.	2026-06-21 19:47:40 -04:00
ed	224930d47c	fix(broadcast): migrate WebSocketServer.broadcast() callers to WebSocketMessage signature Phase 5 of any_type_componentization_20260621 changed WebSocketServer.broadcast(channel, payload) -> broadcast(message: WebSocketMessage) but did not update internal callers. This produced worker[queue_fallback] TypeError spam on the GUI thread. Fixed 2 sites: - src/app_controller.py:1849 _process_pending_gui_tasks (telemetry broadcast) - src/events.py:115 AsyncEventQueue.put (events broadcast) gui_2.py has no internal broadcast callers (grep verified). Both callers now construct WebSocketMessage(channel=, payload=) at the call site. test_websocket_broadcast_regression.py 4/4 pass (was 1/4 failing in red phase).	2026-06-21 19:26:14 -04:00
ed	76b10e734d	fix(broadcast): migrate WebSocketServer.broadcast() callers to WebSocketMessage signature Phase 5 of any_type_componentization_20260621 changed WebSocketServer.broadcast(channel, payload) -> broadcast(message: WebSocketMessage) but did not update internal callers. This produced worker[queue_fallback] TypeError spam on the GUI thread. Fixed 2 sites: - src/app_controller.py:1849 _process_pending_gui_tasks (telemetry broadcast) - src/events.py:115 AsyncEventQueue.put (events broadcast) gui_2.py has no internal broadcast callers (grep verified). Both callers now construct WebSocketMessage(channel=, payload=) at the call site. test_websocket_broadcast_regression.py 4/4 pass (was 1/4 failing in red phase).	2026-06-21 19:26:14 -04:00
ed	30c8b26381	fix(ai_client): migrate gemini_cli NormalizedResponse callers to Phase 2 dataclass API Phase 2 deferred t2_6: update src/ai_client.py _send_grok + _send_minimax + _send_llama + _send_gemini_cli (4 functions) to use the new dataclass API after NormalizedResponse was refactored to (text, tool_calls: tuple[ToolCall, ...], usage: UsageStats, raw_response). These 4 callers were left with the old keyword args (usage_input_tokens, usage_output_tokens, ...) which broke at runtime: ai_client.send() raised TypeError: NormalizedResponse.__init__() got an unexpected keyword argument 'usage_input_tokens'. FIXES: - src/ai_client.py L2054: gemini_cli 'adapter unavailable' branch - src/ai_client.py L2088: gemini_cli normal response branch - Added: from src.openai_schemas import UsageStats (module level) - Added backward-compat in src/openai_compatible.py: messages_dicts = [m.to_dict() if hasattr(m, 'to_dict') else m for m in request.messages] (accepts both ChatMessage dataclass and dict for backward compat with existing tests that pass raw dicts) TEST FIXES: - tests/test_ai_client_tool_loop.py: _make_normalized_response helper uses UsageStats instead of usage__tokens kwargs - tests/test_ai_client_tool_loop_builder.py: same - tests/test_ai_client_tool_loop_send_func.py: same - tests/test_openai_compatible.py: NormalizedResponse(text=..., usage=UsageStats(...)) + tool_calls[0].function.name (attribute access) instead of ['function']['name'] - tests/test_auto_whitelist.py: use update_session_metadata() instead of dict subscript assignment (Session dataclass doesn't support item assignment) VERIFIED: uv run pytest tests/test_ai_client_.py tests/test_openai_*.py \ tests/test_auto_whitelist.py --timeout=30 56 passed in 4.49s (19 previously failing tests now pass) uv run python scripts/audit_weak_types.py --strict STRICT OK: 115 weak sites <= baseline 115 uv run python scripts/audit_dataclass_coverage.py --strict STRICT OK: 200 weak sites <= baseline 207 This commit closes the t2_6 deferred task. The 41-site Phase 3 call-site migration remains deferred (separate provider_state_migration track).	2026-06-21 17:42:35 -04:00
ed	e9fa69ddc1	feat(api_hooks): add WebSocketMessage + JsonValue type (t5_1-t5_8) Phase 5 of any_type_componentization_20260621. Promotes the WebSocket broadcast signature in src/api_hooks.py from (channel, payload: dict) to a typed WebSocketMessage dataclass (16 Any sites): NEW dataclass (inline in src/api_hooks.py): - WebSocketMessage (frozen=True): channel: str, payload: JsonValue MODIFIED: - _serialize_for_api(obj: Any) -> JsonValue (typed return) - broadcast(channel: str, payload: dict[str, Any]) -> broadcast(message: WebSocketMessage) - _get_app_attr / _set_app_attr signatures UNCHANGED (Pattern 4 preserved) NEW tests/test_api_hooks_dataclasses.py (12 tests, all pass): - test_websocket_message_construction - test_websocket_message_with_list_payload - test_websocket_message_with_nested_payload - test_websocket_message_is_frozen - test_websocket_message_to_json - test_serialize_for_api_returns_dict_for_to_dict_object - test_serialize_for_api_handles_nested_lists - test_serialize_for_api_handles_purepath - test_serialize_for_api_passthrough_for_primitives - test_serialize_for_api_handles_mixed_nesting - test_get_app_attr_signature_preserved (Pattern 4 invariant) - test_set_app_attr_signature_preserved (Pattern 4 invariant) MODIFIED tests/test_websocket_server.py: - Updated broadcast() call site to use WebSocketMessage(channel=..., payload=...) - Added WebSocketMessage import Verified: uv run pytest tests/test_api_hooks_dataclasses.py tests/test_api_hooks_warmup.py tests/test_websocket_server.py --timeout=30 23 passed in 5.03s (12 new + 10 existing + 1 websocket)	2026-06-21 17:00:42 -04:00
ed	fef6c20ea0	feat(log): add Session + SessionMetadata dataclasses (t4_1-t4_8) Phase 4 of any_type_componentization_20260621. Promotes the 2-level dict[str, dict[str, Any]] structure in src/log_registry.py to typed Session + SessionMetadata dataclasses (7 Any sites): NEW dataclasses (inline in src/log_registry.py): - SessionMetadata (frozen): message_count, errors, size_kb, whitelisted, reason, timestamp - Session (frozen): session_id, path, start_time, whitelisted, metadata - to_dict() / from_dict() classmethod for round-trip with TOML shape - Backward-compat __getitem__ / get() so existing test_log_registry.py tests that use session_data['path'] / session_data.get('metadata') continue to work REFACTOR LogRegistry: - self.data: dict[str, dict[str, Any]] -> dict[str, Session] - load_registry: populates with Session.from_dict(...) - save_registry: serializes via session.to_dict() - register_session: creates Session dataclass - update_session_metadata: creates new Session with updated SessionMetadata - is_session_whitelisted: reads session.whitelisted - update_auto_whitelist_status: reads session.path - get_old_non_whitelisted_sessions: reads session.start_time + metadata NEW tests/test_log_registry_dataclasses.py (13 tests, all pass): - test_session_dataclass_construction - test_session_metadata_dataclass_construction - test_session_from_dict_basic / with_metadata - test_session_to_dict_round_trip - test_session_metadata_to_dict - test_log_registry_data_is_typed - test_log_registry_register_session_returns_session - test_log_registry_update_session_metadata_sets_metadata - test_log_registry_is_session_whitelisted - test_log_registry_get_old_non_whitelisted_sessions - test_session_is_frozen - test_session_metadata_is_frozen Verified: uv run pytest tests/test_log_registry.py tests/test_log_registry_dataclasses.py --timeout=30 18 passed in 3.27s (5 existing + 13 new)	2026-06-21 16:56:24 -04:00
ed	2ad4718c3c	feat(provider): add src/provider_state.py + tests (t3_2, t3_3) Phase 3 of any_type_componentization_20260621 (PARTIAL). Adds the ProviderHistory abstraction and 6-provider registry. NEW src/provider_state.py (60 lines): - ProviderHistory dataclass (messages: list[HistoryMessage], lock: Lock, append / get_all / replace_all / clear methods) - _PROVIDER_HISTORIES: dict[str, ProviderHistory] for anthropic / deepseek / minimax / qwen / grok / llama - get_history(provider) factory + clear_all() + providers() - SDK client holders (_gemini_chat, _anthropic_client, etc.) NOT touched per Pattern 3 (heterogeneous SDK types) NEW tests/test_provider_state.py (12 tests, all pass): - test_six_providers_registered - test_get_history_returns_singleton_per_provider - test_get_history_raises_for_unknown - test_provider_history_starts_empty - test_provider_history_append / get_all_returns_copy / replace_all / replace_all_takes_copy / clear - test_clear_all_resets_every_provider - test_provider_history_thread_safety (10 threads x 100 messages) - test_independent_locks_per_provider (lock on one doesn't block another) DEFERRED: - t3_4 (Remove 14 globals from ai_client.py:111-133) - t3_5 through t3_13 (Update call sites in _send_<provider> functions) - t3_14 (Run full regression suite on test_ai_client*.py) These call-site updates require careful per-function refactoring of the ~27 sites in _send_anthropic, _send_deepseek, _send_minimax, _send_qwen, _send_grok, _send_llama. The ai_client.py file is 3432 lines; a single regex pass risks subtle indentation regressions in nested constructs (see the 7 ot : orphan lines from a previous attempt). The provider_state module is independently usable and tested. Future track: provider_state_migration_2026MMDD to wire up the call sites mechanically, OR integrate into a Phase 3 retry pass. Verified: uv run pytest tests/test_provider_state.py --timeout=30 12 passed in 2.99s	2026-06-21 16:43:42 -04:00
ed	a96f946b40	feat(openai): add src/openai_schemas.py + refactor openai_compatible.py (t2_1-t2_7) Phase 2 of any_type_componentization_20260621. Promotes NormalizedResponse + OpenAICompatibleRequest from src/openai_compatible.py to typed dataclasses. The 17 Any sites become 5 dataclasses: NEW src/openai_schemas.py (138 lines): - ToolCallFunction dataclass (name, arguments) - ToolCall dataclass (id, function: ToolCallFunction, type='function') - ChatMessage dataclass (role, content, tool_calls, tool_call_id, name) - UsageStats dataclass (input_tokens, output_tokens, cache_read_, cache_creation_) - NormalizedResponse dataclass (text, tool_calls: tuple, usage, raw_response: Any) - OpenAICompatibleRequest dataclass (messages: list[ChatMessage], model, ...) NEW tests/test_openai_schemas.py (19 tests, all pass): - ToolCallFunction, ToolCall, ChatMessage round-trips - UsageStats field access + frozen=True semantics - NormalizedResponse.to_legacy_dict preserves shape - raw_response stays Any (Pattern 3 preserved) - tools field stays list[dict[str, Any]] for Phase 1 ToolSpec follow-up MODIFIED src/openai_compatible.py: - Removed inline NormalizedResponse + OpenAICompatibleRequest definitions - Re-imported from src.openai_schemas - _send_blocking: tool_calls -> tuple[ToolCall, ...]; usage_*_tokens -> UsageStats - _send_streaming: same migration - send_openai_compatible: messages_dicts = [m.to_dict() for m in request.messages] - Exception handler: empty NormalizedResponse uses UsageStats - All NormalizedResponse consumers still work (legacy dict shape preserved) Verified: uv run pytest tests/test_openai_schemas.py tests/test_mcp_tool_specs.py tests/test_audit_dataclass_coverage.py tests/test_type_aliases.py tests/test_mcp_client_beads.py tests/test_mcp_client_paths.py tests/test_arch_boundary_phase2.py --timeout=60 64 passed in 6.28s	2026-06-21 16:27:59 -04:00
ed	8bcde09476	refactor(mcp): update ai_client.py 3 TOOL_NAMES sites (t1_5) Phase 1 of any_type_componentization_20260621. Migrates ai_client.py: - Line 560: new_tools = {name: False for name in mcp_client.TOOL_NAMES} -> mcp_tool_specs.tool_names() - Line 582: _agent_tools = {name: True for name in mcp_client.TOOL_NAMES} -> mcp_tool_specs.tool_names() - Line 1012: is_native = name in mcp_client.TOOL_NAMES -> name in mcp_tool_specs.tool_names() Plus adds: from src import mcp_tool_specs Verified: uv run pytest tests/test_mcp_tool_specs.py tests/test_mcp_client_beads.py tests/test_mcp_client_paths.py tests/test_audit_dataclass_coverage.py tests/test_type_aliases.py 39 passed in 11.79s No regressions. The mcp_client.TOOL_NAMES re-export is preserved for backward compatibility with any external test/code that imports it.	2026-06-21 16:11:27 -04:00
ed	747e3983bd	refactor(mcp): update mcp_client.py call sites to mcp_tool_specs (t1_4) Phase 1 of any_type_componentization_20260621. Migrates the 4 call sites in src/mcp_client.py to use the new typed module: - Line 1944: native_names = {t['name'] for t in MCP_TOOL_SPECS} -> native_names = mcp_tool_specs.tool_names() - Line 1958: res = list(MCP_TOOL_SPECS) -> res = [s.to_dict() for s in mcp_tool_specs.get_tool_schemas()] - Line 2747: TOOL_NAMES = {t['name'] for t in MCP_TOOL_SPECS} -> TOOL_NAMES = mcp_tool_specs.tool_names() Plus: removes the legacy MCP_TOOL_SPECS list literal (lines 1973-2746; 774 lines of dict literals). The data lives in src/mcp_tool_specs.py now; the canonical registry. (The legacy dict shape is preserved via ToolSpec.to_dict() for downstream serialization.) Adds import: from src import mcp_tool_specs Verified: uv run pytest tests/test_mcp_tool_specs.py tests/test_audit_dataclass_coverage.py tests/test_type_aliases.py 32 passed in 5.48s uv run pytest tests/test_mcp_client_beads.py tests/test_mcp_client_paths.py 7 passed in 3.20s Cross-module invariant (test_tool_names_subset_of_models_agent_tool_names): the 45 mcp_tool_specs.tool_names() are all in models.AGENT_TOOL_NAMES.	2026-06-21 16:09:30 -04:00
ed	96007ebd77	feat(mcp): add src/mcp_tool_specs.py + tests (t1_1, t1_2, t1_3) Phase 1 of any_type_componentization_20260621. Promotes MCP_TOOL_SPECS (45 dict[str, Any] literals in src/mcp_client.py) to typed dataclasses: NEW src/mcp_tool_specs.py: - ToolParameter dataclass (name, type, description, required, enum) - ToolSpec dataclass (name, description, parameters: tuple) - _REGISTRY: dict[str, ToolSpec] - register() / get_tool_spec() / get_tool_schemas() / tool_names() - to_dict() preserves legacy JSON shape for downstream serialization - 45 register() calls (one per tool) at module level - Mirrors src/vendor_capabilities.py reference pattern NEW tests/test_mcp_tool_specs.py (11 tests, all pass): - test_module_loads_with_45_registrations - test_tool_names_set_matches_expected_45 - test_get_tool_spec_returns_correct_instance - test_get_tool_spec_raises_for_unknown_name - test_get_tool_schemas_returns_all_specs - test_tool_spec_is_frozen - test_tool_parameter_is_frozen - test_to_dict_round_trip_preserves_shape - test_tool_parameter_to_dict_includes_enum - test_tool_names_subset_of_models_agent_tool_names (cross-module invariant) - test_register_idempotent_replaces_existing (hot-reload support) NEW scripts/tier2/artifacts/any_type_componentization_20260621/: - generate_mcp_tool_specs.py: idempotent generator from MCP_TOOL_SPECS - generate_tool_specs.py: helper that emits registration lines - inspect_mcp_specs.py: shape inspection - _generated_registrations.txt: the 45 registration lines Verified: 11/11 tests pass. The legacy MCP_TOOL_SPECS dict in mcp_client.py still exists; this commit only ADDS the new module. Migration of call sites in mcp_client.py + ai_client.py follows in t1_4 + t1_5. Verified with: uv run pytest tests/test_mcp_tool_specs.py --timeout=30 11 passed in 3.01s	2026-06-21 16:06:29 -04:00
ed	4e658dd25c	feat(types): add JsonPrimitive + JsonValue TypeAliases (t0_3) Phase 0 of any_type_componentization_20260621. Extends src/type_aliases.py with two recursive-friendly TypeAliases for JSON wire format (used by Phase 5 api_hooks WebSocketMessage): - JsonPrimitive: str \| int \| float \| bool \| None - JsonValue: JsonPrimitive \| list['JsonValue'] \| dict[str, 'JsonValue'] The forward-ref 'JsonValue' strings work because from __future__ import annotations is at the top of the module (PEP 563 + PEP 613 TypeAlias). Tests added (4 new, 14 total): - test_json_primitive_alias_resolves_to_union: hints exposes JsonPrimitive - test_json_value_alias_resolves_to_recursive_union: hints exposes JsonValue - test_json_value_accepts_primitive_dict: dict[str, JsonValue] runtime use - test_json_value_accepts_nested_structures: nested dict+list round-trip Verification: uv run pytest tests/test_type_aliases.py --timeout=30 14 passed in 2.97s	2026-06-21 15:57:40 -04:00
ed	d81339ecb3	refactor(ai_client): _reread_file_items_result returns FileItemsDiff NamedTuple	2026-06-21 12:47:07 -04:00
ed	833e99f2ec	refactor(project_manager,aggregate,api_hook_client): replace weak type sites with aliases	2026-06-21 12:39:17 -04:00
ed	d0c0571bde	refactor(api_hook_client): replace weak type sites with aliases	2026-06-21 12:38:22 -04:00

1 2 3 4 5 ...

1055 Commits