32 Commits

Author SHA1 Message Date
ed 2bf55a89c2 chore(conductor): Add new track 'GUI 2.0 Feature Parity and Migration' 2026-02-24 18:39:21 -05:00
ed 9ba8ac2187 chore(conductor): Add new track 'Update documentation and cleanup MainContext.md' 2026-02-24 18:36:03 -05:00
ed 5515a72cf3 update conductor files 2026-02-24 18:32:38 -05:00
ed ef3d8b0ec1 chore(conductor): Add new track 'Move discussion histories to their own toml to prevent the ai agent from reading it (will be on a blacklist).' 2026-02-24 18:32:09 -05:00
ed 874422ecfd comitting 2026-02-23 23:28:49 -05:00
ed 57cb63b9c9 conductor(track): Complete gui2_feature_parity track
Close gui2_feature_parity track after implementing all features
and conducting manual and automated verification.

Key Achievements:
- Integrated event-driven architecture and MCP client.
- Ported API hooks and performance diagnostics.
- Implemented Prior Session Viewer.
- Refactored UI to a Hub-based layout.
- Added agent capability toggles.
- Achieved full theme integration.
- Developed comprehensive test suite.

Note: Remaining UI display issues for text panels in the comms and
tool call history will be addressed in a subsequent track.
2026-02-23 23:27:43 -05:00
ed dbf2962c54 fix(gui): Restore 'Load Log' button and fix docking crash
fix(mcp): Improve path resolution and error messages
2026-02-23 23:00:17 -05:00
ed f5ef2d850f refactor(gui): Implement user feedback for UI layout 2026-02-23 22:36:45 -05:00
ed 366cd8ebdd conductor(plan): Mark phase 'UI/UX Refinement' as complete 2026-02-23 22:18:11 -05:00
ed cc5074e682 conductor(checkpoint): Checkpoint end of Phase 3 2026-02-23 22:17:37 -05:00
ed 1b49e20c2e conductor(plan): Mark Hub refactoring as complete 2026-02-23 22:16:30 -05:00
ed ddb53b250f refactor(gui2): Restructure layout into discrete Hubs
Automates the refactoring of the monolithic _gui_func in gui_2.py into separate rendering methods, nested within 'Context Hub', 'AI Settings Hub', 'Discussion Hub', and 'Operations Hub', utilizing tab bars. Adds tests to ensure the new default windows correctly represent this Hub structure.
2026-02-23 22:15:13 -05:00
ed c6a756e754 conductor(plan): Mark phase 'Core Architectural Integration' as complete 2026-02-23 22:11:17 -05:00
ed 712d5a856f conductor(checkpoint): Checkpoint end of Phase 1 2026-02-23 22:10:05 -05:00
ed ece84d4c4f feat(gui2): Integrate mcp_client.py for native file tools
Wires up the mcp_client.perf_monitor_callback to the gui_2.py App class and verifies the dispatch loop through a newly created test.
2026-02-23 22:06:55 -05:00
ed 2ab3f101d6 Merge origin/cache 2026-02-23 22:03:06 -05:00
ed 1d8626bc6b chore: Update config and manual_slop.toml 2026-02-23 21:55:00 -05:00
r00tz bd8551d282 Harden reliability, security, and UX across core modules
- Add thread safety: _anthropic_history_lock and _send_lock in ai_client to prevent concurrent corruption
  - Add _send_thread_lock in gui_2 for atomic check-and-start of send thread
  - Add atexit fallback in session_logger to flush log files on abnormal exit
  - Fix file descriptor leaks: use context managers for urlopen in mcp_client
  - Cap unbounded tool output growth at 500KB per send() call (both Gemini and Anthropic)
  - Harden path traversal: resolve(strict=True) with fallback in mcp_client allowlist checks
  - Add SLOP_CREDENTIALS env var override for credentials.toml with helpful error message
  - Fix Gemini token heuristic: use _CHARS_PER_TOKEN (3.5) instead of hardcoded // 4
  - Add keyboard shortcuts: Ctrl+Enter to send, Ctrl+L to clear message input
  - Add auto-save: flush project and config to disk every 60 seconds
2026-02-23 21:29:30 -05:00
ed 6d825e6585 wip: gemini doing gui_2.py catchup track 2026-02-23 21:07:06 -05:00
ed 3db6a32e7c conductor(plan): Update plan after merge from cache branch 2026-02-23 20:34:14 -05:00
ed c19b13e4ac Merge branch 'origin/cache' 2026-02-23 20:32:49 -05:00
ed 1b9a2ab640 chore: Update discussion timestamp 2026-02-23 20:24:51 -05:00
ed 4300a8a963 conductor(plan): Mark task 'Integrate events.py into gui_2.py' as complete 2026-02-23 20:23:26 -05:00
ed 24b831c712 feat(gui2): Integrate core event system
Integrates the ai_client.events emitter into the gui_2.py App class. Adds a new test file to verify that the App subscribes to API lifecycle events upon initialization. This is the first step in aligning gui_2.py with the project's event-driven architecture.
2026-02-23 20:22:36 -05:00
ed bf873dc110 for some reason didn't add? 2026-02-23 20:17:55 -05:00
ed f65542add8 chore(conductor): Add new track 'get gui_2 working with latest changes to the project.' 2026-02-23 20:16:53 -05:00
ed 229ebaf238 Merge branch 'sim' 2026-02-23 20:11:01 -05:00
ed e51194a9be remove live_ux_test from active tracks 2026-02-23 20:10:47 -05:00
ed 85f8f08f42 chore(conductor): Archive track 'live_ux_test_20260223' 2026-02-23 20:10:22 -05:00
ed 70358f8151 conductor(plan): Mark task 'Apply review suggestions' as complete 2026-02-23 20:09:54 -05:00
ed 064d7ba235 fix(conductor): Apply review suggestions for track 'live_ux_test_20260223' 2026-02-23 20:09:41 -05:00
r00tz 69401365be Port missing features to gui_2 and optimize caching
- Port 10 missing features from gui.py to gui_2.py: performance
    diagnostics, prior session log viewing, token budget visualization,
    agent tools config, API hooks server, GUI task queue, discussion
    truncation, THINKING/LIVE indicators, event subscriptions, and
    session usage tracking
  - Persist window visibility state in config.toml
  - Fix Gemini cache invalidation by separating discussion history
    from cached context (use MD5 hash instead of built-in hash)
  - Add cost optimizations: tool output truncation at source, proactive
    history trimming at 40%, summary_only support in aggregate.run()
  - Add cleanup() for destroying API caches on exit
2026-02-23 20:06:13 -05:00
43 changed files with 2155 additions and 1039 deletions
+14 -1
View File
@@ -164,6 +164,18 @@ def build_markdown_from_items(file_items: list[dict], screenshot_base_dir: Path,
return "\n\n---\n\n".join(parts) return "\n\n---\n\n".join(parts)
def build_markdown_no_history(file_items: list[dict], screenshot_base_dir: Path, screenshots: list[str], summary_only: bool = False) -> str:
"""Build markdown with only files + screenshots (no history). Used for stable caching."""
return build_markdown_from_items(file_items, screenshot_base_dir, screenshots, history=[], summary_only=summary_only)
def build_discussion_text(history: list[str]) -> str:
"""Build just the discussion history section text. Returns empty string if no history."""
if not history:
return ""
return "## Discussion History\n\n" + build_discussion_section(history)
def build_markdown(base_dir: Path, files: list[str], screenshot_base_dir: Path, screenshots: list[str], history: list[str], summary_only: bool = False) -> str: def build_markdown(base_dir: Path, files: list[str], screenshot_base_dir: Path, screenshots: list[str], history: list[str], summary_only: bool = False) -> str:
parts = [] parts = []
# STATIC PREFIX: Files and Screenshots must go first to maximize Cache Hits # STATIC PREFIX: Files and Screenshots must go first to maximize Cache Hits
@@ -195,8 +207,9 @@ def run(config: dict) -> tuple[str, Path, list[dict]]:
output_file = output_dir / f"{namespace}_{increment:03d}.md" output_file = output_dir / f"{namespace}_{increment:03d}.md"
# Build file items once, then construct markdown from them (avoids double I/O) # Build file items once, then construct markdown from them (avoids double I/O)
file_items = build_file_items(base_dir, files) file_items = build_file_items(base_dir, files)
summary_only = config.get("project", {}).get("summary_only", False)
markdown = build_markdown_from_items(file_items, screenshot_base_dir, screenshots, history, markdown = build_markdown_from_items(file_items, screenshot_base_dir, screenshots, history,
summary_only=False) summary_only=summary_only)
output_file.write_text(markdown, encoding="utf-8") output_file.write_text(markdown, encoding="utf-8")
return markdown, output_file, file_items return markdown, output_file, file_items
+133 -41
View File
@@ -15,7 +15,11 @@ import tomllib
import json import json
import time import time
import datetime import datetime
import hashlib
import difflib
import threading
from pathlib import Path from pathlib import Path
import os
import file_cache import file_cache
import mcp_client import mcp_client
import anthropic import anthropic
@@ -51,6 +55,8 @@ _GEMINI_CACHE_TTL = 3600
_anthropic_client = None _anthropic_client = None
_anthropic_history: list[dict] = [] _anthropic_history: list[dict] = []
_anthropic_history_lock = threading.Lock()
_send_lock = threading.Lock()
# Injected by gui.py - called when AI wants to run a command. # Injected by gui.py - called when AI wants to run a command.
# Signature: (script: str, base_dir: str) -> str | None # Signature: (script: str, base_dir: str) -> str | None
@@ -67,6 +73,10 @@ tool_log_callback = None
# Increased to allow thorough code exploration before forcing a summary # Increased to allow thorough code exploration before forcing a summary
MAX_TOOL_ROUNDS = 10 MAX_TOOL_ROUNDS = 10
# Maximum cumulative bytes of tool output allowed per send() call.
# Prevents unbounded memory growth during long tool-calling loops.
_MAX_TOOL_OUTPUT_BYTES = 500_000
# Maximum characters per text chunk sent to Anthropic. # Maximum characters per text chunk sent to Anthropic.
# Kept well under the ~200k token API limit. # Kept well under the ~200k token API limit.
_ANTHROPIC_CHUNK_SIZE = 120_000 _ANTHROPIC_CHUNK_SIZE = 120_000
@@ -128,8 +138,18 @@ def clear_comms_log():
def _load_credentials() -> dict: def _load_credentials() -> dict:
with open("credentials.toml", "rb") as f: cred_path = os.environ.get("SLOP_CREDENTIALS", "credentials.toml")
return tomllib.load(f) try:
with open(cred_path, "rb") as f:
return tomllib.load(f)
except FileNotFoundError:
raise FileNotFoundError(
f"Credentials file not found: {cred_path}\n"
f"Create a credentials.toml with:\n"
f" [gemini]\n api_key = \"your-key\"\n"
f" [anthropic]\n api_key = \"your-key\"\n"
f"Or set SLOP_CREDENTIALS env var to a custom path."
)
# ------------------------------------------------------------------ provider errors # ------------------------------------------------------------------ provider errors
@@ -244,7 +264,8 @@ def reset_session():
_gemini_cache_md_hash = None _gemini_cache_md_hash = None
_gemini_cache_created_at = None _gemini_cache_created_at = None
_anthropic_client = None _anthropic_client = None
_anthropic_history = [] with _anthropic_history_lock:
_anthropic_history = []
_CACHED_ANTHROPIC_TOOLS = None _CACHED_ANTHROPIC_TOOLS = None
file_cache.reset_client() file_cache.reset_client()
@@ -435,6 +456,13 @@ def _run_script(script: str, base_dir: str) -> str:
return output return output
def _truncate_tool_output(output: str) -> str:
"""Truncate tool output to _history_trunc_limit chars before sending to API."""
if _history_trunc_limit > 0 and len(output) > _history_trunc_limit:
return output[:_history_trunc_limit] + "\n\n... [TRUNCATED BY SYSTEM TO SAVE TOKENS.]"
return output
# ------------------------------------------------------------------ dynamic file context refresh # ------------------------------------------------------------------ dynamic file context refresh
def _reread_file_items(file_items: list[dict]) -> tuple[list[dict], list[dict]]: def _reread_file_items(file_items: list[dict]) -> tuple[list[dict], list[dict]]:
@@ -460,7 +488,7 @@ def _reread_file_items(file_items: list[dict]) -> tuple[list[dict], list[dict]]:
refreshed.append(item) # unchanged — skip re-read refreshed.append(item) # unchanged — skip re-read
continue continue
content = p.read_text(encoding="utf-8") content = p.read_text(encoding="utf-8")
new_item = {**item, "content": content, "error": False, "mtime": current_mtime} new_item = {**item, "old_content": item.get("content", ""), "content": content, "error": False, "mtime": current_mtime}
refreshed.append(new_item) refreshed.append(new_item)
changed.append(new_item) changed.append(new_item)
except Exception as e: except Exception as e:
@@ -486,6 +514,35 @@ def _build_file_context_text(file_items: list[dict]) -> str:
return "\n\n---\n\n".join(parts) return "\n\n---\n\n".join(parts)
_DIFF_LINE_THRESHOLD = 200
def _build_file_diff_text(changed_items: list[dict]) -> str:
"""
Build text for changed files. Small files (<= _DIFF_LINE_THRESHOLD lines)
get full content; large files get a unified diff against old_content.
"""
if not changed_items:
return ""
parts = []
for item in changed_items:
path = item.get("path") or item.get("entry", "unknown")
content = item.get("content", "")
old_content = item.get("old_content", "")
new_lines = content.splitlines(keepends=True)
if len(new_lines) <= _DIFF_LINE_THRESHOLD or not old_content:
suffix = str(path).rsplit(".", 1)[-1] if "." in str(path) else "text"
parts.append(f"### `{path}` (full)\n\n```{suffix}\n{content}\n```")
else:
old_lines = old_content.splitlines(keepends=True)
diff = difflib.unified_diff(old_lines, new_lines, fromfile=str(path), tofile=str(path), lineterm="")
diff_text = "\n".join(diff)
if diff_text:
parts.append(f"### `{path}` (diff)\n\n```diff\n{diff_text}\n```")
else:
parts.append(f"### `{path}` (no changes detected)")
return "\n\n---\n\n".join(parts)
# ------------------------------------------------------------------ content block serialisation # ------------------------------------------------------------------ content block serialisation
def _content_block_to_dict(block) -> dict: def _content_block_to_dict(block) -> dict:
@@ -530,22 +587,26 @@ def _get_gemini_history_list(chat):
return chat.get_history() return chat.get_history()
return [] return []
def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items: list[dict] | None = None) -> str: def _send_gemini(md_content: str, user_message: str, base_dir: str,
file_items: list[dict] | None = None,
discussion_history: str = "") -> str:
global _gemini_chat, _gemini_cache, _gemini_cache_md_hash, _gemini_cache_created_at global _gemini_chat, _gemini_cache, _gemini_cache_md_hash, _gemini_cache_created_at
try: try:
_ensure_gemini_client(); mcp_client.configure(file_items or [], [base_dir]) _ensure_gemini_client(); mcp_client.configure(file_items or [], [base_dir])
# Only stable content (files + screenshots) goes in the cached system instruction.
# Discussion history is sent as conversation messages so the cache isn't invalidated every turn.
sys_instr = f"{_get_combined_system_prompt()}\n\n<context>\n{md_content}\n</context>" sys_instr = f"{_get_combined_system_prompt()}\n\n<context>\n{md_content}\n</context>"
tools_decl = [_gemini_tool_declaration()] tools_decl = [_gemini_tool_declaration()]
# DYNAMIC CONTEXT: Check if files/context changed mid-session # DYNAMIC CONTEXT: Check if files/context changed mid-session
current_md_hash = hash(md_content) current_md_hash = hashlib.md5(md_content.encode()).hexdigest()
old_history = None old_history = None
if _gemini_chat and _gemini_cache_md_hash != current_md_hash: if _gemini_chat and _gemini_cache_md_hash != current_md_hash:
old_history = list(_get_gemini_history_list(_gemini_chat)) if _get_gemini_history_list(_gemini_chat) else [] old_history = list(_get_gemini_history_list(_gemini_chat)) if _get_gemini_history_list(_gemini_chat) else []
if _gemini_cache: if _gemini_cache:
try: _gemini_client.caches.delete(name=_gemini_cache.name) try: _gemini_client.caches.delete(name=_gemini_cache.name)
except: pass except Exception as e: _append_comms("OUT", "request", {"message": f"[CACHE DELETE WARN] {e}"})
_gemini_chat = None _gemini_chat = None
_gemini_cache = None _gemini_cache = None
_gemini_cache_created_at = None _gemini_cache_created_at = None
@@ -558,7 +619,7 @@ def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items:
if elapsed > _GEMINI_CACHE_TTL * 0.9: if elapsed > _GEMINI_CACHE_TTL * 0.9:
old_history = list(_get_gemini_history_list(_gemini_chat)) if _get_gemini_history_list(_gemini_chat) else [] old_history = list(_get_gemini_history_list(_gemini_chat)) if _get_gemini_history_list(_gemini_chat) else []
try: _gemini_client.caches.delete(name=_gemini_cache.name) try: _gemini_client.caches.delete(name=_gemini_cache.name)
except: pass except Exception as e: _append_comms("OUT", "request", {"message": f"[CACHE DELETE WARN] {e}"})
_gemini_chat = None _gemini_chat = None
_gemini_cache = None _gemini_cache = None
_gemini_cache_created_at = None _gemini_cache_created_at = None
@@ -601,9 +662,16 @@ def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items:
_gemini_chat = _gemini_client.chats.create(**kwargs) _gemini_chat = _gemini_client.chats.create(**kwargs)
_gemini_cache_md_hash = current_md_hash _gemini_cache_md_hash = current_md_hash
# Inject discussion history as a user message on first chat creation
# (only when there's no old_history being restored, i.e., fresh session)
if discussion_history and not old_history:
_gemini_chat.send_message(f"[DISCUSSION HISTORY]\n\n{discussion_history}")
_append_comms("OUT", "request", {"message": f"[HISTORY INJECTED] {len(discussion_history)} chars"})
_append_comms("OUT", "request", {"message": f"[ctx {len(md_content)} + msg {len(user_message)}]"}) _append_comms("OUT", "request", {"message": f"[ctx {len(md_content)} + msg {len(user_message)}]"})
payload, all_text = user_message, [] payload, all_text = user_message, []
_cumulative_tool_bytes = 0
# Strip stale file refreshes and truncate old tool outputs ONCE before # Strip stale file refreshes and truncate old tool outputs ONCE before
# entering the tool loop (not per-round — history entries don't change). # entering the tool loop (not per-round — history entries don't change).
@@ -634,37 +702,30 @@ def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items:
if cached_tokens: if cached_tokens:
usage["cache_read_input_tokens"] = cached_tokens usage["cache_read_input_tokens"] = cached_tokens
# Fetch cache stats in the background thread to avoid blocking GUI events.emit("response_received", payload={"provider": "gemini", "model": _model, "usage": usage, "round": r_idx})
cache_stats = None
try:
cache_stats = get_gemini_cache_stats()
except Exception:
pass
events.emit("response_received", payload={"provider": "gemini", "model": _model, "usage": usage, "round": r_idx, "cache_stats": cache_stats})
reason = resp.candidates[0].finish_reason.name if resp.candidates and hasattr(resp.candidates[0], "finish_reason") else "STOP" reason = resp.candidates[0].finish_reason.name if resp.candidates and hasattr(resp.candidates[0], "finish_reason") else "STOP"
_append_comms("IN", "response", {"round": r_idx, "stop_reason": reason, "text": txt, "tool_calls": [{"name": c.name, "args": dict(c.args)} for c in calls], "usage": usage}) _append_comms("IN", "response", {"round": r_idx, "stop_reason": reason, "text": txt, "tool_calls": [{"name": c.name, "args": dict(c.args)} for c in calls], "usage": usage})
# Guard: if Gemini reports input tokens approaching the limit, drop oldest history pairs # Guard: proactively trim history when input tokens exceed 40% of limit
total_in = usage.get("input_tokens", 0) total_in = usage.get("input_tokens", 0)
if total_in > _GEMINI_MAX_INPUT_TOKENS and _gemini_chat and _get_gemini_history_list(_gemini_chat): if total_in > _GEMINI_MAX_INPUT_TOKENS * 0.4 and _gemini_chat and _get_gemini_history_list(_gemini_chat):
hist = _get_gemini_history_list(_gemini_chat) hist = _get_gemini_history_list(_gemini_chat)
dropped = 0 dropped = 0
# Drop oldest pairs (user+model) but keep at least the last 2 entries # Drop oldest pairs (user+model) but keep at least the last 2 entries
while len(hist) > 4 and total_in > _GEMINI_MAX_INPUT_TOKENS * 0.7: while len(hist) > 4 and total_in > _GEMINI_MAX_INPUT_TOKENS * 0.3:
# Drop in pairs (user + model) to maintain alternating roles required by Gemini # Drop in pairs (user + model) to maintain alternating roles required by Gemini
saved = 0 saved = 0
for _ in range(2): for _ in range(2):
if not hist: break if not hist: break
for p in hist[0].parts: for p in hist[0].parts:
if hasattr(p, "text") and p.text: if hasattr(p, "text") and p.text:
saved += len(p.text) // 4 saved += int(len(p.text) / _CHARS_PER_TOKEN)
elif hasattr(p, "function_response") and p.function_response: elif hasattr(p, "function_response") and p.function_response:
r = getattr(p.function_response, "response", {}) r = getattr(p.function_response, "response", {})
if isinstance(r, dict): if isinstance(r, dict):
saved += len(str(r.get("output", ""))) // 4 saved += int(len(str(r.get("output", ""))) / _CHARS_PER_TOKEN)
hist.pop(0) hist.pop(0)
dropped += 1 dropped += 1
total_in -= max(saved, 200) total_in -= max(saved, 200)
@@ -689,15 +750,23 @@ def _send_gemini(md_content: str, user_message: str, base_dir: str, file_items:
if i == len(calls) - 1: if i == len(calls) - 1:
if file_items: if file_items:
file_items, changed = _reread_file_items(file_items) file_items, changed = _reread_file_items(file_items)
ctx = _build_file_context_text(changed) ctx = _build_file_diff_text(changed)
if ctx: if ctx:
out += f"\n\n[SYSTEM: FILES UPDATED]\n\n{ctx}" out += f"\n\n[SYSTEM: FILES UPDATED]\n\n{ctx}"
if r_idx == MAX_TOOL_ROUNDS: out += "\n\n[SYSTEM: MAX ROUNDS. PROVIDE FINAL ANSWER.]" if r_idx == MAX_TOOL_ROUNDS: out += "\n\n[SYSTEM: MAX ROUNDS. PROVIDE FINAL ANSWER.]"
out = _truncate_tool_output(out)
_cumulative_tool_bytes += len(out)
f_resps.append(types.Part.from_function_response(name=name, response={"output": out})) f_resps.append(types.Part.from_function_response(name=name, response={"output": out}))
log.append({"tool_use_id": name, "content": out}) log.append({"tool_use_id": name, "content": out})
events.emit("tool_execution", payload={"status": "completed", "tool": name, "result": out, "round": r_idx}) events.emit("tool_execution", payload={"status": "completed", "tool": name, "result": out, "round": r_idx})
if _cumulative_tool_bytes > _MAX_TOOL_OUTPUT_BYTES:
f_resps.append(types.Part.from_text(
f"SYSTEM WARNING: Cumulative tool output exceeded {_MAX_TOOL_OUTPUT_BYTES // 1000}KB budget. Provide your final answer now."
))
_append_comms("OUT", "request", {"message": f"[TOOL OUTPUT BUDGET EXCEEDED: {_cumulative_tool_bytes} bytes]"})
_append_comms("OUT", "tool_result_send", {"results": log}) _append_comms("OUT", "tool_result_send", {"results": log})
payload = f_resps payload = f_resps
@@ -955,7 +1024,7 @@ def _repair_anthropic_history(history: list[dict]):
}) })
def _send_anthropic(md_content: str, user_message: str, base_dir: str, file_items: list[dict] | None = None) -> str: def _send_anthropic(md_content: str, user_message: str, base_dir: str, file_items: list[dict] | None = None, discussion_history: str = "") -> str:
try: try:
_ensure_anthropic_client() _ensure_anthropic_client()
mcp_client.configure(file_items or [], [base_dir]) mcp_client.configure(file_items or [], [base_dir])
@@ -969,7 +1038,11 @@ def _send_anthropic(md_content: str, user_message: str, base_dir: str, file_item
context_blocks = _build_chunked_context_blocks(context_text) context_blocks = _build_chunked_context_blocks(context_text)
system_blocks = stable_blocks + context_blocks system_blocks = stable_blocks + context_blocks
user_content = [{"type": "text", "text": user_message}] # Prepend discussion history to the first user message if this is a fresh session
if discussion_history and not _anthropic_history:
user_content = [{"type": "text", "text": f"[DISCUSSION HISTORY]\n\n{discussion_history}\n\n---\n\n{user_message}"}]
else:
user_content = [{"type": "text", "text": user_message}]
# COMPRESS HISTORY: Truncate massive tool outputs from previous turns # COMPRESS HISTORY: Truncate massive tool outputs from previous turns
for msg in _anthropic_history: for msg in _anthropic_history:
@@ -1000,6 +1073,7 @@ def _send_anthropic(md_content: str, user_message: str, base_dir: str, file_item
}) })
all_text_parts = [] all_text_parts = []
_cumulative_tool_bytes = 0
# We allow MAX_TOOL_ROUNDS, plus 1 final loop to get the text synthesis # We allow MAX_TOOL_ROUNDS, plus 1 final loop to get the text synthesis
for round_idx in range(MAX_TOOL_ROUNDS + 2): for round_idx in range(MAX_TOOL_ROUNDS + 2):
@@ -1086,10 +1160,12 @@ def _send_anthropic(md_content: str, user_message: str, base_dir: str, file_item
_append_comms("OUT", "tool_call", {"name": b_name, "id": b_id, "args": b_input}) _append_comms("OUT", "tool_call", {"name": b_name, "id": b_id, "args": b_input})
output = mcp_client.dispatch(b_name, b_input) output = mcp_client.dispatch(b_name, b_input)
_append_comms("IN", "tool_result", {"name": b_name, "id": b_id, "output": output}) _append_comms("IN", "tool_result", {"name": b_name, "id": b_id, "output": output})
truncated = _truncate_tool_output(output)
_cumulative_tool_bytes += len(truncated)
tool_results.append({ tool_results.append({
"type": "tool_result", "type": "tool_result",
"tool_use_id": b_id, "tool_use_id": b_id,
"content": output, "content": truncated,
}) })
events.emit("tool_execution", payload={"status": "completed", "tool": b_name, "result": output, "round": round_idx}) events.emit("tool_execution", payload={"status": "completed", "tool": b_name, "result": output, "round": round_idx})
elif b_name == TOOL_NAME: elif b_name == TOOL_NAME:
@@ -1105,17 +1181,26 @@ def _send_anthropic(md_content: str, user_message: str, base_dir: str, file_item
"id": b_id, "id": b_id,
"output": output, "output": output,
}) })
truncated = _truncate_tool_output(output)
_cumulative_tool_bytes += len(truncated)
tool_results.append({ tool_results.append({
"type": "tool_result", "type": "tool_result",
"tool_use_id": b_id, "tool_use_id": b_id,
"content": output, "content": truncated,
}) })
events.emit("tool_execution", payload={"status": "completed", "tool": b_name, "result": output, "round": round_idx}) events.emit("tool_execution", payload={"status": "completed", "tool": b_name, "result": output, "round": round_idx})
if _cumulative_tool_bytes > _MAX_TOOL_OUTPUT_BYTES:
tool_results.append({
"type": "text",
"text": f"SYSTEM WARNING: Cumulative tool output exceeded {_MAX_TOOL_OUTPUT_BYTES // 1000}KB budget. Provide your final answer now."
})
_append_comms("OUT", "request", {"message": f"[TOOL OUTPUT BUDGET EXCEEDED: {_cumulative_tool_bytes} bytes]"})
# Refresh file context after tool calls — only inject CHANGED files # Refresh file context after tool calls — only inject CHANGED files
if file_items: if file_items:
file_items, changed = _reread_file_items(file_items) file_items, changed = _reread_file_items(file_items)
refreshed_ctx = _build_file_context_text(changed) refreshed_ctx = _build_file_diff_text(changed)
if refreshed_ctx: if refreshed_ctx:
tool_results.append({ tool_results.append({
"type": "text", "type": "text",
@@ -1160,21 +1245,26 @@ def send(
user_message: str, user_message: str,
base_dir: str = ".", base_dir: str = ".",
file_items: list[dict] | None = None, file_items: list[dict] | None = None,
discussion_history: str = "",
) -> str: ) -> str:
""" """
Send a message to the active provider. Send a message to the active provider.
md_content : aggregated markdown string from aggregate.run() md_content : aggregated markdown string (for Gemini: stable content only,
user_message: the user question / instruction for Anthropic: full content including history)
base_dir : project base directory (for PowerShell tool calls) user_message : the user question / instruction
file_items : list of file dicts from aggregate.build_file_items() for base_dir : project base directory (for PowerShell tool calls)
dynamic context refresh after tool calls file_items : list of file dicts from aggregate.build_file_items() for
dynamic context refresh after tool calls
discussion_history : discussion history text (used by Gemini to inject as
conversation message instead of caching it)
""" """
if _provider == "gemini": with _send_lock:
return _send_gemini(md_content, user_message, base_dir, file_items) if _provider == "gemini":
elif _provider == "anthropic": return _send_gemini(md_content, user_message, base_dir, file_items, discussion_history)
return _send_anthropic(md_content, user_message, base_dir, file_items) elif _provider == "anthropic":
raise ValueError(f"unknown provider: {_provider}") return _send_anthropic(md_content, user_message, base_dir, file_items, discussion_history)
raise ValueError(f"unknown provider: {_provider}")
def get_history_bleed_stats() -> dict: def get_history_bleed_stats() -> dict:
""" """
@@ -1182,7 +1272,9 @@ def get_history_bleed_stats() -> dict:
""" """
if _provider == "anthropic": if _provider == "anthropic":
# For Anthropic, we have a robust estimator # For Anthropic, we have a robust estimator
current_tokens = _estimate_prompt_tokens([], _anthropic_history) with _anthropic_history_lock:
history_snapshot = list(_anthropic_history)
current_tokens = _estimate_prompt_tokens([], history_snapshot)
limit_tokens = _ANTHROPIC_MAX_PROMPT_TOKENS limit_tokens = _ANTHROPIC_MAX_PROMPT_TOKENS
percentage = (current_tokens / limit_tokens) * 100 if limit_tokens > 0 else 0 percentage = (current_tokens / limit_tokens) * 100 if limit_tokens > 0 else 0
return { return {
@@ -1226,4 +1318,4 @@ def get_history_bleed_stats() -> dict:
"limit": 0, "limit": 0,
"current": 0, "current": 0,
"percentage": 0, "percentage": 0,
} }
@@ -0,0 +1,5 @@
# Track gui2_feature_parity_20260223 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)
@@ -0,0 +1,8 @@
{
"track_id": "gui2_feature_parity_20260223",
"type": "feature",
"status": "new",
"created_at": "2026-02-23T20:15:30Z",
"updated_at": "2026-02-23T20:15:30Z",
"description": "get gui_2 working with latest changes to the project."
}
@@ -0,0 +1,82 @@
# Implementation Plan: GUIv2 Feature Parity
## Phase 1: Core Architectural Integration [checkpoint: 712d5a8]
- [x] **Task:** Integrate `events.py` into `gui_2.py`. [24b831c]
- [x] Sub-task: Import the `events` module in `gui_2.py`.
- [x] Sub-task: Refactor the `ai_client` call in `_do_send` to use the event-driven `send` method.
- [x] Sub-task: Create event handlers in `App` class for `request_start`, `response_received`, and `tool_execution`.
- [x] Sub-task: Subscribe the handlers to `ai_client.events` upon `App` initialization.
- [x] **Task:** Integrate `mcp_client.py` for native file tools. [ece84d4]
- [x] Sub-task: Import `mcp_client` in `gui_2.py`.
- [x] Sub-task: Add `mcp_client.perf_monitor_callback` to the `App` initialization.
- [x] Sub-task: In `ai_client`, ensure the MCP tools are registered and available for the AI to call when `gui_2.py` is the active UI.
- [x] **Task:** Write tests for new core integrations. [ece84d4]
- [x] Sub-task: Create `tests/test_gui2_events.py` to verify that `gui_2.py` correctly handles AI lifecycle events.
- [x] Sub-task: Create `tests/test_gui2_mcp.py` to verify that the AI can use MCP tools through `gui_2.py`.
- [x] **Task:** Conductor - User Manual Verification 'Core Architectural Integration' (Protocol in workflow.md)
## Phase 2: Major Feature Implementation
- [x] **Task:** Port the API Hooks System. [merged]
- [x] Sub-task: Import `api_hooks` in `gui_2.py`.
- [x] Sub-task: Instantiate `HookServer` in the `App` class.
- [x] Sub-task: Implement the logic to start the server based on a CLI flag (e.g., `--enable-test-hooks`).
- [x] Sub-task: Implement the queue and lock for pending GUI tasks from the hook server, similar to `gui.py`.
- [x] Sub-task: Add a main loop task to process the GUI task queue.
- [x] **Task:** Port the Performance & Diagnostics feature. [merged]
- [x] Sub-task: Import `PerformanceMonitor` in `gui_2.py`.
- [x] Sub-task: Instantiate `PerformanceMonitor` in the `App` class.
- [x] Sub-task: Create a new "Diagnostics" window in `gui_2.py`.
- [x] Sub-task: Add UI elements (plots, labels) to the Diagnostics window to display FPS, CPU, frame time, etc.
- [x] Sub-task: Add a throttled update mechanism in the main loop to refresh diagnostics data.
- [x] **Task:** Implement the Prior Session Viewer. [merged]
- [x] Sub-task: Add a "Load Prior Session" button to the UI.
- [x] Sub-task: Implement the file dialog logic to select a `.log` file.
- [x] Sub-task: Implement the logic to parse the log file and populate the comms history view.
- [x] Sub-task: Implement the "tinted" theme application when in viewing mode and a way to exit this mode.
- [x] **Task:** Write tests for major features.
- [x] Sub-task: Create `tests/test_gui2_api_hooks.py` to test the hook server integration.
- [x] Sub-task: Create `tests/test_gui2_diagnostics.py` to verify the diagnostics panel displays data.
- [x] **Task:** Conductor - User Manual Verification 'Major Feature Implementation' (Protocol in workflow.md)
## Phase 3: UI/UX Refinement [checkpoint: cc5074e]
- [x] **Task:** Refactor UI to a "Hub" based layout. [ddb53b2]
- [x] Sub-task: Analyze the docking layout of `gui.py`.
- [x] Sub-task: Create wrapper windows for "Context Hub", "AI Settings Hub", "Discussion Hub", and "Operations Hub" in `gui_2.py`.
- [x] Sub-task: Move existing windows into their respective Hubs using the `imgui-bundle` docking API.
- [x] Sub-task: Ensure the default layout is saved to and loaded from `manualslop_layout.ini`.
- [x] **Task:** Add Agent Capability Toggles to the UI. [merged]
- [x] Sub-task: In the "Projects" or a new "Agent" panel, add checkboxes for each agent tool (e.g., `run_powershell`, `read_file`).
- [x] Sub-task: Ensure these UI toggles are saved to the project\'s `.toml` file.
- [x] Sub-task: Ensure `ai_client` respects these settings when determining which tools are available to the AI.
- [x] **Task:** Full Theme Integration. [merged]
- [x] Sub-task: Review all newly added windows and controls.
- [x] Sub-task: Ensure that colors, fonts, and scaling from `theme_2.py` are correctly applied everywhere.
- [x] Sub-task: Test theme switching to confirm all elements update correctly.
- [x] **Task:** Write tests for UI/UX changes. [ddb53b2]
- [x] Sub-task: Create `tests/test_gui2_layout.py` to verify the hub structure is created.
- [x] Sub-task: Add tests to verify agent capability toggles are respected.
- [x] **Task:** Conductor - User Manual Verification 'UI/UX Refinement' (Protocol in workflow.md)
## Phase 4: Finalization and Verification
- [x] **Task:** Conduct full manual testing against `spec.md` Acceptance Criteria. (Note: Some UI display issues for text panels persist and will be addressed in a future track.)
- [x] Sub-task: Verify AC1: `gui_2.py` launches.
- [x] Sub-task: Verify AC2: Hub layout is correct.
- [x] Sub-task: Verify AC3: Diagnostics panel works.
- [x] Sub-task: Verify AC4: API hooks server runs.
- [x] Sub-task: Verify AC5: MCP tools are usable by AI.
- [x] Sub-task: Verify AC6: Prior Session Viewer works.
- [x] Sub-task: Verify AC7: Theming is consistent.
- [x] **Task:** Run the full project test suite.
- [x] Sub-task: Execute `uv run run_tests.py` (or equivalent).
- [x] Sub-task: Ensure all existing and new tests pass.
- [x] **Task:** Code Cleanup and Refactoring.
- [x] Sub-task: Remove any dead code or temporary debug statements.
- [x] Sub-task: Ensure code follows project style guides.
- [x] **Task:** Conductor - User Manual Verification 'Finalization and Verification' (Protocol in workflow.md)
---
**Note:** This track is being closed. Remaining UI display issues for text panels in the comms and tool call history will be addressed in a subsequent track. Please see the project's issue tracker for details on the new track.
@@ -0,0 +1,45 @@
# Specification: GUIv2 Feature Parity
## 1. Overview
This track aims to bring `gui_2.py` (the `imgui-bundle` based UI) to feature parity with the existing `gui.py` (the `dearpygui` based UI). This involves porting several major systems and features to ensure `gui_2.py` can serve as a viable replacement and support the latest project capabilities like automated testing and advanced diagnostics.
## 2. Functional Requirements
### FR1: Port Core Architectural Systems
- **FR1.1: Event-Driven Architecture:** `gui_2.py` MUST be refactored to use the `events.py` module for handling API lifecycle events, decoupling the UI from the AI client.
- **FR1.2: MCP File Tools Integration:** `gui_2.py` MUST integrate and use `mcp_client.py` to provide the AI with native, sandboxed file system capabilities (read, list, search).
### FR2: Port Major Features
- **FR2.1: API Hooks System:** The full API hooks system, including `api_hooks.py` and `api_hook_client.py`, MUST be integrated into `gui_2.py`. This will enable external test automation and state inspection.
- **FR2.2: Performance & Diagnostics:** The performance monitoring capabilities from `performance_monitor.py` MUST be integrated. A new "Diagnostics" panel, mirroring the one in `gui.py`, MUST be created to display real-time metrics (FPS, CPU, Frame Time, etc.).
- **FR2.3: Prior Session Viewer:** The functionality to load and view previous session logs (`.log` files from the `/logs` directory) MUST be implemented, including the distinctive "tinted" UI theme when viewing a prior session.
### FR3: UI/UX Alignment
- **FR3.1: 'Hub' UI Layout:** The windowing layout of `gui_2.py` MUST be refactored to match the "Hub" paradigm of `gui.py`. This includes creating:
- `Context Hub`
- `AI Settings Hub`
- `Discussion Hub`
- `Operations Hub`
- **FR3.2: Agent Capability Toggles:** The UI MUST include checkboxes or similar controls to allow the user to enable or disable the AI's agent-level tools (e.g., `run_powershell`, `read_file`).
- **FR3.3: Full Theme Integration:** All new UI components, windows, and controls MUST correctly apply and respond to the application's theming system (`theme_2.py`).
## 3. Non-Functional Requirements
- **NFR1: Stability:** The application must remain stable and responsive during and after the feature porting.
- **NFR2: Maintainability:** The new code should follow existing project conventions and be well-structured to ensure maintainability.
## 4. Acceptance Criteria
- **AC1:** `gui_2.py` successfully launches without errors.
- **AC2:** The "Hub" layout is present and organizes the UI elements as specified.
- **AC3:** The Diagnostics panel is present and displays updating performance metrics.
- **AC4:** The API hooks server starts and is reachable when `gui_2.py` is run with the appropriate flag.
- **AC5:** The AI can successfully use file system tools provided by `mcp_client.py`.
- **AC6:** The "Prior Session Viewer" can successfully load and display a log file.
- **AC7:** All new UI elements correctly reflect the selected theme.
## 5. Out of Scope
- Deprecating or removing `gui.py`. Both will coexist for now.
- Any new features not already present in `gui.py`. This is strictly a porting and alignment task.
@@ -35,3 +35,6 @@ Consolidate the simulation into end-user artifacts and CI tests.
- [x] Task: Create `tests/test_live_workflow.py` for automated regression testing. 8bd280e - [x] Task: Create `tests/test_live_workflow.py` for automated regression testing. 8bd280e
- [x] Task: Perform a full visual walkthrough and verify 'human-readable' pace. 8e63b31 - [x] Task: Perform a full visual walkthrough and verify 'human-readable' pace. 8e63b31
- [x] Task: Conductor - User Manual Verification 'Phase 4: Final Integration & Regression' (Protocol in workflow.md) 8e63b31 - [x] Task: Conductor - User Manual Verification 'Phase 4: Final Integration & Regression' (Protocol in workflow.md) 8e63b31
## Phase: Review Fixes
- [x] Task: Apply review suggestions 064d7ba
+4 -1
View File
@@ -1,14 +1,17 @@
# Project Context # Project Context
## Definition ## Definition
- [Product Definition](./product.md) - [Product Definition](./product.md)
- [Product Guidelines](./product-guidelines.md) - [Product Guidelines](./product-guidelines.md)
- [Tech Stack](./tech-stack.md) - [Tech Stack](./tech-stack.md)
## Workflow ## Workflow
- [Workflow](./workflow.md) - [Workflow](./workflow.md)
- [Code Style Guides](./code_styleguides/) - [Code Style Guides](./code_styleguides/)
## Management ## Management
- [Tracks Registry](./tracks.md) - [Tracks Registry](./tracks.md)
- [Tracks Directory](./tracks/) - [Tracks Directory](./tracks/)
+3
View File
@@ -1,15 +1,18 @@
# Product Guidelines: Manual Slop # Product Guidelines: Manual Slop
## Documentation Style ## Documentation Style
- **Strict & In-Depth:** Documentation must follow an old-school, highly detailed technical breakdown style (similar to VEFontCache-Odin). Focus on architectural design, state management, algorithmic details, and structural formats rather than just surface-level usage. - **Strict & In-Depth:** Documentation must follow an old-school, highly detailed technical breakdown style (similar to VEFontCache-Odin). Focus on architectural design, state management, algorithmic details, and structural formats rather than just surface-level usage.
## UX & UI Principles ## UX & UI Principles
- **USA Graphics Company Values:** Embrace high information density and tactile interactions. - **USA Graphics Company Values:** Embrace high information density and tactile interactions.
- **Arcade Aesthetics:** Utilize arcade game-style visual feedback for state updates (e.g., blinking notifications for tool execution and AI responses) to make the experience fun, visceral, and engaging. - **Arcade Aesthetics:** Utilize arcade game-style visual feedback for state updates (e.g., blinking notifications for tool execution and AI responses) to make the experience fun, visceral, and engaging.
- **Explicit Control & Expert Focus:** The interface should not hold the user's hand. It must prioritize explicit manual confirmation for destructive actions while providing dense, unadulterated access to logs and context. - **Explicit Control & Expert Focus:** The interface should not hold the user's hand. It must prioritize explicit manual confirmation for destructive actions while providing dense, unadulterated access to logs and context.
- **Multi-Viewport Capabilities:** Leverage dockable, floatable panels to allow users to build custom workspaces suitable for multi-monitor setups. - **Multi-Viewport Capabilities:** Leverage dockable, floatable panels to allow users to build custom workspaces suitable for multi-monitor setups.
## Code Standards & Architecture ## Code Standards & Architecture
- **Strict State Management:** There must be a rigorous separation between the Main GUI rendering thread and daemon execution threads. The UI should *never* hang during AI communication or script execution. Use lock-protected queues and events for synchronization. - **Strict State Management:** There must be a rigorous separation between the Main GUI rendering thread and daemon execution threads. The UI should *never* hang during AI communication or script execution. Use lock-protected queues and events for synchronization.
- **Comprehensive Logging:** Aggressively log all actions, API payloads, tool calls, and executed scripts. Maintain timestamped JSON-L and markdown logs to ensure total transparency and debuggability. - **Comprehensive Logging:** Aggressively log all actions, API payloads, tool calls, and executed scripts. Maintain timestamped JSON-L and markdown logs to ensure total transparency and debuggability.
- **Dependency Minimalism:** Limit external dependencies where possible. For instance, prefer standard library modules (like `urllib` and `html.parser` for web tools) over heavy third-party packages. - **Dependency Minimalism:** Limit external dependencies where possible. For instance, prefer standard library modules (like `urllib` and `html.parser` for web tools) over heavy third-party packages.
+6 -1
View File
@@ -1,17 +1,21 @@
# Technology Stack: Manual Slop # Technology Stack: Manual Slop
## Core Language ## Core Language
- **Python 3.11+** - **Python 3.11+**
## GUI Frameworks ## GUI Frameworks
- **Dear PyGui:** For immediate/retained mode GUI rendering and node mapping. - **Dear PyGui:** For immediate/retained mode GUI rendering and node mapping.
- **ImGui Bundle (`imgui-bundle`):** To provide advanced multi-viewport and dockable panel capabilities on top of Dear ImGui. - **ImGui Bundle (`imgui-bundle`):** To provide advanced multi-viewport and dockable panel capabilities on top of Dear ImGui.
## AI Integration SDKs ## AI Integration SDKs
- **google-genai:** For Google Gemini API interaction and explicit context caching. - **google-genai:** For Google Gemini API interaction and explicit context caching.
- **anthropic:** For Anthropic Claude API interaction, supporting ephemeral prompt caching. - **anthropic:** For Anthropic Claude API interaction, supporting ephemeral prompt caching.
## Configuration & Tooling ## Configuration & Tooling
- **tomli-w:** For writing TOML configuration files. - **tomli-w:** For writing TOML configuration files.
- **psutil:** For system and process monitoring (CPU/Memory telemetry). - **psutil:** For system and process monitoring (CPU/Memory telemetry).
- **uv:** An extremely fast Python package and project manager. - **uv:** An extremely fast Python package and project manager.
@@ -19,4 +23,5 @@
- **ApiHookClient:** A dedicated IPC client for automated GUI interaction and state inspection. - **ApiHookClient:** A dedicated IPC client for automated GUI interaction and state inspection.
## Architectural Patterns ## Architectural Patterns
- **Event-Driven Metrics:** Uses a custom `EventEmitter` to decouple API lifecycle events from UI rendering, improving performance and responsiveness.
- **Event-Driven Metrics:** Uses a custom `EventEmitter` to decouple API lifecycle events from UI rendering, improving performance and responsiveness.
+17 -2
View File
@@ -7,13 +7,28 @@ This file tracks all major tracks for the project. Each track has its own detail
- [x] **Track: Implement context visualization and memory management improvements** - [x] **Track: Implement context visualization and memory management improvements**
*Link: [./tracks/context_management_20260223/](./tracks/context_management_20260223/)* *Link: [./tracks/context_management_20260223/](./tracks/context_management_20260223/)*
--- ---
- [x] **Track: Make a human-like test ux interaction where the AI creates a small python project, engages in a 5-turn discussion, and verifies history/session management features via API hooks.** - [~] **Track: get gui_2 working with latest changes to the project.**
*Link: [./tracks/live_ux_test_20260223/](./tracks/live_ux_test_20260223/)* *Link: [./tracks/gui2_feature_parity_20260223/](./tracks/gui2_feature_parity_20260223/)*
---
- [ ] **Track: Move discussion histories to their own toml to prevent the ai agent from reading it (will be on a blacklist).**
*Link: [./tracks/history_segregation_20260224/](./tracks/history_segregation_20260224/)*
---
- [ ] **Track: Update ./docs/* & ./Readme.md, review ./MainContext.md significance (should we keep it..).**
*Link: [./tracks/documentation_refresh_20260224/](./tracks/documentation_refresh_20260224/)*
---
- [ ] **Track: Investigate differences left between gui.py and gui_2.py. Needs to reach full parity, so we can sunset guy.py**
*Link: [./tracks/gui2_parity_20260224/](./tracks/gui2_parity_20260224/)*
@@ -0,0 +1,5 @@
# Track documentation_refresh_20260224 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)
@@ -0,0 +1,8 @@
{
"track_id": "documentation_refresh_20260224",
"type": "chore",
"status": "new",
"created_at": "2026-02-24T18:35:00Z",
"updated_at": "2026-02-24T18:35:00Z",
"description": "Update ./docs/* & ./Readme.md, review ./MainContext.md significance (should we keep it..)."
}
@@ -0,0 +1,34 @@
# Implementation Plan: Documentation Refresh and Context Cleanup
This plan follows the project's standard task workflow to modernize documentation and decommission redundant context files.
## Phase 1: Context Cleanup
Permanently remove redundant files and update project-wide references.
- [ ] Task: Audit references to `MainContext.md` across the project.
- [ ] Task: Write failing test that verifies the absence of `MainContext.md` and related broken links.
- [ ] Task: Delete `MainContext.md` and update any identified references.
- [ ] Task: Verify that all internal links remain functional.
- [ ] Task: Conductor - User Manual Verification 'Context Cleanup' (Protocol in workflow.md)
## Phase 2: Core Documentation Refresh
Update the Architecture and Tools guides to reflect recent architectural changes.
- [ ] Task: Audit `docs/guide_architecture.md` against current code (e.g., `EventEmitter`, `ApiHookClient`, Conductor).
- [ ] Task: Update `docs/guide_architecture.md` with current Conductor-driven architecture and dual-GUI structure.
- [ ] Task: Audit `docs/guide_tools.md` for toolset accuracy.
- [ ] Task: Update `docs/guide_tools.md` to include API hook client and performance monitoring documentation.
- [ ] Task: Verify documentation alignment with actual implementation.
- [ ] Task: Conductor - User Manual Verification 'Core Documentation Refresh' (Protocol in workflow.md)
## Phase 3: README Refresh and Link Validation
Modernize the primary project entry point and ensure documentation integrity.
- [ ] Task: Audit `Readme.md` for accuracy of setup instructions and feature highlights.
- [ ] Task: Write failing test (or link audit) that identifies outdated setup steps or broken links.
- [ ] Task: Update `Readme.md` with `uv` setup, current project vision, and feature lists (Conductor, GUI 2.0).
- [ ] Task: Perform a project-wide link validation of all Markdown files in `./docs/` and the root.
- [ ] Task: Verify setup instructions by performing a manual walkthrough of the Readme steps.
- [ ] Task: Conductor - User Manual Verification 'README Refresh and Link Validation' (Protocol in workflow.md)
---
[checkpoint: (SHA will be recorded here)]
@@ -0,0 +1,38 @@
# Specification: Documentation Refresh and Context Cleanup
## Overview
This track aims to modernize the project's documentation suite (Architecture, Tools, README) to reflect recent significant architectural additions, including the Conductor framework, the development of `gui_2.py`, and the API hook verification system. It also includes the decommissioning of `MainContext.md`, which has been identified as redundant in the current project structure.
## Functional Requirements
1. **Architecture Update (`docs/guide_architecture.md`):**
- Incorporate descriptions of the Conductor framework and its role in spec-driven development.
- Document the dual-GUI structure (`gui.py` and `gui_2.py`) and their respective development stages.
- Detail the `EventEmitter` and `ApiHookClient` as core architectural components.
2. **Tools Update (`docs/guide_tools.md`):**
- Refresh documentation for the current MCP toolset.
- Add documentation for the API hook client and automated GUI verification tools.
- Update performance monitoring tool descriptions.
3. **README Refresh (`Readme.md`):**
- Update setup instructions (e.g., `uv`, `credentials.toml`).
- Highlight new features: Conductor integration, GUI 2.0, and automated testing capabilities.
- Ensure the high-level project vision aligns with the current state.
4. **Context Cleanup:**
- Permanently remove `MainContext.md` from the project root.
- Update any internal references pointing to `MainContext.md`.
## Non-Functional Requirements
- **Link Validation:** All internal documentation links must be verified as valid.
- **Code-Doc Alignment:** Architectural descriptions must accurately reflect the current code structure.
- **Clarity & Brevity:** Documentation should remain concise and targeted at expert-level developers.
## Acceptance Criteria
- [ ] `MainContext.md` is deleted from the project.
- [ ] `docs/guide_architecture.md` is updated and reviewed for accuracy.
- [ ] `docs/guide_tools.md` is updated and reviewed for accuracy.
- [ ] `Readme.md` setup and feature sections are current.
- [ ] All internal links between `Readme.md` and the `./docs/` folder are functional.
## Out of Scope
- Automated documentation generation (e.g., Sphinx, Doxygen).
- In-depth documentation for features still in early prototyping stages.
- Creating new video or visual walkthroughs.
@@ -0,0 +1,5 @@
# Track gui2_parity_20260224 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)
@@ -0,0 +1,8 @@
{
"track_id": "gui2_parity_20260224",
"type": "feature",
"status": "new",
"created_at": "2026-02-24T18:38:00Z",
"updated_at": "2026-02-24T18:38:00Z",
"description": "Investigate differences left between gui.py and gui_2.py. Needs to reach full parity, so we can sunset guy.py"
}
@@ -0,0 +1,40 @@
# Implementation Plan: GUI 2.0 Feature Parity and Migration
This plan follows the project's standard task workflow to ensure full feature parity and a stable transition to the ImGui-based `gui_2.py`.
## Phase 1: Research and Gap Analysis
Identify and document the exact differences between `gui.py` and `gui_2.py`.
- [ ] Task: Audit `gui.py` and `gui_2.py` side-by-side to document specific visual and functional gaps.
- [ ] Task: Map existing `EventEmitter` and `ApiHookClient` integrations in `gui.py` to `gui_2.py`.
- [ ] Task: Write failing tests in `tests/test_gui2_parity.py` that identify missing UI components or broken hooks in `gui_2.py`.
- [ ] Task: Verify failing parity tests.
- [ ] Task: Conductor - User Manual Verification 'Phase 1: Research and Gap Analysis' (Protocol in workflow.md)
## Phase 2: Visual and Functional Parity Implementation
Address all identified gaps and ensure functional equivalence.
- [ ] Task: Implement missing panels and UX nuances (text sizing, font rendering) in `gui_2.py`.
- [ ] Task: Complete integration of all `EventEmitter` hooks in `gui_2.py` to match `gui.py`.
- [ ] Task: Verify functional parity by running `tests/test_gui2_events.py` and `tests/test_gui2_layout.py`.
- [ ] Task: Address any identified regressions or missing interactive elements.
- [ ] Task: Conductor - User Manual Verification 'Phase 2: Visual and Functional Parity Implementation' (Protocol in workflow.md)
## Phase 3: Performance Optimization and Final Validation
Ensure `gui_2.py` meets performance requirements and passes all quality gates.
- [ ] Task: Conduct performance benchmarking (FPS, CPU, Frame Time) for both `gui.py` and `gui_2.py`.
- [ ] Task: Optimize rendering and docking logic in `gui_2.py` if performance targets are not met.
- [ ] Task: Verify performance parity using `tests/test_gui2_performance.py`.
- [ ] Task: Run full suite of automated GUI tests with `live_gui` fixture on `gui_2.py`.
- [ ] Task: Conductor - User Manual Verification 'Phase 3: Performance Optimization and Final Validation' (Protocol in workflow.md)
## Phase 4: Deprecation and Cleanup
Finalize the migration and decommission the original `gui.py`.
- [ ] Task: Rename `gui.py` to `gui_legacy.py`.
- [ ] Task: Update project entry point or documentation to point to `gui_2.py` as the primary interface.
- [ ] Task: Final project-wide link validation and documentation update.
- [ ] Task: Conductor - User Manual Verification 'Phase 4: Deprecation and Cleanup' (Protocol in workflow.md)
---
[checkpoint: (SHA will be recorded here)]
@@ -0,0 +1,29 @@
# Specification: GUI 2.0 Feature Parity and Migration
## Overview
The project is transitioning from `gui.py` (Dear PyGui-based) to `gui_2.py` (ImGui Bundle-based) to leverage advanced multi-viewport and docking features not natively supported by Dear PyGui. This track focuses on achieving full visual, functional, and performance parity between the two implementations, ultimately enabling the decommissioning of the original `gui.py`.
## Functional Requirements
1. **Visual Parity:**
- Ensure all panels, layouts, and interactive elements in `gui_2.py` match the established UX of `gui.py`.
- Address nuances in UX, such as text panel sizing and font rendering, to ensure a seamless transition for existing users.
2. **Functional Parity:**
- Verify that all backend hooks (API metrics, context management, MCP tools, shell execution) work identically in `gui_2.py`.
- Ensure all interactive controls (buttons, inputs, dropdowns) trigger the correct application state changes.
3. **Performance Parity:**
- Benchmark `gui_2.py` against `gui.py` for FPS, frame time, and CPU/memory usage.
- Optimize `gui_2.py` to meet or exceed the performance metrics of the original implementation.
## Non-Functional Requirements
- **Multi-Viewport Stability:** Ensure the ImGui-bundle implementation is stable across multiple windows and docking configurations.
- **Deprecation Workflow:** Establish a clear path for renaming `gui.py` to `gui_legacy.py` for a transition period.
## Acceptance Criteria
- [ ] `gui_2.py` successfully passes the full suite of GUI automated verification tests (e.g., `test_gui2_events.py`, `test_gui2_layout.py`).
- [ ] A side-by-side audit confirms visual and functional parity for all core Hub panels.
- [ ] Performance benchmarks show `gui_2.py` is within +/- 5% of `gui.py` metrics.
- [ ] `gui.py` is renamed to `gui_legacy.py`.
## Out of Scope
- Introducing new UI features or backend capabilities not present in `gui.py`.
- Modifying the core `EventEmitter` or `AiClient` logic (unless required for GUI hook integration).
@@ -0,0 +1,5 @@
# Track history_segregation_20260224 Context
- [Specification](./spec.md)
- [Implementation Plan](./plan.md)
- [Metadata](./metadata.json)
@@ -0,0 +1,8 @@
{
"track_id": "history_segregation_20260224",
"type": "feature",
"status": "new",
"created_at": "2026-02-24T18:28:00Z",
"updated_at": "2026-02-24T18:28:00Z",
"description": "Move discussion histories to their own toml to prevent the ai agent from reading it (will be on a blacklist)."
}
@@ -0,0 +1,30 @@
# Implementation Plan: Discussion History Segregation and Blacklisting
This plan follows the Test-Driven Development (TDD) workflow to move discussion history into a dedicated sibling TOML file and enforce a strict blacklist against AI agent tool access.
## Phase 1: Foundation and Migration Logic
This phase focuses on the structural changes needed to handle dual-file project configurations and the automatic migration of legacy history.
- [ ] Task: Research existing `ProjectManager` serialization and tool access points in `mcp_client.py`.
- [ ] Task: Write TDD tests for migrating the `discussion` key from `manual_slop.toml` to a new sibling file.
- [ ] Task: Implement automatic migration in `ProjectManager.load_project()`.
- [ ] Task: Update `ProjectManager.save_project()` to persist history separately.
- [ ] Task: Verify that existing history is correctly migrated and remains visible in the GUI.
- [ ] Task: Conductor - User Manual Verification 'Foundation and Migration' (Protocol in workflow.md)
## Phase 2: Blacklist Enforcement
This phase ensures the AI agent is strictly prevented from reading the history source files through its tools.
- [ ] Task: Write failing tests that attempt to read a known history file via the `mcp_client.py` and `aggregate.py` logic.
- [ ] Task: Implement hardcoded exclusion for `*_history.toml` and `history.toml` in `mcp_client.py`.
- [ ] Task: Implement hardcoded exclusion in `aggregate.py` to prevent history from being added as a raw file context.
- [ ] Task: Verify that tool-based file reads for the history file return a "Permission Denied" or "Blacklisted" error.
- [ ] Task: Conductor - User Manual Verification 'Blacklist Enforcement' (Protocol in workflow.md)
## Phase 3: Integration and Final Validation
This phase validates the full lifecycle, ensuring the application remains functional and secure.
- [ ] Task: Conduct a full walkthrough using the simulation scripts to verify history persistence across turns.
- [ ] Task: Verify that the AI can still use the *curated* history provided in the prompt context but cannot access the raw file.
- [ ] Task: Run full suite of automated GUI and API hook tests.
- [ ] Task: Conductor - User Manual Verification 'Integration and Final Validation' (Protocol in workflow.md)
@@ -0,0 +1,32 @@
# Specification: Discussion History Segregation and Blacklisting
## Overview
Currently, `manual_slop.toml` stores both project configuration and the entire discussion history. This leads to redundancy and potential context bloat if the AI agent reads the raw TOML file via its tools. This track will move the discussion history to a dedicated sibling TOML file (`history.toml`) and strictly blacklist it from the AI agent's file tools to ensure it only interacts with the curated context provided in the prompt.
## Functional Requirements
1. **File Segregation:**
- Create a dedicated history file (e.g., `manual_slop_history.toml`) in the same directory as the main project configuration.
- The main `manual_slop.toml` will henceforth only store project settings, tracked files, and system prompts.
2. **Automatic Migration:**
- On application startup or project load, detect if the `discussion` key exists in `manual_slop.toml`.
- If found, automatically migrate all discussion entries to the new history sibling file and remove the key from the original file.
3. **Strict Blacklisting:**
- Hardcode the exclusion of the history TOML file in `mcp_client.py` and `aggregate.py`.
- The AI agent must be prevented from reading this file using the `read_file` or `search_files` tools.
4. **Backend Integration:**
- Update `ProjectManager` in `project_manager.py` to manage two distinct TOML files per project.
- Ensure the GUI correctly loads history from the new file while maintaining existing functionality.
## Non-Functional Requirements
- **Data Integrity:** Ensure no history is lost during the migration process.
- **Performance:** Minimize I/O overhead when saving history entries after each AI turn.
## Acceptance Criteria
- [ ] `manual_slop.toml` no longer contains the `discussion` array.
- [ ] A sibling `history.toml` (or similar) contains all historical and new discussion entries.
- [ ] The AI agent cannot access the history TOML file via its file tools (verification via tool call test).
- [ ] Discussion history remains visible in the GUI and is correctly included in the AI prompt context.
## Out of Scope
- Customizable blacklist via the UI.
- Support for cloud-based history storage.
+23 -6
View File
@@ -33,7 +33,7 @@ All tasks follow a strict lifecycle:
- Rerun tests to ensure they still pass after refactoring. - Rerun tests to ensure they still pass after refactoring.
6. **Verify Coverage:** Run coverage reports using the project's chosen tools. For example, in a Python project, this might look like: 6. **Verify Coverage:** Run coverage reports using the project's chosen tools. For example, in a Python project, this might look like:
```bash ```powershell
pytest --cov=app --cov-report=html pytest --cov=app --cov-report=html
``` ```
Target: >80% coverage for new code. The specific tools and commands will vary by language and framework. Target: >80% coverage for new code. The specific tools and commands will vary by language and framework.
@@ -53,7 +53,7 @@ All tasks follow a strict lifecycle:
- **Step 9.1: Get Commit Hash:** Obtain the hash of the *just-completed commit* (`git log -1 --format="%H"`). - **Step 9.1: Get Commit Hash:** Obtain the hash of the *just-completed commit* (`git log -1 --format="%H"`).
- **Step 9.2: Draft Note Content:** Create a detailed summary for the completed task. This should include the task name, a summary of changes, a list of all created/modified files, and the core "why" for the change. - **Step 9.2: Draft Note Content:** Create a detailed summary for the completed task. This should include the task name, a summary of changes, a list of all created/modified files, and the core "why" for the change.
- **Step 9.3: Attach Note:** Use the `git notes` command to attach the summary to the commit. - **Step 9.3: Attach Note:** Use the `git notes` command to attach the summary to the commit.
```bash ```powershell
# The note content from the previous step is passed via the -m flag. # The note content from the previous step is passed via the -m flag.
git notes add -m "<note content>" <commit_hash> git notes add -m "<note content>" <commit_hash>
``` ```
@@ -136,6 +136,7 @@ For features involving the GUI or complex internal state, unit tests are often i
# The GUI is now running on port 8999 # The GUI is now running on port 8999
... ...
``` ```
Note: pytest must be run with `uv`.
3. **Verify via ApiHookClient:** Use the `ApiHookClient` in `api_hook_client.py` to interact with the running application. It includes robust retry logic and health checks. 3. **Verify via ApiHookClient:** Use the `ApiHookClient` in `api_hook_client.py` to interact with the running application. It includes robust retry logic and health checks.
@@ -163,21 +164,24 @@ Before marking any task complete, verify:
**AI AGENT INSTRUCTION: This section should be adapted to the project's specific language, framework, and build tools.** **AI AGENT INSTRUCTION: This section should be adapted to the project's specific language, framework, and build tools.**
### Setup ### Setup
```bash
```powershell
# Example: Commands to set up the development environment (e.g., install dependencies, configure database) # Example: Commands to set up the development environment (e.g., install dependencies, configure database)
# e.g., for a Node.js project: npm install # e.g., for a Node.js project: npm install
# e.g., for a Go project: go mod tidy # e.g., for a Go project: go mod tidy
``` ```
### Daily Development ### Daily Development
```bash
```powershell
# Example: Commands for common daily tasks (e.g., start dev server, run tests, lint, format) # Example: Commands for common daily tasks (e.g., start dev server, run tests, lint, format)
# e.g., for a Node.js project: npm run dev, npm test, npm run lint # e.g., for a Node.js project: npm run dev, npm test, npm run lint
# e.g., for a Go project: go run main.go, go test ./..., go fmt ./... # e.g., for a Go project: go run main.go, go test ./..., go fmt ./...
``` ```
### Before Committing ### Before Committing
```bash
```powershell
# Example: Commands to run all pre-commit checks (e.g., format, lint, type check, run tests) # Example: Commands to run all pre-commit checks (e.g., format, lint, type check, run tests)
# e.g., for a Node.js project: npm run check # e.g., for a Node.js project: npm run check
# e.g., for a Go project: make check (if a Makefile exists) # e.g., for a Go project: make check (if a Makefile exists)
@@ -186,18 +190,21 @@ Before marking any task complete, verify:
## Testing Requirements ## Testing Requirements
### Unit Testing ### Unit Testing
- Every module must have corresponding tests. - Every module must have corresponding tests.
- Use appropriate test setup/teardown mechanisms (e.g., fixtures, beforeEach/afterEach). - Use appropriate test setup/teardown mechanisms (e.g., fixtures, beforeEach/afterEach).
- Mock external dependencies. - Mock external dependencies.
- Test both success and failure cases. - Test both success and failure cases.
### Integration Testing ### Integration Testing
- Test complete user flows - Test complete user flows
- Verify database transactions - Verify database transactions
- Test authentication and authorization - Test authentication and authorization
- Check form submissions - Check form submissions
### Mobile Testing ### Mobile Testing
- Test on actual iPhone when possible - Test on actual iPhone when possible
- Use Safari developer tools - Use Safari developer tools
- Test touch interactions - Test touch interactions
@@ -207,6 +214,7 @@ Before marking any task complete, verify:
## Code Review Process ## Code Review Process
### Self-Review Checklist ### Self-Review Checklist
Before requesting review: Before requesting review:
1. **Functionality** 1. **Functionality**
@@ -245,6 +253,7 @@ Before requesting review:
## Commit Guidelines ## Commit Guidelines
### Message Format ### Message Format
``` ```
<type>(<scope>): <description> <type>(<scope>): <description>
@@ -254,6 +263,7 @@ Before requesting review:
``` ```
### Types ### Types
- `feat`: New feature - `feat`: New feature
- `fix`: Bug fix - `fix`: Bug fix
- `docs`: Documentation only - `docs`: Documentation only
@@ -263,7 +273,8 @@ Before requesting review:
- `chore`: Maintenance tasks - `chore`: Maintenance tasks
### Examples ### Examples
```bash
```powershell
git commit -m "feat(auth): Add remember me functionality" git commit -m "feat(auth): Add remember me functionality"
git commit -m "fix(posts): Correct excerpt generation for short posts" git commit -m "fix(posts): Correct excerpt generation for short posts"
git commit -m "test(comments): Add tests for emoji reaction limits" git commit -m "test(comments): Add tests for emoji reaction limits"
@@ -287,6 +298,7 @@ A task is complete when:
## Emergency Procedures ## Emergency Procedures
### Critical Bug in Production ### Critical Bug in Production
1. Create hotfix branch from main 1. Create hotfix branch from main
2. Write failing test for bug 2. Write failing test for bug
3. Implement minimal fix 3. Implement minimal fix
@@ -295,6 +307,7 @@ A task is complete when:
6. Document in plan.md 6. Document in plan.md
### Data Loss ### Data Loss
1. Stop all write operations 1. Stop all write operations
2. Restore from latest backup 2. Restore from latest backup
3. Verify data integrity 3. Verify data integrity
@@ -302,6 +315,7 @@ A task is complete when:
5. Update backup procedures 5. Update backup procedures
### Security Breach ### Security Breach
1. Rotate all secrets immediately 1. Rotate all secrets immediately
2. Review access logs 2. Review access logs
3. Patch vulnerability 3. Patch vulnerability
@@ -311,6 +325,7 @@ A task is complete when:
## Deployment Workflow ## Deployment Workflow
### Pre-Deployment Checklist ### Pre-Deployment Checklist
- [ ] All tests passing - [ ] All tests passing
- [ ] Coverage >80% - [ ] Coverage >80%
- [ ] No linting errors - [ ] No linting errors
@@ -320,6 +335,7 @@ A task is complete when:
- [ ] Backup created - [ ] Backup created
### Deployment Steps ### Deployment Steps
1. Merge feature branch to main 1. Merge feature branch to main
2. Tag release with version 2. Tag release with version
3. Push to deployment service 3. Push to deployment service
@@ -329,6 +345,7 @@ A task is complete when:
7. Monitor for errors 7. Monitor for errors
### Post-Deployment ### Post-Deployment
1. Monitor analytics 1. Monitor analytics
2. Check error logs 2. Check error logs
3. Gather user feedback 3. Gather user feedback
+17 -8
View File
@@ -1,16 +1,16 @@
[ai] [ai]
provider = "gemini" provider = "gemini"
model = "gemini-2.5-flash" model = "gemini-2.5-flash-lite"
temperature = 0.6000000238418579 temperature = 0.0
max_tokens = 12000 max_tokens = 8192
history_trunc_limit = 8000 history_trunc_limit = 8000
system_prompt = "DO NOT EVER make a shell script unless told to. DO NOT EVER make a readme or a file describing your changes unless your are told to. If you have commands I should be entering into the command line or if you have something to explain to me, please just use code blocks or normal text output. DO NOT DO ANYTHING OTHER THAN WHAT YOU WERE TOLD TODO. DO NOT EVER, EVER DO ANYTHING OTHER THAN WHAT YOU WERE TOLD TO DO. IF YOU WANT TO DO OTHER THINGS, SIMPLY SUGGEST THEM, AND THEN I WILL REVIEW YOUR CHANGES, AND MAKE THE DECISION ON HOW TO PROCEED. WHEN WRITING SCRIPTS USE A 120-160 character limit per line. I don't want to see scrunched code.\n" system_prompt = ""
[theme] [theme]
palette = "10x Dark" palette = "Gold"
font_path = "C:/Users/Ed/AppData/Local/uv/cache/archive-v0/WSthkYsQ82b_ywV6DkiaJ/pygame_gui/data/FiraCode-Regular.ttf" font_path = ""
font_size = 18.0 font_size = 14.0
scale = 1.0 scale = 1.5
[projects] [projects]
paths = [ paths = [
@@ -19,3 +19,12 @@ paths = [
"C:\\projects\\manual_slop\\tests\\temp_project.toml", "C:\\projects\\manual_slop\\tests\\temp_project.toml",
] ]
active = "C:\\projects\\manual_slop\\tests\\temp_project.toml" active = "C:\\projects\\manual_slop\\tests\\temp_project.toml"
[gui.show_windows]
"Context Hub" = true
"Files & Media" = true
"AI Settings" = true
"Discussion Hub" = true
"Operations Hub" = true
Theme = true
Diagnostics = true
+1142 -706
View File
File diff suppressed because it is too large Load Diff
+77 -191
View File
File diff suppressed because one or more lines are too long
+112 -63
View File
@@ -8,100 +8,149 @@ Size=400,400
Collapsed=0 Collapsed=0
[Window][Projects] [Window][Projects]
Pos=209,396 ViewportPos=43,95
Size=387,337 ViewportId=0x78C57832
Size=897,649
Collapsed=0 Collapsed=0
DockId=0x00000014,0 DockId=0x00000002,0
[Window][Files] [Window][Files]
Pos=0,0 ViewportPos=3125,170
Size=207,1200 ViewportId=0x26D64416
Size=593,581
Collapsed=0 Collapsed=0
DockId=0x00000011,0 DockId=0x00000009,0
[Window][Screenshots] [Window][Screenshots]
Pos=209,0 ViewportPos=3125,170
Size=387,171 ViewportId=0x26D64416
Collapsed=0 Pos=0,583
DockId=0x00000015,0 Size=593,574
[Window][Discussion History]
Pos=598,128
Size=712,619
Collapsed=0
DockId=0x0000000E,0
[Window][Provider]
Pos=209,913
Size=387,287
Collapsed=0 Collapsed=0
DockId=0x0000000A,0 DockId=0x0000000A,0
[Window][Message] [Window][Discussion History]
Pos=598,749 Pos=0,17
Size=712,451 Size=1680,730
Collapsed=0 Collapsed=0
DockId=0x0000000C,0 DockId=0x00000011,0
[Window][Provider]
ViewportPos=43,95
ViewportId=0x78C57832
Pos=0,651
Size=897,468
Collapsed=0
DockId=0x00000002,0
[Window][Message]
Pos=0,749
Size=1680,451
Collapsed=0
DockId=0x0000000F,0
[Window][Response] [Window][Response]
Pos=209,735 Pos=0,749
Size=387,176 Size=1680,451
Collapsed=0 Collapsed=0
DockId=0x00000010,0 DockId=0x0000000F,1
[Window][Tool Calls] [Window][Tool Calls]
Pos=1312,733 ViewportPos=43,95
Size=368,144 ViewportId=0x78C57832
Pos=0,1121
Size=897,775
Collapsed=0 Collapsed=0
DockId=0x00000008,0 DockId=0x00000001,1
[Window][Comms History] [Window][Comms History]
Pos=1312,879 ViewportPos=43,95
Size=368,321 ViewportId=0x78C57832
Pos=0,1121
Size=897,775
Collapsed=0 Collapsed=0
DockId=0x00000006,0 DockId=0x00000001,0
[Window][System Prompts] [Window][System Prompts]
Pos=1312,0 Pos=0,749
Size=368,731 Size=1680,451
Collapsed=0 Collapsed=0
DockId=0x00000007,0 DockId=0x0000000F,2
[Window][Theme] [Window][Theme]
Pos=209,173 ViewportPos=43,95
Size=387,221 ViewportId=0x78C57832
Size=897,1896
Collapsed=0 Collapsed=0
DockId=0x00000016,0 DockId=0x00000002,0
[Window][Text Viewer - Entry #7] [Window][Text Viewer - Entry #7]
Pos=379,324 Pos=379,324
Size=900,700 Size=900,700
Collapsed=0 Collapsed=0
[Window][Diagnostics]
Pos=1190,794
Size=490,406
Collapsed=0
DockId=0x00000006,0
[Window][Context Hub]
Pos=0,17
Size=270,728
Collapsed=0
DockId=0x00000011,0
[Window][AI Settings Hub]
Pos=406,17
Size=435,1186
Collapsed=0
DockId=0x0000000D,0
[Window][Discussion Hub]
Pos=1190,17
Size=490,775
Collapsed=0
DockId=0x00000005,0
[Window][Operations Hub]
Pos=272,17
Size=916,1183
Collapsed=0
DockId=0x00000010,0
[Window][Files & Media]
Pos=0,17
Size=270,728
Collapsed=0
DockId=0x00000011,1
[Window][AI Settings]
Pos=0,747
Size=270,453
Collapsed=0
DockId=0x00000012,0
[Docking][Data] [Docking][Data]
DockSpace ID=0xAFC85805 Window=0x079D3A04 Pos=138,161 Size=1680,1200 Split=X DockNode ID=0x00000007 Pos=43,95 Size=897,1896 Split=Y
DockNode ID=0x00000011 Parent=0xAFC85805 SizeRef=207,1200 Selected=0x0469CA7A DockNode ID=0x00000002 Parent=0x00000007 SizeRef=1029,1119 Selected=0x8CA2375C
DockNode ID=0x00000012 Parent=0xAFC85805 SizeRef=1559,1200 Split=X DockNode ID=0x00000001 Parent=0x00000007 SizeRef=1029,775 Selected=0x8B4EBFA6
DockNode ID=0x00000003 Parent=0x00000012 SizeRef=1189,1200 Split=X DockNode ID=0x00000008 Pos=3125,170 Size=593,1157 Split=Y
DockNode ID=0x00000001 Parent=0x00000003 SizeRef=387,1200 Split=Y Selected=0x8CA2375C DockNode ID=0x00000009 Parent=0x00000008 SizeRef=1029,147 Selected=0x0469CA7A
DockNode ID=0x00000009 Parent=0x00000001 SizeRef=405,911 Split=Y Selected=0x8CA2375C DockNode ID=0x0000000A Parent=0x00000008 SizeRef=1029,145 Selected=0xDF822E02
DockNode ID=0x0000000F Parent=0x00000009 SizeRef=405,733 Split=Y Selected=0x8CA2375C DockSpace ID=0xAFC85805 Window=0x079D3A04 Pos=476,516 Size=1680,1183 Split=Y
DockNode ID=0x00000013 Parent=0x0000000F SizeRef=405,394 Split=Y Selected=0x8CA2375C DockNode ID=0x0000000C Parent=0xAFC85805 SizeRef=1362,1041 Split=X Selected=0x5D11106F
DockNode ID=0x00000015 Parent=0x00000013 SizeRef=405,171 Selected=0xDF822E02 DockNode ID=0x00000003 Parent=0x0000000C SizeRef=1188,1183 Split=X
DockNode ID=0x00000016 Parent=0x00000013 SizeRef=405,221 Selected=0x8CA2375C DockNode ID=0x0000000B Parent=0x00000003 SizeRef=404,1186 Split=X Selected=0xF4139CA2
DockNode ID=0x00000014 Parent=0x0000000F SizeRef=405,337 Selected=0xDA22FEDA DockNode ID=0x0000000E Parent=0x0000000B SizeRef=270,1183 Split=Y Selected=0xF4139CA2
DockNode ID=0x00000010 Parent=0x00000009 SizeRef=405,176 Selected=0x0D5A5273 DockNode ID=0x00000011 Parent=0x0000000E SizeRef=422,728 CentralNode=1 Selected=0xF4139CA2
DockNode ID=0x0000000A Parent=0x00000001 SizeRef=405,287 Selected=0xA07B5F14 DockNode ID=0x00000012 Parent=0x0000000E SizeRef=422,453 Selected=0x7BD57D6A
DockNode ID=0x00000002 Parent=0x00000003 SizeRef=800,1200 Split=Y DockNode ID=0x00000010 Parent=0x0000000B SizeRef=916,1183 Selected=0x418C7449
DockNode ID=0x0000000B Parent=0x00000002 SizeRef=1010,747 Split=Y DockNode ID=0x0000000D Parent=0x00000003 SizeRef=435,1186 Selected=0x363E93D6
DockNode ID=0x0000000D Parent=0x0000000B SizeRef=1010,126 CentralNode=1 DockNode ID=0x00000004 Parent=0x0000000C SizeRef=490,1183 Split=Y Selected=0x418C7449
DockNode ID=0x0000000E Parent=0x0000000B SizeRef=1010,619 Selected=0x5D11106F DockNode ID=0x00000005 Parent=0x00000004 SizeRef=837,775 Selected=0x6F2B5B04
DockNode ID=0x0000000C Parent=0x00000002 SizeRef=1010,451 Selected=0x66CFB56E DockNode ID=0x00000006 Parent=0x00000004 SizeRef=837,406 Selected=0xB4CBF21A
DockNode ID=0x00000004 Parent=0x00000012 SizeRef=368,1200 Split=Y Selected=0xDD6419BC DockNode ID=0x0000000F Parent=0xAFC85805 SizeRef=1362,451 Selected=0xDD6419BC
DockNode ID=0x00000005 Parent=0x00000004 SizeRef=261,877 Split=Y Selected=0xDD6419BC
DockNode ID=0x00000007 Parent=0x00000005 SizeRef=261,731 Selected=0xDD6419BC
DockNode ID=0x00000008 Parent=0x00000005 SizeRef=261,144 Selected=0x1D56B311
DockNode ID=0x00000006 Parent=0x00000004 SizeRef=261,321 Selected=0x8B4EBFA6
;;;<<<Layout_655921752_Default>>>;;; ;;;<<<Layout_655921752_Default>>>;;;
;;;<<<HelloImGui_Misc>>>;;; ;;;<<<HelloImGui_Misc>>>;;;
@@ -111,6 +160,6 @@ Name=Default
Show=false Show=false
ShowFps=true ShowFps=true
[Theme] [Theme]
Name=DarculaDarker Name=SoDark_AccentRed
;;;<<<SplitIds>>>;;; ;;;<<<SplitIds>>>;;;
{"gImGuiSplitIDs":{"MainDockSpace":2949142533}} {"gImGuiSplitIDs":{"MainDockSpace":2949142533}}
+17 -6
View File
@@ -65,7 +65,10 @@ def configure(file_items: list[dict], extra_base_dirs: list[str] | None = None):
for item in file_items: for item in file_items:
p = item.get("path") p = item.get("path")
if p is not None: if p is not None:
rp = Path(p).resolve() try:
rp = Path(p).resolve(strict=True)
except (OSError, ValueError):
rp = Path(p).resolve()
_allowed_paths.add(rp) _allowed_paths.add(rp)
_base_dirs.add(rp.parent) _base_dirs.add(rp.parent)
@@ -82,8 +85,13 @@ def _is_allowed(path: Path) -> bool:
A path is allowed if: A path is allowed if:
- it is explicitly in _allowed_paths, OR - it is explicitly in _allowed_paths, OR
- it is contained within (or equal to) one of the _base_dirs - it is contained within (or equal to) one of the _base_dirs
All paths are resolved (follows symlinks) before comparison to prevent
symlink-based path traversal.
""" """
rp = path.resolve() try:
rp = path.resolve(strict=True)
except (OSError, ValueError):
rp = path.resolve()
if rp in _allowed_paths: if rp in _allowed_paths:
return True return True
for bd in _base_dirs: for bd in _base_dirs:
@@ -108,9 +116,10 @@ def _resolve_and_check(raw_path: str) -> tuple[Path | None, str]:
except Exception as e: except Exception as e:
return None, f"ERROR: invalid path '{raw_path}': {e}" return None, f"ERROR: invalid path '{raw_path}': {e}"
if not _is_allowed(p): if not _is_allowed(p):
allowed_bases = "\\n".join([f" - {d}" for d in _base_dirs])
return None, ( return None, (
f"ACCESS DENIED: '{raw_path}' is not within the allowed paths. " f"ACCESS DENIED: '{raw_path}' resolves to '{p}', which is not within the allowed paths.\\n"
f"Use list_directory or search_files on an allowed base directory first." f"Allowed base directories are:\\n{allowed_bases}"
) )
return p, "" return p, ""
@@ -269,7 +278,8 @@ def web_search(query: str) -> str:
url = "https://html.duckduckgo.com/html/?q=" + urllib.parse.quote(query) url = "https://html.duckduckgo.com/html/?q=" + urllib.parse.quote(query)
req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'}) req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'})
try: try:
html = urllib.request.urlopen(req, timeout=10).read().decode('utf-8', errors='ignore') with urllib.request.urlopen(req, timeout=10) as resp:
html = resp.read().decode('utf-8', errors='ignore')
parser = _DDGParser() parser = _DDGParser()
parser.feed(html) parser.feed(html)
if not parser.results: if not parser.results:
@@ -292,7 +302,8 @@ def fetch_url(url: str) -> str:
req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'}) req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'})
try: try:
html = urllib.request.urlopen(req, timeout=10).read().decode('utf-8', errors='ignore') with urllib.request.urlopen(req, timeout=10) as resp:
html = resp.read().decode('utf-8', errors='ignore')
parser = _TextExtractor() parser = _TextExtractor()
parser.feed(html) parser.feed(html)
full_text = " ".join(parser.text) full_text = " ".join(parser.text)
+1 -1
View File
@@ -35,5 +35,5 @@ active = "main"
[discussion.discussions.main] [discussion.discussions.main]
git_commit = "" git_commit = ""
last_updated = "2026-02-23T19:01:39" last_updated = "2026-02-23T22:59:46"
history = [] history = []
+3
View File
@@ -26,6 +26,7 @@ scripts/generated/
Where <ts> = YYYYMMDD_HHMMSS of when this session was started. Where <ts> = YYYYMMDD_HHMMSS of when this session was started.
""" """
import atexit
import datetime import datetime
import json import json
import threading import threading
@@ -71,6 +72,8 @@ def open_session():
_tool_fh.write(f"# Tool-call log — session {_ts}\n\n") _tool_fh.write(f"# Tool-call log — session {_ts}\n\n")
_tool_fh.flush() _tool_fh.flush()
atexit.register(close_session)
def close_session(): def close_session():
"""Flush and close both log files. Called on clean exit (optional).""" """Flush and close both log files. Called on clean exit (optional)."""
-1
View File
@@ -1 +0,0 @@
Get-Content .env | ForEach-Object { $name, $value = $_.Split('=', 2); [Environment]::SetEnvironmentVariable($name, $value, "Process") }
+3 -2
View File
@@ -24,9 +24,10 @@ def main():
project_name = f"LiveTest_{int(time.time())}" project_name = f"LiveTest_{int(time.time())}"
# Use actual project dir for realism # Use actual project dir for realism
git_dir = os.path.abspath(".") git_dir = os.path.abspath(".")
project_path = os.path.join(git_dir, "tests", f"{project_name}.toml")
print(f"\n[Action] Scaffolding Project: {project_name}") print(f"\n[Action] Scaffolding Project: {project_name} at {project_path}")
sim.setup_new_project(project_name, git_dir) sim.setup_new_project(project_name, git_dir, project_path)
# Enable auto-add so results appear in history automatically # Enable auto-add so results appear in history automatically
client.set_value("auto_add_history", True) client.set_value("auto_add_history", True)
+8 -5
View File
@@ -31,11 +31,14 @@ class UserSimAgent:
break break
# We need to set a custom system prompt for the User Simulator # We need to set a custom system prompt for the User Simulator
ai_client.set_custom_system_prompt(self.system_prompt) try:
ai_client.set_custom_system_prompt(self.system_prompt)
# We'll use a blank md_content for now as the 'User' doesn't need to read its own files # We'll use a blank md_content for now as the 'User' doesn't need to read its own files
# via the same mechanism, but we could provide it if needed. # via the same mechanism, but we could provide it if needed.
response = ai_client.send(md_content="", user_message=last_ai_msg) response = ai_client.send(md_content="", user_message=last_ai_msg)
finally:
ai_client.set_custom_system_prompt("")
return response return response
def perform_action_with_delay(self, action_func, *args, **kwargs): def perform_action_with_delay(self, action_func, *args, **kwargs):
+5 -2
View File
@@ -8,9 +8,12 @@ class WorkflowSimulator:
self.client = hook_client self.client = hook_client
self.user_agent = UserSimAgent(hook_client) self.user_agent = UserSimAgent(hook_client)
def setup_new_project(self, name, git_dir): def setup_new_project(self, name, git_dir, project_path=None):
print(f"Setting up new project: {name}") print(f"Setting up new project: {name}")
self.client.click("btn_project_new") if project_path:
self.client.click("btn_project_new_automated", user_data=project_path)
else:
self.client.click("btn_project_new")
time.sleep(1) time.sleep(1)
self.client.set_value("project_git_dir", git_dir) self.client.set_value("project_git_dir", git_dir)
self.client.click("btn_project_save") self.client.click("btn_project_save")
View File
+10 -2
View File
@@ -4,6 +4,7 @@ git_dir = "C:\\projects\\manual_slop"
system_prompt = "" system_prompt = ""
main_context = "" main_context = ""
word_wrap = true word_wrap = true
summary_only = false
[output] [output]
output_dir = "./md_gen" output_dir = "./md_gen"
@@ -37,5 +38,12 @@ auto_add = true
[discussion.discussions.main] [discussion.discussions.main]
git_commit = "" git_commit = ""
last_updated = "2026-02-23T19:53:17" last_updated = "2026-02-23T23:25:47"
history = [] history = [
"@2026-02-23T23:01:07\nUser:\nhey this is a test, plaase read temp-project.toml and do a random web search!",
"@2026-02-23T23:01:10\nAI:\nI encountered an error trying to read the file `temp-project.toml`. It seems the file doesn't exist.\n\nFor the web search, I found a few interesting results for \"random search\":\n\n1. **The Useless Web:** A site that takes you to random, useless websites.\n2. **Random Number Generator - Calculator.net:** Information on how random number generators work.\n3. **Random Google Search Generator - Perchance:** A tool to create random Google search queries.\n4. **Random Picker - Vondy:** A tool to randomly select items from a list.\n5. **Wikipedia on Random Search:** An explanation of random search as an optimization method.",
"@2026-02-23T23:01:29\nUser:\nmy bad temp_project.toml",
"@2026-02-23T23:01:30\nAI:\nI'm sorry, but I still couldn't find the file `temp_project.toml`. Please double-check the filename and try again.",
"@2026-02-23T23:01:52\nUser:\n./tests/temp_project.toml?",
"@2026-02-23T23:01:53\nAI:\nOkay, I've successfully read the content of `./tests/temp_project.toml`. It appears to be a configuration file for a project, detailing settings for project name, Git directory, output directories, file paths, screenshot paths, agent tools, and discussion history.",
]
+48
View File
@@ -0,0 +1,48 @@
import pytest
from unittest.mock import MagicMock, patch
from gui_2 import App
import ai_client
from events import EventEmitter
@pytest.fixture
def app_instance():
"""
Fixture to create an instance of the gui_2.App class for testing.
It mocks functions that would render a window or block execution.
"""
if not hasattr(ai_client, 'events') or ai_client.events is None:
ai_client.events = EventEmitter()
with (
patch('gui_2.load_config', return_value={'ai': {}, 'projects': {}}),
patch('gui_2.save_config'),
patch('gui_2.project_manager'),
patch('gui_2.session_logger'),
patch('gui_2.immapp.run'),
patch.object(App, '_load_active_project'),
patch.object(App, '_fetch_models'),
patch.object(App, '_load_fonts'),
patch.object(App, '_post_init')
):
yield App
def test_app_subscribes_to_events(app_instance):
"""
This test checks that the App's __init__ method subscribes the necessary
event handlers to the ai_client.events emitter.
This test will fail until the event subscription logic is added to gui_2.App.
"""
with patch.object(ai_client.events, 'on') as mock_on:
app = app_instance()
mock_on.assert_called()
calls = mock_on.call_args_list
event_names = [call.args[0] for call in calls]
assert "request_start" in event_names
assert "response_received" in event_names
assert "tool_execution" in event_names
for call in calls:
handler = call.args[1]
assert hasattr(handler, '__self__')
assert handler.__self__ is app
+48
View File
@@ -0,0 +1,48 @@
import pytest
from unittest.mock import patch
from gui_2 import App
@pytest.fixture
def app_instance():
with (
patch('gui_2.load_config', return_value={'gui': {'show_windows': {}}}),
patch('gui_2.save_config'),
patch('gui_2.project_manager'),
patch('gui_2.session_logger'),
patch('gui_2.immapp.run'),
patch.object(App, '_load_active_project'),
patch.object(App, '_fetch_models'),
patch.object(App, '_load_fonts'),
patch.object(App, '_post_init')
):
yield App()
def test_gui2_hubs_exist_in_show_windows(app_instance):
"""
Verifies that the new consolidated Hub windows are defined in the App's show_windows.
This ensures they will be available in the 'Windows' menu.
"""
expected_hubs = [
"Context Hub",
"AI Settings",
"Discussion Hub",
"Operations Hub",
"Files & Media",
"Theme",
]
for hub in expected_hubs:
assert hub in app_instance.show_windows, f"Expected hub window '{hub}' not found in show_windows"
def test_gui2_old_windows_removed_from_show_windows(app_instance):
"""
Verifies that the old fragmented windows are removed from show_windows.
"""
old_windows = [
"Projects", "Files", "Screenshots",
"Provider", "System Prompts",
"Message", "Response", "Tool Calls", "Comms History"
]
for old_win in old_windows:
assert old_win not in app_instance.show_windows, f"Old window '{old_win}' should have been removed from show_windows"
+79
View File
@@ -0,0 +1,79 @@
import pytest
from unittest.mock import patch, MagicMock
from gui_2 import App
import ai_client
from events import EventEmitter
@pytest.fixture
def app_instance():
if not hasattr(ai_client, 'events') or ai_client.events is None:
ai_client.events = EventEmitter()
with (
patch('gui_2.load_config', return_value={'ai': {}, 'projects': {}}),
patch('gui_2.save_config'),
patch('gui_2.project_manager'),
patch('gui_2.session_logger'),
patch('gui_2.immapp.run'),
patch.object(App, '_load_active_project'),
patch.object(App, '_fetch_models'),
patch.object(App, '_load_fonts'),
patch.object(App, '_post_init')
):
yield App()
def test_mcp_tool_call_is_dispatched(app_instance):
"""
This test verifies that when the AI returns a tool call for an MCP function,
the ai_client correctly dispatches it to mcp_client.
This will fail until mcp_client is properly integrated.
"""
# 1. Define the mock tool call from the AI
mock_fc = MagicMock()
mock_fc.name = "read_file"
mock_fc.args = {"file_path": "test.txt"}
# 2. Construct the mock AI response (Gemini format)
mock_response_with_tool = MagicMock()
mock_part = MagicMock()
mock_part.text = ""
mock_part.function_call = mock_fc
mock_candidate = MagicMock()
mock_candidate.content.parts = [mock_part]
mock_candidate.finish_reason.name = "TOOL_CALLING"
mock_response_with_tool.candidates = [mock_candidate]
class DummyUsage:
prompt_token_count = 100
candidates_token_count = 10
cached_content_token_count = 0
mock_response_with_tool.usage_metadata = DummyUsage()
# 3. Create a mock for the final AI response after the tool call
mock_response_final = MagicMock()
mock_response_final.text = "Final answer"
mock_response_final.candidates = []
mock_response_final.usage_metadata = DummyUsage()
# 4. Patch the necessary components
with patch("ai_client._ensure_gemini_client"), \
patch("ai_client._gemini_client") as mock_client, \
patch('mcp_client.dispatch', return_value="file content") as mock_dispatch:
mock_chat = mock_client.chats.create.return_value
mock_chat.send_message.side_effect = [mock_response_with_tool, mock_response_final]
ai_client.set_provider("gemini", "mock-model")
# 5. Call the send function
ai_client.send(
md_content="some context",
user_message="read the file",
base_dir=".",
file_items=[],
discussion_history=""
)
# 6. Assert that the MCP dispatch function was called
mock_dispatch.assert_called_once_with("read_file", {"file_path": "test.txt"})