abc333f91b
Ctrl+C in sloppy.py's terminal would hang the process when a worker of the shared 4-thread I/O pool was mid-task in user code (e.g. a long- running Gemini/Anthropic HTTP request). The hang chain: 1. SIGINT delivered to main thread 2. Python raises KeyboardInterrupt (default handler) 3. Exception propagates out of main() 4. Interpreter finalization begins 5. ThreadPoolExecutor.__del__ runs shutdown(wait=True) 6. shutdown(wait=True) joins all worker threads 7. The blocked worker never returns -> hang An atexit-based fix (mirroring the conftest fix at8957c9a5) was attempted first: register pool.shutdown(wait=False) at pool creation. Verified empirically that this DOES NOT WORK — atexit handlers do not fire at all when a pool worker is blocked in user code. The hang still occurs in ThreadPoolExecutor.__del__ -> shutdown(wait=True). Production fix: a SIGINT handler installed by AppController.__init__ that drains the pool non-blockingly and calls os._exit(0), bypassing the broken finalization chain. One wire covers all three modes (GUI/headless/web) since they all create an AppController. Files: - src/app_controller.py: new module-level _install_sigint_exit_handler helper called from __init__; one-line docstring at the function level documents the rationale. - tests/test_app_controller_sigint.py: new test file with 2 regression tests (unit: handler is installed on main thread; subprocess: handler exits within 2s when invoked with a blocked worker). - tests/test_io_pool.py: module docstring updated to explain the reverted atexit approach and point readers at the production fix. Best-effort: signal.signal may fail on non-main threads (some conftest warmup paths); failure is swallowed. The conftest's own atexit fix at8957c9a5covers the test fixture's normal-exit path.
62 lines
1.9 KiB
Python
62 lines
1.9 KiB
Python
"""Tests for src/io_pool.py (the shared 4-thread job pool on AppController).
|
|
|
|
Historical note: an earlier revision of this file added two regression
|
|
tests asserting that ``make_io_pool`` registered an atexit shutdown
|
|
handler. Those tests were reverted together with the production atexit
|
|
fix they guarded, because the atexit approach does not solve the actual
|
|
Ctrl+C hang (see ``src/io_pool.py`` module docstring). The production
|
|
fix is a SIGINT handler in ``AppController.__init__``; the regression
|
|
test for that lives in ``tests/test_app_controller_sigint.py``.
|
|
"""
|
|
|
|
import threading
|
|
import time
|
|
from concurrent.futures import ThreadPoolExecutor
|
|
from pathlib import Path
|
|
import sys
|
|
|
|
ROOT = Path(__file__).resolve().parent.parent
|
|
sys.path.insert(0, str(ROOT))
|
|
|
|
from src.io_pool import make_io_pool, IO_POOL_MAX_WORKERS # noqa: E402
|
|
|
|
|
|
def test_make_io_pool_returns_thread_pool_executor() -> None:
|
|
pool = make_io_pool()
|
|
assert isinstance(pool, ThreadPoolExecutor)
|
|
pool.shutdown(wait=False)
|
|
|
|
|
|
def test_make_io_pool_has_four_workers() -> None:
|
|
pool = make_io_pool()
|
|
assert pool._max_workers == IO_POOL_MAX_WORKERS == 4
|
|
pool.shutdown(wait=False)
|
|
|
|
|
|
def test_make_io_pool_workers_named_controller_io() -> None:
|
|
pool = make_io_pool()
|
|
|
|
def capture() -> str:
|
|
return threading.current_thread().name
|
|
|
|
fut = pool.submit(capture)
|
|
name = fut.result(timeout=5)
|
|
assert name.startswith("controller-io"), f"got {name!r}"
|
|
pool.shutdown(wait=False)
|
|
|
|
|
|
def test_make_io_pool_runs_jobs_in_parallel() -> None:
|
|
pool = make_io_pool()
|
|
barrier = threading.Barrier(4)
|
|
results: list[float] = []
|
|
|
|
def wait_at_barrier() -> float:
|
|
t0 = time.perf_counter()
|
|
barrier.wait(timeout=5)
|
|
return time.perf_counter() - t0
|
|
|
|
futs = [pool.submit(wait_at_barrier) for _ in range(4)]
|
|
durations = [f.result(timeout=5) for f in futs]
|
|
assert all(d < 0.5 for d in durations), f"jobs did not run in parallel: {durations}"
|
|
pool.shutdown(wait=False)
|