Private
Public Access
0
0

fix(tests): watchdog exits with code 2 so run_tests_batched.py sees the timeout

The conftest watchdog (e1c8730f) used os._exit(0) after the 30s sleep. run_tests_batched.py calls subprocess.run(check=True) and only prints 'Batch N failed.' when the subprocess exits non-zero. Exit 0 hid the failure: pytest got killed mid-test, the FAILURES section never printed, and the runner silently moved to the next batch. The 'Total batches with failures: 1' summary at the end was therefore undercounting.

Fix: os._exit(0) -> os._exit(2). Code 2 is the standard 'interrupted by signal/timeout' code; pytest also uses it for Ctrl-C. The batched runner now correctly reports a non-zero exit as a failure.

Test updated (docstring) to document the new contract. 3/3 test_conftest_watchdog.py still pass.
This commit is contained in:
2026-06-07 12:44:57 -04:00
parent b95935bf9b
commit 719c5e274a
2 changed files with 14 additions and 7 deletions
+1 -1
View File
@@ -77,7 +77,7 @@ if not _warmup_app_controller.wait_for_warmup(timeout=60.0):
def _watchdog_exit() -> None:
import time
time.sleep(30.0)
os._exit(0)
os._exit(2)
import threading
threading.Thread(target=_watchdog_exit, daemon=True, name="conftest-hang-watchdog").start()
+13 -6
View File
@@ -9,12 +9,19 @@ observed:
hanging on HTTP call to the hook server or on process.wait() for
the sloppy.py subprocess.
The conftest installs a daemon-thread watchdog (os._exit(0) after a
timeout) to bound the hang. This test verifies the watchdog is
actually registered after the conftest loads. It does NOT spawn a
subprocess (which would itself be bound by the watchdog and create a
recursive timeout), it just inspects threading.enumerate() at the
time the test runs.
The conftest installs a daemon-thread watchdog (os._exit(2) after a
30s timeout) to bound the hang. The non-zero exit code is critical:
run_tests_batched.py uses subprocess.run(check=True) and only
prints "Batch N failed." if pytest exits non-zero. Exit code 0 would
silently report a successful batch even when the watchdog killed
pytest mid-test (the FAILURES section never gets printed). Exit
code 2 is the standard "interrupted by signal/timeout" code that
preserves the failure signal to the runner.
This test verifies the watchdog is actually registered after the
conftest loads. It does NOT spawn a subprocess (which would itself
be bound by the watchdog and create a recursive timeout), it just
inspects threading.enumerate() at the time the test runs.
If the watchdog is removed or the timeout grows, this test fails
and the run_tests_batched.py hang returns.