docs(styleguide): add 5 sections clarifying the convention's boundaries
This commit is contained in:
@@ -263,7 +263,7 @@ warnings use `warnings.warn(..., stacklevel=2)` which is thread-safe.
|
||||
**Don't use it for:**
|
||||
|
||||
- Constructors (`__init__`) that fail with programmer errors (use `assert` or
|
||||
`raise` for these).
|
||||
`raise` for these). See "Constructors Can Raise" below for the full rule.
|
||||
- Trivial getters that can't fail (`get_name() -> str` doesn't need a
|
||||
`Result`).
|
||||
- Performance-critical hot paths where the overhead of the dataclass
|
||||
@@ -271,6 +271,337 @@ warnings use `warnings.warn(..., stacklevel=2)` which is thread-safe.
|
||||
|
||||
---
|
||||
|
||||
## Boundary Types: What Counts as a "Boundary"?
|
||||
|
||||
The convention says "exceptions are reserved for the SDK boundary," but what
|
||||
counts as a boundary? There are 3 categories:
|
||||
|
||||
### 1. Third-party SDK calls
|
||||
|
||||
A try/except that wraps a call to a third-party SDK is the canonical
|
||||
boundary use of the pattern. The catch site converts the SDK's exception
|
||||
to `ErrorInfo` (or re-raises if the function is the public API and a Result
|
||||
is the right return type).
|
||||
|
||||
Recognized third-party SDK modules (partial list):
|
||||
`anthropic`, `google` / `google.genai` / `google.api_core`, `openai`,
|
||||
`groq`, `cohere`, `chromadb`, `sentence_transformers`, `huggingface_hub`,
|
||||
`requests`, `urllib3`, `httpx`, `aiohttp`, `websockets`, `psutil`,
|
||||
`imgui_bundle`, `dearpygui`, `PIL`, `cv2`, `numpy`.
|
||||
|
||||
Recognized third-party exception types (partial list):
|
||||
`anthropic.APIError` / `RateLimitError` / `AuthenticationError`,
|
||||
`google.api_core.exceptions.GoogleAPIError` / `ResourceExhausted`,
|
||||
`openai.OpenAIError` / `APIError` / `RateLimitError`,
|
||||
`requests.RequestException` / `ConnectionError` / `Timeout`,
|
||||
`httpx.HTTPError` / `RequestError`,
|
||||
`chromadb.errors.ChromaError`,
|
||||
`pydantic.ValidationError`.
|
||||
|
||||
### 2. Stdlib I/O that can raise
|
||||
|
||||
File and network I/O via stdlib (`open()`, `os.path.*`, `json.loads()`,
|
||||
`subprocess.run()`, `socket.*`, `sqlite3.*`, `csv.*`, `zipfile.*`,
|
||||
`xml.etree.ElementTree`) commonly raises. Catching the specific exception
|
||||
(`OSError`, `FileNotFoundError`, `PermissionError`,
|
||||
`json.JSONDecodeError`, `subprocess.CalledProcessError`, etc.) at the
|
||||
tool boundary and converting to `ErrorInfo` is compliant.
|
||||
|
||||
This is the "stdlib I/O exception caught in our own code is acceptable"
|
||||
rule. The catch site should be **specific** (`except FileNotFoundError`,
|
||||
not `except Exception`) and should convert to `ErrorInfo`, not swallow.
|
||||
|
||||
### 3. Framework boundaries (FastAPI)
|
||||
|
||||
A try/except or `raise` in a FastAPI `_api_*` handler is the framework
|
||||
boundary. `raise HTTPException(status_code=..., detail=...)` is the
|
||||
FastAPI-idiomatic way to signal an HTTP error; FastAPI converts it to a
|
||||
JSON response at the framework level. This is **not** an exception leak
|
||||
into internal code; it's the framework contract.
|
||||
|
||||
```python
|
||||
# Compliant: FastAPI boundary in _api_* handler
|
||||
async def _api_get_key(controller, header_key: str) -> str:
|
||||
if not _is_valid_key(header_key):
|
||||
raise HTTPException(status_code=403, detail="Could not validate API Key")
|
||||
return header_key
|
||||
|
||||
# Compliant: broad catch + HTTPException at the FastAPI boundary
|
||||
async def _api_generate(controller, payload):
|
||||
try:
|
||||
result = ai_client.send_result(...)
|
||||
return result.data
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=500, detail=f"AI call failed: {e}")
|
||||
```
|
||||
|
||||
The catch-all `except Exception` is acceptable here **because the
|
||||
conversion is to the framework's exception** (HTTPException), not to a
|
||||
silent swallow. The detail message includes the original error; the
|
||||
HTTP status code is the framework contract.
|
||||
|
||||
### What is NOT a boundary
|
||||
|
||||
- Internal business logic: `try/except` around a `for` loop in a
|
||||
controller method is internal, not boundary.
|
||||
- Cross-method calls within `src/`: calling a method in
|
||||
`app_controller.py` from a method in `app_controller.py` is internal,
|
||||
not boundary.
|
||||
- stdlib I/O that the user controls directly: opening a file the user
|
||||
passed via `--config` is internal; converting the failure should be
|
||||
Result-based, not exception-based.
|
||||
|
||||
---
|
||||
|
||||
## The Broad-Except Distinction
|
||||
|
||||
Anti-pattern #6 says "DON'T catch `except Exception` and silently swallow."
|
||||
But `except Exception` is **not always a violation**. The distinction is
|
||||
**what the catch site does with the exception**:
|
||||
|
||||
| What the catch does | Classification | Convention status |
|
||||
|---|---|---|
|
||||
| `pass` (or no body) | `INTERNAL_SILENT_SWALLOW` | **Violation** |
|
||||
| `print(...)` / `log(...)` only | `INTERNAL_SILENT_SWALLOW` | **Violation** (the data is lost) |
|
||||
| `return None` / `return Optional[T]` | `INTERNAL_OPTIONAL_RETURN` | **Violation** (use `Result[T]`) |
|
||||
| `return Result(data=..., errors=[ErrorInfo(...)])` | `BOUNDARY_CONVERSION` | **Compliant** (the canonical pattern) |
|
||||
| `raise` (re-raise) | `INTERNAL_RETHROW` (or `BOUNDARY_SDK` if at third-party call) | **Suspicious** (often refactorable) |
|
||||
| `raise HTTPException(...)` (in `_api_*` handler) | `BOUNDARY_FASTAPI` | **Compliant** (the framework contract) |
|
||||
|
||||
**The canonical pattern** (in `_result` functions that wrap third-party SDK
|
||||
calls):
|
||||
|
||||
```python
|
||||
def _validate_collection_dim_result(self) -> Result[None]:
|
||||
if self.collection is None or self.collection == "mock":
|
||||
return Result(data=None)
|
||||
try:
|
||||
res = self.collection.get(limit=1, include=["embeddings"])
|
||||
# ... validation logic ...
|
||||
return Result(data=None)
|
||||
except Exception as e:
|
||||
return Result(data=None, errors=[
|
||||
ErrorInfo(kind=ErrorKind.INTERNAL,
|
||||
message=f"Failed to validate collection dim: {e}",
|
||||
source="rag._validate_collection_dim",
|
||||
original=e)
|
||||
])
|
||||
```
|
||||
|
||||
This `except Exception` is **compliant** because the catch + ErrorInfo
|
||||
conversion IS the data-oriented pattern. The `original=e` field preserves
|
||||
the original exception for debugging.
|
||||
|
||||
**The anti-pattern** (in internal code that has nothing to do with a
|
||||
third-party SDK):
|
||||
|
||||
```python
|
||||
# VIOLATION: broad catch + silent swallow
|
||||
try:
|
||||
do_something()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# VIOLATION: broad catch + log-only (data is lost)
|
||||
try:
|
||||
do_something()
|
||||
except Exception as e:
|
||||
print(f"Error: {e}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Constructors Can Raise
|
||||
|
||||
Per the "When to Use This Convention" section, constructors (`__init__`)
|
||||
that fail with programmer errors use `assert` or `raise`. This section
|
||||
elaborates.
|
||||
|
||||
**Compliant constructor raises:**
|
||||
|
||||
```python
|
||||
class MyClass:
|
||||
def __init__(self, config: Config):
|
||||
if config is None:
|
||||
raise ValueError("MyClass requires a non-None Config")
|
||||
if not config.api_key:
|
||||
raise ValueError("MyClass requires a non-empty api_key")
|
||||
self._config = config
|
||||
```
|
||||
|
||||
**Compliant assert (for impossible states):**
|
||||
|
||||
```python
|
||||
def _set_rag_status(self, status: str):
|
||||
# The status string is one of a known set; if it's not, the caller
|
||||
# has a bug.
|
||||
assert status in {"idle", "ready", "syncing", "error"}, f"Unknown status: {status}"
|
||||
self._rag_status = status
|
||||
```
|
||||
|
||||
**The rule:** if the failure is "this object cannot exist without X," raise
|
||||
in `__init__` is the canonical pattern. The Result pattern is for runtime
|
||||
failures ("the network is down"); raise is for programmer errors ("you
|
||||
forgot to pass X").
|
||||
|
||||
**Recognized programmer-error exception types** (per
|
||||
`scripts/audit_exception_handling.py` `INTERNAL_PROGRAMMER_RAISE`
|
||||
category):
|
||||
`AssertionError`, `ValueError`, `KeyError`, `IndexError`, `TypeError`,
|
||||
`AttributeError`, `NameError`, `RuntimeError`, `NotImplementedError`.
|
||||
|
||||
---
|
||||
|
||||
## Re-Raise Patterns
|
||||
|
||||
A `try/except + raise` (without ErrorInfo conversion) is **suspicious** but
|
||||
not always a violation. There are 3 legitimate re-raise patterns:
|
||||
|
||||
### 1. Catch + convert + raise as a different type
|
||||
|
||||
```python
|
||||
# Compliant: convert library error to user-friendly error
|
||||
try:
|
||||
value = json.loads(raw)
|
||||
except json.JSONDecodeError as e:
|
||||
raise ValueError(f"Invalid JSON: {e}") from e
|
||||
```
|
||||
|
||||
The `from e` preserves the original exception in the traceback. The
|
||||
new exception type (`ValueError`) is more meaningful to the caller.
|
||||
|
||||
### 2. Catch + log + re-raise
|
||||
|
||||
```python
|
||||
# Compliant: log before propagating
|
||||
try:
|
||||
do_something()
|
||||
except Exception as e:
|
||||
logger.exception("do_something failed; will propagate")
|
||||
raise
|
||||
```
|
||||
|
||||
The log line provides a record; the re-raise preserves the original
|
||||
control flow. This is appropriate when the failure is severe and the
|
||||
caller should still handle it.
|
||||
|
||||
### 3. Catch + cleanup + re-raise
|
||||
|
||||
```python
|
||||
# Compliant: ensure cleanup before propagating
|
||||
try:
|
||||
resource = acquire()
|
||||
do_something(resource)
|
||||
finally:
|
||||
release(resource) # `finally` is cleaner; `except+raise` is for when
|
||||
# you also need to log or convert
|
||||
```
|
||||
|
||||
Use `try/finally` for the pure cleanup case (no logging/conversion).
|
||||
Use `try/except + re-raise` when you need to log or convert AND ensure
|
||||
cleanup.
|
||||
|
||||
### Suspicious re-raise (often a code smell)
|
||||
|
||||
```python
|
||||
# SUSPICIOUS: catch + re-raise the same exception (no value-add)
|
||||
try:
|
||||
do_something()
|
||||
except Exception:
|
||||
raise
|
||||
```
|
||||
|
||||
This catches an exception, does nothing with it, and re-raises. The
|
||||
`try/except` is dead code; remove it or use a `Result`-based propagation
|
||||
instead.
|
||||
|
||||
The audit script flags this as `INTERNAL_RETHROW` (suspicious). If you
|
||||
see this pattern in code review, ask "is the `try/except` doing anything
|
||||
useful? If not, remove it."
|
||||
|
||||
---
|
||||
|
||||
## Audit Script
|
||||
|
||||
The convention is enforced via
|
||||
`scripts/audit_exception_handling.py`. This is a static analyzer (AST-based)
|
||||
that classifies every `try/except/finally/raise` site in the codebase per
|
||||
the categories in the previous sections.
|
||||
|
||||
**Usage:**
|
||||
|
||||
```bash
|
||||
# Human-readable report
|
||||
uv run python scripts/audit_exception_handling.py
|
||||
|
||||
# JSON output for tooling
|
||||
uv run python scripts/audit_exception_handling.py --json
|
||||
|
||||
# Include tests/ and scripts/
|
||||
uv run python scripts/audit_exception_handling.py --include-tests
|
||||
|
||||
# Top N files (default: 15)
|
||||
uv run python scripts/audit_exception_handling.py --top 20
|
||||
|
||||
# Show every site inline
|
||||
uv run python scripts/audit_exception_handling.py --verbose
|
||||
|
||||
# Strict mode (exit 1 on any violation; for CI use)
|
||||
uv run python scripts/audit_exception_handling.py --strict
|
||||
```
|
||||
|
||||
**"Delete to turn off"** (per `feature_flags.md`): `rm
|
||||
scripts/audit_exception_handling.py` disables the audit. Re-enable by
|
||||
restoring the file (it's tracked in git).
|
||||
|
||||
**Classification categories** (the canonical taxonomy; matches the
|
||||
script's output):
|
||||
|
||||
| Category | Convention status | When |
|
||||
|---|---|---|
|
||||
| `BOUNDARY_SDK` | Compliant | Wraps a third-party SDK call |
|
||||
| `BOUNDARY_IO` | Compliant | Wraps stdlib I/O that can raise |
|
||||
| `BOUNDARY_CONVERSION` | Compliant | Catches and converts to `ErrorInfo` in a `Result` |
|
||||
| `BOUNDARY_FASTAPI` | Compliant | FastAPI `HTTPException` in `_api_*` handler |
|
||||
| `INTERNAL_SILENT_SWALLOW` | **Violation** | `except ...: pass` or just logs |
|
||||
| `INTERNAL_BROAD_CATCH` | **Violation** | `except Exception` without ErrorInfo conversion, in non-`*_result` code |
|
||||
| `INTERNAL_OPTIONAL_RETURN` | **Violation** | `try/except + return None/Optional[T]` |
|
||||
| `INTERNAL_RETHROW` | Suspicious | `try/except + raise` (without ErrorInfo conversion) |
|
||||
| `INTERNAL_PROGRAMMER_RAISE` | Compliant | `raise` for impossible state / precondition |
|
||||
| `INTERNAL_COMPLIANT` | Compliant | `try/finally` (no except) — canonical cleanup |
|
||||
| `UNCLEAR` | Review needed | Can't determine automatically |
|
||||
|
||||
**Output structure:**
|
||||
|
||||
```
|
||||
=== Exception Handling Audit (Data-Oriented Convention) ===
|
||||
|
||||
Files scanned: 65
|
||||
Files with findings: 42
|
||||
Total sites: 348
|
||||
Compliant sites: 80
|
||||
Suspicious sites: 25
|
||||
Violation sites: 211
|
||||
Unclear (review): 32
|
||||
|
||||
--- Baseline (refactored files: mcp_client, ai_client, rag_engine) ---
|
||||
Sites: 112, violations: 77
|
||||
--- Migration target (all other src/ files) ---
|
||||
Sites: 236, violations: 134
|
||||
```
|
||||
|
||||
The **baseline** is the 3 fully-refactored files (the convention reference).
|
||||
The **migration target** is the ~10 unrefactored files in `src/`. The
|
||||
violation count is informational; the user decides which migration-target
|
||||
files warrant a refactor track.
|
||||
|
||||
**Important:** the audit is **informational**, not a CI gate. The script
|
||||
exits 0 by default. Use `--strict` to enable CI-gate mode (exit 1 on any
|
||||
violation). The user is expected to review the report and decide the
|
||||
next action.
|
||||
|
||||
---
|
||||
|
||||
## Migration Playbook
|
||||
|
||||
When converting existing code:
|
||||
|
||||
Reference in New Issue
Block a user