Plausible but wrong output
Generated work looks right, reads confidently, and quietly isn't.
Response
Validation, source checks, tests, and human review.
Scope drift
The change keeps expanding until review gets hard and intent gets murky.
Response
Small slices, explicit boundaries, and one definition of done.
Silent automation
Work happens invisibly and nobody notices until it's already gone wrong.
Response
Visible state, approvals, logs, and notifications.
Excessive permissions
An agent can touch far more than its job requires.
Response
Least privilege, narrow tool access, and isolated credentials.
Untrusted instructions in data
Retrieved or user-supplied content tries to redirect the system.
Response
Treat retrieved or user-supplied content as data — not authority over the system.
Missing provenance
Conclusions arrive with no way to tell what they were based on.
Response
Keep source references and distinguish generated conclusions from verified facts.
Context leakage
Sensitive information ends up somewhere it was never meant to go.
Response
Approved data boundaries and deliberate handling of sensitive information.
Fragile vendor coupling
Provider- or model-specific quirks leak into the whole system.
Response
Keep model- and provider-specific details behind explicit integration boundaries.
No recovery path
Something breaks and there's no clean way to stop or undo it.
Response
Dry runs, throttles, idempotency where practical, and an obvious off switch.