Status: Accepted
Date: 2026-05-11
Deciders: Engineering team
The chat feature accepts free-text input from anonymous users and forwards it to an LLM. Without input guardrails, the system is exposed to prompt injection, jailbreak attempts, toxic content, and sensitive data leakage. A guardrail strategy had to be chosen.
| Dimension | Assessment |
|---|---|
| Implementation cost | Zero |
| Risk | High — open to prompt injection, jailbreaks, and abuse |
| Acceptable | No |
| Dimension | Assessment |
|---|---|
| Implementation cost | Low initially, high to maintain |
| Coverage | Poor — easily bypassed with paraphrasing or encoding |
| Customisation | Full control |
| Maintenance | Ongoing — threat landscape evolves constantly |
| Dimension | Assessment |
|---|---|
| Implementation cost | Low — add a moderation prompt |
| Latency | High — doubles the LLM call latency |
| Reliability | Moderate — same model can be manipulated |
| Cost | Doubles inference token usage |
| Dimension | Assessment |
|---|---|
| Implementation cost | Low — single HTTP call, thin wrapper |
| Coverage | High — purpose-built threat detection with multiple sensor presets |
| Free tier | Yes — usable without upfront cost commitment |
| Customisation | Yes — custom sensors available for domain-specific threats |
| Latency | ~500–1,500 ms per check (acceptable, runs before LLM call) |
| Fail-open | Implemented — Stihia outage does not block users |
| Maintenance | None — threat models maintained by Stihia |
Stihia (api.stihia.ai,
POST /v1/sense) as the input guardrail service.
Key drivers:
POST /v1/sense call. The DISABLE_GUARDRAILS
environment variable provides a clean escape hatch for local development
without a Stihia key.STIHIA_API_KEY must be provisioned in all environments
(see README — Environment variables).high or
critical return a 403 to the browser; the LLM
is never called.low or medium severity are
allowed through — this threshold can be adjusted in
StihiaService.cs if stricter moderation is required.default-output
Stihia sensor if needed.