Skip to main content

Operational detectors (15)

Operational detectors flag UX and quality issues — repetition loops, truncated output, blank responses, placeholder text, persona drift, sycophancy, language switches. They aren't security threats; they're signals that something is wrong with the model's output that affects user experience or downstream automation.

Reference

IDDetectorTypeDetects
OPS-01BlankResponseDetectorRule-basedEmpty or whitespace-only responses
OPS-02RepetitionLoopDetectorRule-basedSame sentence repeated 3+ times
OPS-03IncompleteCodeBlockDetectorRule-basedUnclosed code fences
OPS-04PlaceholderTextDetectorRule-basedTODO, [INSERT HERE], Lorem ipsum leftovers
OPS-05ContextCollapseDetectorSemanticLoss of conversational context across turns
OPS-06AgentProbingDetectorSemanticAttempts to map agent capabilities or system prompt
OPS-07QueryIntentDetectorSemanticMalicious intent hidden in benign-looking queries
OPS-08ResponseCoherenceDetectorSemanticResponse that doesn't address the question asked
OPS-09TruncatedOutputDetectorRule-basedMid-sentence truncation and unclosed code fences
OPS-10WaitingForContextDetectorSemanticStall phrases when the user prompt was substantive
OPS-11UnboundedConsumptionDetectorRule-basedCompares response length to prompt length; flags unbounded expansion (OWASP LLM04)
OPS-12SemanticRepetitionDetectorSemanticSame idea restated with different wording — extends RepetitionLoop beyond literal string matching
OPS-13PersonaDriftDetectorSemanticTone, persona, or stated identity shifts significantly across turns — context poisoning signal
OPS-14SycophancyDetectorSemanticModel reverses a stated position purely because the user pushed back — epistemic cowardice
OPS-15WrongLanguageDetectorRule-basedResponse language doesn't match the user's language (script / charset detection)

Severity guidance

Operational issues rarely warrant Quarantine. Default routing:

opts.OnHigh = SentinelAction.Alert; // OPS-01 BlankResponse, OPS-13 PersonaDrift
opts.OnMedium = SentinelAction.Log; // most OPS-* fire here
opts.OnLow = SentinelAction.Log;

Some operational detectors are noisy by design — they cast a wide net. Disable or clamp the ones that don't fit your domain:

// Code-generation app — false positives on incomplete fences during streaming
opts.Configure<IncompleteCodeBlockDetector>(c => c.Enabled = false);

// Multilingual app — wrong-language is expected
opts.Configure<WrongLanguageDetector>(c => c.Enabled = false);

// RAG with long context — semantic repetition is by design
opts.Configure<SemanticRepetitionDetector>(c => c.SeverityCap = Severity.Low);

OPS-09 vs OPS-03 — TruncatedOutput vs IncompleteCodeBlock

TruncatedOutputDetector (OPS-09) is the broader signal — it flags any mid-sentence cutoff plus unclosed fences. IncompleteCodeBlockDetector (OPS-03) is the narrower fence-only check. If both fire on the same response, that's a hard truncation signal. If only OPS-09 fires, the prose is mid-sentence; if only OPS-03 fires, the prose is fine but a code block is open.

For most apps, leave both enabled and route at Medium so audit captures the signal without blocking.

OPS-11 UnboundedConsumption — DoS prevention

This one can warrant Alert or Quarantine. The detector compares response length to prompt length and flags ratios that look like the model is being prompted to emit unbounded output ("write me 10,000 words about X", followed by 50,000 words of output). This is an OWASP LLM04 signal — token cost amplification.

Tune the threshold by subclassing or routing aggressively:

opts.OnHigh = SentinelAction.Quarantine; // block runaway responses
opts.Configure<UnboundedConsumptionDetector>(c => c.SeverityFloor = Severity.High);

OPS-13 PersonaDrift — context poisoning canary

Persona drift is a low-frequency, high-signal detector. When the model's stated identity, tone, or role shifts across turns of a session, that often means something is poisoning the conversation context — prompt injection from a tool result, retrieval-augmented data, or a user successfully jailbreaking earlier in the conversation. Pair with SEC-09 IndirectInjection and SEC-31 VectorRetrievalPoisoning for a defense-in-depth view.