Skip to main content

Severity model

AI.Sentinel uses a five-level severity scale and aggregates per-detector severities into a numeric Threat Risk Score (0–100) that drives intervention decisions and dashboard visualization.

The Severity enum

public enum Severity
{
None = 0, // detector ran, no threat — DetectionResult.IsClean == true
Low,
Medium,
High,
Critical,
}
SeverityUse when
CriticalActive exploitation, data exfiltration, credential leak
HighLikely threat with high confidence (e.g., direct injection phrase match)
MediumSuspicious pattern with moderate confidence
LowAnomaly worth flagging but probably benign
NoneNo threat — DetectionResult.IsClean == true

Threat Risk Score (0–100)

Each detector emits a severity. The pipeline computes a per-detector score and aggregates:

SeverityPer-detector score
Critical100
High70
Medium40
Low15
None0

The aggregate is not the simple sum — it's a saturating max-with-decay so a single Critical doesn't compound with multiple Mediums into noise. Conceptually:

Score = max-with-attenuation over firing detectors

Cap at 100. Round to int. The dashboard's gauge maps the 0–100 score to four bands:

BandRangeUI color
SAFE0–14green
WATCH15–39yellow
ALERT40–69orange
ISOLATE70–100red

How severity flows

[1] Detector emits DetectionResult { Severity = Low, ... }

[2] Configure<T>(c => c.SeverityFloor = High) clamps Low → High

[3] LLM escalation may further adjust (Medium → Critical, etc.)

[4] Engine reads MaxSeverity from PipelineResult.Detections

[5] Engine maps severity → SentinelAction via opts.OnXxx properties

[6] AuditEntry records the post-clamp severity

The clamp pass means audit entries reflect policy-applied severity, not raw detector output. If you need both, use the DetectionResult clamp annotation (backlog) or compute pre-clamp from the detector source.

Where each detector falls

Every detector has a typical severity range. Some pin to a single level (SEC-23 PiiLeakage emits Critical for credit cards, Medium for phone numbers — it's pattern-class-driven). Others span the whole range based on their semantic-similarity bucket (any SemanticDetectorBase subclass: High if a high-bucket example matches at >0.90 cosine, Medium at >0.82, Low at >0.75, otherwise Clean).

Detector typeSeverity behavior
Rule-based, single pattern classOne severity per detector
Rule-based, multi-pattern (e.g., PII)Pattern-driven; different patterns emit different severities
Semantic (SemanticDetectorBase)Bucket-driven: HighThreshold (default 0.90) → High, MediumThreshold (0.82) → Medium, LowThreshold (0.75) → Low
LLM escalationInitial rule-based hit, then LLM may downgrade or upgrade

See the detector reference pages for per-detector severity guidance.

Action mapping

opts.OnCritical = SentinelAction.Quarantine;
opts.OnHigh = SentinelAction.Alert;
opts.OnMedium = SentinelAction.Log;
opts.OnLow = SentinelAction.Log;

The intervention engine looks up the action for the maximum severity across firing detectors. If three detectors fire (Low, Medium, High) on a single call, the engine applies OnHigh = Alert. The other detections still appear in audit, but only the max-severity action is taken.

Tuning per detector

Configure<T>(c => c.SeverityFloor / c.SeverityCap) lets you reshape a detector's severity output without changing detector code:

// JailbreakDetector might emit Low for borderline matches — promote to High so it triggers Alert/Quarantine
opts.Configure<JailbreakDetector>(c => c.SeverityFloor = Severity.High);

// RepetitionLoopDetector is noisy on legitimate code-generation responses — clamp to Low so it just logs
opts.Configure<RepetitionLoopDetector>(c => c.SeverityCap = Severity.Low);

// WrongLanguageDetector is irrelevant in a multilingual app — disable entirely
opts.Configure<WrongLanguageDetector>(c => c.Enabled = false);

Floor and Cap apply only to firing results — Clean results pass through unchanged. You can't fabricate a detection by setting Floor = High on a non-firing detector.

See the Configure<T> page for the full rules and examples.

Severity in the API

SurfaceWhat you see
DetectionResult.SeverityWhat the detector emitted (post-clamp)
PipelineResult.MaxSeverityHighest severity among firing detectors
PipelineResult.ScoreAggregate ThreatRiskScore (0–100)
AuditEntry.SeverityWhat the entry records — same as DetectionResult.Severity
SentinelException.PipelineResult.MaxSeverityQuarantine action carries this
Dashboard gaugeMaps Score to SAFE/WATCH/ALERT/ISOLATE bands

Defaults

If you don't configure OnCritical/OnHigh/OnMedium/OnLow at all, every action is Log. That's the conservative default — the framework won't break your app the moment you wire it up; you opt into stricter actions explicitly.

A reasonable production starting point:

opts.OnCritical = SentinelAction.Quarantine;
opts.OnHigh = SentinelAction.Alert;
opts.OnMedium = SentinelAction.Log;
opts.OnLow = SentinelAction.Log;

Tune from there as you learn which detectors fire frequently in your domain — disable the noisy ones, clamp the borderline ones, and let the high-confidence threats reach the action tier they deserve.