Skip to main content

Detection Engine

The detection engine is the backend service that consumes Kafka messages, persists access events to PostgreSQL, runs batch learning scans when asked, and evaluates real-time violations using a composite detector.

Consumers

1. Request log consumer

Subscribes to the configured request-log topic (default sf-events-access). For each message it:

  1. Normalizes the path (including resource ID segments for IDOR-aware paths).
  2. Inserts a RequestLog row (skipped when both user_id and role are empty).
  3. Runs CompositeDetector.DetectAll (internal/detector/composite_detector.go, wired from internal/consumer/request_consumer.go), which evaluates:
    • RBAC / vertical — role vs. learned endpoint mappings (ViolationTypeVerticalIDOR in storage).
    • IDOR / horizontal — requester vs. learned resource ownership (ViolationTypeHorizontalIDOR).
  4. Persists zero or more violations per request (each type is independent).

2. Scan consumer

Subscribes to KAFKA_TOPIC_SCAN_REQUESTS (default scan-requests) with consumer group KAFKA_GROUP_ID + "-scan" (see Configuration). It processes messages one at a time (single worker): for each message it runs the RBAC scanner first, then the IDOR scanner sequentially (internal/consumer/scan_consumer.go). Both scans refresh their respective in-memory caches used by the live path.

Processing Flow

Scan request message (ScanRequestMessage)

Optional JSON body on scan-requests messages. Unmarshalling errors fall back to environment defaults; empty payload uses defaults.

FieldTypeApplies toDescription
request_idstringCorrelation id (carried in the message for operators/dashboards).
learning_window_daysintRBAC + IDOROverrides learning window for both scanners when set.
violation_threshold_percentfloat64RBAC onlyMinimum role traffic share (%) to treat a role as allowed for an endpoint.
minimum_sample_sizeintRBAC + IDORRBAC: minimum total requests per endpoint. IDOR: minimum total accesses per resource before learning.
resource_dominance_percentfloat64IDOR onlyMinimum share (0–100) of accesses by the top user vs. total for that resource to create or update a user_resource_mapping. Ignored field: confirmation_threshold (deprecated).

RBAC learning algorithm (batch scanner)

The RBAC scanner (internal/scanner/scanner.go) recomputes endpoint → allowed roles from request_logs inside the learning window:

For each normalized_path:
Aggregate request counts per role (roles and paths must be non-null).
If total_requests for that path >= minimum_sample_size:
For each role on that path:
If (role_requests / total_requests) * 100 >= violation_threshold_percent:
Upsert that role as an active allowed mapping for the path
Else:
That role is not learned as allowed (no row for it in the active mapping set)

Results are written to the endpoint mapping store and the mapping cache is refreshed for real-time RBAC checks.

IDOR learning (batch scanner)

The IDOR scanner (internal/scanner/idor_scanner.go) learns resource ownership into user_resource_mappings: paths whose normalized form contains :id or :uuid, grouped by normalized_path, extracted resource_id, and user_id. For each resource it picks the sole user with the highest access count (ties skip learning), and only upserts a row if that user’s share of total accesses is at least resource_dominance_percent (e.g. 95%). The confirmed column is operator-only (dashboard): new rows start as false, rescans preserve an existing confirmed value, and it does not affect real-time alerting.

All stored mappings are loaded into the resource cache for horizontal IDOR checks.

DefaultIDORScanParams (defaults before merging env / message overrides):

ParameterDefaultDescription
LearningWindowDays90Days of request_logs considered (also initialized from LEARNING_WINDOW_DAYS in main).
ResourceDominancePercent95Minimum top-user access share (%) vs. total for that resource.
MinimumSampleSize2Minimum total accesses per resource before learning.

Violation detection (real-time)

After a request is stored, CompositeDetector.DetectAll may append both an RBAC/vertical and an IDOR/horizontal violation for the same request if both checks fail. Each violation is inserted independently.

Roughly:

  1. RBAC — If the endpoint has learned rules and the request’s role is not among allowed roles, emit a vertical-style violation.
  2. IDOR — If a learned owner exists in user_resource_mappings for the path/resource and the requester is not that owner, emit a horizontal-style violation (independent of confirmed).

If an endpoint is still below minimum_sample_size for RBAC learning, no RBAC rules exist yet for that path (learning mode). IDOR behavior depends on existing resource mappings and path shape.

Configurable parameters (environment)

ParameterDefaultDescription
learning_window_days90Default learning window (RBAC scan, IDOR scan seed from env).
violation_threshold_percent5RBAC: minimum percentage for an allowed role.
minimum_sample_size100RBAC: minimum requests per endpoint before rules. (IDOR scan defaults use 2 from DefaultIDORScanParams unless overridden by a scan message.)
idor_resource_dominance_percent95IDOR: minimum top-user share (%) to learn a user_resource_mapping (ResourceDominancePercent).

Configuration

VariableDefaultDescription
DATABASE_URLPostgreSQL connection string
KAFKA_BROKERSlocalhost:9092Kafka brokers
KAFKA_TOPIC_REQUEST_LOGSsf-events-accessTopic for access events (request consumer)
KAFKA_TOPIC_SCAN_REQUESTSscan-requestsTopic for batch scan triggers (scan consumer)
KAFKA_GROUP_IDdetection-engineConsumer group for the request consumer; scan consumer uses {KAFKA_GROUP_ID}-scan
LEARNING_WINDOW_DAYS90Learning window in days
VIOLATION_THRESHOLD_PERCENT5Minimum percentage for allowed role (RBAC scanner)
MINIMUM_SAMPLE_SIZE100Minimum requests per endpoint for RBAC learning
IDOR_RESOURCE_DOMINANCE_PERCENT95IDOR: minimum dominant user access share (%) per resource