Risk Score Engine

Risk Score Engine

The BotBye Risk Score Engine is a fully configurable, real-time risk assessment system that goes beyond binary bot detection. It evaluates every request across multiple fraud classes - account takeover, abuse, bot activity, or any custom class you define - and returns a unified decision: ALLOW, CHALLENGE, or BLOCK.

Unlike static rule sets that require engineering involvement for every change, the Risk Score Engine puts control in the hands of fraud and security teams. You define what to measure, what thresholds to set, and what actions to take - all without code changes or redeployments.

How It Works

Every protection event you send (login, registration, transaction, etc.) is used to continuously update metrics - real-time counters and aggregations that capture user and device behavior patterns.

When a new request arrives, the engine:

1. Resolves metrics for the relevant accounts, IPs, and devices

2. Evaluates signals - each signal checks metric values against conditions and contributes a score to a fraud class

3. Makes a decision - per-class scores are compared against thresholds, and the worst outcome wins

1
2
3
4
5
6
7
8
9
10
11
12
13
Request arrives
  |
  v
Resolve metrics for account, IP, device
  |
  v
Evaluate signals against metric values
  |
  v
Decision
  |-- scores: { ato: 0.75, abuse: 0.2, bot: 0.1 }
  |-- signals: [brute_force, credential_stuffing]
  |-- decision: BLOCK

Metrics

Metrics are the foundation of the engine. They answer questions like "how many failed logins has this account had in the last 10 minutes?" or "how many distinct accounts have logged in from this IP?"

Dynamic Metrics

Dynamic metrics are computed in real-time from the events you send. Each metric definition specifies:

  • Aggregation type - how to compute the value
  • Key type - what entity to aggregate by
  • Time window - the rolling window (e.g. 10 minutes, 1 hour, 24 hours)
  • Event filter - optional conditions to count only matching events
  • Field - which event field to aggregate (for DISTINCT_COUNT, LAST_VALUE, SUM)

Aggregation Types

Type Description Example
COUNT Number of events in window Failed logins in 10 min
DISTINCT_COUNT Unique values of a field Distinct accounts per IP
RATE Events per minute Login attempts per minute
LAST_VALUE Most recent value of a field Last known latitude
SUM Sum of a numeric field Total transaction amount

Key Types

Metrics can be aggregated by different entity keys:

Key Description Use Case
ACCOUNT Per user account Brute force, account abuse
IP Per IP address Credential stuffing, proxy abuse
DEVICE Per device fingerprint Multi-accounting, device farms
DEVICE_IP Composite: device + IP Prevents fingerprint spoofing

Built-in Dynamic Metrics

Every project is seeded with a set of built-in metrics covering common fraud patterns:

Metric Key Aggregation Window Detects
Failed logins ACCOUNT COUNT 10 min Brute force
IP failed logins IP COUNT 10 min Distributed brute force
IP distinct accounts IP DISTINCT_COUNT 10 min Credential stuffing
Device+IP distinct accounts DEVICE_IP DISTINCT_COUNT 24 h Multi-accounting
Account distinct IPs ACCOUNT DISTINCT_COUNT 1 h Account sharing
Account distinct devices ACCOUNT DISTINCT_COUNT 1 h Account sharing
Account events ACCOUNT COUNT 1 h Excessive usage
Last login latitude ACCOUNT LAST_VALUE - Impossible travel
Last login longitude ACCOUNT LAST_VALUE - Impossible travel
Last login timestamp ACCOUNT LAST_VALUE - Impossible travel

You can create custom dynamic metrics with any combination of aggregation type, key, time window, and event filters - for example, "SUM of transaction amounts per account in 1 hour where event type is WITHDRAWAL".

Event Filters

Dynamic metrics support optional filters so you can count only matching events. For example, a "failed logins" metric filters for events where eventType = LOGIN and eventStatus = FAILED.

Available event fields for filtering:

Field Description
eventType LOGIN, REGISTRATION, TRANSACTION, etc.
eventStatus SUCCESSFUL, FAILED, DECLINED
accountId User account identifier
ip Client IP address
email User email
userAgent Browser/client user agent
deviceFingerprint Device fingerprint
country GeoIP country
latitude / longitude Geolocation coordinates
*custom fields* Any key-value pair you pass via customFields in the SDK

You can pass arbitrary custom fields alongside every event via the customFields parameter (a Map<String, String>). These fields are merged into the event data and become available for metric filtering and aggregation - for example, you could pass bonus_id, transaction_tier, or payment_method and then create metrics that filter or aggregate by those values.

Computed Metrics

Computed metrics are derived at request time from dynamic metrics and request context - for example, calculating the geographic distance between the current request and the last known login location.

Metric Depends On Logic
Geo distance (km) Last login latitude/longitude Haversine distance from last known location to current request
Time since last login (min) Last login timestamp Minutes elapsed since last login event
Has device fingerprint Request context 1.0 if device fingerprint is present, 0.0 otherwise

Computed metrics enable patterns like impossible travel detection: if the geo distance is over 500 km and time since last login is under 60 minutes, it's physically impossible for the user to have traveled that far.

Signals

Signals define when to add risk score. Each signal has:

  • Conditions - one or more metric thresholds that must ALL be met (AND logic)
  • Fraud class - which fraud class this signal contributes to
  • Score - the contribution value (0.0 to 1.0) added to the fraud class when triggered

Operators

Conditions support a rich set of comparison operators:

Operator Description Example
GT / GTE Greater than (or equal) failed_logins > 10
LT / LTE Less than (or equal) time_since_login < 60
EQ / NEQ Equal / not equal country = "US"
IN / NOT_IN Set membership country IN ["US", "CA", "UK"]
BETWEEN Range check distance BETWEEN 100,500

Built-in Signals

The platform ships with battle-tested signals for common fraud scenarios:

Signal Fraud Class Score Conditions
Brute Force ATO 0.4 Failed logins (account) > 10 in 10 min
Brute Force (mild) ATO 0.2 Failed logins (account) between 4 and 5 in 10 min
Impossible Travel ATO 0.5 Geo distance > 500 km AND time since login < 60 min
Credential Stuffing ATO 0.35 Distinct accounts per IP > 3 in 10 min
IP Velocity ATO 0.3 Failed logins per IP > 10 in 10 min
New Device with Failures ATO 0.15 Failed logins (account) > 3 AND has device fingerprint AND distinct devices > 1 in 1 h
Multi-Accounting Abuse 0.5 Distinct accounts per device+IP > 3 in 24 h
Account Sharing Abuse 0.4 Distinct IPs per account > 5 AND distinct devices > 3 in 1 h
Excessive Usage Abuse 0.3 Account events > 1000 in 1 h

Custom Signals

You can create custom signals with any combination of metrics and conditions. Examples:

  • High-value fraud: SUM of transaction amounts per account > $5000 in 1 hour → Payment Fraud class, score 0.6
  • Promotion abuse: COUNT of bonus claims per device > 3 in 24 hours → Abuse class, score 0.5
  • Rapid registration: COUNT of registrations per IP > 5 in 10 minutes → Abuse class, score 0.4

Score Accumulation

When multiple signals trigger for the same fraud class, their scores are summed (capped at 1.0). This weighted scoring approach means that a single weak signal might not trigger a block, but multiple signals together will.

For example, if both "Credential Stuffing" (0.35) and "Brute Force" (0.4) trigger for the ATO class, the combined ATO score is 0.75 - which exceeds the default block threshold of 0.7.

Fraud Classes & Thresholds

Fraud classes group signals by threat category. Each fraud class has independently configurable block and challenge thresholds.

Built-in Fraud Classes

Fraud Class Block Threshold Challenge Threshold Description
Bot >= 0.7 >= 0.4 Automated traffic and bot activity
ATO >= 0.7 >= 0.4 Account takeover indicators
Abuse >= 0.8 >= 0.5 Account abuse patterns

Custom Fraud Classes

You can define custom fraud classes for domain-specific risk categories - for example, payment_fraud, promotion_abuse, or content_spam. Each custom class gets its own thresholds and associated signals.

Decision Logic

The final decision follows a strict priority:

1. Evaluate all signals and compute per-class scores

2. For each fraud class, compare the score against its thresholds

3. The worst outcome across all fraud classes becomes the final decision: - If any class score >= block threshold → BLOCK - Else if any class score >= challenge threshold → CHALLENGE - Else → ALLOW

This ensures that a request flagged as high-risk in even one fraud class will be appropriately handled, regardless of how safe it appears in other classes.

Key Features

1. Real-Time Metrics

Metrics update continuously as events arrive - not in overnight batches. This means the engine always reflects the current state, enabling sub-second detection of attacks as they happen.

2. Dynamic Configuration

When you create, update, or delete a metric or signal, the change takes effect immediately - no redeployment, no downtime.

3. Composite Key Types

The DEVICE_IP composite key prevents a common evasion technique: attackers who spoof device fingerprints are still caught because the composite key requires both the fingerprint and IP to match. This closes a gap that single-dimension keying leaves open.

4. Multi-Condition Rules with Score Accumulation

Signals support multiple AND conditions, so you can express complex patterns like "geo distance > 500 km AND time since login < 60 min." When multiple signals trigger for the same fraud class, their scores accumulate - meaning weak individual signals can combine into a strong collective signal.

5. Computed Metrics

Not everything can be expressed as a simple counter. Computed metrics like geo distance and time since last login are derived on-the-fly from a combination of stored metrics and the current request context. This enables patterns like impossible travel detection.

6. Per-Project Customization

Every aspect of the engine - metrics, signals, fraud classes, thresholds - is configured per project. Different projects can have entirely different risk profiles suited to their specific threat model.

7. Event-Driven Dual Purpose

Every evaluate() call both scores the request and feeds future metrics. This means you should call evaluate for every significant user action - not just when you need a decision. The more events flow through the system, the more accurate the metrics become.

8. Built-in + Custom Extensibility

The platform ships with battle-tested built-in metrics and signals covering common fraud patterns. But you can extend it with any custom metric, signal, or fraud class specific to your business - without writing code.

Getting Started

The Risk Score Engine is built into every BotBye project. When you create a project, built-in metrics, signals, and fraud class thresholds are automatically seeded. You can start using risk scoring immediately by sending events via the SDK:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
BotbyeEvaluateResponse response = botbye.evaluate(BotbyeRiskScoringEvent.of(
    request.getRemoteAddr(),
    headers,
    new BotbyeUserInfo(userId, null, userEmail, userPhone),
    "LOGIN",
    BotbyeEventStatus.FAILED,
    botbyeResult,
    Collections.emptyMap()
));

switch (response.getDecision()) {
    case BLOCK     -> { return ResponseEntity.status(403).build(); }
    case CHALLENGE -> { return showChallenge(response.getChallenge()); }
    case ALLOW     -> continueRequest();
}

From the dashboard, you can then:

  • View and edit metric definitions
  • Create custom metrics with any aggregation, key, window, and filter
  • Define signals with multi-condition logic
  • Adjust thresholds per fraud class per project
  • Monitor triggered signals and fraud class scores in real-time