Risk Score Engine
The BotBye Risk Score Engine is a fully configurable, real-time risk assessment system that goes beyond binary bot detection. It evaluates every request across multiple fraud classes - account takeover, abuse, bot activity, or any custom class you define - and returns a unified decision: ALLOW, CHALLENGE, or BLOCK.
Unlike static rule sets that require engineering involvement for every change, the Risk Score Engine puts control in the hands of fraud and security teams. You define what to measure, what thresholds to set, and what actions to take - all without code changes or redeployments.
How It Works
Every protection event you send (login, registration, transaction, etc.) is used to continuously update metrics - real-time counters and aggregations that capture user and device behavior patterns.
When a new request arrives, the engine:
1. Resolves metrics for the relevant accounts, IPs, and devices
2. Evaluates signals - each signal checks metric values against conditions and contributes a score to a fraud class
3. Makes a decision - per-class scores are compared against thresholds, and the worst outcome wins
1
2
3
4
5
6
7
8
9
10
11
12
13
Request arrives
|
v
Resolve metrics for account, IP, device
|
v
Evaluate signals against metric values
|
v
Decision
|-- scores: { ato: 0.75, abuse: 0.2, bot: 0.1 }
|-- signals: [brute_force, credential_stuffing]
|-- decision: BLOCK
Metrics
Metrics are the foundation of the engine. They answer questions like "how many failed logins has this account had in the last 10 minutes?" or "how many distinct accounts have logged in from this IP?"
Dynamic Metrics
Dynamic metrics are computed in real-time from the events you send. Each metric definition specifies:
- Aggregation type - how to compute the value
- Key type - what entity to aggregate by
- Time window - the rolling window (e.g. 10 minutes, 1 hour, 24 hours)
- Event filter - optional conditions to count only matching events
- Field - which event field to aggregate (for DISTINCT_COUNT, LAST_VALUE, SUM)
Aggregation Types
| Type | Description | Example |
|---|---|---|
| COUNT | Number of events in window | Failed logins in 10 min |
| DISTINCT_COUNT | Unique values of a field | Distinct accounts per IP |
| RATE | Events per minute | Login attempts per minute |
| LAST_VALUE | Most recent value of a field | Last known latitude |
| SUM | Sum of a numeric field | Total transaction amount |
Key Types
Metrics can be aggregated by different entity keys:
| Key | Description | Use Case |
|---|---|---|
| ACCOUNT | Per user account | Brute force, account abuse |
| IP | Per IP address | Credential stuffing, proxy abuse |
| DEVICE | Per device fingerprint | Multi-accounting, device farms |
| DEVICE_IP | Composite: device + IP | Prevents fingerprint spoofing |
Built-in Dynamic Metrics
Every project is seeded with a set of built-in metrics covering common fraud patterns:
| Metric | Key | Aggregation | Window | Detects |
|---|---|---|---|---|
| Failed logins | ACCOUNT | COUNT | 10 min | Brute force |
| IP failed logins | IP | COUNT | 10 min | Distributed brute force |
| IP distinct accounts | IP | DISTINCT_COUNT | 10 min | Credential stuffing |
| Device+IP distinct accounts | DEVICE_IP | DISTINCT_COUNT | 24 h | Multi-accounting |
| Account distinct IPs | ACCOUNT | DISTINCT_COUNT | 1 h | Account sharing |
| Account distinct devices | ACCOUNT | DISTINCT_COUNT | 1 h | Account sharing |
| Account events | ACCOUNT | COUNT | 1 h | Excessive usage |
| Last login latitude | ACCOUNT | LAST_VALUE | - | Impossible travel |
| Last login longitude | ACCOUNT | LAST_VALUE | - | Impossible travel |
| Last login timestamp | ACCOUNT | LAST_VALUE | - | Impossible travel |
You can create custom dynamic metrics with any combination of aggregation type, key, time window, and event filters - for example, "SUM of transaction amounts per account in 1 hour where event type is WITHDRAWAL".
Event Filters
Dynamic metrics support optional filters so you can count only matching events. For example, a "failed logins" metric filters for events where eventType = LOGIN and eventStatus = FAILED.
Available event fields for filtering:
| Field | Description |
|---|---|
| eventType | LOGIN, REGISTRATION, TRANSACTION, etc. |
| eventStatus | SUCCESSFUL, FAILED, DECLINED |
| accountId | User account identifier |
| ip | Client IP address |
| User email | |
| userAgent | Browser/client user agent |
| deviceFingerprint | Device fingerprint |
| country | GeoIP country |
| latitude / longitude | Geolocation coordinates |
| *custom fields* | Any key-value pair you pass via customFields in the SDK |
You can pass arbitrary custom fields alongside every event via the customFields parameter (a Map<String, String>). These fields are merged into the event data and become available for metric filtering and aggregation - for example, you could pass bonus_id, transaction_tier, or payment_method and then create metrics that filter or aggregate by those values.
Computed Metrics
Computed metrics are derived at request time from dynamic metrics and request context - for example, calculating the geographic distance between the current request and the last known login location.
| Metric | Depends On | Logic |
|---|---|---|
| Geo distance (km) | Last login latitude/longitude | Haversine distance from last known location to current request |
| Time since last login (min) | Last login timestamp | Minutes elapsed since last login event |
| Has device fingerprint | Request context | 1.0 if device fingerprint is present, 0.0 otherwise |
Computed metrics enable patterns like impossible travel detection: if the geo distance is over 500 km and time since last login is under 60 minutes, it's physically impossible for the user to have traveled that far.
Signals
Signals define when to add risk score. Each signal has:
- Conditions - one or more metric thresholds that must ALL be met (AND logic)
- Fraud class - which fraud class this signal contributes to
- Score - the contribution value (0.0 to 1.0) added to the fraud class when triggered
Operators
Conditions support a rich set of comparison operators:
| Operator | Description | Example |
|---|---|---|
| GT / GTE | Greater than (or equal) | failed_logins > 10 |
| LT / LTE | Less than (or equal) | time_since_login < 60 |
| EQ / NEQ | Equal / not equal | country = "US" |
| IN / NOT_IN | Set membership | country IN ["US", "CA", "UK"] |
| BETWEEN | Range check | distance BETWEEN 100,500 |
Built-in Signals
The platform ships with battle-tested signals for common fraud scenarios:
| Signal | Fraud Class | Score | Conditions |
|---|---|---|---|
| Brute Force | ATO | 0.4 | Failed logins (account) > 10 in 10 min |
| Brute Force (mild) | ATO | 0.2 | Failed logins (account) between 4 and 5 in 10 min |
| Impossible Travel | ATO | 0.5 | Geo distance > 500 km AND time since login < 60 min |
| Credential Stuffing | ATO | 0.35 | Distinct accounts per IP > 3 in 10 min |
| IP Velocity | ATO | 0.3 | Failed logins per IP > 10 in 10 min |
| New Device with Failures | ATO | 0.15 | Failed logins (account) > 3 AND has device fingerprint AND distinct devices > 1 in 1 h |
| Multi-Accounting | Abuse | 0.5 | Distinct accounts per device+IP > 3 in 24 h |
| Account Sharing | Abuse | 0.4 | Distinct IPs per account > 5 AND distinct devices > 3 in 1 h |
| Excessive Usage | Abuse | 0.3 | Account events > 1000 in 1 h |
Custom Signals
You can create custom signals with any combination of metrics and conditions. Examples:
- High-value fraud: SUM of transaction amounts per account > $5000 in 1 hour → Payment Fraud class, score 0.6
- Promotion abuse: COUNT of bonus claims per device > 3 in 24 hours → Abuse class, score 0.5
- Rapid registration: COUNT of registrations per IP > 5 in 10 minutes → Abuse class, score 0.4
Score Accumulation
When multiple signals trigger for the same fraud class, their scores are summed (capped at 1.0). This weighted scoring approach means that a single weak signal might not trigger a block, but multiple signals together will.
For example, if both "Credential Stuffing" (0.35) and "Brute Force" (0.4) trigger for the ATO class, the combined ATO score is 0.75 - which exceeds the default block threshold of 0.7.
Fraud Classes & Thresholds
Fraud classes group signals by threat category. Each fraud class has independently configurable block and challenge thresholds.
Built-in Fraud Classes
| Fraud Class | Block Threshold | Challenge Threshold | Description |
|---|---|---|---|
| Bot | >= 0.7 | >= 0.4 | Automated traffic and bot activity |
| ATO | >= 0.7 | >= 0.4 | Account takeover indicators |
| Abuse | >= 0.8 | >= 0.5 | Account abuse patterns |
Custom Fraud Classes
You can define custom fraud classes for domain-specific risk categories - for example, payment_fraud, promotion_abuse, or content_spam. Each custom class gets its own thresholds and associated signals.
Decision Logic
The final decision follows a strict priority:
1. Evaluate all signals and compute per-class scores
2. For each fraud class, compare the score against its thresholds
3. The worst outcome across all fraud classes becomes the final decision: - If any class score >= block threshold → BLOCK - Else if any class score >= challenge threshold → CHALLENGE - Else → ALLOW
This ensures that a request flagged as high-risk in even one fraud class will be appropriately handled, regardless of how safe it appears in other classes.
Key Features
1. Real-Time Metrics
Metrics update continuously as events arrive - not in overnight batches. This means the engine always reflects the current state, enabling sub-second detection of attacks as they happen.
2. Dynamic Configuration
When you create, update, or delete a metric or signal, the change takes effect immediately - no redeployment, no downtime.
3. Composite Key Types
The DEVICE_IP composite key prevents a common evasion technique: attackers who spoof device fingerprints are still caught because the composite key requires both the fingerprint and IP to match. This closes a gap that single-dimension keying leaves open.
4. Multi-Condition Rules with Score Accumulation
Signals support multiple AND conditions, so you can express complex patterns like "geo distance > 500 km AND time since login < 60 min." When multiple signals trigger for the same fraud class, their scores accumulate - meaning weak individual signals can combine into a strong collective signal.
5. Computed Metrics
Not everything can be expressed as a simple counter. Computed metrics like geo distance and time since last login are derived on-the-fly from a combination of stored metrics and the current request context. This enables patterns like impossible travel detection.
6. Per-Project Customization
Every aspect of the engine - metrics, signals, fraud classes, thresholds - is configured per project. Different projects can have entirely different risk profiles suited to their specific threat model.
7. Event-Driven Dual Purpose
Every evaluate() call both scores the request and feeds future metrics. This means you should call evaluate for every significant user action - not just when you need a decision. The more events flow through the system, the more accurate the metrics become.
8. Built-in + Custom Extensibility
The platform ships with battle-tested built-in metrics and signals covering common fraud patterns. But you can extend it with any custom metric, signal, or fraud class specific to your business - without writing code.
Getting Started
The Risk Score Engine is built into every BotBye project. When you create a project, built-in metrics, signals, and fraud class thresholds are automatically seeded. You can start using risk scoring immediately by sending events via the SDK:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
BotbyeEvaluateResponse response = botbye.evaluate(BotbyeRiskScoringEvent.of(
request.getRemoteAddr(),
headers,
new BotbyeUserInfo(userId, null, userEmail, userPhone),
"LOGIN",
BotbyeEventStatus.FAILED,
botbyeResult,
Collections.emptyMap()
));
switch (response.getDecision()) {
case BLOCK -> { return ResponseEntity.status(403).build(); }
case CHALLENGE -> { return showChallenge(response.getChallenge()); }
case ALLOW -> continueRequest();
}
From the dashboard, you can then:
- View and edit metric definitions
- Create custom metrics with any aggregation, key, window, and filter
- Define signals with multi-condition logic
- Adjust thresholds per fraud class per project
- Monitor triggered signals and fraud class scores in real-time