What is Behavioral Analytics in Observability? Introduction to ALBA Scoring

The problem with static thresholds

Traditional alerting uses static thresholds: alert when CPU exceeds 80%, when error rate exceeds 1%, when latency exceeds 200ms.

This approach fails at scale for predictable reasons:

No seasonality awareness: 80% CPU at 2PM on Black Friday is normal. 80% CPU at 3AM on Tuesday is not.
Context blindness: A new deployment changes normal behavior. Static thresholds do not adapt.
Workload variance: Different services have different baselines. One threshold does not fit all.
Constant tuning: Teams spend hours adjusting thresholds that become stale within weeks.

The alert fatigue statistics

The consequences are measurable:

SOC teams receive an average of 4,484 alerts per day (Vectra 2023)
67% of alerts are ignored due to false positives
Companies with 500-1,499 employees ignore 27% of all alerts (IDC)

When most alerts are noise, real problems get missed.

How behavioral baselines work

Behavioral analytics replaces static thresholds with learned baselines:

Collect historical data: 2-6 weeks minimum for accurate patterns
Analyze patterns: Daily cycles, weekly trends, seasonal variations
Build statistical model: What is "normal" for this metric, at this time, for this entity?
Alert on deviations: Flag significant departures from predicted baseline

Datadog describes their approach: "Anomaly detection monitors... account for seasonality by firing alerts according to a set amount of deviation from the observed pattern, rather than alerting on a fixed threshold."

The key insight: what matters is not the absolute value, but whether the value is unusual for this context.

The AIOps landscape

The AIOps market is growing rapidly, with varying estimates depending on definition:

$1.87B (2024) to $8.64B (2032) at 21.4% CAGR (Fortune Business Insights)
Gartner predicts large enterprises exclusively using AIOps will grow from 5% (2018) to 30% (2024)
80% of AIOps vendors implementing generative AI by 2024-2025

Vendor capabilities comparison

Major observability vendors have built ML-powered analytics:

Vendor	Product	Key Capabilities
Datadog	Watchdog	Anomaly detection, root cause analysis, faulty deployment detection. Requires 2 weeks metric history.
Dynatrace	Davis AI	Predictive AI (forecasting), Causal AI (root cause via Smartscape topology), Davis CoPilot (natural language).
Splunk	ITSI	Adaptive Thresholding, Trending (historical comparison), Entity Cohesion (peer group deviation).
Moogsoft	AIOps	Event correlation, noise reduction. Claims 85-99.7% noise reduction.

Entity-level scoring: moving beyond metrics

Individual metric alerting has a fundamental problem: one unhealthy service generates multiple separate alerts.

When a service degrades, you might see alerts for:

CPU utilization increased
Memory pressure elevated
Error rate exceeded threshold
Latency p99 degraded
Throughput decreased

These are five alerts for one problem. Entity-level scoring aggregates signals into a single health indicator:

Service_Health = f(error_rate, latency_p99, throughput_change, CPU, memory)

Instead of five alerts, you get one: "Service X is unhealthy."

What is an "entity"?

An entity is any logical component you want to track as a unit:

Service
Host
Container
Endpoint
Pod
Database
Network device

Entity-level scoring lets you ask "Is this thing healthy?" instead of "Are any of these 47 metrics outside their thresholds?"

Anomaly score vs risk score

Advanced behavioral analytics uses dual scoring systems:

Score Type	Question	Factors
Anomaly Score	How unusual is this?	Statistical deviation from baseline, ML confidence
Risk Score	How bad could this be?	Business impact, blast radius, SLA implications

Combined prioritization

The real power comes from combining both scores:

Priority = Anomaly_Score × Risk_Score

Scenario	Anomaly	Risk	Priority
High anomaly on internal dev tool	High	Low	Low priority
Moderate anomaly on payment service	Moderate	High	High priority
High anomaly on payment service	High	High	Urgent

This approach surfaces what matters: unusual behavior on critical systems. Not every anomaly deserves attention—only anomalies with impact potential.

Anomaly detection methods

Different algorithms suit different use cases:

Method	Data Needed	Compute	Interpretability	Temporal
Z-Score	Low	Minimal	High	No
Holt-Winters	Moderate	Low	High	Yes (seasonal)
Isolation Forest	Moderate	Low	Medium	Limited
Prophet	High (1yr+)	Moderate	High	Yes
LSTM	High	High	Low	Yes
Autoencoder	High	High	Low	Limited

No single method is best. Production systems often combine multiple approaches: fast statistical methods for real-time alerting, more sophisticated ML for deeper analysis.

Why complete data matters for ML

Here is the core problem with sampling for behavioral analytics:

ML models learn from training data. If you sample during the baseline period, the model learns from an incomplete picture.

Consider what 1% sampling misses:

Rare but important error classes that occur less than your sampling rate
Latency spikes that happen to not get sampled
Entire user segments with low traffic
Intermittent failures affecting specific request patterns

The model cannot learn what it never sees. Anomaly detection trained on sampled data has blind spots that mirror your sampling gaps.

Datadog Watchdog requires 2 weeks of historical data for metric baselines. If that data is 1% sampled, the baselines are biased toward high-volume patterns.

Security detection through behavioral analytics

Behavioral baselines are not just for performance—they detect security threats too:

Lateral Movement: User accounts accessing systems they have never touched before
Data Exfiltration: Outbound traffic patterns that deviate from baseline
Credential Misuse: Authentication patterns that do not match normal behavior

NETSCOUT notes: "Anomaly Detection: Tools that recognize unusual behavior, like an employee's account accessing systems they have never touched before, can flag potential lateral movement."

The same behavioral model that detects performance anomalies can surface security concerns. The difference is in the interpretation, not the detection.

Introducing ALBA

ALBA—Adaptive Learning Behavioral Analytics—is Sampleless's approach to entity-level scoring. Built to take advantage of complete data, ALBA provides:

Entity health scores: Aggregate multiple signals into a single indicator per service, host, or endpoint
Dual scoring: Anomaly scores for "how unusual" and risk scores for "how impactful"
Context-aware baselines: Time-of-day, day-of-week, and seasonal patterns
Continuous learning: Baselines adapt as your system evolves

Because Sampleless collects 100% of telemetry, ALBA trains on complete data. No sampling gaps. No biased baselines. The full picture.

OpenALBA

ALBA is built on OpenALBA—an open specification for entity-level behavioral scoring. We believe the industry needs shared standards for behavioral analytics, not another proprietary lock-in vector.

OpenALBA defines entity schemas, scoring algorithms, and export formats. Your behavioral data is yours, portable to any compatible system.

Frequently asked questions

What is the difference between anomaly score and risk score?

Anomaly score measures how unusual current behavior is compared to the baseline (statistical deviation). Risk score measures potential business impact (blast radius, SLA implications, revenue exposure). Combined priority = Anomaly × Risk. A high anomaly on a low-risk service is low priority; a moderate anomaly on a critical service is high priority.

How long does it take to train behavioral baselines?

Minimum 2-6 weeks of historical data for accurate baselines. Datadog Watchdog requires 2 weeks for metric baselines and 24 hours for logs. Shorter training periods miss weekly patterns and edge cases. This is why sampled data degrades ML accuracy—incomplete training data means biased baselines.

Can behavioral analytics detect security threats?

Yes. Observability signals can detect lateral movement (accounts accessing unusual systems), data exfiltration (outbound traffic anomalies), and credential misuse (authentication pattern anomalies). Behavioral baselines flag deviations regardless of whether the cause is operational or security-related.