Guides

Production Deployment

A checklist and deep-dive for running Lelu reliably in production — covering HTTPS, secrets management, Engine scaling, and observability.

Pre-Launch Checklist

Security & Infrastructure

TLS terminated at load balancer or ingress for all services

LELU_API_KEY rotated from default and stored in a secret manager

DATABASE_URL uses sslmode=require in production

Redis uses TLS (rediss://) or a private network

Engine replicas ≥ 2 for high availability

Health checks configured on /healthz for all services

Prompt injection detection enabled (automatic)

Observability & Monitoring

OpenTelemetry tracing configured with Jaeger/Zipkin

Prometheus metrics endpoint exposed and scraped

Behavioral analytics enabled for agent monitoring

Predictive analytics models trained with sufficient data (100+ samples)

Alert channels configured (Slack, PagerDuty, email)

Structured logs exported to your log platform

Policies & Compliance

OPA/Rego policies are version-controlled before deploy

Risk assessment thresholds tuned for your use case

Confidence gates configured appropriately

Audit retention configured (S3/object-store lifecycle, 1+ year)

Human review workflows tested and documented

In Docker Compose healthchecks, prefer 127.0.0.1 over localhost to avoid container-local hostname resolution edge cases.

Scaling the Engine

The Engine is stateless — scale horizontally by running multiple replicas behind a load balancer. All state lives in Redis.

docker-compose.override.yml

services:
  engine:
    deploy:
      replicas: 3
      resources:
        limits:
          cpus: "1"
          memory: 512M
      restart_policy:
        condition: on-failure
        delay: 5s

Secrets Management

Never store secrets in environment files committed to source control. Use one of these patterns in production:

AWS Secrets Manager

Use the AWS SSM Parameter Store or Secrets Manager and inject via IAM role at runtime.

Kubernetes Secrets

Mount as environment variables from an encrypted Secret object — use Sealed Secrets or External Secrets Operator.

HashiCorp Vault

Use Vault Agent Injector to automatically inject secrets into pods at startup.

Observability

Lelu provides comprehensive metrics for monitoring authorization decisions, agent behavior, and system performance. Configure Prometheus scraping and alerting for production deployments.

Core Metrics

Authorization & HTTP metrics

lelu_http_requests_total{method="POST",path="/v1/agent/authorize",status="200"}
  # Request volume and status-code anomalies

lelu_http_request_duration_seconds{method="POST",path="/v1/agent/authorize"}
  # Latency SLO / p95 / p99

lelu_auth_decisions_total{type="agent",allowed="false"}
  # Deny-rate spikes and confidence policy pressure

lelu_agent_requests_total{agent_id,action,outcome}
  # Per-agent authorization outcomes

lelu_agent_confidence_score{agent_id,action}
  # Confidence score distribution

lelu_agent_risk_score{agent_id,action}
  # Risk score distribution

Behavioral Analytics Metrics

Reputation, anomalies, and alerts

lelu_agent_reputation_score{agent_id}
  # Current reputation score (0-1)

lelu_agent_anomaly_score{agent_id}
  # Anomaly detection score (0-1, higher = more anomalous)

lelu_agent_human_review_total{agent_id,reason}
  # Human review requirements by reason

lelu_policy_effectiveness_rate{policy_name,policy_version}
  # Policy success rate

Predictive Analytics Metrics

ML model performance

lelu_agent_prediction_accuracy{model_type,agent_id}
  # Model accuracy (0-1)

lelu_agent_prediction_latency_seconds{model_type}
  # Prediction latency

lelu_agent_predictions_total{model_type,outcome}
  # Prediction counts

lelu_agent_model_sample_count{model_type}
  # Training sample count

Multi-Agent Coordination Metrics

Delegation and swarm operations

lelu_agent_delegation_total{delegator,delegatee,outcome}
  # Agent delegation counts

lelu_swarm_operations_total{swarm_id,operation_type,outcome}
  # Swarm orchestration operations

lelu_swarm_agent_count{swarm_id}
  # Active agents per swarm

Recommended Alerts

critical

lelu_agent_reputation_score < 0.5

Agent reputation dropped below 50%

critical

lelu_agent_anomaly_score > 0.9

Severe anomaly detected

warning

lelu_http_request_duration_seconds{quantile="0.95"} > 0.5

P95 latency exceeds 500ms

warning

lelu_agent_prediction_accuracy < 0.7

ML model accuracy below 70%

info

lelu_policy_effectiveness_rate < 0.6

Policy effectiveness below 60%

Advanced Features Configuration

Enable and configure advanced features for production deployments.

OpenTelemetry Tracing

environment variables

OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4318
OTEL_SERVICE_NAME=lelu-engine
OTEL_TRACES_EXPORTER=otlp
OTEL_TRACES_SAMPLER=always_on

Behavioral Analytics

environment variables

# Reputation thresholds
REPUTATION_LOW_THRESHOLD=0.5
REPUTATION_MIN_DECISIONS=10

# Anomaly detection
ANOMALY_DETECTION_ENABLED=true
ANOMALY_SEVERITY_THRESHOLD=0.7
ANOMALY_WINDOW_SIZE=100

# Baseline management
BASELINE_SAMPLE_SIZE=100
BASELINE_REFRESH_INTERVAL=24h

Predictive Analytics

environment variables

# Model training
MIN_SAMPLES_FOR_MODEL=100
MODEL_UPDATE_INTERVAL=6h
CONFIDENCE_MODEL_WINDOW=30d
REVIEW_MODEL_WINDOW=14d

# Prediction thresholds
CONFIDENCE_PREDICTION_THRESHOLD=0.7
REVIEW_PREDICTION_THRESHOLD=0.6
POLICY_OPTIMIZATION_THRESHOLD=0.5

Prompt Injection Detection

environment variables

# Enabled by default
PROMPT_INJECTION_DETECTION_ENABLED=true
PROMPT_INJECTION_SEVERITY_THRESHOLD=0.8

# Alert on high-severity detections
PROMPT_INJECTION_ALERT_ENABLED=true

Multi-Agent Deployment Considerations

When deploying systems with multiple coordinating agents, consider these additional factors.

Delegation Chain Limits

Set maximum delegation depth to prevent infinite loops and excessive latency.

MAX_DELEGATION_DEPTH=5

Swarm Coordination

Configure swarm size limits and timeout values for coordinated operations.

MAX_SWARM_SIZE=10 SWARM_OPERATION_TIMEOUT=30s

Trace Context Propagation

Ensure OpenTelemetry context is propagated across agent boundaries for complete trace visibility.

Previous: Rate Limiting Next: Testing Policies