Skip to content

Audit Log, Live Feed, and Analytics

You want three things simultaneously:

  1. Audit: an append-only tamper-evident-ish record of what happened (requests, approvals, operations).
  2. Live feed: a real-time stream for UI/monitoring (pending approvals, durations, outcomes).
  3. Analytics + semantic search: fast queries and “find similar events” without leaking secret material.

This doc describes a broker-first approach that keeps Opaque a secrets broker, not a secret store.

1. Storage Strategy (Layered)

System of record: SQLite (transactional)

  • Primary store for:
  • audit events (append-only)
  • device pairings (public keys)
  • client identities (exe hashes, uid/gid)
  • provider metadata (non-secret config)
  • profiles (name -> secret refs)

SQLite gives durability, migrations, constraints, and low operational overhead.

Analytics store: Arrow/Parquet dataset

  • Periodically (or continuously) export/roll up audit events to Parquet files with a stable Arrow schema.
  • This enables:
  • DuckDB queries
  • DataFusion queries
  • Python/R/BI tooling

Parquet is an Arrow-friendly, columnar, compressible format for long-term history.

Semantic index: LanceDB (Arrow-native)

  • Build an embeddings index over sanitized event text (no secret values, no raw locators).
  • Store:
  • event_id
  • ts
  • event_text (sanitized)
  • embedding vector

LanceDB is a good fit specifically because it is Arrow-native and optimized for vector search.

2. Redaction Policy (Critical)

Audit/feeds become an exfiltration path if they contain sensitive data and are accessible to untrusted agent runtimes.

Rules:

  • Never log plaintext secrets.
  • Prefer not to log full secret locators (e.g., full Vault paths or 1Password item names) in LLM-visible channels.
  • Treat these as sensitive metadata:
  • secret ref locators
  • repository names (sometimes)
  • cluster names/namespaces (sometimes)
  • exact URLs and response bodies from authenticated HTTP proxy ops

Recommended split:

  • Human audit stream: richer detail (still no values).
  • Agent audit stream: heavily minimized (operation name + high-level target category + outcome).

Enforce this by separating:

  • transport (separate sockets/endpoints) and/or
  • authorization (role gating per client identity).

3. Event Model

Event taxonomy (suggested)

  • request.received
  • policy.denied
  • approval.required
  • approval.presented
  • approval.granted
  • approval.denied
  • operation.started
  • operation.succeeded
  • operation.failed
  • provider.fetch.started / provider.fetch.finished (metadata only)

Correlation IDs

Every operation should carry a correlation chain:

  • request_id: end-to-end idempotency key from client or generated by daemon
  • approval_id: approval request id (may be multiple if step-up)
  • event_id: unique per event

Minimal event fields (conceptual)

struct AuditEvent {
  event_id: String,           // uuid
  ts_utc_ms: i64,
  level: String,              // info|warn|error
  kind: String,               // request.received, approval.granted, ...
  request_id: Option<String>,
  approval_id: Option<String>,

  client: ClientSummary,      // observed uid/gid + exe hash + optional codesign
  operation: Option<String>,  // github.set_actions_secret, k8s.set_secret, ...
  target: Option<TargetSummary>,

  outcome: Option<String>,    // ok|denied|error
  latency_ms: Option<i64>,    // approval latency, op latency, etc

  // Optional and sensitive: store only when explicitly enabled.
  location: Option<Location>,

  // No secret values.
  // Avoid full locators by default; use stable ids or hashed references.
  secret_names: Vec<String>,  // e.g. ["JWT","DATABASE_URL"]
  secret_ref_ids: Vec<String> // e.g. hashed refs or profile keys
}

Location (optional, privacy-sensitive)

Location can mean multiple things:

  • iOS approval device location (requires explicit permission)
  • network info (LAN IP, WiFi SSID) is often more sensitive than helpful

Recommendation:

  • default: location = None
  • opt-in: store coarse location only:
  • country/region if available, or
  • geohash with low precision, or
  • just “network = home/office” tags from user config

4. Live Feed

Feed requirements

  • near-real-time UI updates:
  • new requests
  • pending approvals
  • granted/denied
  • operation outcomes
  • filtering:
  • by repo/project/cluster
  • by operation kind
  • by client identity

Implementation shape

Internally:

  • append each event to SQLite
  • publish each event to an in-memory pubsub (e.g. tokio::broadcast)

Externally (pick one or more):

  • local SQLite query via CLI (opaque audit tail)
  • HTTP localhost endpoint using SSE for web/desktop UI (/audit/stream)
  • (later) Arrow Flight / FlightSQL stream for Arrow-native consumers

Current SSE runtime:

  • disabled by default
  • enable with OPAQUE_AUDIT_SSE_ADDR=127.0.0.1:8787
  • optional tuning:
  • OPAQUE_AUDIT_SSE_POLL_MS
  • OPAQUE_AUDIT_SSE_BATCH_LIMIT
  • endpoint: GET /audit/stream?since_ms=<unix_ms>

Preventing side channels

Make sure untrusted agent clients cannot subscribe to the human feed by default.

5. Analytics

Built-in metrics (daemon can compute)

  • approvals:
  • count granted/denied
  • median approval latency
  • step-up frequency (local_bio only vs local_bio+ios_faceid)
  • operations:
  • success/error rates by operation
  • p95 latency by operation kind
  • top targets (repo/project/cluster)

Columnar analytics with Arrow/Parquet + DuckDB

For long-term analysis:

  • export audit events to Parquet partitions:
  • partition by date (dt=YYYY-MM-DD)
  • optionally partition by operation_family

DuckDB can query these locally with high performance, including joins, group-bys, and window functions.

What to embed

Only embed a sanitized textual summary, e.g.:

  • "approval denied for github.set_actions_secret repo=org/repo env=prod secret=JWT client=claude-code"

Never embed:

  • secret values
  • access tokens
  • raw HTTP bodies
  • full secret ref locators if you consider them sensitive

Indexing pipeline

  • emit AuditEvent
  • derive event_text_sanitized
  • compute embedding asynchronously (so approvals/operations are not blocked)
  • upsert into LanceDB with event_id as primary key

Queries

  • audit.search_semantic(query, limit) returns:
  • event_ids + similarity scores + short snippet
  • then fetch details from SQLite (role-gated and redacted appropriately)

7. Retention and Backpressure

Audit can grow without bound.

Implemented

  • Periodic cleanup: the writer thread deletes events older than retention_days every 6 hours (configurable via audit_cleanup_interval_secs in config.toml), using batched deletes of 5,000 rows to avoid stalling the writer
  • Incremental vacuum: after each cleanup pass, PRAGMA incremental_vacuum(500) reclaims freed pages without a full VACUUM
  • Auto-vacuum: new databases are created with PRAGMA auto_vacuum = INCREMENTAL; existing databases log an info message suggesting a one-time VACUUM migration
  • Disk-backed overflow queue: when the bounded channel (4,096 events) fills, events spill to a separate audit.overflow.db file (up to 100k events) rather than being dropped; the writer thread drains overflow events back into the main DB after each batch
  • Push-based SSE: the writer thread signals an AuditNotify handle after each successful insert, waking SSE consumers immediately instead of polling on a fixed interval; SSE falls back to polling when no notify handle is available
  • Older events rolled to Parquet and optionally removed from SQLite
  • Embeddings store follows the same retention window