Docs — logdive

Quick start

Install the binary, ingest a log file, run a query. Five lines, no flags worth worrying about yet.

 $ cargo install logdive
$ logdive ingest --file /var/log/app.log
 → 12,847 lines indexed in 78ms $ logdive query 'level=error last 1h' $ logdive stats

By default, the index lives at ~/.logdive/index.db. Override with --db <path> on any command, or set LOGDIVE_DB.

Installation

Three supported paths.

From crates.io

 $ cargo install logdive
$ cargo install logdive-api # optional, for the HTTP server

From Docker

 $ docker pull ghcr.io/aryagorjipour/logdive:0.3.0

From source

 $ git clone https://github.com/Aryagorjipour/logdive
$ cd logdive
$ cargo build --release

Resulting binaries: logdive at 3.8 MB stripped, logdive-api at 4.1 MB stripped. MSRV: Rust 1.85.

The CLI

One binary, four subcommands.

ingest

Reads a file or stdin, parses log lines, and inserts them into the SQLite index. Supports JSON (default), logfmt, and plain text. Deduplicates via blake3 content hash.

 $ logdive ingest --file ./logs/app.log
$ logdive ingest --file ./logs/app.log --format logfmt --tag production
$ docker logs my-container | logdive ingest --tag my-container
$ logdive ingest --file ./logs/app.log --follow

--file <PATH>: Read from a file. Mutually exclusive with stdin.
--format json|logfmt|plain: Input format. Default json.
--tag <TAG>: Attach a tag to every ingested entry that does not already have a tag field.
--timestamp-now: Assign current UTC time to entries lacking a timestamp field instead of skipping them.
--follow: Tail the file for new lines, similar to tail -f. Detects log rotation and truncation. Requires --file. Unix only (Windows support is v0.4+).
--db <PATH>: Database path override. Default ~/.logdive/index.db. Also settable via LOGDIVE_DB.

query

Evaluates a query expression and prints matching rows.

 $ logdive query 'level=error AND service=payments last 2h' $ logdive query '(level=error OR level=warn) AND service=payments' $ logdive query 'level=error OR level=warn' --output json
$ logdive query 'message contains "timeout" last 24h' $ logdive query 'since 2026-01-01' --limit 0 --offset 200

--output pretty|json: Output format. Default pretty (colored). json is newline-delimited, pipe-friendly.
--limit <N>: Maximum results. Default 1000. Use 0 for unlimited.
--offset <N>: Skip the first N results. Use with --limit for page navigation. Default 0.
--db <PATH>: Database path override.

stats

Reports aggregate metadata about the index — row count, time range, tags, and DB size on disk.

 $ logdive stats
logdive index: /home/user/.logdive/index.db
  Entries:       42,317
  Time range:    2026-03-14T08:22:01Z → 2026-04-22T19:45:03Z
  Tags:          api, nginx, payments, worker, (untagged)
  DB size:       8.4 MB (8,400,000 bytes)

prune

Deletes entries outside a retention window, then vacuums the database file to reclaim disk space. Safe for cron.

 $ logdive prune --older-than 30d
$ logdive prune --before 2026-01-01
$ logdive prune --older-than 7d --yes

--older-than <DURATION>: Delete entries older than this. Format: integer + m, h, or d. E.g. 30d, 24h. Mutually exclusive with --before.
--before <DATETIME>: Delete entries before this datetime. Accepts RFC 3339, ISO naive datetime, or ISO date. Mutually exclusive with --older-than.
--yes: Skip the interactive [y/N] confirmation. Useful in scripts and cron.
--db <PATH>: Database path override.

The HTTP API

The logdive-api binary serves the same query language over HTTP, read-only. No authentication — bind it to localhost.

 $ logdive-api --db ~/.logdive/index.db --port 4000

GET /query

Runs a query and returns matching entries as newline-delimited JSON.

q (required): Query expression. URL-encoded. Same syntax as CLI.
limit (optional): Maximum results. Default 1000. 0 for unlimited.
offset (optional): Skip the first N results. Default 0. Use with limit for pagination.

 $ curl 'http://127.0.0.1:4000/query?q=level%3Derror&limit=50' {"timestamp":"2026-05-21T14:02:31Z","level":"error","message":"..."}
{"timestamp":"2026-05-21T14:02:33Z","level":"error","message":"..."}

GET /stats

Returns aggregate metadata as a single JSON object.

 {
  "entries": 42317,
  "min_timestamp": "2026-03-14T08:22:01Z",
  "max_timestamp": "2026-04-22T19:45:03Z",
  "tags": [null, "api", "nginx", "payments", "worker"],
  "db_size_bytes": 8400000,
  "db_path": "/home/user/.logdive/index.db"
}

GET /version

Returns build version and supported capabilities. Never touches the database. Use as a liveness probe.

 {
  "version": "0.3.0",
  "formats": ["json", "logfmt", "plain"],
  "capabilities": ["query", "stats", "version"]
}

Query language

Boolean expressions over fields, plus an optional trailing time window. Fields can be indexed columns (timestamp, level, message, tag) or arbitrary JSON paths (user.id, request.method).

Grammar

Keywords are case-insensitive. AND binds tighter than OR; use parentheses to override precedence.

Operators

=: Exact match. Hits an index on known fields.
!=: Negation of =. Still indexed.
>, <: Numeric or lexicographic comparison.
CONTAINS: Case-insensitive substring match. Full-table scan on the target field.
AND: Binds clauses within a group. Tighter precedence than OR.
OR: Separates AND groups. Each group is evaluated independently.
last <N>m|h|d: Time window ending now.
since <datetime>: Absolute lower bound. RFC 3339, ISO naive datetime, or ISO date.

Examples

 # All errors from payments in the last 2 hours
level=error AND service=payments last 2h
# Errors OR warnings
level=error OR level=warn
# Parentheses for explicit precedence
(level=error OR level=warn) AND service=payments
# AND within each OR branch (no parens needed — AND binds tighter)
level=error AND service=payments OR level=warn AND tag=worker
# Case-insensitive level match (ERROR, error, Error all hit the same rows)
level=ERROR AND service=payments
# Substring search on the message body
message contains "timeout" last 24h
# Slow requests over 500ms
duration_ms > 500
# Time range by absolute date
since 2026-04-15T09:00:00Z

Configuration

All configuration is via command-line flags, with environment-variable fallbacks for containerised deployments.

LOGDIVE_DB: Database path. Fallback for --db. Default ~/.logdive/index.db.
LOGDIVE_LOG: Verbosity filter for internal diagnostics (tracing_subscriber::EnvFilter). Default warn.
LOGDIVE_API_PORT: Port for logdive-api. Fallback for --port. Default 4000.
LOGDIVE_API_HOST: Bind host for logdive-api. Fallback for --host. Default 127.0.0.1.
LOGDIVE_API_CORS_ORIGINS: Allowed CORS origins. Comma-separated list or *. Default: disabled.
NO_COLOR: Suppress ANSI colour in logdive query output when set.

Docker

Multi-arch images (linux/amd64 and linux/arm64) published to GHCR on every merge to main and every version tag.

 # Start the API server $ docker volume create logdive-data
$ docker run -d \
    --name logdive \
    -v logdive-data:/data \
    -p 4000:4000 \
    ghcr.io/aryagorjipour/logdive:0.3.0

 # Ingest with the CLI against the same volume $ docker run --rm \
    -v logdive-data:/data \
    -v /path/to/your/logs:/logs:ro \
    --entrypoint logdive \
    ghcr.io/aryagorjipour/logdive:0.3.0 \
    ingest --file /logs/app.log --tag production

Default entrypoint is logdive-api. The image pre-sets LOGDIVE_DB=/data/index.db and LOGDIVE_API_HOST=0.0.0.0. Runtime is gcr.io/distroless/cc-debian12:nonroot — no shell, no curl, no root. HEALTHCHECK uses logdive-api --health-check, which opens a TCP connection to the server's own port via stdlib — no HTTP client required.

Architecture

A Cargo workspace with three crates. logdive-core is publishable to crates.io as a standalone library.

logdive/
├── logdive-core 
├── logdive 
└── logdive-api

Storage model

Hybrid schema: timestamp, level, message, tag are indexed columns. Everything else is stored in a fields TEXT JSON blob and queried at read time via SQLite's json_extract(). Deduplication uses a raw_hash UNIQUE column with blake3 hashes and INSERT OR IGNORE.

Why SQLite

Zero infrastructure. A single file, transactional, with a query planner that handles indexes, joins, and aggregates. The interesting work is the query parser and the storage schema; SQLite handles the rest.

Why Rust

Parsing log lines at 200k/s with near-zero GC budget. The ingest path is where Rust earns its place. The query path is mostly SQL.