Quick start
Install the binary, ingest a log file, run a query. Five lines, no flags worth worrying about yet.
$ cargo install logdive $ logdive ingest --file /var/log/app.log → 12,847 lines indexed in 78ms $ logdive query 'level=error last 1h' $ logdive stats
By default, the index lives at ~/.logdive/index.db. Override with --db <path> on any command, or set LOGDIVE_DB.
Installation
Three supported paths.
From crates.io
$ cargo install logdive $ cargo install logdive-api # optional, for the HTTP server
From Docker
$ docker pull ghcr.io/aryagorjipour/logdive:0.3.0 From source
$ git clone https://github.com/Aryagorjipour/logdive $ cd logdive $ cargo build --release
Resulting binaries: logdive at 3.8 MB stripped, logdive-api at 4.1 MB stripped. MSRV: Rust 1.85.
The CLI
One binary, four subcommands.
ingest
Reads a file or stdin, parses log lines, and inserts them into the SQLite index. Supports JSON (default), logfmt, and plain text. Deduplicates via blake3 content hash.
$ logdive ingest --file ./logs/app.log $ logdive ingest --file ./logs/app.log --format logfmt --tag production $ docker logs my-container | logdive ingest --tag my-container $ logdive ingest --file ./logs/app.log --follow
- --file <PATH>
- Read from a file. Mutually exclusive with stdin.
- --format json|logfmt|plain
- Input format. Default
json. - --tag <TAG>
- Attach a tag to every ingested entry that does not already have a
tagfield. - --timestamp-now
- Assign current UTC time to entries lacking a
timestampfield instead of skipping them. - --follow
- Tail the file for new lines, similar to
tail -f. Detects log rotation and truncation. Requires--file. Unix only (Windows support is v0.4+). - --db <PATH>
- Database path override. Default
~/.logdive/index.db. Also settable viaLOGDIVE_DB.
query
Evaluates a query expression and prints matching rows.
$ logdive query 'level=error AND service=payments last 2h' $ logdive query '(level=error OR level=warn) AND service=payments' $ logdive query 'level=error OR level=warn' --output json $ logdive query 'message contains "timeout" last 24h' $ logdive query 'since 2026-01-01' --limit 0 --offset 200
- --output pretty|json
- Output format. Default
pretty(colored).jsonis newline-delimited, pipe-friendly. - --limit <N>
- Maximum results. Default 1000. Use
0for unlimited. - --offset <N>
- Skip the first N results. Use with
--limitfor page navigation. Default 0. - --db <PATH>
- Database path override.
stats
Reports aggregate metadata about the index — row count, time range, tags, and DB size on disk.
$ logdive stats logdive index: /home/user/.logdive/index.db Entries: 42,317 Time range: 2026-03-14T08:22:01Z → 2026-04-22T19:45:03Z Tags: api, nginx, payments, worker, (untagged) DB size: 8.4 MB (8,400,000 bytes)
prune
Deletes entries outside a retention window, then vacuums the database file to reclaim disk space. Safe for cron.
$ logdive prune --older-than 30d $ logdive prune --before 2026-01-01 $ logdive prune --older-than 7d --yes
- --older-than <DURATION>
- Delete entries older than this. Format: integer +
m,h, ord. E.g.30d,24h. Mutually exclusive with--before. - --before <DATETIME>
- Delete entries before this datetime. Accepts RFC 3339, ISO naive datetime, or ISO date. Mutually exclusive with
--older-than. - --yes
- Skip the interactive
[y/N]confirmation. Useful in scripts and cron. - --db <PATH>
- Database path override.
The HTTP API
The logdive-api binary serves the same query language over HTTP, read-only. No authentication — bind it to localhost.
$ logdive-api --db ~/.logdive/index.db --port 4000
GET /query
Runs a query and returns matching entries as newline-delimited JSON.
- q (required)
- Query expression. URL-encoded. Same syntax as CLI.
- limit (optional)
- Maximum results. Default 1000.
0for unlimited. - offset (optional)
- Skip the first N results. Default 0. Use with
limitfor pagination.
$ curl 'http://127.0.0.1:4000/query?q=level%3Derror&limit=50' {"timestamp":"2026-05-21T14:02:31Z","level":"error","message":"..."} {"timestamp":"2026-05-21T14:02:33Z","level":"error","message":"..."}
GET /stats
Returns aggregate metadata as a single JSON object.
{
"entries": 42317,
"min_timestamp": "2026-03-14T08:22:01Z",
"max_timestamp": "2026-04-22T19:45:03Z",
"tags": [null, "api", "nginx", "payments", "worker"],
"db_size_bytes": 8400000,
"db_path": "/home/user/.logdive/index.db"
} GET /version
Returns build version and supported capabilities. Never touches the database. Use as a liveness probe.
{
"version": "0.3.0",
"formats": ["json", "logfmt", "plain"],
"capabilities": ["query", "stats", "version"]
} Query language
Boolean expressions over fields, plus an optional trailing time window. Fields can be indexed columns (timestamp, level, message, tag) or arbitrary JSON paths (user.id, request.method).
Grammar
Keywords are case-insensitive. AND binds tighter than OR; use parentheses to override precedence.
Operators
- =
- Exact match. Hits an index on known fields.
- !=
- Negation of
=. Still indexed. - >, <
- Numeric or lexicographic comparison.
- CONTAINS
- Case-insensitive substring match. Full-table scan on the target field.
- AND
- Binds clauses within a group. Tighter precedence than OR.
- OR
- Separates AND groups. Each group is evaluated independently.
- last <N>m|h|d
- Time window ending now.
- since <datetime>
- Absolute lower bound. RFC 3339, ISO naive datetime, or ISO date.
Examples
# All errors from payments in the last 2 hours level=error AND service=payments last 2h # Errors OR warnings level=error OR level=warn # Parentheses for explicit precedence (level=error OR level=warn) AND service=payments # AND within each OR branch (no parens needed — AND binds tighter) level=error AND service=payments OR level=warn AND tag=worker # Case-insensitive level match (ERROR, error, Error all hit the same rows) level=ERROR AND service=payments # Substring search on the message body message contains "timeout" last 24h # Slow requests over 500ms duration_ms > 500 # Time range by absolute date since 2026-04-15T09:00:00Z
Configuration
All configuration is via command-line flags, with environment-variable fallbacks for containerised deployments.
- LOGDIVE_DB
- Database path. Fallback for
--db. Default~/.logdive/index.db. - LOGDIVE_LOG
- Verbosity filter for internal diagnostics (
tracing_subscriber::EnvFilter). Defaultwarn. - LOGDIVE_API_PORT
- Port for
logdive-api. Fallback for--port. Default4000. - LOGDIVE_API_HOST
- Bind host for
logdive-api. Fallback for--host. Default127.0.0.1. - LOGDIVE_API_CORS_ORIGINS
- Allowed CORS origins. Comma-separated list or
*. Default: disabled. - NO_COLOR
- Suppress ANSI colour in
logdive queryoutput when set.
Docker
Multi-arch images (linux/amd64 and linux/arm64) published to GHCR on every merge to main and every version tag.
# Start the API server $ docker volume create logdive-data $ docker run -d \ --name logdive \ -v logdive-data:/data \ -p 4000:4000 \ ghcr.io/aryagorjipour/logdive:0.3.0
# Ingest with the CLI against the same volume $ docker run --rm \ -v logdive-data:/data \ -v /path/to/your/logs:/logs:ro \ --entrypoint logdive \ ghcr.io/aryagorjipour/logdive:0.3.0 \ ingest --file /logs/app.log --tag production
Default entrypoint is logdive-api. The image pre-sets LOGDIVE_DB=/data/index.db and LOGDIVE_API_HOST=0.0.0.0. Runtime is gcr.io/distroless/cc-debian12:nonroot — no shell, no curl, no root. HEALTHCHECK uses logdive-api --health-check, which opens a TCP connection to the server's own port via stdlib — no HTTP client required.
Architecture
A Cargo workspace with three crates. logdive-core is publishable to crates.io as a standalone library.
logdive/ ├── logdive-core ├── logdive └── logdive-api
Storage model
Hybrid schema: timestamp, level, message, tag are indexed columns. Everything else is stored in a fields TEXT JSON blob and queried at read time via SQLite's json_extract(). Deduplication uses a raw_hash UNIQUE column with blake3 hashes and INSERT OR IGNORE.
Why SQLite
Zero infrastructure. A single file, transactional, with a query planner that handles indexes, joins, and aggregates. The interesting work is the query parser and the storage schema; SQLite handles the rest.
Why Rust
Parsing log lines at 200k/s with near-zero GC budget. The ingest path is where Rust earns its place. The query path is mostly SQL.