Mimir: Scalable Prometheus with Interactive Architecture
Grafana Mimir is a highly available, scalable, long-term storage for Prometheus. Think of it as "Prometheus on steroids" — distributed, fault-tolerant, and built for production scale.
Open the interactive Mimir architecture explorer
The Core Idea
Prometheus is great for short-term metrics, but Mimir solves the storage problem: how do you keep metrics for months/years without running out of disk space or memory?
Mimir = Prometheus metrics → distributed storage → queryable at scale
Three-Layer Architecture Model
Mimir's components cleanly separate into three concerns:
- Read Path — How queries happen
- Write Path — How metrics get in
- Backend Storage — Where everything lives
This separation is the key insight. Let me break it down.
Write Path Components (Ingestion)
These components handle incoming Prometheus metrics.
Distributor
What it does:
- Receives metrics from Prometheus scrapers (or Prometheus clients)
- Validates metrics format
- Adds authentication/rate limits
- Routes data to ingesters
Mental model: The bouncer at the club. Checks your ID, makes sure you're allowed in, then directs you to a table.
Key details:
- Stateless (can scale horizontally)
- Handles load balancing to ingesters
- Enforces rate limits per tenant
- Checksums data for integrity
Ingester
What it does:
- Receives time-series data from distributors
- Keeps recent data in memory (write buffer)
- Periodically flushes to long-term storage
- Handles data replication
Mental model: A server at a restaurant. Takes your order, writes it down, then hands it to the kitchen.
Key details:
- Stateful (persistent across requests)
- One ingester per shard
- Maintains multiple replicas for fault tolerance
- Holds ~2-3 hours of metrics in memory
Read Path Components (Querying)
These components handle PromQL queries.
Query Frontend
What it does:
- Receives PromQL queries from users/Grafana
- Splits large queries into smaller chunks
- Caches query results
- Returns results to users
Mental model: A receptionist. Takes your request, figures out who needs to handle it, caches common questions.
Key details:
- Stateless (scales horizontally)
- Implements query caching (reduces load)
- Query-result caching (10-30 second typical TTL)
- Useful for dashboard queries (same query, repeatedly)
Querier
What it does:
- Executes PromQL against metrics
- Reads from ingesters (recent data)
- Reads from long-term storage (old data)
- Aggregates results
Mental model: A librarian searching for books. Checks recent arrivals (ingesters) and the archive (storage).
Key details:
- Stateless (scales horizontally)
- Fetches data from multiple sources
- Deduplicates & merges time-series
- Handles failed queries gracefully
Backend Storage Components
These handle long-term persistence.
Blocks Storage
What it does:
- Stores compressed metric blocks
- One file per ~2-hour chunk
- Organized by tenant & metric
Mental model: A filing cabinet. Organized, compressed, tamper-proof.
Key details:
- 2-hour blocks typical
- TSDB format (same as Prometheus)
- Immutable (write-once)
- Highly compressible
Object Storage (S3, GCS, Azure)
What it does:
- Durably stores all blocks
- Highly available & redundant
- Multi-region capable
Mental model: Bank vault. Distributed, replicated, backed up.
Key details:
- Cloud-native (S3, GCS, Azure Blob)
- Or MinIO for on-prem
- Survives machine failures
- Can span regions
Compactor
What it does:
- Merges small blocks into larger ones
- Deduplicates time-series
- Downsamples old data (5m→1h→24h)
- Optimizes storage
Mental model: An archivist. Takes loose papers, merges them into binders, throws away outdated copies.
Key details:
- Batch job (runs periodically)
- Reduces query latency
- Saves storage space
- Downsampling: trade resolution for space
Index (Bloom/BoltDB)
What it does:
- Fast label lookups
- Bloom filters for cardinality
- Quick series discovery
Mental model: A book's index. Find page numbers quickly without reading every page.
Key details:
- Speeds up metric discovery
- Prevents memory exhaustion
- Handles high cardinality
Interactive Component Diagram
Here's how they connect:
Write Flow
Prometheus Scraper
↓
Distributor (Stateless)
↓
Ingester (Stateful, in-memory)
↓
Object Storage (S3/GCS)
Read that as:
- Prometheus pushes metrics to distributor
- Distributor rate-limits, validates, routes
- Ingester buffers in memory
- Periodically flushed to cloud storage
Read Flow
User / Grafana
↓
Query Frontend (Caching)
↓
Querier (Orchestrator)
↓
/----├----\
Ingester Blocks Index
(recent) (old) (fast lookup)
Read that as:
- Query arrives at frontend
- Frontend checks cache
- Querier fetches from multiple sources
- Results merged & returned
Real Example: "Show CPU usage last 7 days"
Write Side (What happened):
- Day 1: Prometheus sends metrics → Distributor → Ingester (in-memory)
- Day 1-3: Ingester flushes blocks → S3 (after 3-4 hours)
- Nightly: Compactor merges blocks (reduce from 100 → 20 blocks)
Read Side (Query happens now):
- User queries "CPU last 7 days"
- Query Frontend splits into:
- "Last 2 hours" → check recent ingesters
- "2-24 hours ago" → check blocks from S3
- "1-7 days ago" → check downsampled blocks (1m data → 5m)
- Querier parallelizes reads
- Index helps find "cpu_usage" metric instantly
- Results stream back to user
Total latency: ~500ms - 2s (depending on query complexity)
Component Interaction Matrix
| Component | Read | Write | Stateless? | Scales? |
|---|---|---|---|---|
| Distributor | ❌ | ✅ | ✅ | ✅ |
| Ingester | ✅ | ✅ | ❌ | ⚠️ (stateful) |
| Query Frontend | ✅ | ❌ | ✅ | ✅ |
| Querier | ✅ | ❌ | ✅ | ✅ |
| Compactor | ❌ | ❌ | ✅ | ✅ (batch job) |
| Object Storage | ✅ | ✅ | N/A | ✅ (infinite) |
| Index | ✅ | ❌ | ✅ | ✅ |
Write Path Deep Dive
Step 1: Metric arrives at Distributor
Metric:
name: up
labels:
job: prometheus
instance: localhost:9090
timestamp: 1234567890
value: 1
Distributor checks:
- Format valid? ✅
- User authenticated? ✅
- Rate limit OK? ✅
- Tenant exists? ✅
Then: Hash metric labels → pick 3 ingesters (replication)
Step 2: Ingester buffers
Ingester memory state:
┌─────────────────────────────┐
│ WAL (Write-Ahead Log) │
├─────────────────────────────┤
│ Time-series in-memory db │
│ up{job=prometheus}... │
│ http_requests_total{...}... │
│ ... │
└─────────────────────────────┘
Key: WAL persists to disk (survives restarts)
Step 3: Periodic flush to storage
Ingester → Compresses → Blocks → Object Storage
(3GB RAM) (50MB) (S3)
Compression: ~50:1 typical (time-series is repetitive)
Read Path Deep Dive
Step 1: Query arrives at Frontend
User writes: rate(up[5m])
Frontend:
- Parses query
- Checks label set (what metrics are needed?)
- Splits by time range:
- 0-2h ago → ingesters
- 2h-1yr ago → blocks
- 1yr+ ago → downsampled blocks
Step 2: Querier fetches from multiple sources
Querier parallel fetch:
├─ Ingester 1 (recent data)
├─ Ingester 2 (replication)
├─ Blocks reader (historical)
└─ Index (label lookup)
Deduplication: If data exists in both ingester & blocks, keep one copy.
Step 3: Results stream back
Results:
│ timestamp | value |
│-----------|-------|
│ 1234567890| 1 |
│ 1234567891| 1 |
│ 1234567892| 1 |
Scaling Mimir
Horizontal Scaling (add more machines)
| Component | Scale Strategy | Notes |
|---|---|---|
| Distributor | Add replicas | Stateless, easy |
| Ingester | Add shards | Hash-based routing |
| Query Frontend | Add replicas | Stateless, cache-friendly |
| Querier | Add replicas | Stateless |
| Compactor | Single job | Or distributed compaction |
| Storage | Infinite | Cloud storage scales automatically |
Vertical Scaling (bigger machines)
- Ingesters: More RAM = hold metrics longer = cheaper storage
- Queriers: More CPU = faster PromQL evaluation
- Distributors: More CPU = higher throughput
Tenancy (Multi-tenant Prometheus)
Mimir supports multiple independent Prometheus instances in one cluster.
Tenant Isolation
Distributor receives metric
├─ Extract tenant ID from request header
├─ Route to tenant-specific ingesters
├─ Store in tenant-specific blocks
└─ Query frontend filters by tenant
Example:
Team A: Prometheus → Distributor (tenant_id=team-a)
Team B: Prometheus → Distributor (tenant_id=team-b)
Same Mimir cluster, complete isolation.
Failure Scenarios & Recovery
Ingester Dies
- Distributor routes to healthy ingesters
- Lost data still in object storage (replicated)
- No data loss (because of replication factor)
- New ingester starts, re-syncs from storage
Distributor Dies
- Load balancer detects, routes to next distributor
- Completely stateless, no state recovery needed
- Query continues uninterrupted
Storage Goes Down
- Queries hitting blocks stall
- Recent ingesters still serve ~2h of data
- Ingesters keep writing to WAL (disk)
- When storage returns, ingesters resume flushing
Tuning for Your Use Case
High Cardinality (many unique metrics)
Tune:
- Increase index size
- Enable bloom filters
- Reduce compaction interval
- Increase ingester memory
Long Retention (years of data)
Tune:
- Enable downsampling (1m → 5m → 1h → 24h)
- Increase compaction interval
- Use cheaper object storage tier
High Query Load
Tune:
- Increase query frontend cache TTL
- Add more queriers
- Enable query caching layer (Redis)
- Reduce query complexity (pre-compute aggregations)
Cost-Conscious
Tune:
- Aggressive downsampling
- Longer block intervals (4h instead of 2h)
- Compress storage more
- Use cheaper storage (AWS S3 Standard → Glacier)
One-Liner Recap
Mimir = Prometheus metrics + distributed storage + query cache + multi-tenancy
Components:
- Write: Distributor → Ingester → Storage
- Read: Frontend → Querier → (Ingesters + Blocks + Index)
- Backend: Object Storage + Compactor + Index
Key insight: Separation of concerns. Stateless read/write paths scale independently. Stateful ingesters handle buffering. Cloud storage handles durability.
Quick Reference: Which Component When?
When you have ingestion slowness?
→ Check Distributor rate limits, add more distributors
When queries are slow?
→ Check Query Frontend cache hit rate, add more queriers
When storage bloats?
→ Tune compactor, enable downsampling, compress more
When cardinality explodes?
→ Increase index size, enable bloom filters, reduce churn
When disk fills up on ingesters?
→ Increase ingester memory, reduce WAL retention