Metrics

The metrics add-on scrapes Prometheus-style metrics from your app and surfaces them in Grafana, alongside platform metrics that Watasu reports for you.

Attach it

watasu addons:create metrics --app my-app

If this is your first observability add-on, Grafana is provisioned automatically.

What gets collected

Two streams flow into Grafana:

Your app’s metrics — anything your processes expose on a Prometheus-compatible endpoint. Use any standard client library (prometheus_client in Python, prom-client in Node, prometheus-client for Ruby/Rails, etc.). Watasu scrapes them on a regular interval.

When the metrics add-on is ready, Watasu injects a METRICS_PORT config var into your app (default 9464). Serve your Prometheus metrics endpoint on that port:

start_http_server(int(os.environ["METRICS_PORT"]))

METRICS_PORT is an add-on-managed var — read it, don’t override it.

Prefer pushing instead of being scraped? The add-on also injects OTEL_EXPORTER_OTLP_METRICS_ENDPOINT (with OTEL_EXPORTER_OTLP_METRICS_PROTOCOL=http/protobuf) and PROMETHEUS_REMOTE_WRITE_URL — standard OpenTelemetry SDKs and Prometheus remote-write clients pick these up with no extra configuration.

Platform metrics — Watasu publishes runtime data without you instrumenting it:

pod CPU and memory usage
pod restarts and OOMKilled events
attached add-on metrics (PostgreSQL, Valkey, object storage, and so on)

You get both in the same Grafana, ready to query.

Best-effort scraping

If a process doesn’t expose metrics, or exposes them on a different port, Watasu skips it gracefully. Metrics collection failing never blocks an app from running.

What to instrument first

Don’t try to instrument everything on day one. The high-leverage metrics for most apps:

request latency (histogram, p50/p95/p99)
request rate by endpoint
error rate (HTTP 5xx, exceptions)
queue depth or job latency for workers
one or two domain-specific business metrics (“orders per minute”, “active sessions”)

Wider coverage can come later. Five well-named, well-shaped metrics beat fifty noisy ones.

Cardinality discipline

Every unique combination of labels is a separate time series. High-cardinality labels (user IDs, request IDs, full URLs) explode storage and query cost. Stick to low-cardinality dimensions: status code, method, route name, queue name, region.

Alerts

Grafana’s alerting UI lets you build alerts on top of any metric query. Hook them up to whatever paging or notification system your team already uses.

Set alerts on operational thresholds people will actually act on. Alerts that fire and get ignored train the team to ignore alerts.