Cost Tracking
Convox aggregates per-app spend from cloud-provider pricing data plus the rack's
in-cluster usage telemetry. Spend is the input to budget caps (see Budget
Caps) and surfaces in the Console and the convox cost
CLI.
Enabling cost tracking
Cost tracking is gated by the rack parameter cost_tracking_enable, default
false. Without it, the cost accumulator does not run — no spend is
computed and budget enforcement (caps, alerts, auto-shutdown) cannot fire
even with a budget: block in convox.yml.
Read paths still return successfully: convox cost against a rack with
cost_tracking_enable=false returns a zero spend total and an empty
breakdown (HTTP 200), so dashboards and scripts that poll the endpoint do
not break — they see "no data yet." Write paths, on the other hand,
reject loud: convox budget set and convox deploy against a manifest
with an enforcement-bearing budget: block return HTTP 422 with an
actionable message pointing at the enable command. Recovery operations
(convox budget clear, convox budget reset) remain available regardless
of cost-tracking state.
Enable on AWS racks:
$ convox rack params set cost_tracking_enable=true
Wait ~3 minutes for the rack apply to complete, then deploy or set budgets.
The first accumulator tick after the apply (default tick interval is 10
minutes) starts populating spend. The Console budget panel and convox cost
become populated from that tick onward.
cost_tracking_enable is AWS-only. Non-AWS racks (Azure, GCP,
DigitalOcean, Equinix Metal, Local) cannot enable cost tracking; their
built-in pricing tables and instance-type introspection paths only cover
AWS. Cost-tracking-dependent features (Console budget panel populated,
convox cost, per-service spend attribution) are AWS-only.
How spend is computed
The rack samples each running pod's CPU, memory, and (where applicable) GPU
allocations on every accumulator tick (default 10 minutes). Each sample is
priced against the instance type the pod runs on, using a built-in price table
keyed by cloud provider and instance family. The per-tick samples are summed
across the month into the app's CurrentMonthSpendUsd field, surfaced through
convox budget show and the Console budget panel.
Pricing adjustment (pricingAdjustment in convox.yml) is applied
multiplicatively at sample time. A pricingAdjustment of 1.10 produces 10% more
recorded spend than the raw price would; 0.95 produces 5% less. Use this to
align Convox's internal pricing with the contract pricing your finance team
sees, or to add a buffer for cap headroom.
Per-variant cost breakdown
Spend is attributed to each (instance-type, capacity-type) variant a service
runs on across the month. A service that started the month on g4dn.xlarge
on-demand and was Karpenter-replaced to g4dn.xlarge spot mid-month produces
two rows in convox cost --app myapp: one on-demand row with the early-month
replicas count, one spot row with the later-month replicas count. Rows are
sorted by descending spend.
A row showing 0 active replicas indicates pods previously ran on that variant
but have since migrated or been removed; the accumulated spend for the variant
is preserved through the rest of the month so the rollup reflects the actual
cloud bill.
Spot capacity-type rows are automatically discounted by the pricing table's
spot factor (default 0.30 of the on-demand rate). Per-instance overrides are
configurable via the SpotUsdPerHourFactor field on the rack-side pricing
table.
The pricing-adjustment multiplier (pricingAdjustment in convox.yml) applies
multiplicatively to the variant rates. A value of 0.7 models a 30% AWS
Enterprise Discount Program / Savings Plan / Reserved Instance discount on top
of the canonical pricing, so Convox-reported spend tracks your contract
pricing rather than the raw on-demand rate. A value of 1.10 adds 10% buffer
for cap headroom.
Unpriced instance types
The built-in price table covers the common instance families on each provider.
When a pod runs on an instance the table does not know about — a brand-new AWS
family, a Karpenter-spawned instance from a custom NodePool, or a custom GPU SKU
on metal — the rack records 0 for that sample. The pod still runs; only the
cost-tracking column is blank.
Symptoms:
convox cost --app myappshows?or0.00for some services.app:budget:thresholdandapp:budget:capevents do not fire even though cloud bills indicate the app should have crossed.
To diagnose:
convox ps --app myappshows the running pods.kubectl get pod -n <rack>-<app> -o jsonpath='{.items[*].spec.nodeName}'pluskubectl get nodes -L node.kubernetes.io/instance-typeresolves each pod to its instance type.- File the unrecognized type as an issue at the convox/convox repo.
To work around in the meantime, set a higher pricingAdjustment to compensate
for the under-counted instances, or move the impacted services to a node group
that uses a recognized instance family.
Cost breakdown CLI
convox cost returns one row per service plus the reserved _build and
_unattributed buckets, sorted descending by SPEND-USD with alphabetical
secondary tiebreak:
$ convox cost --app myapp
SERVICE GPU-HOURS CPU-HOURS MEM-GB-HOURS INSTANCE SPEND-USD
vllm 0.00 0.00 0.00 g4dn.xlarge $0.30
api 0.00 0.00 0.00 t3.medium $0.08
worker 0.00 0.00 0.00 t3.small $0.04
_build 0.00 0.00 0.00 c5.large $0.02
_unattributed 0.00 0.00 0.00 t3.medium $0.01
The SPEND-USD column is populated from the accumulator's per-service totals.
The GPU-HOURS / CPU-HOURS / MEM-GB-HOURS columns are reserved for a
future per-resource pricing model; in 3.24.6 they always render 0.00. App
totals are surfaced via convox cost --aggregate (a single-row table:
APP | SPEND-USD | AS-OF | PRICING-SOURCE). See the
cost CLI reference for the full flag set.
Service-level numbers help identify which workload is driving spend. Use the
output to refine monthlyCapUsd, decide whether to opt a service out of
atCapAction: auto-shutdown via neverAutoShutdown, or scale the workload
down before cap fire.
Per-month rollover
Spend resets to zero at the first of each month, UTC. Caps that were tripped in the previous month are cleared as part of the rollover. Recovery banners and flap-suppress carry-overs are cleared by the stale-annotation GC tick after one poll interval (10 min default).
See Also
- Budget Caps — operational management of caps
- convox.yml budget block — schema reference
- cost CLI reference — command reference
- Budget Management — Console UI for cost and budget management