Budget Caps
Convox tracks per-app cloud spend and lets you enforce a monthly cap. When the
cap is reached, the rack can alert, block new deploys, or automatically shut
services down to prevent overrun. Caps are managed at runtime with the
convox budget CLI or the Console budget tab. The
budget block in convox.yml declares the
schema and is validated on convox releases promote, but it does not itself
persist runtime cap values. Set caps with convox budget set so cap changes
are explicit, audit-attributed, and not implicitly overwritten by the next
deploy.
This page is the operational guide for managing caps in production. For schema details see the convox.yml budget block; for how spend is computed see Cost Tracking.
Prerequisite: cost tracking must be enabled
Budget enforcement (monthlyCapUsd, alertThresholdPercent, atCapAction)
requires the rack-level cost accumulator. Without it, no spend is computed, so
caps and alerts persist as config but never trip. Enable it on AWS racks:
$ convox rack params set cost_tracking_enable=true
If you run convox budget set (or deploy a manifest with a budget: block)
against a rack with cost_tracking_enable=false, the rack rejects the request
with HTTP 422 and a message pointing at this command. Set the rack parameter,
wait for the apply to complete (~3 min), then redeploy or retry.
cost_tracking_enable is AWS-only today; non-AWS racks (Azure, GCP,
DigitalOcean, Equinix Metal, Local) cannot enforce budgets in the current
release. Recovery operations (convox budget clear, convox budget reset)
remain available regardless of cost tracking state so you can always clean up.
Set a monthly cap
Set a cap for an app with convox budget set:
$ convox budget set myapp --monthly-cap 500
The cap value is in USD. You can configure the alert threshold, the at-cap action, and a pricing adjustment in the same command:
$ convox budget set myapp --monthly-cap 1000 --alert-at 75 --at-cap-action block-new-deploys --pricing-adjustment 0.7
| Flag | Default | Effect |
|---|---|---|
--monthly-cap |
none | Monthly spend cap in USD. Setting it enables enforcement for the app. |
--alert-at |
80 |
Percent of the cap at which an alert fires (the threshold alert). |
--at-cap-action |
alert-only |
What happens when spend crosses the cap. See Cap actions. |
--pricing-adjustment |
1.0 |
Multiplier applied to computed spend (e.g. 0.7 to model a 30% committed-use discount). |
You can also set these in the convox.yml budget block,
but runtime cap values come from convox budget set, not from the manifest.
Cap actions
atCapAction selects what happens when an app crosses its monthlyCapUsd:
| Action | Behavior |
|---|---|
alert-only (default) |
Fires the app:budget:cap event and webhooks. No deploy or runtime impact. |
block-new-deploys |
Fires app:budget:cap and blocks new deploys with an over-cap error. Running services keep running. |
auto-shutdown |
Starts the auto-shutdown countdown. After notifyBeforeMinutes, services are scaled to zero in shutdownOrder order. |
When you choose auto-shutdown, the CLI prints a warning reminding you to
configure your at-cap webhook and to validate the configuration first:
$ convox budget set myapp --monthly-cap 500 --at-cap-action auto-shutdown
Run convox budget simulate-shutdown myapp to preview which services would be
scaled to zero, and in what order, without actually shutting anything down.
Choosing an action
- Use
alert-onlywhile you are still learning what an app costs. You get the threshold alert and the at-cap event without any runtime impact. - Use
block-new-deploysto stop new deploys from adding more cost without touching what is already running. Good for non-production apps that should not grow further this month. - Use
auto-shutdownfor apps where stopping the spend matters more than staying up. Pair it with a configured webhook so your team is paged on the:armedevent before services scale down.
Raising or recovering a cap
Raising the cap mid-month re-enables blocked deploys (when current spend is below the new cap) and dismisses any active recovery banner:
$ convox budget cap raise myapp --monthly-cap-usd 500
Raising monthly cap for myapp... OK
budget cap raise is an alias for budget set --monthly-cap. Both
--monthly-cap-usd and --monthly-cap are accepted.
If the new cap is below current spend, the cap-raise is rejected with an
explicit error and deploys stay blocked. Use convox cost --app myapp to
confirm current spend before raising.
After auto-shutdown has fired, a cap-raise alone re-enables deploys but does
not restart already-shutdown services. To clear the breach and restore
services, use convox budget reset (see below).
Reset and force-clear cooldown
convox budget reset acknowledges a cap breach and re-enables deploys. After
auto-shutdown has fired, the plain reset also restarts services that were
scaled to zero. The cap value itself is unchanged.
$ convox budget reset myapp
Resetting budget for myapp... OK
By default, reset preserves the flap-suppression cooldown: an app that recently breached, was reset, then breached again will not flip-flop into repeated auto-shutdown cycles within 24 hours.
--force-clear-cooldown is additive. It re-enables deploys and restores
services exactly as the plain reset does, and additionally clears the
flap-suppression cooldown so the next cap breach is not suppressed by the
24-hour window. Use it only when you are sure the underlying cause is resolved.
$ convox budget reset myapp --force-clear-cooldown
Resetting budget for myapp (force-clearing flap-suppress cooldown)... OK
Block-new-deploys recovery
When atCapAction: block-new-deploys is in effect and the cap is reached,
deploys are rejected with an over-cap error similar to:
budget cap exceeded for app myapp: spent $268.42 of $250.00 cap this month
To recover:
- Run
convox cost --app myappto confirm current spend. - Either raise the cap (
convox budget cap raise --monthly-cap-usd NEW) or wait for the next month rollover (current spend resets on the 1st). - Or accept the cap and run
convox budget resetto re-enable deploys for the rest of the month without raising the cap. The breaker clears but the cap remains; subsequent cost growth will block deploys again.
What you see when a cap is hit
The convox ps STATUS column and the convox services BUDGET column both
show a per-service sub-state token when an app's budget cap is breached or
arming. The vocabulary is the same on both surfaces:
| Token | Meaning |
|---|---|
armed-Nm |
Auto-shutdown countdown is active; N minutes remain until services scale to zero (e.g. armed-25m). |
at-cap-keda |
Service has been scaled to zero by auto-shutdown (KEDA-managed services). |
at-cap-auto |
Service has been scaled to zero by auto-shutdown (deployment-only services). |
at-cap |
Cap is breached with atCapAction: block-new-deploys; no scale-to-zero, deploys are rejected. |
When you recover (raise the cap or reset), these tokens clear and any recovery banner in the Console is dismissed.
Authorization
Budget operations split across two authorization tiers. Cap mutation, clearing
budget config, and force-clearing the cooldown require the Admin role.
Threshold-only changes, plain reset, dismiss-recovery, and the read-only
simulate-shutdown preview require the read-write (rw) role. The split applies
on every provider (AWS, Azure, GCP, DigitalOcean, Equinix Metal, Local).
| Operation | Required role |
|---|---|
Set or raise the monthly cap (--monthly-cap) |
Admin |
Set the at-cap action (--at-cap-action) |
Admin |
Set the pricing adjustment (--pricing-adjustment) |
Admin |
Set the alert threshold only (--alert-at) |
rw |
Clear budget config (convox budget clear) |
Admin |
Reset (convox budget reset) |
rw |
Reset with --force-clear-cooldown |
Admin |
Dismiss recovery banner (convox budget dismiss-recovery) |
rw |
Simulate shutdown (convox budget simulate-shutdown, read-only) |
rw |
The Admin requirement on cap changes, clear, and force-clear keeps an admin-set cap from being circumvented by a non-admin (for example, clearing the budget and re-setting it without the cap).
Basic-auth callers (using the rack password) pass the Admin check automatically. A non-Admin Console user who attempts an Admin-only action sees the rejection message relayed through the Console UI. The exact rejection messages are:
403 AppBudgetSet: admin role required to set budget cap
403 AppBudgetClear: admin role required to remove budget config
403 AppBudgetReset --force-clear-cooldown requires Admin role; current role is 'w'. Contact rack admin or use Admin token.
Per-service cost breakdown
convox cost --app myapp returns a breakdown array with the cumulative
spend, instance type, and bucket name for every service that has been observed
running this month. The breakdown grows over the month and resets to zero at
month rollover alongside currentMonthSpendUsd.
Two reserved buckets appear alongside service names:
_build: spend for build pods, bucketed away from the service being built so the named service's normal-operation cost stays uninflated._unattributed: spend for pods with no service label (system sidecars, autoscaler components, anything not user-deployed). Kept visible without inflating any user-deployed service's row.
Edge cases:
- Service deleted mid-month. The deleted service's accumulated spend remains in the breakdown until month rollover. You ran the service for part of the month, so its cost stays attributed for that month.
- Service renamed mid-month. The old name keeps its pre-rename spend; the new name accumulates from the rename point forward. Both rows appear until rollover and sum to the correct app total.
- Per-app entry cap. The breakdown holds up to 1000 services per app. Once
full, existing rows keep accumulating but new services are dropped from this
month's breakdown, and an
app:budget:per-service-truncatedevent fires. Subscribe via webhook to surface it. Most apps stay well under the cap; if you hit it, check for unbounded service-name churn. - Pre-3.24.6 history. Per-service attribution starts populating after
upgrade. Spend accumulated before the upgrade remains in the app total
(
currentMonthSpendUsd) but is not retroactively attributed. - Rolling back. Total spend survives a downgrade-then-re-upgrade round trip. Per-service attribution does not; an older rack drops the per-service fields, and the breakdown re-populates after re-upgrading.
The breakdown surfaces in convox cost, the Console budget panel, and the
auto-shutdown shutdownOrder: largest-cost ranking. With per-service spends
populated, largest-cost shuts down the most expensive service first.
Audit actor
Every budget mutation emits an audit event with an actor field identifying
who triggered the action. From 3.24.6 onward, when the request supplies an
acknowledging identity (the Console passes the authenticated admin's email, and
the CLI derives it from your authenticated identity), the rack records that value as the
actor. When no such value is supplied, the rack falls back to the caller
derived from the request credentials (typically rack-password for basic-auth
clients), preserving pre-3.24.6 behavior.
Operator scripts and webhook receivers ingesting these events should follow the actor-shape guidance in ack_by Derivation.
Troubleshooting
Breaker re-trips immediately after reset
The cap is below current spend. Either raise the cap or wait for month
rollover. convox budget show myapp displays current spend versus cap.
Recovery banner persists across cycles
Pre-3.24.6 racks could carry a dismiss timestamp from one shutdown-and-recovery cycle into the next, silently suppressing the new banner. The fix shipped in 3.24.6. For racks already in a stuck state, clear the annotation manually:
$ kubectl annotate ns <rack>-<app> convox.com/budget-recovery-banner-dismissed-
Auto-shutdown fired but services did not scale down
Confirm convox.yml has atCapAction: auto-shutdown set (not
block-new-deploys). Check convox budget show myapp for the live state. If
the countdown armed but did not fire, it may still be running
(notifyBeforeMinutes).
Auto-shutdown fired but I want to keep services running
Run convox budget reset myapp. The plain reset re-enables deploys AND
restarts services that were scaled to zero. A cap-raise alone
(convox budget cap raise) re-enables deploys but does not restart services.
convox budget reset is the recovery path after auto-shutdown has fired.
See Also
- Cost Tracking: how spend is computed
- convox.yml budget block: schema reference
- budget CLI reference: command reference
- Webhooks: receiving cap events at an external URL
- ack_by Derivation: actor field semantics for audit-event receivers
- Budget Management: Console UI for budget configuration
Note on terminology: this page covers the per-app monthly spend cap introduced in 3.24.6. The unrelated Karpenter disruption budget (a cluster-level node-scheduling primitive; see Karpenter) shares the word "budget" but is a separate concept with no shared configuration surface.