scale

The convox scale command views and adjusts the scale parameters of a service, including replica count, CPU allocation, and memory allocation. When called without flags, it displays the current scale configuration for all services in the app.

scale

Scale a service

Usage

    convox scale [<service>]

The positional <service> is required for imperative changes (--count, --cpu, --memory, --gpu, --gpu-vendor, --min, --max) and optional in read-mode. With no service argument, convox scale prints the full scale table for the app. With a service argument, the table is filtered to that service's row only: the same columns, one row.

    $ convox scale web -a myapp
    SERVICE  DESIRED  RUNNING  CPU  MEMORY  GPU  MIN  MAX  STATUS
    web      2        2        256  1024    -    2    2

If the supplied service does not exist in the app, convox scale <name> exits non-zero with service "<name>" not found in app <app>. The check fires once before the watch loop starts, so a typo combined with --watch does not loop forever printing the same error.

Examples

    $ convox scale web --count 3 --cpu 250 --memory 1024
    Scaling web...
    2026-01-15T14:54:50Z system/k8s/atom/app Status: Running => Pending
    2026-01-15T14:54:51Z system/k8s/web Scaled up replica set web-745f845dc to 3
    2026-01-15T14:54:51Z system/k8s/web-745f845dc Created pod: web-745f845dc-abc12
    2026-01-15T14:54:52Z system/k8s/atom/app Status: Pending => Updating
    2026-01-15T14:54:53Z system/k8s/web-745f845dc-abc12 Pulling image "registry.0a1b2c3d4e5f.convox.cloud/myapp:web.BABCDEFGHI"
    2026-01-15T14:54:56Z system/k8s/web-745f845dc-abc12 Successfully pulled image "registry.0a1b2c3d4e5f.convox.cloud/myapp:web.BABCDEFGHI"
    2026-01-15T14:54:56Z system/k8s/web-745f845dc-abc12 Created container main
    2026-01-15T14:54:56Z system/k8s/web-745f845dc-abc12 Started container main
    2026-01-15T14:55:01Z system/k8s/atom/app Status: Updating => Running
    OK

Flags

Flag	Description
`--count`	Number of desired replicas for the service
`--cpu`	CPU allocation in millicores (e.g., 250 = 0.25 vCPU)
`--memory`	Memory allocation in MB
`--gpu`	Number of GPU devices to reserve per pod
`--gpu-vendor`	GPU vendor. Supported: `nvidia` (default), `amd`
`--min`	Minimum replica count. With autoscale configured, sets the autoscale floor. Without autoscale (3.24.6+), patches the deployment replica count directly, useful for short-lived overrides without editing convox.yml.
`--max`	Maximum replica count when autoscale is configured. Combined with `--min`, requires the service to declare `scale.autoscale` (or `scale.keda`) in convox.yml. Without an autoscale block the rack returns an error directing you to add one or use `--count` for a fixed replica count. Services with a Console-driven triggers override accept `--min`/`--max` via the HPA bounds (3.24.6+) without requiring a manifest autoscale block.

Output Table

When no flags are passed, convox scale prints a table of the configured scale for every service in the app. Columns 1-6 (SERVICE, DESIRED, RUNNING, CPU, MEMORY, GPU) match the 3.24.5 layout exactly so existing scripts that parse the output positionally with awk or cut continue to work unchanged. 3.24.6 appends MIN, MAX, an optional AUTOSCALE, and a trailing STATUS column at positions 7+.

    $ convox scale
    SERVICE  DESIRED  RUNNING  CPU   MEMORY  GPU  MIN  MAX  AUTOSCALE    STATUS
    web      2        2        256   1024    -    2    2    -
    vllm     0        0        4000  16384   1    0    10   gpu-util>70  COLD (~2-5m first req)

The vllm row above is at rest with zero replicas. The cold-start hint (COLD (~2-5m first req)) only renders when both DESIRED=0 and RUNNING=0 on a service whose autoscale min is 0 (a min: 0 service that has scaled back down to zero). Once a request triggers the first replica, DESIRED and RUNNING become 1+ and the STATUS cell clears.

The AUTOSCALE column appears between MAX and STATUS when at least one service in the app has autoscaling enabled. Its cell summarizes the configured trigger (e.g. cpu>70, gpu-util>80 queue>10). Services without autoscale render - in that column.

The trailing STATUS column carries the cold-start hint (COLD (~2-5m first req) for services with min: 0 autoscale) and, when the app's budget cap is breached, the per-service sub-state token (armed-Nm, at-cap-keda, at-cap-auto, at-cap).

Column-Position Contract

Position	Header	Source	Stable since
1	`SERVICE`	service name	3.24.5
2	`DESIRED`	configured replica count (`s.Count`)	3.24.5
3	`RUNNING`	live process count (from `ProcessList`)	3.24.5
4	`CPU`	CPU allocation in millicores	3.24.5
5	`MEMORY`	memory allocation in MB	3.24.5
6	`GPU`	GPU count per pod (or `-`)	3.24.5
7	`MIN`	min replica count from `scale.autoscale` (or `-`)	3.24.6
8	`MAX`	max replica count from `scale.autoscale` (or `-`)	3.24.6
9 (optional)	`AUTOSCALE`	configured trigger summary	3.24.6
trailing	`STATUS`	cold-start hint and budget sub-state	3.24.6

scale

Scale a service

Usage

    convox scale [<service>]

    $ convox scale web -a myapp
    SERVICE  DESIRED  RUNNING  CPU  MEMORY  GPU  MIN  MAX  STATUS
    web      2        2        256  1024    -    2    2

Examples

    $ convox scale web --count 3 --cpu 250 --memory 1024
    Scaling web...
    2026-01-15T14:54:50Z system/k8s/atom/app Status: Running => Pending
    2026-01-15T14:54:51Z system/k8s/web Scaled up replica set web-745f845dc to 3
    2026-01-15T14:54:51Z system/k8s/web-745f845dc Created pod: web-745f845dc-abc12
    2026-01-15T14:54:52Z system/k8s/atom/app Status: Pending => Updating
    2026-01-15T14:54:53Z system/k8s/web-745f845dc-abc12 Pulling image "registry.0a1b2c3d4e5f.convox.cloud/myapp:web.BABCDEFGHI"
    2026-01-15T14:54:56Z system/k8s/web-745f845dc-abc12 Successfully pulled image "registry.0a1b2c3d4e5f.convox.cloud/myapp:web.BABCDEFGHI"
    2026-01-15T14:54:56Z system/k8s/web-745f845dc-abc12 Created container main
    2026-01-15T14:54:56Z system/k8s/web-745f845dc-abc12 Started container main
    2026-01-15T14:55:01Z system/k8s/atom/app Status: Updating => Running
    OK

Flags

Flag	Description
`--count`	Number of desired replicas for the service
`--cpu`	CPU allocation in millicores (e.g., 250 = 0.25 vCPU)
`--memory`	Memory allocation in MB
`--gpu`	Number of GPU devices to reserve per pod
`--gpu-vendor`	GPU vendor. Supported: `nvidia` (default), `amd`
`--min`	Minimum replica count. With autoscale configured, sets the autoscale floor. Without autoscale (3.24.6+), patches the deployment replica count directly, useful for short-lived overrides without editing convox.yml.
`--max`	Maximum replica count when autoscale is configured. Combined with `--min`, requires the service to declare `scale.autoscale` (or `scale.keda`) in convox.yml. Without an autoscale block the rack returns an error directing you to add one or use `--count` for a fixed replica count. Services with a Console-driven triggers override accept `--min`/`--max` via the HPA bounds (3.24.6+) without requiring a manifest autoscale block.

Output Table

    $ convox scale
    SERVICE  DESIRED  RUNNING  CPU   MEMORY  GPU  MIN  MAX  AUTOSCALE    STATUS
    web      2        2        256   1024    -    2    2    -
    vllm     0        0        4000  16384   1    0    10   gpu-util>70  COLD (~2-5m first req)

Column-Position Contract

Position	Header	Source	Stable since
1	`SERVICE`	service name	3.24.5
2	`DESIRED`	configured replica count (`s.Count`)	3.24.5
3	`RUNNING`	live process count (from `ProcessList`)	3.24.5
4	`CPU`	CPU allocation in millicores	3.24.5
5	`MEMORY`	memory allocation in MB	3.24.5
6	`GPU`	GPU count per pod (or `-`)	3.24.5
7	`MIN`	min replica count from `scale.autoscale` (or `-`)	3.24.6
8	`MAX`	max replica count from `scale.autoscale` (or `-`)	3.24.6
9 (optional)	`AUTOSCALE`	configured trigger summary	3.24.6
trailing	`STATUS`	cold-start hint and budget sub-state	3.24.6

scale

scale

Usage

Examples

Flags

Output Table

Column-Position Contract

See Also

scale

scale

Usage

Examples

Flags

Output Table

Column-Position Contract

See Also