Vertical Pod Autoscaler (VPA)
VPA is available on AWS racks.
The Vertical Pod Autoscaler automatically adjusts CPU and memory requests for your services based on observed usage. Unlike horizontal autoscaling which changes the number of replicas, VPA right-sizes each replica's resource allocation.
Prerequisites
Enable VPA on your rack:
$ convox rack params set vpa_enable=true -r rackName
Setting parameters... OK
Defining VPA in convox.yml
Define VPA settings in the scale.vpa section of your service in convox.yml:
services:
web:
build: .
port: 3000
scale:
count: 3
vpa:
updateMode: Initial
minCpu: "100"
maxCpu: "2000"
minMem: "256"
maxMem: "4096"
Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
| updateMode | string | Required. How VPA applies recommendations: Off, Initial, or Recreate |
|
| minCpu | string | Minimum CPU in millicores (e.g. "100") |
|
| maxCpu | string | Maximum CPU in millicores (e.g. "2000") |
|
| minMem | string | Minimum memory in MB (e.g. "256") |
|
| maxMem | string | Maximum memory in MB (e.g. "4096") |
|
| cpuOnly | boolean | false | Only adjust CPU, leave memory unchanged |
| memOnly | boolean | false | Only adjust memory, leave CPU unchanged |
| updateRequestOnly | boolean | false | Only update resource requests, do not modify limits |
Update Modes
- Off: VPA calculates recommendations but does not apply them. Use this to observe recommendations before enabling automatic adjustments.
- Initial: VPA sets resource requests only when pods are first created. Running pods are not restarted.
- Recreate: VPA applies recommendations by evicting pods when significant resource changes are needed.
cpuOnlyandmemOnlycannot both be set totrue.
See Also
- vpa_enable rack parameter for enabling VPA
- Autoscaling for horizontal scaling and GPU configuration
- KEDA Autoscaling for event-driven autoscaling