Web apps, APIs, and background workers all scale on the same Kubernetes runtime in your own cloud, and Convox adjusts replicas to match demand so performance holds without over-provisioning.
Tune replica counts for horizontal scaling or adjust CPU and memory targets for vertical headroom, all declared per service in convox.yml, so you set the policies and Convox applies them.
Trigger scaling on CPU and memory utilization, with Kubernetes HPA running underneath to react to load, so traffic spikes get absorbed automatically while idle capacity scales back down.
Scaling rides on the same rolling deployment strategy as releases, so new pods come up healthy before old ones retire and services stay available through traffic surges and rapid growth.
The same scaling behavior runs on AWS, Google Cloud, Azure, and DigitalOcean, deployed into your own account, so you can move or expand across providers without rewriting how your app scales.
By matching pod counts and resource requests to actual demand, Convox keeps you from paying for idle headroom, and capacity contracts when load drops so apps stay responsive without waste.