Upsun User Documentation

Autoscaling

Autoscaling allows your applications to automatically scale horizontally based on resource usage.

This ensures your apps remain responsive under load while helping you optimize costs.

  • Scope: Available for applications only
  • Product tiers: Available for all Upsun Flex environments
  • Environments: Configurable per environment - across development, staging, and production

Autoscaling availability Anchor to this heading

The tables below outline where autoscaling and manual scaling are supported, so you can plan your deployments with the right balance of flexibility and control.

Component support Anchor to this heading

Component Horizontal autoscaling Manual scaling (Vertical)
Applications (PHP, Node.js, etc.) Available Available
Services (MySQL, Redis, etc.) Unavailable Available
Queues (workers, background jobs) Unavailable Available

Product tier support Anchor to this heading

Product tier Horizontal autoscaling Manual scaling (Vertical)
Upsun Flex Available Available
Upsun Fixed Unavailable Available

Environment support Anchor to this heading

Environment Horizontal Autoscaling Manual scaling (Vertical)
Development Available Available
Staging Available Available
Production Available Available

Scaling trigger support Anchor to this heading

Trigger Console
Average CPU (min/max) Available
Average Memory (min/max) Coming

How autoscaling works Anchor to this heading

Thresholds Anchor to this heading

Autoscaling continuously monitors the average CPU utilization across your app’s running instances. It works by you setting your thresholds, which are specific CPU usage levels that determine when autoscaling should take action. There are two different thresholds that your CPU utilization operates within: A scale-up threshold and a scale-down threshold.

  • Scale-up threshold: If your chosen trigger (e.g. CPU usage) stays above this level for the time period you’ve set (the evaluation period), autoscaling will launch additional instances to share the load.

  • Scale-down threshold: If your chosen trigger stays below this level for the time period you’ve set, autoscaling will remove unneeded instances to save resources and costs.

To prevent unnecessary back-and-forth, autoscaling also uses a cooldown window: a short waiting period before another scaling action can be triggered. This can also be configured or kept to the default waiting period before any additional scaling starts.

Default settings Anchor to this heading

Autoscaling continuously monitors the configured trigger across your app’s running instances. We will use the average CPU utilization trigger as the primary example for the default settings and examples below.

  • Scale-up threshold: 80% CPU for 5 minutes
  • Scale-down threshold: 20% CPU for 5 minutes
  • Cooldown window: 5 minutes between scaling actions
  • Instance limits: 1–8 per environment (region-dependent)

Default behaviour (CPU example) Anchor to this heading

  • If CPU stays at 80% or higher for 5 minutes, autoscaling adds an instance.
  • If CPU stays at 20% or lower for 5 minutes, autoscaling removes an instance.
  • After a scaling action, autoscaling waits 5 minutes before making another change.

This cycle ensures your app automatically scales up during high demand and scales down when demand drops, helping balance performance with cost efficiency.

Guardrails and evaluation Anchor to this heading

Autoscaling gives you control over the minimum and maximum number of instances your app can run. These guardrails ensure your app never scales up or down too far. Set boundaries to keep scaling safe, predictable, and cost-efficient:

For example, you might configure:

  • Minimum instances — Ensures a minimum number of instances of the configured application are always running (e.g. 2)
  • Maximum instances — Prevents runaway scaling (e.g. 8)
  • Evaluation period — Time CPU must stay above or below a threshold before action (1–60 minutes)
  • Cooldown window — Wait time before any subsequent scaling action (default: 5 minutes)

Enable Autoscaling Anchor to this heading

To enable autoscaling, follow the steps below:

  1. Open your project in the Console
  2. Select the environment where you want to enable autoscaling
  3. Choose Configure resources

Navigate to Configure resources in Console

  1. Under the autoscaling column select Enable
  2. Configure thresholds, evaluation period, cooldown, and instances as needed

Configure autoscaling in Console

Alerts and metrics Anchor to this heading

When autoscaling is enabled, the system continuously monitors metrics such as CPU usage, instance count, and request latency. If a defined threshold is crossed, an alert is triggered and the platform automatically responds by adjusting resources.

Scaling activity is visible in several places:

  • Metrics dashboards show when scaling has occurred
  • Alerts and scaling actions are also visible in the Console:
    • Alerts appear with a bell icon (for example - Scaling: CPU for application below 70% for 5 minutes)
    • Scaling actions appear with a resources icon (for example, Upscale: 1 instance added to application)
  • Alerts and scaling actions are also listed in the CLI as environment.alert and environment.resources.update
  • To review detailed scaling events, open the Resources dashboard by navigating to {Select project} > {Select environment} > Resources

Configure alerts Anchor to this heading

You can also configure notifications for alerts.

For example, by setting up an activity script on environment.alert, you can automatically send yourself an email, a Slack message, or another type of custom notification.

Billing and cost impact Anchor to this heading

Autoscaling projects are billed for the resources that they consume. Instances added through autoscaling are billed the same as if you were to manually configure those resources.

However, each scaling action consumes build minutes, since new or removed instances are deployed with scaling action. If your app scales frequently, this could increase build minute usage.

To control costs, avoid overly aggressive settings (e.g. very short evaluation periods).

Best practices for autoscaling Anchor to this heading

Autoscaling gives you flexibility and resilience, but to get the best results it’s important to configure your app and thresholds thoughtfully. Below are some best practices to help you balance performance, stability, and cost.

Cost & stability Anchor to this heading

  • Set thresholds wisely: Configure realistic scale-up and scale-down thresholds to avoid unnecessary deployments that quickly consume build minutes.
  • Smooth spikes: Use longer evaluation periods (10–15 minutes) if your app traffic spikes often, to prevent rapid up-and-down scaling.
  • Control instance counts: Define minimum and maximum instances to manage costs while keeping required availability.
  • Monitor costs: Track billing and build minute usage after enabling autoscaling, then adjust thresholds as needed.

Application design Anchor to this heading

Cron jobs & long-running tasks Anchor to this heading

  • CPU spikes from jobs: Cron jobs can increase CPU usage and may trigger scale-ups so factor this into your threshold settings.
  • Job continuity: Cron jobs remain bound to their starting container and are not interrupted by scaling, so plan instances accordingly.
Resources and scaling Anchor to this heading
Metrics Anchor to this heading
Image reference Anchor to this heading
Billing and payment Anchor to this heading