Platform.sh is now Upsun. Click here to learn more
Upsun User Documentation

Autoscaling

Autoscaling is a feature that automatically adjusts how many instances of your application are running, increasing capacity when demand rises, and reducing it when things are quiet. It helps your app stay responsive under heavy load while keeping your infrastructure costs efficient.

What is autoscaling? Anchor to this heading

Autoscaling works through horizontal scaling, by adding or removing whole application instances depending on resource usage. If CPU or memory utilization stays above a certain threshold for a set time, Upsun automatically adds more instances. If it stays low, Upsun removes unneeded ones. You control these thresholds and limits, so scaling always happens safely and predictably.

  • Scope: Available for applications only
  • Product tiers: Available for all Upsun Flex environments
  • Environments: Configurable per environment - across development, staging, and production

When to use autoscaling Anchor to this heading

Autoscaling makes the most sense for workloads with variable or unpredictable traffic. It’s especially valuable when:

  • You run time-sensitive or customer-facing applications where latency matters.
  • Your app experiences seasonal or campaign-driven spikes.
  • You want to avoid paying for idle capacity during quieter periods.

Example: When autoscaling works effectively Anchor to this heading

A retail app sees traffic jump fivefold every Friday evening and during holiday campaigns. By enabling autoscaling, the app automatically adds instances when CPU usage rises and scales back overnight, ensuring smooth checkouts without wasted cost.

Example: When autoscaling might not be needed Anchor to this heading

An internal dashboard with predictable, low usage may not benefit from autoscaling. In this case, a fixed number of instances and tuned vertical resources (CPU/RAM) can be more cost-effective and stable.

Autoscaling availability Anchor to this heading

The tables below outline where autoscaling and manual scaling are supported, so you can plan your deployments with the right balance of flexibility and control.

Component support Anchor to this heading

Component Horizontal autoscaling Manual scaling (Vertical)
Applications (PHP, Node.js, etc.) Available Available
Services (MySQL, Redis, etc.) Unavailable Available
Queues (workers, background jobs) Unavailable Available

Product tier support Anchor to this heading

Product tier Horizontal autoscaling Manual scaling (Vertical)
Upsun Flex Available Available
Upsun Fixed Unavailable Available

Environment support Anchor to this heading

Environment Horizontal Autoscaling Manual scaling (Vertical)
Development Available Available
Staging Available Available
Production Available Available

Scaling trigger support Anchor to this heading

Trigger Console
Average CPU (min/max) Available
Average Memory (min/max) Available

How autoscaling works Anchor to this heading

Thresholds Anchor to this heading

Autoscaling monitors the average CPU and memory usage of your running app instances.
You define thresholds that determine when new instances are launched or removed.

There are two different thresholds that your CPU and memory utilization operate within: A scale-up threshold and a scale-down threshold.

  • Scale-up threshold: If your chosen trigger (e.g. CPU usage) stays above this level for the time period you’ve set (the evaluation period), autoscaling will launch additional instances to share the load.

  • Scale-down threshold: If your chosen trigger stays below this level for the time period you’ve set, autoscaling will remove unneeded instances to save resources and costs.

To prevent unnecessary back-and-forth, autoscaling also uses a cooldown window: a short waiting period before another scaling action can be triggered. This can also be configured or kept to the default waiting period before any additional scaling starts.

Default settings Anchor to this heading

Autoscaling continuously monitors the configured trigger across your app’s running instances. We will use the average CPU utilization trigger as the primary example for the default settings and examples below.

  • Scale-up threshold: 80% CPU for 5 minutes

  • Scale-down threshold: 20% CPU for 5 minutes

  • Cooldown window: 5 minutes between scaling actions

  • Instance limits: 1–8 per environment (region-dependent)

Default behaviour (CPU example) Anchor to this heading

  • If CPU stays at 80% or higher for 5 minutes, autoscaling adds an instance.
  • If CPU stays at 20% or lower for 5 minutes, autoscaling removes an instance.
  • After a scaling action, autoscaling waits 5 minutes before making another change.

This cycle ensures your app automatically scales up during high demand and scales down when demand drops, helping balance performance with cost efficiency.

Memory-based autoscaling Anchor to this heading

Autoscaling primarily relies on CPU utilization as its trigger, however you can also configure memory-based autoscaling, which works in a similar way, but with a few important differences to understand.

CPU-based triggers Anchor to this heading

CPU-based autoscaling reacts to sustained changes in average CPU utilization.

  • Scale-up threshold: When average CPU usage stays above your defined limit for the evaluation period, instances are added to distribute the load.
  • Scale-down threshold: When CPU usage remains below your lower limit for the evaluation period, instances are removed to save resources.
  • Cooldown window: A delay (default: 5 minutes) before another scaling action can occur.

Memory-based triggers Anchor to this heading

Memory-based autoscaling follows the same principle as CPU triggers but measures average memory utilization instead. When your app consistently uses more memory than your upper threshold, Upsun adds instances; when memory usage remains low, it removes them.

This option is useful for workloads where caching or in-memory data handling determine performance - for example, large data processing apps or services with persistent caching layers.

Example Anchor to this heading

Condition Scaling action
Memory above 80% for 5 minutes Scale up: Add one instance
Memory below 30% for 5 minutes Scale down: Remove one instance

Configure memory triggers Anchor to this heading

  1. Open your project in the Console.
  2. Select your target environment.
  3. Choose Configure resources.
  4. Under Autoscaling, select Enable (if not already enabled).
  5. Choose Memory usage (min/max) as your scaling trigger.
  6. Set scale-up and scale-down thresholds, evaluation period, and cooldown window.
  7. Save changes — your app will now automatically scale based on memory utilization.

Guardrails and evaluation Anchor to this heading

Autoscaling gives you control over the minimum and maximum number of instances your app can run. These guardrails ensure your app never scales up or down too far. Set boundaries to keep scaling safe, predictable, and cost-efficient:

For example, you might configure:

  • Minimum instances — Ensures a minimum number of instances of the configured application are always running (e.g. 2)
  • Maximum instances — Prevents runaway scaling (e.g. 8)
  • Evaluation period — Time CPU must stay above or below a threshold before action (1–60 minutes)
  • Cooldown window — Wait time before any subsequent scaling action (default: 5 minutes)

Enable Autoscaling Anchor to this heading

To enable autoscaling, follow the steps below:

  1. Open your project in the Console
  2. Select the environment where you want to enable autoscaling
  3. Choose Configure resources

Navigate to Configure resources in Console

  1. Under the autoscaling column select Enable
  2. Configure thresholds, evaluation period, cooldown, and instances as needed

Configure autoscaling in Console

Alerts and metrics Anchor to this heading

When autoscaling is enabled, the system continuously monitors metrics such as CPU usage, instance count, and request latency. If a defined threshold is crossed, an alert is triggered and the platform automatically responds by adjusting resources.

Scaling activity is visible in several places:

  • Metrics dashboards show when scaling has occurred
  • Alerts and scaling actions are also visible in the Console:
    • Alerts appear with a bell icon (for example - Scaling: CPU for application below 70% for 5 minutes)
    • Scaling actions appear with a resources icon (for example, Upscale: 1 instance added to application)
  • Alerts and scaling actions are also listed in the CLI as environment.alert and environment.resources.update
  • To review detailed scaling events, open the Resources dashboard by navigating to {Select project} > {Select environment} > Resources

Configure alerts Anchor to this heading

You can also configure notifications for alerts.

For example, by setting up an activity script on environment.alert, you can automatically send yourself an email, a Slack message, or another type of custom notification.

Billing and cost impact Anchor to this heading

Autoscaling projects are billed for the resources that they consume. Instances added through autoscaling are billed the same as if you were to manually configure those resources.

Added instances are deployed automatically without downtime.

To control costs, avoid overly aggressive settings (e.g. very short evaluation periods).

Best practices for autoscaling Anchor to this heading

Autoscaling gives you flexibility and resilience, but to get the best results it’s important to configure your app and thresholds thoughtfully. Below are some best practices to help you balance performance, stability, and cost.

Cost & stability Anchor to this heading

  • Set thresholds wisely: Configure realistic scale-up and scale-down thresholds to avoid unnecessary deployments.
  • Smooth spikes: Use longer evaluation periods (10–15 minutes) if your app traffic spikes often, to prevent rapid up-and-down scaling.
  • Control instance counts: Define minimum and maximum instances to manage costs while keeping required availability.
  • Monitor costs: Track billing and usage after enabling autoscaling, then adjust thresholds as needed.

Application design Anchor to this heading

Cron jobs & long-running tasks Anchor to this heading

  • CPU spikes from jobs: Cron jobs can increase CPU usage and may trigger scale-ups so factor this into your threshold settings.
  • Job continuity: Cron jobs remain bound to their starting container and are not interrupted by scaling, so plan instances accordingly.
Resources and scaling Anchor to this heading
Metrics Anchor to this heading
Image reference Anchor to this heading
Billing and payment Anchor to this heading