Skip to content
Crow CI

Autoscaler

The Crow CI Autoscaler dynamically provisions cloud servers to execute pipelines, then terminates them when idle.

sequenceDiagram
    participant Queue as Build Queue
    participant AS as Autoscaler
    participant Cloud as Cloud Provider
    participant Agent as Agent (VM)
    participant Server as Crow Server

    Queue->>AS: Pending build
    AS->>Cloud: Provision VM
    Cloud->>Agent: VM ready
    Agent->>Server: Register & connect
    Agent->>Agent: Execute pipeline
    Note over AS,Agent: Idle timeout
    AS->>Cloud: Terminate VM
ProviderConfiguration Reference
AWSflags.go
Hetzner Cloudflags.go
Linodeflags.go
Scalewayflags.go
Vultrflags.go

Additional providers with a Go SDK can be added — contributions welcome!

  1. Deploy alongside the server — the autoscaler listens for build triggers

  2. Configure server connection — provide server address and authentication tokens

  3. Configure scaling limits — set min/max agents and workflows per agent

  4. Configure gRPC — remote agents need secure gRPC connection to server

  5. Configure cloud provider — set provider credentials and instance settings

docker-compose.yaml
services:
crow-autoscaler:
image: codefloe.com/crowci/crow-autoscaler:<version>
restart: always
depends_on:
- crow-server
environment:
# Server connection
- CROW_SERVER=crow-server:9000
- CROW_TOKEN=${CROW_TOKEN} # Admin API token
- CROW_AUTOSCALER_TOKEN=${CROW_AUTOSCALER_TOKEN}
# Scaling limits
- CROW_MIN_AGENTS=0
- CROW_MAX_AGENTS=2
- CROW_WORKFLOWS_PER_AGENT=5
# gRPC (for remote agents)
- CROW_GRPC_ADDR=grpc.crow.example.com
- CROW_GRPC_SECURE=true
# Timeouts
- CROW_AGENT_IDLE_TIMEOUT=10m
- CROW_AGENT_SERVER_CONNECTION_TIMEOUT=10m
# Cloud provider (Hetzner example)
- CROW_PROVIDER=hetznercloud
- CROW_HETZNERCLOUD_API_TOKEN=${HETZNER_TOKEN}
- CROW_HETZNERCLOUD_LOCATION=fsn1
- CROW_HETZNERCLOUD_SERVER_TYPE=cax41
- CROW_HETZNERCLOUD_IMAGE=ubuntu-24.04
- CROW_HETZNERCLOUD_NETWORKS=my-network
- CROW_HETZNERCLOUD_SSH_KEYS=my-key
- CROW_HETZNERCLOUD_FIREWALLS=my-firewall
# Agent image (optional — auto-detected from server version if omitted)
# - CROW_AGENT_IMAGE=codefloe.com/crowci/crow-agent:v5.3.2
# Optional: agent environment
- CROW_AGENT_ENV=CROW_LOG_LEVEL=debug,CROW_HEALTHCHECK=false
VariableDescription
CROW_SERVERServer address (internal or public URL)
CROW_TOKENAdmin API token for agent management
CROW_AUTOSCALER_TOKENRegistration token for autoscaler
VariableDefaultDescription
CROW_MIN_AGENTS0Minimum agents always running
CROW_MAX_AGENTS1Maximum concurrent agents
CROW_WORKFLOWS_PER_AGENT1Parallel workflows per agent
VariableDefaultDescription
CROW_AGENT_IDLE_TIMEOUT10mTime before idle agent is terminated
CROW_AGENT_SERVER_CONNECTION_TIMEOUT10mMax time without server connection

Remote agents require secure gRPC to connect back to the server.

VariableDescription
CROW_GRPC_ADDRPublic gRPC address (no protocol prefix)
CROW_GRPC_SECURESet true for TLS connection
VariableDefaultDescription
CROW_AGENT_IMAGEautoContainer image for spawned agents

When CROW_AGENT_IMAGE is not set, the autoscaler queries the Crow server’s /version endpoint and uses the matching agent image automatically — for example, if the server reports version v5.3.2, the autoscaler uses codefloe.com/crowci/crow-agent:v5.3.2.

Set this variable explicitly only if you need to pin a specific agent version or use a custom image.

VariableDefaultDescription
CROW_AGENT_ENVnoneEnvironment variables passed to spawned agents (comma-separated KEY=value pairs)
CROW_FILTER_LABELSnoneOnly count queued tasks matching this label (key=value) toward scaling decisions. Required for multiple autoscalers.

Example agent environment:

CROW_AGENT_ENV=CROW_AGENT_LABELS=tier=heavy,CROW_LOG_LEVEL=debug,CROW_HEALTHCHECK=false

Remote agents need a TLS-secured gRPC endpoint. Configure your reverse proxy to forward to the server’s gRPC port (default: 9000).

server {
listen 443 ssl http2;
server_name grpc.crow.example.com;
ssl_certificate /etc/ssl/certs/crow.crt;
ssl_certificate_key /etc/ssl/private/crow.key;
location / {
grpc_pass grpc://crow-server:9000;
}
}
grpc.crow.example.com {
reverse_proxy h2c://crow-server:9000
}
http:
routers:
crow-grpc:
rule: Host(`grpc.crow.example.com`)
service: crow-server
tls:
certResolver: letsencrypt
services:
crow-server:
loadBalancer:
servers:
- url: h2c://crow-server:9000

Combine static agents (always-on) with autoscaled agents (on-demand) for cost efficiency.

Agent TypeUse Case
StaticFast, lightweight builds; always available
AutoscaledResource-intensive builds; cost-optimized

Example: Run a small static agent alongside the server for quick jobs. The autoscaler provisions powerful VMs only when the static agent is at capacity.

Use labels to route workflows:

Static agent configuration:

CROW_AGENT_LABELS=tier=standard

Workflow targeting autoscaled agents (.crow.yaml):

labels:
tier: heavy

The autoscaler checks for available agents before provisioning. If a static agent can handle the workload, no new VM is created.

A single Crow server can use multiple autoscalers simultaneously. Each autoscaler runs as an independent process with its own registration token, provider configuration, and scaling limits.

Multiple autoscalers let you target different cloud providers from a single server, for example, Hetzner for Linux builds and Azure for Windows builds. You can also provision different instance sizes, using small VMs for unit tests and large VMs for integration tests. Multi-region setups are possible too, placing agents in eu-west for European teams and us-east for US teams. For cost optimization, non-urgent work can run on spot or preemptible instances while time-sensitive builds use on-demand capacity. Finally, you can serve different architectures by provisioning amd64 agents from one provider and arm64 agents from another.

Each autoscaler reports its capabilities to the server via heartbeat. The server uses two mechanisms to route workflows to the right autoscaler:

  1. Agent labels (CROW_AGENT_LABELS inside CROW_AGENT_ENV) — the autoscaler reports these to the server, which uses them to determine whether the autoscaler can provision agents for a given workflow. A workflow’s labels: must match an autoscaler’s reported labels for that autoscaler to handle it.

  2. Filter labels (CROW_FILTER_LABELS) — the autoscaler uses these locally to decide which queued tasks count toward its scaling decisions. Without this, every autoscaler would see all pending tasks and try to scale up for work meant for a different autoscaler.

Register two autoscalers on the server and note their tokens.

docker-compose.yaml
services:
# Small instances for standard builds
autoscaler-standard:
image: codefloe.com/crowci/crow-autoscaler:<version>
restart: always
environment:
- CROW_SERVER=crow-server:9000
- CROW_TOKEN=${CROW_TOKEN}
- CROW_AUTOSCALER_TOKEN=${AUTOSCALER_TOKEN_STANDARD}
- CROW_GRPC_ADDR=grpc.crow.example.com
- CROW_GRPC_SECURE=true
- CROW_MAX_AGENTS=4
- CROW_WORKFLOWS_PER_AGENT=3
- CROW_FILTER_LABELS=tier=standard
- CROW_AGENT_ENV=CROW_AGENT_LABELS=tier=standard
- CROW_PROVIDER=hetznercloud
- CROW_HETZNERCLOUD_API_TOKEN=${HETZNER_TOKEN}
- CROW_HETZNERCLOUD_SERVER_TYPE=cax21
# ... other Hetzner settings
# Large instances for heavy builds
autoscaler-heavy:
image: codefloe.com/crowci/crow-autoscaler:<version>
restart: always
environment:
- CROW_SERVER=crow-server:9000
- CROW_TOKEN=${CROW_TOKEN}
- CROW_AUTOSCALER_TOKEN=${AUTOSCALER_TOKEN_HEAVY}
- CROW_GRPC_ADDR=grpc.crow.example.com
- CROW_GRPC_SECURE=true
- CROW_MAX_AGENTS=2
- CROW_WORKFLOWS_PER_AGENT=1
- CROW_FILTER_LABELS=tier=heavy
- CROW_AGENT_ENV=CROW_AGENT_LABELS=tier=heavy
- CROW_PROVIDER=aws
- CROW_AWS_INSTANCE_TYPE=c5.2xlarge
# ... other AWS settings

Workflows select their tier with labels:

# .crow.yaml — lightweight job
labels:
tier: standard
steps:
- name: lint
image: golangci/golangci-lint
commands:
- golangci-lint run
# .crow.yaml — resource-intensive job
labels:
tier: heavy
steps:
- name: integration
image: golang
commands:
- go test -race -count=1 ./...