Autoscaler

The Crow CI Autoscaler dynamically provisions cloud servers to execute pipelines, then terminates them when idle.

sequenceDiagram
    participant Queue as Build Queue
    participant AS as Autoscaler
    participant Cloud as Cloud Provider
    participant Agent as Agent (VM)
    participant Server as Crow Server

    Queue->>AS: Pending build
    AS->>Cloud: Provision VM
    Cloud->>Agent: VM ready
    Agent->>Server: Register & connect
    Agent->>Agent: Execute pipeline
    Note over AS,Agent: Idle timeout
    AS->>Cloud: Terminate VM

Supported Providers

Provider	Configuration Reference
AWS	flags.go
Hetzner Cloud	flags.go
Linode	flags.go
Scaleway	flags.go
Vultr	flags.go

Additional providers with a Go SDK can be added — contributions welcome!

Setup

Deploy alongside the server — the autoscaler listens for build triggers
Configure server connection — provide server address and authentication tokens
Configure scaling limits — set min/max agents and workflows per agent
Configure gRPC — remote agents need secure gRPC connection to server
Configure cloud provider — set provider credentials and instance settings

Complete Example

services:
  crow-autoscaler:
    image: codefloe.com/crowci/crow-autoscaler:<version>
    restart: always
    depends_on:
      - crow-server
    environment:
      # Server connection
      - CROW_SERVER=crow-server:9000
      - CROW_TOKEN=${CROW_TOKEN} # Admin API token
      - CROW_AUTOSCALER_TOKEN=${CROW_AUTOSCALER_TOKEN}

      # Scaling limits
      - CROW_MIN_AGENTS=0
      - CROW_MAX_AGENTS=2
      - CROW_WORKFLOWS_PER_AGENT=5

      # gRPC (for remote agents)
      - CROW_GRPC_ADDR=grpc.crow.example.com
      - CROW_GRPC_SECURE=true

      # Timeouts
      - CROW_AGENT_IDLE_TIMEOUT=10m
      - CROW_AGENT_SERVER_CONNECTION_TIMEOUT=10m

      # Cloud provider (Hetzner example)
      - CROW_PROVIDER=hetznercloud
      - CROW_HETZNERCLOUD_API_TOKEN=${HETZNER_TOKEN}
      - CROW_HETZNERCLOUD_LOCATION=fsn1
      - CROW_HETZNERCLOUD_SERVER_TYPE=cax41
      - CROW_HETZNERCLOUD_IMAGE=ubuntu-24.04
      - CROW_HETZNERCLOUD_NETWORKS=my-network
      - CROW_HETZNERCLOUD_SSH_KEYS=my-key
      - CROW_HETZNERCLOUD_FIREWALLS=my-firewall

      # Agent image (optional — auto-detected from server version if omitted)
      # - CROW_AGENT_IMAGE=codefloe.com/crowci/crow-agent:v5.3.2

      # Optional: agent environment
      - CROW_AGENT_ENV=CROW_LOG_LEVEL=debug,CROW_HEALTHCHECK=false

Configuration Reference

Server Connection

Variable	Description
`CROW_SERVER`	Server address (internal or public URL)
`CROW_TOKEN`	Admin API token for agent management
`CROW_AUTOSCALER_TOKEN`	Registration token for autoscaler

Scaling

Variable	Default	Description
`CROW_MIN_AGENTS`	`0`	Minimum agents always running
`CROW_MAX_AGENTS`	`1`	Maximum concurrent agents
`CROW_WORKFLOWS_PER_AGENT`	`1`	Parallel workflows per agent

Timeouts

Variable	Default	Description
`CROW_AGENT_IDLE_TIMEOUT`	`10m`	Time before idle agent is terminated
`CROW_AGENT_SERVER_CONNECTION_TIMEOUT`	`10m`	Max time without server connection

gRPC Connection

Remote agents require secure gRPC to connect back to the server.

Variable	Description
`CROW_GRPC_ADDR`	Public gRPC address (no protocol prefix)
`CROW_GRPC_SECURE`	Set `true` for TLS connection

Agent Image

Variable	Default	Description
`CROW_AGENT_IMAGE`	auto	Container image for spawned agents

When CROW_AGENT_IMAGE is not set, the autoscaler queries the Crow server’s /version endpoint and uses the matching agent image automatically — for example, if the server reports version v5.3.2, the autoscaler uses codefloe.com/crowci/crow-agent:v5.3.2.

Set this variable explicitly only if you need to pin a specific agent version or use a custom image.

Agent Configuration

Variable	Default	Description
`CROW_AGENT_ENV`	none	Environment variables passed to spawned agents (comma-separated `KEY=value` pairs)
`CROW_FILTER_LABELS`	none	Only count queued tasks matching this label (`key=value`) toward scaling decisions. Required for multiple autoscalers.

Example agent environment:

CROW_AGENT_ENV=CROW_AGENT_LABELS=tier=heavy,CROW_LOG_LEVEL=debug,CROW_HEALTHCHECK=false

gRPC Proxy Setup

Remote agents need a TLS-secured gRPC endpoint. Configure your reverse proxy to forward to the server’s gRPC port (default: 9000).

Nginx

server {
    listen 443 ssl http2;
    server_name grpc.crow.example.com;

    ssl_certificate /etc/ssl/certs/crow.crt;
    ssl_certificate_key /etc/ssl/private/crow.key;

    location / {
        grpc_pass grpc://crow-server:9000;
    }
}

Caddy

grpc.crow.example.com {
    reverse_proxy h2c://crow-server:9000
}

Traefik

http:
  routers:
    crow-grpc:
      rule: Host(`grpc.crow.example.com`)
      service: crow-server
      tls:
        certResolver: letsencrypt
  services:
    crow-server:
      loadBalancer:
        servers:
          - url: h2c://crow-server:9000

Hybrid Setup

Combine static agents (always-on) with autoscaled agents (on-demand) for cost efficiency.

Agent Type	Use Case
Static	Fast, lightweight builds; always available
Autoscaled	Resource-intensive builds; cost-optimized

Example: Run a small static agent alongside the server for quick jobs. The autoscaler provisions powerful VMs only when the static agent is at capacity.

Use labels to route workflows:

Static agent configuration:

CROW_AGENT_LABELS=tier=standard

Workflow targeting autoscaled agents (.crow.yaml):

labels:
  tier: heavy

The autoscaler checks for available agents before provisioning. If a static agent can handle the workload, no new VM is created.

Multiple Autoscalers

A single Crow server can use multiple autoscalers simultaneously. Each autoscaler runs as an independent process with its own registration token, provider configuration, and scaling limits.

Why Use Multiple Autoscalers

Multiple autoscalers let you target different cloud providers from a single server, for example, Hetzner for Linux builds and Azure for Windows builds. You can also provision different instance sizes, using small VMs for unit tests and large VMs for integration tests. Multi-region setups are possible too, placing agents in eu-west for European teams and us-east for US teams. For cost optimization, non-urgent work can run on spot or preemptible instances while time-sensitive builds use on-demand capacity. Finally, you can serve different architectures by provisioning amd64 agents from one provider and arm64 agents from another.

How It Works

Each autoscaler reports its capabilities to the server via heartbeat. The server uses two mechanisms to route workflows to the right autoscaler:

Agent labels (CROW_AGENT_LABELS inside CROW_AGENT_ENV) — the autoscaler reports these to the server, which uses them to determine whether the autoscaler can provision agents for a given workflow. A workflow’s labels: must match an autoscaler’s reported labels for that autoscaler to handle it.
Filter labels (CROW_FILTER_LABELS) — the autoscaler uses these locally to decide which queued tasks count toward its scaling decisions. Without this, every autoscaler would see all pending tasks and try to scale up for work meant for a different autoscaler.

Example: Dual-Provider Setup

services:
  # Small instances for standard builds
  autoscaler-standard:
    image: codefloe.com/crowci/crow-autoscaler:<version>
    restart: always
    environment:
      - CROW_SERVER=crow-server:9000
      - CROW_TOKEN=${CROW_TOKEN}
      - CROW_AUTOSCALER_TOKEN=${AUTOSCALER_TOKEN_STANDARD}
      - CROW_GRPC_ADDR=grpc.crow.example.com
      - CROW_GRPC_SECURE=true
      - CROW_MAX_AGENTS=4
      - CROW_WORKFLOWS_PER_AGENT=3
      - CROW_FILTER_LABELS=tier=standard
      - CROW_AGENT_ENV=CROW_AGENT_LABELS=tier=standard
      - CROW_PROVIDER=hetznercloud
      - CROW_HETZNERCLOUD_API_TOKEN=${HETZNER_TOKEN}
      - CROW_HETZNERCLOUD_SERVER_TYPE=cax21
      # ... other Hetzner settings

  # Large instances for heavy builds
  autoscaler-heavy:
    image: codefloe.com/crowci/crow-autoscaler:<version>
    restart: always
    environment:
      - CROW_SERVER=crow-server:9000
      - CROW_TOKEN=${CROW_TOKEN}
      - CROW_AUTOSCALER_TOKEN=${AUTOSCALER_TOKEN_HEAVY}
      - CROW_GRPC_ADDR=grpc.crow.example.com
      - CROW_GRPC_SECURE=true
      - CROW_MAX_AGENTS=2
      - CROW_WORKFLOWS_PER_AGENT=1
      - CROW_FILTER_LABELS=tier=heavy
      - CROW_AGENT_ENV=CROW_AGENT_LABELS=tier=heavy
      - CROW_PROVIDER=aws
      - CROW_AWS_INSTANCE_TYPE=c5.2xlarge
      # ... other AWS settings

Workflows select their tier with labels:

# .crow.yaml — lightweight job
labels:
  tier: standard

steps:
  - name: lint
    image: golangci/golangci-lint
    commands:
      - golangci-lint run

# .crow.yaml — resource-intensive job
labels:
  tier: heavy

steps:
  - name: integration
    image: golang
    commands:
      - go test -race -count=1 ./...