Behavior & Limits
Guarantees, limits, and platform-specific behavior for Alien Workers.
Guarantees
Stateless Execution. Each worker invocation runs in an isolated environment. Do not rely on in-memory state persisting between invocations — use KV or Storage for persistent state.
At-Least-Once Invocation (Triggers). Queue and storage triggers deliver events at least once. Your handler must be idempotent — the same event may be delivered more than once.
Automatic Scaling. Workers scale up with incoming requests and scale to zero when idle. You do not configure concurrency or instance counts.
Binding Access. Workers can only access resources that are explicitly link()ed in the stack definition. This is enforced at the infrastructure level via IAM/RBAC.
Limits
| Limit | Value | Notes |
|---|---|---|
| Max request body | Platform-dependent | Lambda: 6 MB sync, 256 KB async. Cloud Run: 32 MB. Container Apps: varies. |
| Max response body | Platform-dependent | Lambda: 6 MB. Cloud Run: 32 MB. |
| Max execution time | Platform-dependent | Lambda: 15 min. Cloud Run: 60 min. Container Apps: varies. |
| Queue triggers per worker | 1 | A worker can be triggered by at most one queue. |
| Concurrent invocations | Platform-dependent | Lambda: 1,000 default (requestable). Cloud Run: per-instance concurrency configurable. |
Platform Details
AWS (Lambda)
- Runtime: Container image on ARM64 (Graviton) for better price-performance.
- Cold starts: typically 1-3 seconds for the first invocation after idle.
- Payload limits: 6 MB synchronous, 256 KB asynchronous invocation.
- Max execution time: 15 minutes.
- Worker-to-worker invocation uses the Lambda Invoke API directly (not HTTP).
- Queue triggers: SQS event source mapping. One message per invocation. Visibility timeout is auto-calculated:
max(30s, min(12h, worker_timeout × 6)). - If the worker's ECR image is in a different region than the worker, Alien automatically handles cross-region image rewriting.
GCP (Cloud Run)
- Runtime: Container image.
- Cold starts: typically 1-2 seconds.
- Payload limits: 32 MB request/response.
- Max execution time: 60 minutes.
- Worker-to-worker invocation uses direct HTTP calls to the private service URL.
- Queue triggers: Pub/Sub push subscription.
- Supports per-instance concurrency (multiple requests per container).
Azure (Container Apps)
- Runtime: Container image.
- Worker-to-worker invocation uses direct HTTP calls to the private Container App URL.
- Queue triggers: Service Bus integration via KEDA.
Kubernetes / On-Prem
- Runs as a Pod with service discovery via internal DNS.
- Worker-to-worker invocation uses HTTP to the Kubernetes service.
- Queue triggers are not currently supported on Kubernetes.
Local
- Runs as a native process extracted from the built container image.
- Dynamic port assignment via
--portflag. - Automatic restart on crash via background monitor.
- Triggers supported via LocalTriggerService.
- Full environment variable injection and OTLP telemetry configuration.
Trigger Support Matrix
| Trigger Type | AWS | GCP | Azure |
|---|---|---|---|
| Queue → Worker | SQS event source | Pub/Sub push | Service Bus + KEDA |
| Storage → Worker | S3 notifications | GCS notifications | Dapr blob storage binding |
| Cron → Worker | EventBridge | Cloud Scheduler | Dapr cron binding |
Design Decisions
HTTP semantics for invocation. Worker-to-worker calls use HTTP request/response semantics (method, path, headers, body) even when the underlying transport isn't HTTP (e.g., Lambda Invoke API). This keeps the API portable and familiar.
One queue source per worker. A worker can consume from at most one queue, but can have multiple triggers of different types (e.g. a queue trigger + a cron trigger). If you need to consume from multiple queues, create multiple workers.
Concurrency is optional. By default, Alien lets the cloud provider manage scaling. You can set .concurrencyLimit() to cap concurrent executions — this maps to reserved concurrency on Lambda, max instances on Cloud Run, and max replicas on Container Apps.