AWS Batch on Fargate: Single-Node, Multi-Container Jobs for Sidecar Pipelines
22-Aug-2025
This post shows how to run single-node, multi-container jobs on AWS Batch with Fargate, and why that’s powerful for sidecar-style data pipelines. We’ll build a pattern where a “main app” container exposes a local API, and a lightweight “pipeline” container consumes that API and exits when it’s done—letting the whole job complete cleanly without managing servers.
Why multi-container on a single Batch node?
- Co-locate cooperating processes on the same task/node
- Low-latency localhost communication between containers
- Shared lifecycle and logs managed by Batch
- Simple job success criteria (e.g., pipeline container exit code)
- No EC2 instances to manage (Fargate)
Common use cases:
- Data extraction sidecar that scrapes an in-task API and writes to S3
How it works (Batch + Fargate + ECS orchestration)
- AWS Batch now supports ECS-style multi-container jobs via the ECS orchestration mode
- A single Batch job runs one ECS task (single node) with multiple containers
- Containers share the task network namespace and optional shared volumes
- You control container start order with
dependsOn
and health checks - Batch decides job success/failure based on container exit codes and
essential
flags
High-level flow:
- Batch schedules a Fargate task
- “Main app” container starts and becomes healthy (exposes
:8080/health
) - “Pipeline” container starts after main is healthy, calls
http://localhost:8080/api
, processes data, exits 0 - Batch stops the task; job marked Succeeded if essential containers succeeded
Job definition (single node, multiple containers)
Below is an example AWS Batch job definition JSON using ECS orchestration with Fargate. It starts main-app
first and only starts data-pipeline
after the main container is healthy. The pipeline exits to signal job completion.
Notes:
- Mark the pipeline container as
essential: true
so its exit code drives job success - Mark the main container as
essential: false
(it’s just providing an API for the pipeline) - Use a health check on the main container so
dependsOn: HEALTHY
works
{
"jobDefinitionName": "fargate-multicontainer-sidecar",
"type": "container",
"orchestrationType": "ECS",
"platformCapabilities": ["FARGATE"],
"ecsProperties": {
"taskProperties": [
{
"networkConfiguration": {
"assignPublicIp": "ENABLED"
},
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/myJobTaskRole",
"runtimePlatform": {
"cpuArchitecture": "X86_64",
"operatingSystemFamily": "LINUX"
},
"fargatePlatformConfiguration": {
"platformVersion": "LATEST"
},
"ephemeralStorage": {
"sizeInGiB": 30
},
"cpu": "1024",
"memory": "2048",
"containers": [
{
"name": "main-app",
"image": "123456789012.dkr.ecr.ap-southeast-6.amazonaws.com/main-app:latest",
"essential": false,
"portMappings": [
{ "containerPort": 8080, "hostPort": 8080, "protocol": "tcp" }
],
"healthCheck": {
"command": ["CMD-SHELL", "curl -fsS http://localhost:8080/health || exit 1"],
"interval": 5,
"timeout": 2,
"retries": 10,
"startPeriod": 5
},
"environment": [
{ "name": "SERVER_LOG", "value": "info" }
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/aws/batch/job/fargate-multicontainer-sidecar",
"awslogs-region": "ap-southeast-6",
"awslogs-stream-prefix": "main"
}
}
},
{
"name": "data-pipeline",
"image": "123456789012.dkr.ecr.ap-southeast-6.amazonaws.com/data-pipeline:latest",
"essential": true,
"dependsOn": [
{ "containerName": "main-app", "condition": "HEALTHY" }
],
"environment": [
{ "name": "API_BASE_URL", "value": "http://localhost:8080" },
{ "name": "OUTPUT_BUCKET", "value": "s3://my-bucket/exports/" }
],
"command": ["Ref::EVENT_JSON"], # Payload
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/aws/batch/job/fargate-multicontainer-sidecar",
"awslogs-region": "ap-southeast-6",
"awslogs-stream-prefix": "pipeline"
}
}
}
]
}
]
},
"retryStrategy": { "attempts": 1 },
"propagateTags": true,
"tags": { "project": "sidecar-pipeline" }
}
Minimal container examples
Main app: simple HTTP API (Node.js example)
// main-app/server.js
const express = require('express');
const app = express();
app.get('/health', (req, res) => res.send('ok'));
app.get('/api', (req, res) => {
// Simulate data generation
res.json({ items: [1, 2, 3], generatedAt: new Date().toISOString() });
});
const port = process.env.PORT || 8080;
app.listen(port, () => console.log(`main-app listening on ${port}`));
# main-app/Dockerfile
FROM public.ecr.aws/docker/library/node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY server.js ./
EXPOSE 8080
CMD ["node", "server.js"]
Data pipeline: call local API and write to S3 (Python example)
# data-pipeline/pipeline.py
import os, json, urllib.request, boto3
base = os.getenv('API_BASE_URL', 'http://localhost:8080')
output = os.getenv('OUTPUT_BUCKET') # e.g. s3://bucket/prefix/
with urllib.request.urlopen(f"{base}/api") as r:
body = r.read()
data = json.loads(body)
print("Fetched:", data)
if output and output.startswith('s3://'):
s3 = boto3.client('s3')
_, rest = output.split('s3://', 1)
bucket, prefix = rest.split('/', 1)
key = f"{prefix.rstrip('/')}/export.json"
s3.put_object(Bucket=bucket, Key=key, Body=json.dumps(data).encode('utf-8'))
print(f"Wrote s3://{bucket}/{key}")
# data-pipeline/Dockerfile
FROM public.ecr.aws/docker/library/python:3.12-alpine
RUN pip install boto3
WORKDIR /app
COPY pipeline.py ./
CMD ["python", "/app/pipeline.py"]
IAM roles
- Task execution role: pull ECR images, write CloudWatch Logs, (optional) read Secrets Manager/SSM
- Task role (jobRoleArn): permissions for the pipeline (e.g.,
s3:PutObject
)
Example policy attachment for pipeline writes:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:PutObject", "s3:PutObjectAcl"],
"Resource": "arn:aws:s3:::my-bucket/exports/*"
}
]
}
Submitting the job
- Register the job definition
- Create a Batch job queue and compute environment (Fargate/Fargate Spot)
- Submit the job referencing the job definition; optionally pass env overrides (e.g., S3 prefix)
Example (CLI, optional):
# Register job definition
aws batch register-job-definition \
--cli-input-json file://jobdef-fargate-multicontainer.json
# Submit job
aws batch submit-job \
--job-name export-run-$(date +%s) \
--job-queue my-fargate-queue \
--job-definition fargate-multicontainer-sidecar \
--container-overrides '[{"name":"data-pipeline","environment":[{"name":"OUTPUT_BUCKET","value":"s3://my-bucket/exports/run-123/"}]}]'
Behavior details and knobs
- Start order: use
dependsOn
+healthCheck
to ensure the API is ready before the pipeline starts - Job success: make the sidecar pipeline
essential: true
; when it exits 0, job succeeds; Batch will stop the task and non-essential containers - Storage: use
ephemeralStorage
for temp space or definevolumes
+mountPoints
for shared files - Logging: send both containers to the same CloudWatch Logs group with different stream prefixes
- Cost: Fargate price is per-task vCPU/memory duration; keep the main container minimal (or non-essential) so the job ends quickly
Takeaways
- Batch + Fargate multi-container jobs are perfect for short-lived, cooperating processes
- Use
dependsOn
and health checks to coordinate start-up - Make the sidecar pipeline the
essential
container so its exit code controls job success - Keep containers minimal and the job short to optimize cost
This pattern gives you a clean, serverless data pipeline that travels with the job—no separate services to provision, cost saving, and excellent observability through Batch and CloudWatch Logs.