Serverless means you don’t manage servers — AWS handles provisioning, scaling, and capacity. You write functions or deploy containers, and pay per execution/request.
Lambda cold starts can be 1-5 seconds — for latency-sensitive apps, keep functions warm: Use provisioned concurrency (pre-warmed instances) or scheduled pings to keep functions warm. Provisioned concurrency costs money but eliminates cold starts.
Serverless doesn’t mean scale-to-zero instantly — there’s always a brief provisioning delay: Even with Lambda’s near-infinite scale, the first request to a new instance after idle has a cold start. Design for this.
Lambda has a 15-minute maximum execution time — for longer jobs, use Step Functions or ECS/Fargate: If your job takes 30 minutes, Lambda will timeout at 15 minutes. Break into smaller steps or use a different service.
App Runner is NOT the same as Lambda — you manage the container image, not just code: App Runner runs containers, so you need a Dockerfile and container registry. Lambda lets you just upload code/zip. App Runner is for when you need full runtime control.
Serverless pricing can be unexpectedly high at scale — 100M Lambda invocations/month = $20K/month: At low volume, serverless is cheap. At high volume (millions of requests/minute), a persistent service (ECS/Fargate) is often cheaper. Model your costs before going all-in on serverless.