AWS VPC Edge Routing Guide Part 1: Picking the Right Entry Point — A Decision Tree for ALB, NLB, API Gateway, CloudFront, and Global Accelerator

AWS VPC Edge Routing Guide Part 1: Picking the Right Entry Point — A Decision Tree for ALB, NLB, API Gateway, CloudFront, and Global Accelerator


Introduction

“What should I put in front of this VPC?” is the question that comes back every time you sketch an AWS architecture, because there are five candidates: ALB, NLB, API Gateway, CloudFront, Global Accelerator. You know the names. But the moment you have to pick one for a new service, if the answer doesn’t surface within a second, the decision usually defaults to “whatever someone else picked last time.”

This series starts there. We walk through AWS network service blocks framed as “what decision problem does this solve?”, in three parts. Part 1 covers the most frequent decision — picking an entry point that fronts your VPC.

  • Part 0 — Primer: network and AWS fundamentals
  • Part 1 — Picking the entry point: ALB / NLB / API Gateway / CloudFront / Global Accelerator (this post)
  • Part 2 — VPC-to-VPC and on-prem connectivity: VPC Endpoint / PrivateLink / Transit Gateway / Peering / Direct Connect
  • Part 3 — Inside the VPC: IGW / NAT GW / Route Tables / Security Group vs NACL
  • Part 4 — DNS decisions and Route 53: Hosted Zone / Routing Policy / Alias vs CNAME / Health Check
  • Part 5 — Four standard patterns: from decision tree to first sketch

The target reader is a backend or infrastructure engineer who has “spun up an ALB in the console but isn’t sure when to reach for API Gateway or CloudFront.” After this post, the goal is that picking the entry point for a new service stops being a thing you stall on.


TL;DR

  • The first split is L7 (HTTP/HTTPS) vs L4 (TCP/UDP). L7 → ALB · API Gateway · CloudFront. L4 → NLB · Global Accelerator.
  • API Gateway only earns its price tag when you actually use its built-in auth, throttling, usage plans, or managed integrations. As a plain HTTP proxy, ALB is almost always cheaper and faster.
  • CloudFront is not an entry point — it’s a caching layer in front of an ALB, an API Gateway, or S3. You almost never use it standalone.
  • NLB is the answer when you need a static IP, ultra-low latency, or non-HTTP TCP/UDP. WebSocket and gRPC are technically possible at L4, but ALB does both better at L7.
  • Global Accelerator only pays off with multi-region + worldwide users. Tacking it onto a single-region service just burns ~$18/month.

1. Why this decision is hard

AWS gives you five entry-point candidates, and they differ across OSI layer, routing granularity, auth/cache features, and pricing model. Two services that look similar on the surface diverge on a single critical variable, so a surface-level pick tends to bite you months later when cost or feature gaps force a rewrite.

flowchart LR
    User([Internet User])
    User --> Edge[Entry-point candidates]
    Edge --> CF[CloudFront]
    Edge --> APIG[API Gateway]
    Edge --> ALB[ALB]
    Edge --> NLB[NLB]
    Edge --> GA[Global Accelerator]
    CF -.cache miss.-> Origin[VPC: ALB/EC2/S3]
    APIG --> Lambda[Lambda / VPC: ALB/NLB]
    ALB --> Targets[VPC: EC2/ECS/EKS/Lambda]
    NLB --> Targets
    GA --> ALB2[ALB/NLB across regions]

The trick is that an entry point isn’t just “where traffic arrives” — it’s a component that adds processing on top of the request. So the decision narrows to four variables:

VariableMeaningWhere it points
Protocol layerHTTP/HTTPS = L7, anything else TCP/UDP = L4L7 → ALB / API Gateway / CloudFront, L4 → NLB / Global Accelerator
Global distributionUsers spread across continentsCloudFront (L7) / Global Accelerator (L4)
Managed extrasAuth, throttling, API keys, usage plans, cacheAPI Gateway (auth/throttle) / CloudFront (cache)
Static IP / ultra-low latencyIP allowlists, finance/game trafficNLB

Note — L4 vs L7 and the OSI model: The OSI (Open Systems Interconnection) 7-layer model abstracts network communication into seven layers (physical, data link, network, transport, session, presentation, application). L4 (transport — TCP/UDP) routes by IP/port only, while L7 (application — HTTP) can route by message contents (host, path, headers). ALB is L7, NLB is L4 — same category of “AWS load balancer,” fundamentally different routing units.

Walk these four in order and the candidate set almost always collapses to one or two. The decision tree in §4 captures that flow on a single page; the sections before that explain how each candidate actually behaves.


2. L7 entry points — ALB / API Gateway / CloudFront

L7 means the entry point can route on the contents of the HTTP message (host, path, header, cookie). ALB, API Gateway, and CloudFront are all L7, but what they add at L7 is completely different.

Note — what is a reverse proxy?: An intermediary that takes incoming client requests and forwards them to the appropriate backend server. If a forward proxy is “a tool that goes out on behalf of clients” (e.g., squid for outbound corporate traffic), a reverse proxy “stands in front of servers and dispatches external requests to the right backend.” Four core responsibilities — (1) HTTPS termination, (2) host/path-based routing, (3) load balancing, (4) shielding the backend from direct exposure. Nginx, HAProxy, and Envoy are the canonical open-source implementations, and ALB, API Gateway, and CloudFront in this post are all “AWS-managed reverse proxies + extras.” NLB plays the same dispatching role at L4 (TCP/UDP).

2.1 ALB — the most ordinary L7 load balancer

ALB (Application Load Balancer) is AWS’s managed L7 reverse proxy. It’s the default entry point sitting in front of EC2 / ECS / EKS / Lambda inside a VPC, handling host/path/header-based routing and HTTPS termination. ALB lives in a Public Subnet — that is, a Subnet whose Route Table has a 0.0.0.0/0 → IGW route (the real meaning of Public vs Private subnet is unpacked in Part 0 §2.3).

flowchart LR
    Client([Client]) -->|HTTPS| ALB
    subgraph VPC
        ALB[ALB<br/>L7 routing + HTTPS termination]
        ALB -->|"host: api.x.com"| TG1[Target Group A<br/>ECS]
        ALB -->|"path: /admin/*"| TG2[Target Group B<br/>EC2]
        ALB -->|"path: /jobs/*"| TG3[Target Group C<br/>Lambda]
    end

The features that matter:

  • Host/path routing — one ALB fans multiple domains and paths out to different backends.
  • HTTPS termination — terminate TLS with an ACM cert and either talk plaintext to the backend or re-encrypt.
  • WebSocket / HTTP/2 / gRPC — all native. For gRPC, set the Target Group to HTTP + Protocol version: gRPC.
  • Managed HA — Multi-AZ internally. ALB itself failing is essentially not your problem.

Pricing has two axes: hourly LB cost + LCU (Load Balancer Capacity Units). Roughly $16~20/month sits there as a fixed cost regardless of traffic. That’s the key catch — even with zero traffic, an idle ALB still bills monthly.

Note — how LCU is actually calculated: LCU isn’t a single metric — it’s a synthetic unit billed at the maximum across four dimensions per hour. Per-LCU caps: (1) 25 new connections/sec, (2) 3,000 active connections/min, (3) 1 GB/hr processed (0.4 GB/hr for HTTPS), (4) 1,000 rule evaluations/sec (only counting rules beyond the first 10). Whichever dimension your traffic shape pushes hardest is what dominates — many short API calls lean on “new connections,” WebSockets and SSE lean on “active connections,” big file responses lean on “processed bytes,” and rule sets larger than 10 lean on “rule evaluations.” 1 LCU/hour is ≈ $5.84/month (Seoul region), so small services don’t even fill 1 LCU and the LCU charge effectively disappears under the LB-hour cost; only when traffic spikes does LCU start to exceed the hourly. NLB’s NLCU works the same max-of-dimensions way but with much larger per-dimension caps (e.g., 800 new flows/sec, 100,000 active flows/min), so for the same traffic NLB usually accrues fewer NLCUs and ends up cheaper.

2.2 API Gateway — when you need the managed extras

API Gateway is also L7, but it’s an entry point that wraps “everything you need to operate an API” in one box. Don’t think of it as a router — think of it as “auth, throttling, usage plans, response cache, custom domain, OpenAPI import, all in one service.”

These four — auth, throttling, usage plans, managed integration — show up together in the decision criteria, but each solves a different problem. Pairing each with what you’d otherwise have to build in your backend code makes it concrete.

ExtraWhat problem it solvesBuilding it yourself looks likeWhat API Gateway gives you
AuthVerify “who is calling and what they can do” before traffic reaches the backendSpring Security / Passport / custom JWT verification middlewareJWT/OIDC verification, IAM signing, Cognito User Pools, Lambda Authorizer (arbitrary logic)
ThrottlingCap requests per second or minute — protect the backend, absorb bursts and DDoSBucket4j / Resilience4j RateLimiter / Redis token bucketRPS and burst limits at account/stage/route level, automatic 429 on overrun
Usage plansTier customers — “Free: 10 RPM, Pro: 1000 RPM” — by API Key, with quotas and billingCustom API Key issuance + verification + metering + billing integrationREST API’s Usage Plan + API Key pairing, daily/monthly quotas + rate limits
Managed integrationTranslate the incoming HTTP into a backend API call — e.g., GET /products/{id} → DynamoDB GetItem directly (no Lambda or EC2 in the middle)Hand-rolled SDK calls + request/response mapping + retry/timeout logicIntegrations to DynamoDB / Lambda / SQS / Step Functions / arbitrary AWS APIs, VTL (Velocity Template Language — an Apache-Velocity-based templating DSL for request/response transforms via configuration only, no Lambda needed) request/response transforms, retry and timeout via configuration

The one decisive difference between managed integration and ALB: ALB takes the incoming HTTP and forwards it as-is to a compute target (EC2/ECS/Lambda), and your backend code does the work. API Gateway’s integration translates the HTTP into an AWS service API call and invokes it directly, so for simple CRUD you may not need a compute target at all — GET /products/123 becomes a DynamoDB call without any Lambda function in between. The other three (auth, throttling, usage plans) decide “how to validate and rate-limit a request”; integration decides “to whom and in what shape we forward it.”

If any one of these is “I’d rather not build this myself,” API Gateway’s price tag is justified. As a plain HTTP proxy without these, ALB is almost always cheaper — that’s the starting point for the comparison in §2.5.

There are two variants and they get confused all the time.

HTTP APIREST API
Releasedv2 (2019~)v1 (2015~)
Pricing$1.00 per million requests$3.50 per million requests
AuthJWT / OIDC / Lambda AuthorizerIAM / API Key / Cognito / Lambda Authorizer
Response cachingNoYes (separate hourly instance fee)
Request/response transformsLimitedPowerful (Velocity templates)
When to pickMost new APIsAPI keys / usage plans / cache / VTL transforms

For a new API, default to HTTP API. Reach for REST API only when you need its specific features — usage-plan-based API keys, response cache, request/response transforms, mTLS.

Note: There’s a third type, WebSocket API, for bidirectional messaging (chat, real-time notifications). ALB also handles WebSocket, but if you want Lambda backends and want AWS to manage the connection state, WebSocket API is the answer.

API Gateway costs more than ALB, so the rule is simple: if you aren’t actually using the features that justify the price, you’re wasting money. The most common waste pattern is “I want something in front of Lambda” — Lambda Function URLs or ALB + Lambda targets are almost always cheaper.

Note — How API Gateway reaches a private VPC backend: VPC Link: API Gateway endpoints are public by default, so they can’t directly call ALB/NLB/EC2 backends inside a VPC. VPC Link is the bridge — the mechanism by which API Gateway dials into a VPC endpoint over PrivateLink. REST API’s VPC Link accepts only NLB, while HTTP API’s VPC Link accepts ALB, NLB, and Cloud Map (service discovery). That’s the actual identity of the “API Gateway with NLB behind it” pattern — and one of the reasons REST API costs more (no direct ALB connection). This NLB is a different case from the §5.1 antipattern (NLB in front of ALB): it’s not for static IP or whitelisting, the mechanism itself forces NLB.

2.3 CloudFront — not an entry point, a cache + global accelerator

CloudFront is AWS’s CDN. It caches content at 600+ edge locations worldwide and serves users from the closest one. You almost never use CloudFront standalone — there’s always an origin behind it (S3, ALB, API Gateway, an external HTTP server).

sequenceDiagram
    participant U as User (Seoul)
    participant E as CloudFront edge (Seoul)
    participant O as Origin ALB (us-east-1)
    U->>E: GET /static/app.js
    alt cache hit
        E-->>U: 200 (edge response)
    else cache miss
        E->>O: GET /static/app.js
        O-->>E: 200 + Cache-Control
        E-->>U: 200 (cache + respond)
    end

CloudFront pays off in three scenarios:

  • Lots of static assets — JS, CSS, images, fonts. Edge cache cuts origin traffic to nearly zero.
  • Global users — TLS handshake terminates at the closest edge, then traffic rides the AWS backbone to the origin (faster than the public-internet path).
  • API needs caching or DDoS protection — putting CloudFront in front of an API Gateway or ALB gives you short-TTL response caching and free AWS Shield Standard.

The mistake people make most often is forgetting it’s a caching layer. CloudFront on its own can’t handle a dynamic request — on a cache miss, it just forwards to the origin, and that origin (ALB, S3, whatever) is the one doing real work.

2.4 Aside: Region vs Edge — what’s the difference?

§2.3 introduced “edge locations,” and the comparison table in the next section labels ALB as “Inside VPC (regional),” CloudFront as “Global edge,” and API Gateway REST API as “regional/edge.” It’s worth pausing to define the two words so the table reads smoothly.

RegionEdge Location
DefinitionA geographic cluster of AWS datacentersA small PoP (Point of Presence) close to users
Count~30 (Seoul, Tokyo, Virginia, …)600+ (city level)
What lives thereEC2, RDS, ALB, VPC — all compute / storage / DBCloudFront cache, Route 53, TLS termination
PurposeHeavy processing, durable stateCaching, DNS, TLS termination, anycast routing

In one line: Region is “where the service actually lives,” Edge is “the user-facing rim.” Heavy compute like ALB or EC2 only lives inside a region; edges only do lightweight work — caching, DNS, TLS termination.

Take a US user reaching a Seoul-region service and the difference becomes concrete:

  • Regional only — every request crosses the Pacific to Seoul. RTT ~150ms, TLS handshake adds another ~600ms in cumulative round trips.
  • With edge (CloudFront in front) — TLS terminates at a US edge → traffic rides AWS’s backbone to the Seoul origin. TLS handshake ~30ms, and static content never reaches origin at all.

API Gateway REST API’s “regional/edge” means you choose between two endpoint types:

  • Regional endpoint — the user hits the API Gateway in that region directly. Useful when most users are in the same region or when you put your own CloudFront in front.
  • Edge-optimized endpoint — AWS automatically fronts the API Gateway with CloudFront edge. Shorter default latency for globally distributed users.

HTTP API supports only regional; if you want edge, you put CloudFront in front yourself.

Decision impact: If your users cluster in a single region, regional alone is fine. For globally distributed users, an edge layer (CloudFront or edge-optimized API Gateway) is almost mandatory — the TLS-RTT savings alone visibly cut perceived latency. Heavy static assets make the edge-cache effect decisive.

2.5 The L7 comparison table

ALBAPI Gateway HTTP APIAPI Gateway REST APICloudFront
Where it runsInside VPC (regional)AWS-managed (regional)AWS-managed (regional/edge)Global edge
Routing unithost / path / headerroute → integrationroute → integrationpath / behavior
Built-in authOIDC / CognitoJWT / OIDC / LambdaIAM / API Key / Cognito / Lambdasigned URL / signed cookie
CachingNoNoYes (optional)Yes (the whole point)
WebSocketYesSeparate WebSocket APINoNo
gRPCYesNoNoNo
Idle cost$16~20/month (LB hour)$0$0 (cache adds hourly instance)$0
Per-request costVery low (LCU)$1.00 / million$3.50 / millionVery low + data transfer
WAF integrationYes (native)No (CloudFront in front required)Yes (native)Yes (Shield Standard auto)
StrengthContainers/EC2 standardServerless API + auth/throttleUsage plans / cache / VTLGlobal cache / static assets / DDoS protection

The two confusions that come up most often:

  • ALB vs API Gateway HTTP API: Steady traffic above some threshold → ALB is cheaper (idle cost exists, but per-request is essentially zero). Low or spiky traffic → API Gateway is cheaper (no idle, only per-request). The crossover is roughly ~2M requests/month. Beyond cost, pick API Gateway when you need auth/throttling, ALB when you need containers/gRPC/WebSocket.
  • CloudFront vs the other two: CloudFront is not an alternative — it’s a layer you put on top. The question isn’t “ALB or CloudFront”, it’s “CloudFront + ALB or just ALB.”

2.6 Aside: AWS API Gateway alongside Kong / Spring Cloud Gateway

Everything above stayed inside AWS. The “API Gateway” category itself is broader — Kong and Spring Cloud Gateway (Zuul’s successor) sit in the same role as self-hosted alternatives. Calling this out makes it explicit that this guide’s decision tree is “AWS-bound,” and lets you extend it to fit your organization’s constraints.

The category is the same. All three handle external ingress + path/host routing + auth/throttling + request/response transforms + logging. Three implementations of the same “API Gateway” abstraction.

The differences are in operating model, ecosystem, and integration depth.

AWS API GatewayKongSpring Cloud Gateway (Zuul successor)
OperationsAWS-managed (zero infra)Self-hosted (you run containers/VMs)Self-hosted (JVM process)
StackAWS proprietaryNginx + OpenResty (Lua)Java + Netty (Reactor)
ExtensionLambda Authorizer / VTL / OpenAPI importPlugins (Lua/Go/JS), large ecosystemJava filters
Cloud lock-inAWS onlyPortable (multi-cloud / on-prem)Portable
Cost modelPer-request ($1~3.50 / million)Server cost (24/7)Server cost (24/7)
AWS service integrationNative (Lambda / DynamoDB / SQS direct)HTTP backends onlyHTTP backends only
Latency overhead1030ms (managed)Low (Nginx-based)Medium (JVM warm-up)

When to pick which:

SituationPick
AWS-only, serverless (Lambda) backends, want to avoid running infraAWS API Gateway
Multi-cloud / on-prem, high throughput, plugin customizationKong
Spring Cloud microservices stack, Java-centric teamSpring Cloud Gateway
Simple routing only, minimize opsALB + path-based routing (no gateway needed)

Note: Zuul 1 was Netflix’s gateway, but it’s effectively been replaced by Spring Cloud Gateway (Reactor + Netty). New projects rarely use Zuul 1; “Zuul” usually means Spring Cloud Gateway today.

How this fits the decision tree: The tree in §4 is “which entry point to pick within AWS”. If your company is multi-cloud, runs an existing EKS/Kubernetes microservices stack, or has a Spring-Cloud-savvy team — anywhere “API Gateway” appears in the tree, “Kong or Spring Cloud Gateway are also candidates.” This guide’s tree assumes AWS lock-in as a premise.


3. L4 entry points — NLB / Global Accelerator

L4 means the entry point routes on TCP/UDP packet headers (IP, port) only. It doesn’t understand HTTP, but it’s very fast and accepts any protocol.

3.1 NLB — static IPs and ultra-low latency

NLB (Network Load Balancer) is the L4 load balancer. It fills the gaps that ALB can’t:

  • Static IPs — assign an EIP per AZ. Often the only answer when an external system requires IP allowlisting (banking, payment gateways).
  • TCP / UDP / TLS — anything that isn’t HTTP. Game servers (UDP), MQTT, custom binary protocols.
  • Ultra-low latency — packet-level processing means single-digit-ms lower latency than ALB.
  • Preserve client IP — Target Group in IP mode + Cross-Zone disabled passes the client IP straight through.

ALB vs NLB in one line: does the entry point need to understand HTTP? Yes → ALB. No → NLB.

Note: NLB also supports TLS termination via TLS listener. So “HTTPS but routing on IP/port is enough” (e.g., a single backend with no host routing) is a legitimate NLB scenario.

3.2 Global Accelerator — multi-region + AWS backbone

Global Accelerator (GA) hands out two anycast IPs. Users send traffic to those IPs, enter at the closest AWS edge, then ride the AWS backbone to the actual backend (ALBs / NLBs / EC2 / EIPs in any region).

flowchart LR
    UserEU([User EU]) -->|anycast IP| EdgeEU[AWS Edge EU]
    UserAS([User Asia]) -->|anycast IP| EdgeAS[AWS Edge Asia]
    EdgeEU -.AWS backbone.-> ALBus[ALB us-east-1]
    EdgeAS -.AWS backbone.-> ALBap[ALB ap-northeast-2]
    EdgeEU -.failover.-> ALBap

Two concrete benefits:

  • AWS backbone from the first hop — similar to CloudFront riding the backbone on cache miss, but GA does it for every packet. Public-internet routing volatility doesn’t affect you.
  • Cross-region automatic failover — when a region dies, traffic shifts to another region. Similar to Route 53 health-check routing, but you don’t wait for DNS TTL.

The cost: $0.025/hour (~$18/month) fixed + extra data transfer. Almost always wasted on a single-region service. GA is justified roughly when “global users + multi-region backends already exist + DNS-based failover lag actually hurts.”

3.3 NLB vs Global Accelerator

NLBGlobal Accelerator
LayerL4L4 (anycast)
ScopeSingle regionGlobal
IPEIP (static) per AZTwo anycast IPs (permanent)
AWS backboneLast hop onlyFirst hop onward
Multi-region failoverNo (Route 53 separately)Yes (automatic)
Idle cost$0.0225/hr$0.025/hr + data transfer
When to pickStatic IP, ultra-low latency, TCP/UDPGlobal users + multi-region

4. The decision tree

Walk the four variables above in order and almost any case resolves.

flowchart TD
    Start([External request]) --> Q1{HTTP/HTTPS?}
    Q1 -->|No: TCP/UDP/MQTT/games| Q2{Global + multi-region?}
    Q2 -->|Yes| GA[Global Accelerator<br/>+ regional NLB/ALB]
    Q2 -->|No| NLB[NLB]
    Q1 -->|Yes| Q3{Static assets ≥ 30%<br/>or global users?}
    Q3 -->|Yes| CF[CloudFront in front<br/>+ origin ALB/API Gateway/S3]
    Q3 -->|No| Q4{Need managed auth /<br/>throttling / usage plans / OpenAPI?}
    Q4 -->|No| ALB[ALB]
    Q4 -->|Yes| Q5{Need API keys / response cache /<br/>VTL transforms / mTLS?}
    Q5 -->|No| HTTPAPI[API Gateway HTTP API]
    Q5 -->|Yes| RESTAPI[API Gateway REST API]

Each branch in one line:

  • Q1 (L7 vs L4): anything not HTTP is L4. WebSocket and gRPC ride on top of HTTP, so they’re L7.
  • Q2 (multi-region global): single region → NLB and you’re done. IP-allowlist scenarios that need a static IP also land here.
  • Q3 (static assets / global): heavy static or users across continents → CloudFront in front. CloudFront is never standalone — there’s always an origin.
  • Q4 (managed extras): no auth/throttling needs → ALB nearly always wins. Containers, EKS, gRPC live here too.
  • Q5 (REST-specific): usage-plan API keys, response cache, request/response transforms, mTLS — any one of these → REST API. Otherwise HTTP API.

Key: The branches are ordered by feature, not cost, because missing features force a rewrite, but cost can be optimized after the fact. Picking the cheapest entry point only to find it can’t handle WebSocket means starting over from scratch.


5. Six common anti-patterns

Mistakes here repeat in predictable shapes. Walking through them once is usually enough to dodge the same trap later.

5.1 NLB in front of an ALB

“I need a static IP, so I’ll put an NLB in front, and I still need host routing so I’ll put an ALB behind it…” — chaining NLB → ALB makes no sense. The NLB just forwards IP/port to the ALB, which is the one doing the actual work, so the static-IP allowlist now points at… still the ALB. The standard answer when you need a static IP in front of an ALB is Global Accelerator in front of the ALB — GA gives you anycast IPs and forwards to the ALB.

5.2 Serving static assets from API Gateway

“I want the API and static files coming from the same origin, so I’ll let API Gateway handle the static stuff too.” That $1~3.50 per million requests now applies to every JS/CSS/image hit, and a single page load with dozens of asset requests blows up the bill. Standard pattern: static assets on S3 + CloudFront, API on API Gateway, with CloudFront path behaviors picking the right origin.

5.3 Serving global users without CloudFront

ALB only in Seoul, users in the US and Europe crossing the Pacific or Indian Ocean every request. The TLS handshake alone costs ~4 RTTs, so the latency hit is severe. Even for a fully dynamic API with no static assets, putting CloudFront in front with a 0-second cache moves TLS termination to the edge and noticeably reduces perceived latency.

5.4 ALB in front of a single EC2

A single EC2 instance behind an ALB. There’s nothing to load-balance, so the ALB is just an expensive HTTPS terminator — $20/month with no HA gain (when the EC2 dies, the ALB has nowhere to send traffic). Cheaper alternatives at this stage: terminate HTTPS on the EC2 with Nginx, or use Lightsail / Cloudflare Tunnel. ALB starts paying off from two EC2 instances onward.

5.5 NAT Gateway for S3 and DynamoDB access

Slightly off the entry-point-decision layer, but the most expensive cost trap in the whole series, so it’s worth flagging here too. By default a Private Subnet EC2 reaches S3 / DynamoDB through NAT Gateway → internet → S3 — and every GB triggers $0.045 in NAT GW data-processing charges. Analytics, log shipping, and image-upload workloads typically run hundreds of GB to TBs per month, all of it stacking onto the bill. A Gateway Endpoint (Part 2 §2) is free and a five-minute change to fix this — picking an entry point in Part 1 without then walking through Part 2 §2 leaves money on the table. Once the Part 1 decision is made, the Endpoint decision in Part 2 is the natural next step.

5.6 Polling REST instead of using WebSocket

“I need real-time notifications, I’ll poll the REST endpoint.” It works at first, but as traffic grows, polling explodes both cost and server load. WebSocket scenarios should be drawn from the start with ALB (native WebSocket) or API Gateway WebSocket API. The split is about where connection state lives — backend (ALB) vs AWS-managed (API Gateway WebSocket).


Recap

What this post covered:

  1. Entry-point selection collapses to four variables: protocol layer / global distribution / managed extras / static IP & ultra-low latency. Walk them in order and the candidate set drops to one or two.
  2. What separates the L7 candidates is “what processing they add” — ALB does host/path routing, API Gateway adds auth and throttling, CloudFront caches globally. Same layer, different jobs.
  3. CloudFront is a caching layer, not an entry point. There’s always an origin behind it; it almost never runs alone.
  4. L4 splits between NLB (single-region static IP / ultra-low latency) and Global Accelerator (multi-region global). GA carries a fixed cost, so it’s nearly always wasted in a single-region service.
  5. Six anti-patterns to dodge: NLB → ALB chaining, static assets from API Gateway, no CloudFront in front of a global service, ALB on a single EC2, NAT GW for S3 access, REST polling instead of WebSocket. Each comes from taking one branch wrong on the decision tree.

The goal of Part 1 was to make the decision of “what entry point fronts this VPC” almost automatic. With the decision tree in hand, picking it should be a sub-minute exercise.

Next up: VPC-to-VPC and on-prem connectivity. What’s the difference between VPC Endpoint and PrivateLink, when does Transit Gateway win over VPC Peering, and at what scale does Direct Connect start paying for itself? The decision problem after the traffic has entered — how it reaches the next system inside (or outside) the VPC.

Note — series flow: The decision tree above starts after a user has reached an ALB’s IP / domain. The step before that — how the user’s typed-in domain resolves to an IP and reaches a specific entry point — is covered in Part 4 (DNS decisions and Route 53). In actual traffic-flow order Part 4 happens before Part 1, but the entry-point picked here is what you then register in DNS via Part 4.


Appendix. One-page summary

Bookmark this section for quick reference.

A. By OSI layer

LayerCandidatesRouting unit
L7 (HTTP)ALB / API Gateway / CloudFronthost, path, header, cookie
L4 (TCP/UDP)NLB / Global AcceleratorIP, port

B. Pricing in one line

CandidateIdle costPer-request costData transfer
ALB$16~20/moLCU (effectively zero)Standard EC2 outbound
NLB$16~20/moNLCUStandard
API Gateway HTTP API$0$1.00 / millionStandard
API Gateway REST API$0 (cache adds hourly instance)$3.50 / millionStandard
CloudFront$0Very lowedge → user (region-tiered)
Global Accelerator~$18/mo$0$0.015/GB extra

C. Official AWS docs

D. Acronyms

AWS services and entry points

AcronymMeaning
VPCVirtual Private Cloud. An isolated virtual network inside AWS
EC2Elastic Compute Cloud. AWS virtual servers
ECS / EKSElastic Container Service / Elastic Kubernetes Service. Managed container orchestration on AWS
LambdaAWS serverless compute (upload code; pay per invocation)
S3Simple Storage Service. AWS object storage
DynamoDBAWS-managed NoSQL key-value database
SQSSimple Queue Service. AWS-managed message queue
ALBApplication Load Balancer. L7 load balancer
NLBNetwork Load Balancer. L4 load balancer
CloudFrontAWS’s CDN (global edge caching)
GAGlobal Accelerator. AWS-backbone-based global accelerator
API GatewayAWS-managed API entry point (auth, throttling, usage plans)

Pricing and components

AcronymMeaning
LCU / NLCU(Network) Load Balancer Capacity Unit. ALB / NLB usage billing unit
EIPElastic IP. Static public IP
TGTarget Group. ALB/NLB backend pool
OACOrigin Access Control. CloudFront-to-S3 access protection
VPC LinkIntegration mechanism API Gateway uses to reach VPC endpoints over PrivateLink (REST = NLB only; HTTP = ALB / NLB / Cloud Map)

Network and protocols

AcronymMeaning
OSIOpen Systems Interconnection. The 7-layer model abstraction of network communication
L4OSI transport layer (TCP/UDP). Routes by IP/port only
L7OSI application layer (HTTP). Routes by message contents (host, path, header)
TLSTransport Layer Security. The encryption protocol behind HTTPS
mTLSMutual TLS. Both server and client authenticate via certificates
TCP / UDPTransmission Control Protocol / User Datagram Protocol. L4 transport protocols
WebSocketA bidirectional real-time protocol over TCP
gRPCGoogle RPC. HTTP/2 + Protocol Buffers RPC framework
RTTRound Trip Time. The latency measure for a packet round trip
DNSDomain Name System. Resolves domains to IPs

APIs and auth

AcronymMeaning
RESTRepresentational State Transfer. The HTTP-based API design style
OpenAPIA YAML/JSON standard for documenting APIs (formerly Swagger)
OIDCOpenID Connect. An identity layer on top of OAuth 2.0
JWTJSON Web Token. A signed JSON token used for auth/session passing
CognitoAWS-managed user pools and authentication
IAMIdentity and Access Management. AWS permission and access control
VTLVelocity Template Language. The request/response transform template DSL in API Gateway REST API (Apache-Velocity-based)
WAFWeb Application Firewall. L7 firewall (blocks SQL injection, XSS, etc.)
DDoSDistributed Denial of Service. A traffic flooding attack from many sources

General

AcronymMeaning
CDNContent Delivery Network. A global edge caching network (e.g., CloudFront)
On-prem / On-premisesYour own datacenter or office server room — infrastructure you operate outside a public cloud like AWS
Shop on Amazon

As an Amazon Associate, I earn from qualifying purchases.