Load Balancer Calculator

Calculate load balancer capacity, connections, and LCU requirements

Understanding Load Balancers

Load balancers distribute incoming traffic across multiple servers to ensure reliability, availability, and optimal resource utilization. Understanding load balancer capacity is crucial for proper scaling and cost management.

Load Balancer Types

Application Load Balancer (Layer 7)

Operates at the HTTP/HTTPS level with advanced routing:

Content-based routing: Route by URL, headers, query strings
Host-based routing: Route by hostname
WebSocket support: Persistent connections
HTTP/2 support: Multiplexed connections
Best for: Web applications, microservices, containers

Network Load Balancer (Layer 4)

Operates at the TCP/UDP level with ultra-low latency:

High performance: Millions of requests per second
Static IP: Fixed IP addresses per AZ
Low latency: Microsecond latencies
Protocol support: TCP, UDP, TLS
Best for: Gaming, IoT, high-performance applications

Classic Load Balancer (Legacy)

Previous generation, supports both Layer 4 and 7:

EC2-Classic: Supports legacy EC2 platform
Limited features: Basic load balancing
Recommendation: Migrate to ALB or NLB

AWS ALB Pricing - Load Balancer Capacity Units (LCU)

What is an LCU?

An LCU measures load balancer resource utilization across four dimensions:

1. New Connections (per second)

1 LCU = 25 new connections/second
Example: 100 new connections/sec = 4 LCUs

2. Active Connections (per minute)

1 LCU = 3,000 active connections
Example: 9,000 concurrent connections = 3 LCUs

3. Processed Bytes

1 LCU = 1 GB per hour (for EC2, IP targets)
1 LCU = 0.4 GB per hour (for Lambda targets)
Example: 2 GB/hour = 2 LCUs

4. Rule Evaluations

1 LCU = 1,000 rule evaluations/second
First 10 rules: Free
Example: 2,000 eval/sec = 2 LCUs

LCU Billing

You're charged for the highest dimension:

If new connections = 4 LCUs, active = 3 LCUs, bandwidth = 2 LCUs
You pay for 4 LCUs

Connection Metrics

Concurrent Connections

The number of connections active at any given moment. Calculated as:

Concurrent = Requests/sec × Response_time

Example: 100 req/s × 0.2s response = 20 concurrent connections

Connection Duration

How long connections stay open:

HTTP/1.1: Typically 5-30 seconds (keep-alive)
HTTP/2: 60+ seconds (multiplexed)
WebSocket: Minutes to hours (persistent)

Capacity Planning

Estimating Backend Servers

General guidelines for server capacity:

CPU-bound: 100-500 req/sec per core
I/O-bound: 1,000-10,000 concurrent connections per server
Memory-bound: Depends on data size and caching

High Availability

For production workloads:

Deploy in multiple availability zones (minimum 2)
Size for N+1 capacity (survive one server failure)
Plan for 2-3x peak traffic
Enable connection draining (300-3600 seconds)

Auto-Scaling Rules

CPU threshold: Scale at 70% average CPU
Active connections: Scale at 1,000 per instance
Response time: Scale when latency > target
Request count: Scale at target requests/instance

Performance Optimization

Connection Pooling

Reuse backend connections to reduce overhead:

Reduces TCP handshake latency
Saves on connection establishment time
Improves throughput

Keep-Alive Settings

Optimize connection reuse:

Client keep-alive: 60-120 seconds
Backend keep-alive: 60-300 seconds
Idle timeout: Balance between reuse and resource consumption

Health Checks

Interval: 10-30 seconds (more frequent = faster failover)
Timeout: 5-10 seconds
Threshold: 2-3 consecutive failures
Endpoint: Lightweight endpoint (e.g., /health)

LCU Quick Reference

1 LCU equals:

25 new connections/sec
3,000 active connections
1 GB/hour bandwidth
1,000 rule evaluations/sec

Billing: Highest dimension wins

Best Practices

Deploy in multiple AZs
Enable connection draining
Configure health checks
Enable access logs
Use target groups
Monitor CloudWatch metrics
Plan for 2-3x peak load